Bug 1397005 - engine-setup fails on checking if dwhd is running
Summary: engine-setup fails on checking if dwhd is running
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: Setup.Engine
Version: 4.0.6
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ovirt-4.0.6
: ---
Assignee: Yedidyah Bar David
QA Contact: Pavel Stehlik
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-11-21 11:36 UTC by Petr Matyáš
Modified: 2017-05-11 09:26 UTC (History)
6 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2016-11-28 11:29:45 UTC
oVirt Team: Metrics
Embargoed:
rule-engine: ovirt-4.0.z+


Attachments (Terms of Use)
engine log (228.60 KB, text/plain)
2016-11-21 11:36 UTC, Petr Matyáš
no flags Details
dwhd log (65.96 KB, text/plain)
2016-11-21 12:30 UTC, Petr Matyáš
no flags Details
postgresql logs (1.33 KB, application/octet-stream)
2016-11-21 12:36 UTC, Petr Matyáš
no flags Details
yum log (125.03 KB, text/plain)
2016-11-21 13:33 UTC, Petr Matyáš
no flags Details
engine setup log (449.05 KB, text/plain)
2016-11-21 14:05 UTC, Petr Matyáš
no flags Details

Description Petr Matyáš 2016-11-21 11:36:52 UTC
Created attachment 1222361 [details]
engine log

Description of problem:
After running engine-setup after upgrade from 4.0.5-7 to 4.0.6-1 the setup fails on checking if dwhd is running even though dwhd service is not running and should have been stopped.
DWH is local, not remote.

Version-Release number of selected component (if applicable):
4.0.6-1

How reproducible:
always

Steps to Reproduce:
1. install 4.0.5-7
2. upgrade packages to 4.0.6-1
3. run engine-setup

Actual results:
fail on dwhd is running check

Expected results:
successful engine-setup

Additional info:
[ INFO  ] Stopping dwh service
[ INFO  ] Stopping Image I/O Proxy service
[ INFO  ] Stopping websocket-proxy service
[ ERROR ] dwhd is currently running. Its hostname is pm-rh40.rhev.lab.eng.brq.redhat.com. Please stop it before running Setup.
[ ERROR ] Failed to execute stage 'Transaction setup': dwhd is currently running

Comment 1 Yedidyah Bar David 2016-11-21 12:27:32 UTC
Please attach postgresql logs. Most likely this is a duplicate of bug 1286441.

Comment 2 Yedidyah Bar David 2016-11-21 12:28:14 UTC
However, considering the number of duplicates it has, perhaps we should try to do something.

Comment 3 Yedidyah Bar David 2016-11-21 12:28:56 UTC
And please attach also dwhd logs. Thanks.

Comment 4 Petr Matyáš 2016-11-21 12:30:34 UTC
Created attachment 1222379 [details]
dwhd log

Comment 5 Petr Matyáš 2016-11-21 12:36:31 UTC
Created attachment 1222380 [details]
postgresql logs

Comment 6 Yedidyah Bar David 2016-11-21 12:42:39 UTC
Please also yum logs. Sorry :-( Thanks.

Comment 7 Petr Matyáš 2016-11-21 13:33:19 UTC
Created attachment 1222397 [details]
yum log

Comment 8 Yedidyah Bar David 2016-11-21 13:42:26 UTC
Please attach also setup logs. Thanks.

Comment 9 Petr Matyáš 2016-11-21 14:05:36 UTC
Created attachment 1222399 [details]
engine setup log

...I should have packed whole /var/log folder...

Comment 10 Yedidyah Bar David 2016-11-21 14:42:12 UTC
1. yum updated postgresql. yum.log:

Nov 21 10:15:35 Updated: postgresql-server-9.2.18-1.el7.x86_64

2. dwhd fails to reconnect. First error in dwhd log:

2016-11-21 10:18:47|sIAAti|yb0VIK|q0Ha1J|OVIRT_ENGINE_DWH|OsEnumUpdate|Default|6|Java Exception|tJDBCInput_4|org.postgresql.util.PSQLException:FATAL: terminating connection due to administrator command|1

Seems like it never succeeded.

3. engine-setup stops dwhd "successfully":

2016-11-21 10:35:13 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:863 execute-result: ('/bin/systemctl', 'stop', 'ovirt-engine-dwhd.service'), rc=0

Nothing in stdout/stderr.

4. engine-setup checks the db and sees it's still "up":

2016-11-21 10:35:13 DEBUG otopi.ovirt_engine_setup.engine_common.database database.execute:222 Result: [{'var_value': '1', 'var_datetime': None, 'var_name': 'DwhCurrentlyRunning'}]

Shirly, please have a look. Indeed seems like a duplicate of bug 1286441. But the fix there was supposed to work in this case - between upgrading PG and stopping dwhd passed 20 minutes, this should have been enough for dwhd to reconnect. It might be related to the upgrade. IIRC all the duplicates there are upgrades. Perhaps dwhd with the old pg client library can't connect to the upgraded pg server.

Comment 12 Petr Matyáš 2016-11-28 11:29:45 UTC
This issue must have been related to some particular setup and I can't reporoduce this anymore.


Note You need to log in before you can comment on or make changes to this bug.