Bug 1383118

Summary: excessive SSL errors in the logs (vdsm.log) after hosted-engine update
Product: Red Hat Enterprise Virtualization Manager Reporter: Marcus West <mwest>
Component: vdsmAssignee: Dan Kenigsberg <danken>
Status: CLOSED DUPLICATE QA Contact: Aharon Canan <acanan>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.0.3CC: bazulay, bmcclain, fdeutsch, gklein, lsurette, masayag, mkalinin, mperina, mwest, pkliczew, rnori, srevivo, stirabos, ycui, ykaul
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-03 11:00:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Marcus West 2016-10-10 01:11:38 UTC
## Description of problem:

Recently upgraded to 4.0, there are excess SSL errors in the logs:

JsonRpc (StompReactor)::ERROR::2016-10-10 10:07:03,341::betterAsyncore::113::vds.dispatcher::(recv) SSL error during reading data: unexpected eof
JsonRpc (StompReactor)::ERROR::2016-10-10 10:07:08,792::betterAsyncore::113::vds.dispatcher::(recv) SSL error during reading data: unexpected eof
JsonRpc (StompReactor)::ERROR::2016-10-10 10:07:13,392::betterAsyncore::113::vds.dispatcher::(recv) SSL error during reading data: unexpected eof
JsonRpc (StompReactor)::ERROR::2016-10-10 10:07:28,223::betterAsyncore::113::vds.dispatcher::(recv) SSL error during reading data: unexpected eof
JsonRpc (StompReactor)::ERROR::2016-10-10 10:07:34,641::betterAsyncore::113::vds.dispatcher::(recv) SSL error during reading data: unexpected eof
JsonRpc (StompReactor)::ERROR::2016-10-10 10:07:43,040::betterAsyncore::113::vds.dispatcher::(recv) SSL error during reading data: unexpected eof
JsonRpc (StompReactor)::ERROR::2016-10-10 10:07:48,490::betterAsyncore::113::vds.dispatcher::(recv) SSL error during reading data: unexpected eof

## Version-Release number of selected component (if applicable):

RHV 4.0
vdsm-4.18.13-1.el7ev.x86_64
ovirt-hosted-engine-setup-2.0.2.2-2.el7ev.noarch
ovirt-hosted-engine-ha-2.0.4-1.el7ev.noarch

## How reproducible:

Always

## Additional info:

It seems to be related to ovirt-ha-agent - if I stop this then the errors stop

Comment 2 Piotr Kliczewski 2016-10-11 09:50:25 UTC
Someone reported similar issue on users list and after looking at the engine log we saw it was related to BZ #1371515. Please provide engine log and if this is the same issue please update your engine to the latest 4.0.

Comment 3 Marcus West 2016-10-11 23:29:34 UTC
Hi Piotr, I don't see any of those GetUserProfileQuery errors in my engine.log - i'll attach mine for confirmation.  It's a recent upgrade from a 3.6 environment - currently on ovirt-engine-4.0.4.4-0.1.el7ev.noarch

Comment 6 Piotr Kliczewski 2016-10-12 07:50:58 UTC
Looking at the engine logs I can see that host upgrade manager was started:

2016-10-06 15:00:53,591 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-4) [29ff55ba] Correlation ID: 29ff55ba, Call Stack: null, Custom Event ID: -1, Message: Host intelh6.rhev.gsslab.bne.redhat.com upgrade was started (User: admin@internal-authz).
2016-10-06 15:00:54,260 INFO  [org.ovirt.engine.core.bll.hostdeploy.HostUpgradeCallback] (DefaultQuartzScheduler10) [29ff55ba] Host 'intelh6.rhev.gsslab.bne.redhat.com' is on maintenance mode. Proceeding with Upgrade process.

and just after it finished we lost connectivity with the host:

2016-10-06 15:16:05,981 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand] (DefaultQuartzScheduler7) [1eac2823] Command 'GetAllVmStatsVDSCommand(HostName = intelh6.rhev.gsslab.bne.redhat.com, VdsIdAndVdsVDSCommandParametersBase:{runAsync='true', hostId='a9e052f6-d953-41cb-b46a-c6d284855d63', vds='Host[intelh6.rhev.gsslab.bne.redhat.com,a9e052f6-d953-41cb-b46a-c6d284855d63]'})' execution failed: org.ovirt.vdsm.jsonrpc.client.ClientConnectionException: Connection failed

Moti please take a look at the logs.

Comment 7 Martin Perina 2016-10-12 08:10:45 UTC
Moti is on PTO, Ravi could you please take a look?

Comment 8 Piotr Kliczewski 2016-10-18 09:19:56 UTC
Simone can you please take a look.

Comment 9 Ravi Nori 2016-10-24 14:13:42 UTC
From the logs the node stopped responding on reboot after updating the packages redhat-virtualization-host-image-update and redhat-virtualization-host-image-update-placeholder. There are no errors during the update process, could be that the host network failed to initialize after the reboot.

Comment 10 Martin Perina 2016-10-24 15:24:27 UTC
Fabian, could you please take a look?

Comment 11 Fabian Deutsch 2016-10-28 09:29:47 UTC
A few questions:

1. Not explicitly stated, but: How did you "create" the RHVH 4 host? Was it a fresh install?
2. Did the upgrade complete successful?
3. Did the host reboot?
4. Did the networking on the host come up correctly?
5. Were you able to manually activate the host?
6. Was everything working after you performed the activation?

Comment 12 Marcus West 2016-10-30 23:33:18 UTC
Hi Fabian,

> 1. Not explicitly stated, but: How did you "create" the RHVH 4 host? Was it a fresh install?
Yes, hypervisors were wiped and fresh installed.  Were previously installed with rhev-h (rhel7, for rhev 3.6)

> 2. Did the upgrade complete successful?
'hosted-engine --deploy' created the VM successfully, but stopped at the point where it loads the restore database.  I did this step manually, and then the upgrade was effectively completed.  It's not obvious in the logs anywhere why the db copied over but failed to load.

> 3. Did the host reboot?
I have rebooted the hosts but the excessive logging is still there.

> 4. Did the networking on the host come up correctly?
Networking seems to be fine.

> 5. Were you able to manually activate the host?
Hosts appear to activate fine.

> 6. Was everything working after you performed the activation?
Apart from having to finish the upgrade manually, the rhev environment seems to work fine.

Comment 13 Fabian Deutsch 2016-10-31 10:03:10 UTC
Oh - So it's about hosted engine. And it seemed to happen during a he flow.

Comment 15 Piotr Kliczewski 2016-11-02 08:43:59 UTC
I think that this BZ is related to BZ #1349829.

Comment 16 Martin Perina 2016-11-03 11:00:05 UTC
Closing as duplicate

*** This bug has been marked as a duplicate of bug 1349829 ***