Bug 1569601

Summary: vdsm root ERROR failed to retrieve Hosted Engine HA info
Product: [oVirt] vdsm Reporter: Petr Kubica <pkubica>
Component: GeneralAssignee: Dan Kenigsberg <danken>
Status: CLOSED DUPLICATE QA Contact: Nikolai Sednev <nsednev>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.19.41CC: bugs, dfediuck, pkubica
Target Milestone: ---Keywords: Automation, AutomationBlocker
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-06-04 16:26:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: SLA RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Petr Kubica 2018-04-19 14:45:47 UTC
Description of problem:
Exception was discovered during automation testing.

vdsm root ERROR failed to retrieve Hosted Engine HA info
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 231, in _getHaInfo
    stats = instance.get_all_stats()
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", line 106, in get_all_stats
    stats = broker.get_stats_from_storage(service)
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 233, in get_stats_from_storage
    result = self._checked_communicate(request)
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 261, in _checked_communicate
    .format(message or response))
RequestError: Request failed: failed to read metadata: [Errno 2] No such file or directory: '/rhev/data-center/mnt/blockSD/66b761d5-5889-4f5e-bc3e-80ef2384b841/ha_agent/hosted-engine.metadata'

vdsm root ERROR failed to retrieve Hosted Engine HA info
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 231, in _getHaInfo
    stats = instance.get_all_stats()
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", line 103, in get_all_stats
    with broker.connection(self._retries, self._wait):
  File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__
    return self.gen.next()
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 99, in connection
    self.connect(retries, wait)
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 78, in connect
    raise BrokerConnectionError(error_msg)
BrokerConnectionError: Failed to connect to broker, the number of errors has exceeded the limit (1)

Version-Release number of selected component (if applicable):
vdsm-4.19.51-1.el7ev.x86_64

Steps to Reproduce:
Unclear - it happens during some different tests

Actual results:
Exception in journal log of vdsmd service

Expected results:
No exceptions in journal log of vdsmd service

Comment 2 Doron Fediuck 2018-04-20 08:58:13 UTC
This may happen if you did anything causing the engine to disconnect the storage from the relevant host.
Which RHV version is this?

Comment 3 Petr Kubica 2018-04-20 09:59:23 UTC
It was rhv-4.2.3-1 (exception is also in rhv-4.2.3-2) and probably it is caused by some test(s) but I think it wasn't anything destructive.

Comment 4 Sandro Bonazzola 2018-04-20 10:41:49 UTC
(In reply to Petr Kubica from comment #3)
> It was rhv-4.2.3-1 (exception is also in rhv-4.2.3-2) and probably it is
> caused by some test(s) but I think it wasn't anything destructive.

this can't be 4.2.3-1 since you said this happens with vdsm-4.19.51-1.el7ev.x86_64 which is RHV 4.1.11, not 4.2.3.

Comment 5 Petr Kubica 2018-04-22 07:47:18 UTC
Sorry Sandro, 
This issue also happen in different form on rhv-4.2.3 (See Also), I messed up.

This was actually on rhv-4.1.11-2

Comment 6 Martin Sivák 2018-04-25 10:19:18 UTC
Is this the same situation like in https://bugzilla.redhat.com/show_bug.cgi?id=1569593 ?

Comment 7 Petr Kubica 2018-05-16 09:39:55 UTC
Yes, 
"User shouldn't see multiple printed exceptions in journal log. User should see only errors in this case. I cannot give you exact reproduction steps how to achieve this exception. I have only logs from automation from all machines which I posted in comment #1"

Comment 8 Martin Sivák 2018-06-04 16:26:40 UTC

*** This bug has been marked as a duplicate of bug 1569593 ***