Bug 1301571 - [hosted-engine-ha] Over iSCSI, VM doesn't start automatically; "failed to retrieve Hosted Engine HA info"
Summary: [hosted-engine-ha] Over iSCSI, VM doesn't start automatically; "failed to ret...
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: ovirt-hosted-engine-ha
Classification: oVirt
Component: Broker
Version: 1.3.3.6
Hardware: x86_64
OS: Unspecified
high
high
Target Milestone: ovirt-3.6.3
: ---
Assignee: Martin Sivák
QA Contact: Elad
URL:
Whiteboard: sla
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-01-25 12:08 UTC by Elad
Modified: 2016-01-26 12:27 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-01-26 12:27:51 UTC
oVirt Team: SLA
Embargoed:
ylavi: ovirt-3.6.z?
rule-engine: blocker?
ebenahar: planning_ack?
ebenahar: devel_ack?
ebenahar: testing_ack?


Attachments (Terms of Use)
HE logs, vdsm.log, messages (2.05 MB, application/x-gzip)
2016-01-25 12:08 UTC, Elad
no flags Details

Description Elad 2016-01-25 12:08:40 UTC
Created attachment 1117959 [details]
HE logs, vdsm.log, messages

Description of problem:
Deployed hosted-engine over iSCSI. During deployment, got the following error message in setup.log:

Jan 25 12:10:14 green-vdsc.qa.lab.tlv.redhat.com vdsm[1005]: vdsm vds ERROR failed to retrieve Hosted Engine HA info
                                                             Traceback (most recent call last):
                                                               File "/usr/share/vdsm/API.py", line 1842, in _getHaInfo
                                                                 stats = instance.get_all_stats()
                                                               File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", line 102, in get_all_stats
                                                                 with broker.connection(self._retries, self._wait):
                                                               File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__
                                                                 return self.gen.next()
                                                               File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 99, in connection
                                                                 self.connect(retries, wait)
                                                               File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 78, in connect
                                                                 raise BrokerConnectionError(error_msg)
                                                             BrokerConnectionError: Failed to connect to broker, the number of errors has exceeded the limit (1)
Jan 25 12:10:31 green-vdsc.qa.lab.tlv.redhat.com vdsm[1005]: vdsm ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR Failed to connect to broker, the number of errors has exceeded the limit (1)
Jan 25 12:10:31 green-vdsc.qa.lab.tlv.redhat.com vdsm[1005]: vdsm vds ERROR failed to retrieve Hosted Engine HA info
                                                             Traceback (most recent call last):
                                                               File "/usr/share/vdsm/API.py", line 1842, in _getHaInfo
                                                                 stats = instance.get_all_stats()
                                                               File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", line 102, in get_all_stats
                                                                 with broker.connection(self._retries, self._wait):
                                                               File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__
                                                                 return self.gen.next()
                                                               File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 99, in connection
                                                                 self.connect(retries, wait)
                                                               File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 78, in connect
                                                                 raise BrokerConnectionError(error_msg)
                                                             BrokerConnectionError: Failed to connect to broker, the number of errors has exceeded the limit (1)


HE VM did not start automatically.

Version-Release number of selected component (if applicable):
ovirt-vmconsole-1.0.0-1.el7ev.noarch
ovirt-host-deploy-1.4.1-1.el7ev.noarch
ovirt-setup-lib-1.0.1-1.el7ev.noarch
ovirt-vmconsole-host-1.0.0-1.el7ev.noarch
libgovirt-0.3.3-1.el7_2.1.x86_64
ovirt-hosted-engine-setup-1.3.2.3-1.el7ev.noarch
ovirt-hosted-engine-ha-1.3.3.7-1.el7ev.noarch
vdsm-xmlrpc-4.17.18-0.el7ev.noarch
vdsm-4.17.18-0.el7ev.noarch
vdsm-python-4.17.18-0.el7ev.noarch
vdsm-hook-vmfex-dev-4.17.18-0.el7ev.noarch
vdsm-jsonrpc-4.17.18-0.el7ev.noarch
vdsm-yajsonrpc-4.17.18-0.el7ev.noarch
vdsm-cli-4.17.18-0.el7ev.noarch
vdsm-infra-4.17.18-0.el7ev.noarch


How reproducible:
Over iSCSI - Always

Steps to Reproduce:
1. Deploy hosted engine over iSCSI


Actual results:
At the end of the deployment, HE VM does not start automatically:

[root@green-vdsc ~]# hosted-engine --vm-status


--== Host 1 status ==--

Status up-to-date                  : True
Hostname                           : green-vdsc.qa.lab.tlv.redhat.com
Host ID                            : 1
Engine status                      : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"}
Score                              : 0
stopped                            : False
Local maintenance                  : True
crc32                              : 2e0a207d
Host timestamp                     : 73063


Both ha-broker and ha-agent services are active.
Started manually the VM using --vm-start successfully.

Expected results:
HA VM should start automatically at the end of the deployment

Additional info:
HE logs, vdsm.log, messages

Comment 1 Red Hat Bugzilla Rules Engine 2016-01-25 12:44:08 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 2 Martin Sivák 2016-01-25 13:12:02 UTC
Elad, did you use clean storage? How did you install the host?

The tracebacks seem to be of no consequence as the reason for not starting the VM automatically is actually pretty simple:

Local maintenance                  : True

I do not think this is a bug since you were able to start the VM manually and get the status correctly.

Can you please explain how the maintenance mode happened? And try hosted-engine --set-maintenance --mode=none to see whether it will start the vm automatically (it might take a minute or so to initiate the start)?

Comment 3 Elad 2016-01-26 09:17:10 UTC
(In reply to Martin Sivák from comment #2)
> Elad, did you use clean storage? How did you install the host?

The storage I'm using is clean, I'm creating a new LUN for each HE deployment and cleaning the old ones.

> The tracebacks seem to be of no consequence as the reason for not starting
> the VM automatically is actually pretty simple:
> 
> Local maintenance                  : True
> 
> I do not think this is a bug since you were able to start the VM manually
> and get the status correctly.
> 
> Can you please explain how the maintenance mode happened? And try
> hosted-engine --set-maintenance --mode=none to see whether it will start the
> vm automatically (it might take a minute or so to initiate the start)?


I did not do anything to make this happen, just regular deployment over iSCSI

I tested this 4 times over 2 different hosts, reproduced 4/4

Comment 4 Martin Sivák 2016-01-26 10:20:29 UTC
Hmm, were the hosts clean? No old hosted engine config files or so?

Simone: are we setting the maintenance mode during deploy somehow?

Comment 5 Simone Tiraboschi 2016-01-26 10:23:33 UTC
(In reply to Martin Sivák from comment #4)
> Simone: are we setting the maintenance mode during deploy somehow?

No, we don't

Comment 6 Elad 2016-01-26 12:27:51 UTC
Martin, following you comment #4, I re-installed my host and before deploying. At the end of the deployment, VM started automatically. 
Closing as WORKSFORME.


Note You need to log in before you can comment on or make changes to this bug.