Bug 1301571
Summary: | [hosted-engine-ha] Over iSCSI, VM doesn't start automatically; "failed to retrieve Hosted Engine HA info" | ||||||
---|---|---|---|---|---|---|---|
Product: | [oVirt] ovirt-hosted-engine-ha | Reporter: | Elad <ebenahar> | ||||
Component: | Broker | Assignee: | Martin Sivák <msivak> | ||||
Status: | CLOSED WORKSFORME | QA Contact: | Elad <ebenahar> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 1.3.3.6 | CC: | acanan, bugs, dfediuck, ebenahar, stirabos, ylavi | ||||
Target Milestone: | ovirt-3.6.3 | Keywords: | Regression | ||||
Target Release: | --- | Flags: | ylavi:
ovirt-3.6.z?
rule-engine: blocker? ebenahar: planning_ack? ebenahar: devel_ack? ebenahar: testing_ack? |
||||
Hardware: | x86_64 | ||||||
OS: | Unspecified | ||||||
Whiteboard: | sla | ||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2016-01-26 12:27:51 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | SLA | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
This bug report has Keywords: Regression or TestBlocker. Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP. Elad, did you use clean storage? How did you install the host? The tracebacks seem to be of no consequence as the reason for not starting the VM automatically is actually pretty simple: Local maintenance : True I do not think this is a bug since you were able to start the VM manually and get the status correctly. Can you please explain how the maintenance mode happened? And try hosted-engine --set-maintenance --mode=none to see whether it will start the vm automatically (it might take a minute or so to initiate the start)? (In reply to Martin Sivák from comment #2) > Elad, did you use clean storage? How did you install the host? The storage I'm using is clean, I'm creating a new LUN for each HE deployment and cleaning the old ones. > The tracebacks seem to be of no consequence as the reason for not starting > the VM automatically is actually pretty simple: > > Local maintenance : True > > I do not think this is a bug since you were able to start the VM manually > and get the status correctly. > > Can you please explain how the maintenance mode happened? And try > hosted-engine --set-maintenance --mode=none to see whether it will start the > vm automatically (it might take a minute or so to initiate the start)? I did not do anything to make this happen, just regular deployment over iSCSI I tested this 4 times over 2 different hosts, reproduced 4/4 Hmm, were the hosts clean? No old hosted engine config files or so? Simone: are we setting the maintenance mode during deploy somehow? (In reply to Martin Sivák from comment #4) > Simone: are we setting the maintenance mode during deploy somehow? No, we don't Martin, following you comment #4, I re-installed my host and before deploying. At the end of the deployment, VM started automatically. Closing as WORKSFORME. |
Created attachment 1117959 [details] HE logs, vdsm.log, messages Description of problem: Deployed hosted-engine over iSCSI. During deployment, got the following error message in setup.log: Jan 25 12:10:14 green-vdsc.qa.lab.tlv.redhat.com vdsm[1005]: vdsm vds ERROR failed to retrieve Hosted Engine HA info Traceback (most recent call last): File "/usr/share/vdsm/API.py", line 1842, in _getHaInfo stats = instance.get_all_stats() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", line 102, in get_all_stats with broker.connection(self._retries, self._wait): File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 99, in connection self.connect(retries, wait) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 78, in connect raise BrokerConnectionError(error_msg) BrokerConnectionError: Failed to connect to broker, the number of errors has exceeded the limit (1) Jan 25 12:10:31 green-vdsc.qa.lab.tlv.redhat.com vdsm[1005]: vdsm ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR Failed to connect to broker, the number of errors has exceeded the limit (1) Jan 25 12:10:31 green-vdsc.qa.lab.tlv.redhat.com vdsm[1005]: vdsm vds ERROR failed to retrieve Hosted Engine HA info Traceback (most recent call last): File "/usr/share/vdsm/API.py", line 1842, in _getHaInfo stats = instance.get_all_stats() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", line 102, in get_all_stats with broker.connection(self._retries, self._wait): File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 99, in connection self.connect(retries, wait) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 78, in connect raise BrokerConnectionError(error_msg) BrokerConnectionError: Failed to connect to broker, the number of errors has exceeded the limit (1) HE VM did not start automatically. Version-Release number of selected component (if applicable): ovirt-vmconsole-1.0.0-1.el7ev.noarch ovirt-host-deploy-1.4.1-1.el7ev.noarch ovirt-setup-lib-1.0.1-1.el7ev.noarch ovirt-vmconsole-host-1.0.0-1.el7ev.noarch libgovirt-0.3.3-1.el7_2.1.x86_64 ovirt-hosted-engine-setup-1.3.2.3-1.el7ev.noarch ovirt-hosted-engine-ha-1.3.3.7-1.el7ev.noarch vdsm-xmlrpc-4.17.18-0.el7ev.noarch vdsm-4.17.18-0.el7ev.noarch vdsm-python-4.17.18-0.el7ev.noarch vdsm-hook-vmfex-dev-4.17.18-0.el7ev.noarch vdsm-jsonrpc-4.17.18-0.el7ev.noarch vdsm-yajsonrpc-4.17.18-0.el7ev.noarch vdsm-cli-4.17.18-0.el7ev.noarch vdsm-infra-4.17.18-0.el7ev.noarch How reproducible: Over iSCSI - Always Steps to Reproduce: 1. Deploy hosted engine over iSCSI Actual results: At the end of the deployment, HE VM does not start automatically: [root@green-vdsc ~]# hosted-engine --vm-status --== Host 1 status ==-- Status up-to-date : True Hostname : green-vdsc.qa.lab.tlv.redhat.com Host ID : 1 Engine status : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"} Score : 0 stopped : False Local maintenance : True crc32 : 2e0a207d Host timestamp : 73063 Both ha-broker and ha-agent services are active. Started manually the VM using --vm-start successfully. Expected results: HA VM should start automatically at the end of the deployment Additional info: HE logs, vdsm.log, messages