Bug 1208489
| Summary: | HE active hyper-visor not responding to "hosted-engine --vm-status" after "iptables -I INPUT -s 10.35.160.108 -j DROP" cast. | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Nikolai Sednev <nsednev> | ||||
| Component: | ovirt-hosted-engine-ha | Assignee: | Dudi Maroshi <dmaroshi> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Nikolai Sednev <nsednev> | ||||
| Severity: | urgent | Docs Contact: | |||||
| Priority: | low | ||||||
| Version: | 3.5.1 | CC: | cshao, dougsland, fdeutsch, gklein, gpadgett, huiwa, istein, juwu, lsurette, rbarry, rgolan, sherold, ycui, ykaul | ||||
| Target Milestone: | ovirt-3.6.0-rc | Keywords: | Triaged | ||||
| Target Release: | 3.6.0 | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: |
When a self-hosted engine host client requested status from the Manager virtual machine (hosted-engine --vm-status) and a connection to the storage domain could not be established, the client hanged indefinitely waiting for a response from the ovirt-ha-broker. With this update, connection timeout is added and if the storage domain cannot be accessed, an appropriate error message is returned.
|
Story Points: | --- | ||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2016-03-09 19:49:03 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | SLA | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Nikolai Sednev
2015-04-02 11:43:18 UTC
Created attachment 1010128 [details]
all logs
The status verb is reading current stats from the storage. If storage is blocked then the utility is waiting for it. We can add a timeout for the utility. If expired the utility will say it cannot access the shared storage. just a note on the reproducer, it doesn't need to be 2 host setup. 1 host and an IPTABLES rule to drop the packets will suffice. (In reply to Roy Golan from comment #4) > just a note on the reproducer, it doesn't need to be 2 host setup. 1 host > and an IPTABLES rule to drop the packets will suffice. Yep, it's known and also was tested with a single host, but second host was required to check that HA passes the HE-VM properly. Problem reconstructed and repeated. Diagnostics: ------------ When a hosted engine client is requesting status from hosted engine and there is no connection to storage domain. The client is hanging indefinitely, waiting for response from the hosted-engine-broker. Solution: --------- The fix is add timeout for calling storage domain information. In lib.brokerlink.set_storage_domain *** Bug 1085523 has been marked as a duplicate of this bug. ***
Now previously active host isn't stuck after casting the "hosted-engine --vm-status" command on it, but takes a few seconds to response:
# hosted-engine --vm-status
Traceback (most recent call last):
File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/vm_status.py", line 117, in <module>
if not status_checker.print_status():
File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/vm_status.py", line 60, in print_status
all_host_stats = ha_cli.get_all_host_stats()
File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", line 160, in get_all_host_stats
return self.get_all_stats(self.StatModes.HOST)
File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", line 103, in get_all_stats
self._configure_broker_conn(broker)
File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", line 180, in _configure_broker_conn
dom_type=dom_type)
File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 176, in set_storage_domain
.format(sd_type, options, e))
ovirt_hosted_engine_ha.lib.exceptions.RequestError: Failed to set storage domain FilesystemBackend, options {'dom_type': 'iscsi', 'sd_uuid': 'df2356f7-8272-401a-97f7-63c14f37ec7a'}: Connection timed out
After releasing the iptables rule, host responds correctly:
# hosted-engine --vm-status
--== Host 1 status ==--
Status up-to-date : True
Hostname : alma03.qa.lab.tlv.redhat.com
Host ID : 1
Engine status : {"health": "good", "vm": "up", "detail": "up"}
Score : 3400
stopped : False
Local maintenance : False
crc32 : d69bf92a
Host timestamp : 5992
--== Host 2 status ==--
Status up-to-date : True
Hostname : alma04.qa.lab.tlv.redhat.com
Host ID : 2
Engine status : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"}
Score : 3400
stopped : False
Local maintenance : False
crc32 : 3d75f9e9
Host timestamp : 3790
You have new mail in /var/spool/mail/root
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-0422.html |