Created attachment 850985 [details] setup log Description of problem: traceback when calling hosted-engine --vm-status: Traceback (most recent call last): File "/usr/lib64/python2.6/runpy.py", line 122, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib64/python2.6/runpy.py", line 34, in _run_code exec code in run_globals File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_setup/vm_status.py", line 111, in <module> if not status_checker.print_status(): File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_setup/vm_status.py", line 58, in print_status all_host_stats = ha_cli.get_all_host_stats() File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/client/client.py", line 137, in get_all_host_stats return self.get_all_stats(self.StatModes.HOST) File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/client/client.py", line 86, in get_all_stats constants.SERVICE_TYPE) File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 171, in get_stats_from_storage result = self._checked_communicate(request) File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 198, in _checked_communicate raise RequestError("Request failed: {0}".format(msg)) ovirt_hosted_engine_ha.lib.exceptions.RequestError: Request failed: success Version-Release number of selected component (if applicable): is31 How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Created attachment 850996 [details] agent log
Created attachment 850997 [details] broker log
1) Agent is not able to get sanlock lease Exception: Failed to initialize sanlock: cannot get lock on host id 1: host already holds lock on a different host id 2) There is only this single host connected to the cluster and because this host does not have an ID and thus haven't published it's state yet The metadata file contains only zeros: [root@slot-6 ha_agent]# hexdump hosted-engine.metadata 0000000 0000 0000 0000 0000 0000 0000 0000 0000 * 00fb000 3) Because of this the get_all_stats_for_service_type returns an empty string 4) Because of this the RPC brokerlink.get_stats_from_storage returns only "success " 5) And the brokerlink._checked_communicate expects success to contain at least one additional part 6) Because it does not, it falls back to the else clause that reports Request failed: success I think the sanlock still holds an old lease from previous test run and does not want to give it up and take a new one from a fresh set of IDs (following an environment cleanup)
sanlock cleanup and new setup solved problem
(In reply to movciari from comment #4) > sanlock cleanup and new setup solved problem So is this still relevant? If not, we'll close it.
removing old needinfo