Description of problem: After deploy, shutdown and startup, 'hosted-engine --vm-status' keeps showing 'Engine status : null'. broker.log has: Thread-209::ERROR::2019-12-25 12:56:39,241::submonitor_base::119::ovirt_hosted_engine_ha.broker.submonitor_base.SubmonitorBase::(_worker) Error executing submonitor engine-health, args {'add ress': '0', 'use_ssl': 'true', 'vm_uuid': 'b3bc7f7b-2b88-4758-8192-05242f61ba21'} Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/submonitor_base.py", line 115, in _worker self.action(self._options) File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/submonitors/engine_health.py", line 117, in action self._update_stats(stats, vdsm_ts, local_ts) File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/submonitors/engine_health.py", line 122, in _update_stats if not self._newer_timestamp(vdsm_ts, local_ts): File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/submonitors/engine_health.py", line 160, in _newer_timestamp return local_ts > self._stats_local_timestamp TypeError: '>' not supported between instances of 'int' and 'NoneType' Version-Release number of selected component (if applicable): Current master How reproducible: Always, I think. Not sure how it does not happen right after deploy, but it does happen after reboot. Steps to Reproduce: 1. deploy hosted-engine 2. set global maintenance, shutdown engine machine, shutdown hosts 3. start hosts 4. disable maintenance Actual results: See above Expected results: Should show correct engine status Additional info:
verified on http://bob-dr.lab.eng.brq.redhat.com/builds/4.4/rhv-4.4.0-29 scenario1: hosted-engine --set-maintenance --mode=global hosted-engine --vm-status hosted-engine --vm-poweroff status after poweroff http://pastebin.test.redhat.com/852935 hosted-engine --vm-start Command VM.getStats with args {'vmID': '9862d825-5d39-493b-b692-597dcb8496be'} failed: (code=1, message=Virtual machine does not exist: {'vmId': '9862d825-5d39-493b-b692-597dcb8496be'}) VM in WaitForLaunch status after start http://pastebin.test.redhat.com/852934 hosted-engine --set-maintenance --mode=none scenario2: hosted-engine --set-maintenance --mode=global hosted-engine --vm-status hosted-engine --vm-poweroff poweroff all three hosts in the setup start the hosts hosted-engine --vm-start hosted-engine --set-maintenance --mode=none hosted-engine --vm-status correct status, no error ... --== Host ocelot03.qa.lab.tlv.redhat.com (id: 3) status ==-- Host ID : 3 Host timestamp : 360 Score : 3400 Engine status : {"vm": "up", "health": "good", "detail": "Up"} Hostname : ocelot03.qa.lab.tlv.redhat.com Local maintenance : False stopped : False crc32 : c841b4e0 conf_on_shared_storage : True local_conf_timestamp : 360 Status up-to-date : True Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=360 (Tue Apr 7 19:59:07 2020) host-id=3 score=3400 vm_conf_refresh_time=360 (Tue Apr 7 19:59:07 2020) conf_on_shared_storage=True maintenance=False state=EngineUp stopped=False
This bugzilla is included in oVirt 4.4.0 release, published on May 20th 2020. Since the problem described in this bug report should be resolved in oVirt 4.4.0 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.