+++ This bug was initially created as a clone of Bug #1463094 +++ Description of problem: I have a HC installation setup and for some reason i see that hosted-engine.conf has a different storage domain of hosted_storage which does not exist on the system and due to this HE vm is always in paused state and never comes back up. Errors from agent.log file: ============================= MainThread::ERROR::2017-06-20 11:52:49,244::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 191, in _run_agent return action(he) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 64, in action_proper return he.start_monitoring() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 409, in start_monitoring self._initialize_storage_images() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 662, in _initialize_storage_images self._config.refresh_vm_conf() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/env/config.py", line 492, in refresh_vm_conf content = self._get_file_content_from_shared_storage(VM) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/env/config.py", line 461, in _get_file_content_from_shared_storage config_volume_path = self._get_config_volume_path() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/env/config.py", line 179, in _get_config_volume_path conf_vol_uuid File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/heconflib.py", line 330, in get_volume_path root=envconst.SD_MOUNT_PARENT, RuntimeError: Path to volume 6418b659-213e-45c4-a5a9-704f84273143 not found in /rhev/data-center/mnt vdsm.logs: ======================= 2017-06-19 15:55:21,101+0530 ERROR (check/loop) [storage.Monitor] Error checking path /rhev/data-center/mnt/glusterSD/10.70.36.78:_data/c98d0e85-8c0b-4056-8527-9267c2a97a93/ dom_md/metadata (monitor:485) Traceback (most recent call last): File "/usr/share/vdsm/storage/monitor.py", line 483, in _pathChecked delay = result.delay() File "/usr/lib/python2.7/site-packages/vdsm/storage/check.py", line 362, in delay raise exception.MiscFileReadException(self.path, self.rc, self.err) MiscFileReadException: Internal file read failure: (u'/rhev/data-center/mnt/glusterSD/10.70.36.78:_data/c98d0e85-8c0b-4056-8527-9267c2a97a93/dom_md/metadata', 1, bytearray(b "/usr/bin/dd: failed to open \'/rhev/data-center/mnt/glusterSD/10.70.36.78:_data/c98d0e85-8c0b-4056-8527-9267c2a97a93/dom_md/metadata\': No such file or directory\n")) 2017-06-19 15:55:21,160+0530 ERROR (check/loop) [storage.Monitor] Error checking path /rhev/data-center/mnt/glusterSD/10.70.36.78:_vmstore/35bf3a60-a596-4bf1-9037-39cf8fc615 8c/dom_md/metadata (monitor:485) Traceback (most recent call last): File "/usr/share/vdsm/storage/monitor.py", line 483, in _pathChecked delay = result.delay() File "/usr/lib/python2.7/site-packages/vdsm/storage/check.py", line 362, in delay raise exception.MiscFileReadException(self.path, self.rc, self.err) MiscFileReadException: Internal file read failure: (u'/rhev/data-center/mnt/glusterSD/10.70.36.78:_vmstore/35bf3a60-a596-4bf1-9037-39cf8fc6158c/dom_md/metadata', 1, bytearra y(b"/usr/bin/dd: failed to open \'/rhev/data-center/mnt/glusterSD/10.70.36.78:_vmstore/35bf3a60-a596-4bf1-9037-39cf8fc6158c/dom_md/metadata\': No such file or directory\n")) 2017-06-19 15:55:23,450+0530 INFO (MainThread) [vds] Received signal 15, shutting down (vdsm:68) Version-Release number of selected component (if applicable): How reproducible: Hit it once Steps to Reproduce: 1. Do not have any steps to reproduce. Actual results: I see that hosted engine vm remains in paused state and never comes up plus a different storage domain id for the engine is referred which does not exist. Expected results: Different storage domain id for the engine should not be referred which does not exist. Additional info: --- Additional comment from RamaKasturi on 2017-06-20 03:19:59 EDT --- sosreports are present in the link below: ============================================= http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/HC/1463094/ --- Additional comment from Sahina Bose on 2017-07-14 05:19:13 EDT --- I'm assigning this to Hosted engine team. Please reassign if needed --- Additional comment from Doron Fediuck on 2017-07-18 07:29:03 EDT --- Is this the case which re-used an existing RHVH? ie- the domain existed in a previous installation and was persisted by RHVH infra. --- Additional comment from RamaKasturi on 2017-09-06 02:23:56 EDT --- The issue happened during RHV-H upgrade. I have not checked if the domain with that id existed before upgrading the nodes. --- Additional comment from Doron Fediuck on 2017-09-06 04:37:34 EDT --- Ryan, are you familiar with such an upgrade issue? --- Additional comment from Ryan Barry on 2017-09-06 07:11:14 EDT --- I am not -- we haven't seen this. Rama, can you provide the versions upgraded to/from? --- Additional comment from RamaKasturi on 2017-09-06 07:29:45 EDT --- Hi Ryan, i tried upgrading from 4.1.2 to 4.1.3. Thanks kasturi --- Additional comment from Ryan Barry on 2017-09-06 08:32:39 EDT --- How was hosted engine deployed? Cockpit or CLI? --- Additional comment from RamaKasturi on 2017-09-06 08:34:42 EDT --- Hosted engine was deployed using cockpit
Can we close this as not repro since we donot seem to have any steps to reproduce?
Closing due to lack of data requested. Please re-open if you can provide the requested data
(In reply to Sahina Bose from comment #1) > Can we close this as not repro since we donot seem to have any steps to > reproduce? Yes, this is not hit again