Bug 1463094 - Different storage domain id for the hosted_storage is referred which does not exist in the system
Different storage domain id for the hosted_storage is referred which does not...
Status: CLOSED WORKSFORME
Product: ovirt-engine
Classification: oVirt
Component: BLL.HostedEngine (Show other bugs)
4.1.2.1
Unspecified Unspecified
medium Severity medium (vote)
: ovirt-4.1.8
: ---
Assigned To: Doron Fediuck
Artyom
:
Depends On:
Blocks: Gluster-HC-3 1489369
  Show dependency treegraph
 
Reported: 2017-06-20 02:30 EDT by RamaKasturi
Modified: 2017-10-30 11:47 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1489369 (view as bug list)
Environment:
Last Closed: 2017-10-30 11:47:57 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: SLA
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
sabose: ovirt‑4.1?
mgoldboi: planning_ack+
sabose: devel_ack?
sabose: testing_ack?


Attachments (Terms of Use)

  None (edit)
Description RamaKasturi 2017-06-20 02:30:02 EDT
Description of problem:
I have a HC installation setup and for some reason i see that hosted-engine.conf has a different storage domain  of hosted_storage which does not exist on the system and due to this HE vm is always in paused state and never comes back up.

Errors from agent.log file:
=============================

MainThread::ERROR::2017-06-20 11:52:49,244::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 191, in _run_agent
    return action(he)
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 64, in action_proper
    return he.start_monitoring()
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 409, in start_monitoring
    self._initialize_storage_images()
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 662, in _initialize_storage_images
    self._config.refresh_vm_conf()
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/env/config.py", line 492, in refresh_vm_conf
    content = self._get_file_content_from_shared_storage(VM)
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/env/config.py", line 461, in _get_file_content_from_shared_storage
    config_volume_path = self._get_config_volume_path()
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/env/config.py", line 179, in _get_config_volume_path
    conf_vol_uuid
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/heconflib.py", line 330, in get_volume_path
    root=envconst.SD_MOUNT_PARENT,
RuntimeError: Path to volume 6418b659-213e-45c4-a5a9-704f84273143 not found in /rhev/data-center/mnt

vdsm.logs:
=======================
2017-06-19 15:55:21,101+0530 ERROR (check/loop) [storage.Monitor] Error checking path /rhev/data-center/mnt/glusterSD/10.70.36.78:_data/c98d0e85-8c0b-4056-8527-9267c2a97a93/
dom_md/metadata (monitor:485)
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/monitor.py", line 483, in _pathChecked
    delay = result.delay()
  File "/usr/lib/python2.7/site-packages/vdsm/storage/check.py", line 362, in delay
    raise exception.MiscFileReadException(self.path, self.rc, self.err)
MiscFileReadException: Internal file read failure: (u'/rhev/data-center/mnt/glusterSD/10.70.36.78:_data/c98d0e85-8c0b-4056-8527-9267c2a97a93/dom_md/metadata', 1, bytearray(b
"/usr/bin/dd: failed to open \'/rhev/data-center/mnt/glusterSD/10.70.36.78:_data/c98d0e85-8c0b-4056-8527-9267c2a97a93/dom_md/metadata\': No such file or directory\n"))
2017-06-19 15:55:21,160+0530 ERROR (check/loop) [storage.Monitor] Error checking path /rhev/data-center/mnt/glusterSD/10.70.36.78:_vmstore/35bf3a60-a596-4bf1-9037-39cf8fc615
8c/dom_md/metadata (monitor:485)
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/monitor.py", line 483, in _pathChecked
    delay = result.delay()
  File "/usr/lib/python2.7/site-packages/vdsm/storage/check.py", line 362, in delay
    raise exception.MiscFileReadException(self.path, self.rc, self.err)
MiscFileReadException: Internal file read failure: (u'/rhev/data-center/mnt/glusterSD/10.70.36.78:_vmstore/35bf3a60-a596-4bf1-9037-39cf8fc6158c/dom_md/metadata', 1, bytearra
y(b"/usr/bin/dd: failed to open \'/rhev/data-center/mnt/glusterSD/10.70.36.78:_vmstore/35bf3a60-a596-4bf1-9037-39cf8fc6158c/dom_md/metadata\': No such file or directory\n"))
2017-06-19 15:55:23,450+0530 INFO  (MainThread) [vds] Received signal 15, shutting down (vdsm:68)


Version-Release number of selected component (if applicable):


How reproducible:
Hit it once

Steps to Reproduce:
1. Do not have any steps to reproduce.


Actual results:
I see that hosted engine vm remains in paused state and never comes up plus a different storage domain id for the engine is referred which does not exist.

Expected results:
Different storage domain id for the engine should not be referred which does not exist.

Additional info:
Comment 1 RamaKasturi 2017-06-20 03:19:59 EDT
sosreports are present in the link below:
=============================================
http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/HC/1463094/
Comment 2 Sahina Bose 2017-07-14 05:19:13 EDT
I'm assigning this to Hosted engine team. Please reassign if needed
Comment 3 Doron Fediuck 2017-07-18 07:29:03 EDT
Is this the case which re-used an existing RHVH?
ie- the domain existed in a previous installation and was persisted by RHVH infra.
Comment 4 RamaKasturi 2017-09-06 02:23:56 EDT
The issue happened during RHV-H upgrade. I have not checked if the domain with that id existed before upgrading the nodes.
Comment 5 Doron Fediuck 2017-09-06 04:37:34 EDT
Ryan,
are you familiar with such an upgrade issue?
Comment 6 Ryan Barry 2017-09-06 07:11:14 EDT
I am not -- we haven't seen this.

Rama, can you provide the versions upgraded to/from?
Comment 7 RamaKasturi 2017-09-06 07:29:45 EDT
Hi Ryan,

    i tried upgrading from 4.1.2 to 4.1.3.

Thanks
kasturi
Comment 8 Ryan Barry 2017-09-06 08:32:39 EDT
How was hosted engine deployed? Cockpit or CLI?
Comment 9 RamaKasturi 2017-09-06 08:34:42 EDT
Hosted engine was deployed using cockpit
Comment 10 Ryan Barry 2017-09-07 07:54:21 EDT
I can't reproduce this on base RHVH, but I'm setting up a gluster environment, since I imagine that was used for storage...
Comment 11 Sahina Bose 2017-10-11 06:56:10 EDT
Ryan - do you know the cause for this? Was it reproducible?
Comment 12 Ryan Barry 2017-10-11 08:04:26 EDT
I also wasn't able to reproduce it on a RHHI environment.
Comment 13 Yaniv Kaul 2017-10-30 11:47:57 EDT
I'm closing for the time being. If it reproduces, please re-open.

Note You need to log in before you can comment on or make changes to this bug.