[hosted-engine] [iSCSI support] Putting an iSCSI domain in maintenance while the hosted-engine is installed on a LUN from the same storage server causes the setup to become non operational
Created attachment 950796[details]
logs
Description of problem:
On a hosted-engine setup installed using iSCSI, I created an iSCSI storage domain located on the same server where the engine's disk is located. I set the domain to maintenance mode. The whole hosted-engine setup became inactive beacause the host disconnected from the storage server where the engine's VM disk is located.
Version-Release number of selected component (if applicable):
rhev 3.5 vt7
rhel6.6 host
ovirt-hosted-engine-setup-1.2.1-1.el6ev.noarch
rhevm-3.5.0-0.17.beta.el6ev.noarch
vdsm-4.16.7.1-1.el6ev.x86_64
How reproducible:
Always
Steps to Reproduce:
1. Deploy hosted-engine using iSCSI
2. Create an iSCSI storage domain using a LUN from the same storage server where the engine's VM disk is located. Create one more storage domain (nfs)
3. Put the iSCSI domain in maintenance
Actual results:
The whole setup becomes inactive because the host had disconnected from the iSCSI storage server
Thread-2551::INFO::2014-10-26 15:07:42,210::logUtils::44::dispatcher::(wrapper) Run and protect: disconnectStorageServer(domType=1, spUUID=u'00000000-0000-0000-0000-000000000000', conList=[{u'connection': u'lion.q
a.lab.tlv.redhat.com:/export/elad/1', u'iqn': u'', u'user': u'', u'tpgt': u'1', u'password': '******', u'id': u'529690d4-1afd-4ebb-a3f9-91009c157496', u'port': u''}], options=None)
Thread-2551::DEBUG::2014-10-26 15:07:42,211::mount::227::Storage.Misc.excCmd::(_runcmd) /usr/bin/sudo -n /bin/umount -f -l /rhev/data-center/mnt/lion.qa.lab.tlv.redhat.com:_export_elad_1 (cwd None)
Thread-2694::ERROR::2014-10-26 15:09:16,505::API::1699::vds::(_getHaInfo) failed to retrieve Hosted Engine HA info
Traceback (most recent call last):
File "/usr/share/vdsm/API.py", line 1679, in _getHaInfo
stats = instance.get_all_stats()
File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/client/client.py", line 100, in get_all_stats
stats = broker.get_stats_from_storage(service)
File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 232, in get_stats_from_storage
result = self._checked_communicate(request)
File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 260, in _checked_communicate
.format(message or response))
RequestError: Request failed: <type 'exceptions.OSError'>
Expected results:
When putting an iSCSI storage domain in maintenance while the engine's VM disk is located on the same storage server, the host should not disconnect from the targets
Workaround:
Recoonecting to the iSCSI sessions using iscsiadm
Additional info: logs
We currently don't have any indication in the engine of this LUN. There has to be a way for the engine to know this LUN before we can block this operation, moving to integration to set up this data so the storage flows can use it.
(In reply to Doron Fediuck from comment #2)
> Will you get a different behaviour when working with NFS?
No, because on NFS, maintenance domain will umount the specific export path of the storage domain.
http://gerrit.ovirt.org/34783 add the HE disk to the engine.
Now the lun is known to the engine, any other change required on hosted-engine side?
Moving back to storage.
(In reply to Tal Nisan from comment #6)
> Iirc upon disconnection there is a check if the connections to disconnect
> are not used by other entities in the system and those are filtered out
So, in other words, this is already solved?
Moving an iSCSI domain to maintenance, while the engine disk is located on the same storage server, doesn't cause to a disconnection from the storage server since of the existance of the LUN, contains the engine disk, in the DB.
Verified using rhev3.5 vt12
Created attachment 950796 [details] logs Description of problem: On a hosted-engine setup installed using iSCSI, I created an iSCSI storage domain located on the same server where the engine's disk is located. I set the domain to maintenance mode. The whole hosted-engine setup became inactive beacause the host disconnected from the storage server where the engine's VM disk is located. Version-Release number of selected component (if applicable): rhev 3.5 vt7 rhel6.6 host ovirt-hosted-engine-setup-1.2.1-1.el6ev.noarch rhevm-3.5.0-0.17.beta.el6ev.noarch vdsm-4.16.7.1-1.el6ev.x86_64 How reproducible: Always Steps to Reproduce: 1. Deploy hosted-engine using iSCSI 2. Create an iSCSI storage domain using a LUN from the same storage server where the engine's VM disk is located. Create one more storage domain (nfs) 3. Put the iSCSI domain in maintenance Actual results: The whole setup becomes inactive because the host had disconnected from the iSCSI storage server Thread-2551::INFO::2014-10-26 15:07:42,210::logUtils::44::dispatcher::(wrapper) Run and protect: disconnectStorageServer(domType=1, spUUID=u'00000000-0000-0000-0000-000000000000', conList=[{u'connection': u'lion.q a.lab.tlv.redhat.com:/export/elad/1', u'iqn': u'', u'user': u'', u'tpgt': u'1', u'password': '******', u'id': u'529690d4-1afd-4ebb-a3f9-91009c157496', u'port': u''}], options=None) Thread-2551::DEBUG::2014-10-26 15:07:42,211::mount::227::Storage.Misc.excCmd::(_runcmd) /usr/bin/sudo -n /bin/umount -f -l /rhev/data-center/mnt/lion.qa.lab.tlv.redhat.com:_export_elad_1 (cwd None) Thread-2694::ERROR::2014-10-26 15:09:16,505::API::1699::vds::(_getHaInfo) failed to retrieve Hosted Engine HA info Traceback (most recent call last): File "/usr/share/vdsm/API.py", line 1679, in _getHaInfo stats = instance.get_all_stats() File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/client/client.py", line 100, in get_all_stats stats = broker.get_stats_from_storage(service) File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 232, in get_stats_from_storage result = self._checked_communicate(request) File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 260, in _checked_communicate .format(message or response)) RequestError: Request failed: <type 'exceptions.OSError'> Expected results: When putting an iSCSI storage domain in maintenance while the engine's VM disk is located on the same storage server, the host should not disconnect from the targets Workaround: Recoonecting to the iSCSI sessions using iscsiadm Additional info: logs