Bug 989131

Summary: [vdsm] host is unable to connect to the pool after connectivity issues have been solved
Product: Red Hat Enterprise Virtualization Manager Reporter: Elad <ebenahar>
Component: vdsmAssignee: Ayal Baron <abaron>
Status: CLOSED DUPLICATE QA Contact: Elad <ebenahar>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 3.3.0CC: abaron, amureini, bazulay, hateya, iheim, lpeer, scohen, yeylon
Target Milestone: ---Keywords: Regression, Triaged
Target Release: 3.3.0   
Hardware: x86_64   
OS: Unspecified   
Whiteboard: storage
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-08-08 07:40:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
logs none

Description Elad 2013-07-27 18:53:08 UTC
Created attachment 779177 [details]
logs

Description of problem:
Host fails in connectStoragePool after connectivity problems to one of the pool's domains was solved. vdsm is unable to find the pool's master domain:

StoragePoolMasterNotFound: Cannot find master domain: 'spUUID=1def1ef4-b354-424d-9fbe-25e40400db64, msdUUID=b38adea3-3a54-4f65-a7aa-07a17482be00'



Version-Release number of selected component (if applicable):
vdsm-4.12.0-rc1.12.git8ee6885.el6.x86_64
rhevm-3.3.0-0.9.master.el6ev.noarch
libvirt-0.10.2-18.el6_4.9.x86_64

How reproducible:
100%

Steps to Reproduce:
1. on a block pool with more than 1 host and more than 1 data domain
2. block connectivity between HSM to non-master domain using iptables, engine will set host to 'non-operational'
3. resume connectivity, and activate the host

Actual results:
Host will fail to connect to the pool. Host is unable to find master storage domain:

Thread-41181::ERROR::2013-07-27 20:47:35,006::task::850::TaskManager.Task::(_setError) Task=`48f9660b-c25e-497e-880e-54bef55fbb60`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 857, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 45, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 991, in connectStoragePool
    masterVersion, options)
  File "/usr/share/vdsm/storage/hsm.py", line 1038, in _connectStoragePool
    res = pool.connect(hostID, scsiKey, msdUUID, masterVersion)
  File "/usr/share/vdsm/storage/sp.py", line 698, in connect
    self.__rebuild(msdUUID=msdUUID, masterVersion=masterVersion)
  File "/usr/share/vdsm/storage/sp.py", line 1235, in __rebuild
    masterVersion=masterVersion)
  File "/usr/share/vdsm/storage/sp.py", line 1594, in getMasterDomain
    raise se.StoragePoolMasterNotFound(self.spUUID, msdUUID)
StoragePoolMasterNotFound: Cannot find master domain: 'spUUID=1def1ef4-b354-424d-9fbe-25e40400db64, msdUUID=b38adea3-3a54-4f65-a7aa-07a17482be00'


storage pool is not present under /rhev/data-center/ :

[root@green-vdsa data-center]# ll
total 12
drwxr-xr-x. 2 vdsm kvm 4096 Feb  6 16:17 9ca4f342-afd8-4c5f-97ca-0039d5d261d4
drwxr-xr-x. 2 vdsm kvm 4096 Jul 25 10:37 hsm-tasks
drwxr-xr-x. 7 vdsm kvm 4096 Jul 22 16:37 mnt


Expected results:
After connectivity problem to one of the domains was solved, host should be able to connect to the pool

Additional info:
logs

Comment 3 Allon Mureinik 2013-08-08 07:40:37 UTC

*** This bug has been marked as a duplicate of bug 986652 ***