Created attachment 731657 [details] logs Description of problem: I ran a scenreo in which I have two domains in two different servers and we run upgrade while there is no connectivity to the master storage domain. once connectivity is restored to storage from teh hosts, the old master (the one that was blocked) strat upgrade but fails to upgrade from v2 to v3 for image does not exists. Version-Release number of selected component (if applicable): sf12 upgrade of pool from v1 pool created on vdsm-4.10-1.8 to vdsm-4.10.2-13.0.el6ev.x86_64 How reproducible: 100% Steps to Reproduce: 1. with hosts with vdsm-4.10-1.8 create iscsi 3.0 poolwith 2 domains located on different storage servers and hosts with vdsm-4.10-1.8 2. create vm's with two disks (one on each domain and run them) 3. upgrade the hosts to vdsm-4.10.2-13.0.el6ev.x86_64 4. block connectivity to the master domain only using iptables from both hosts 5. after the reconstruct and the domain becoming inactive - restore connectivity and check the vgs metadata version. 6. stop the vms 7. deactivate the domain 8. try to activate the domain. Actual results: we fail to upgrade since for some reason the links for the domain were not created. Expected results: links should be created and we should be able to upgrade Additional info: I attached the logs from the upgrade and the vdsm log from trying to activate the domain (in which we are failing). Thread-7455::ERROR::2013-04-04 18:37:14,515::task::833::TaskManager.Task::(_setError) Task=`b8410171-89fa-4f90-b235-56114f4f9bbb`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 840, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 41, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 1156, in activateStorageDomain pool.activateSD(sdUUID) File "/usr/share/vdsm/storage/securable.py", line 68, in wrapper return f(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1062, in activateSD self._convertDomain(dom) File "/usr/share/vdsm/storage/sp.py", line 1033, in _convertDomain domain.getRealDomain(), isMsd, targetFormat) File "/usr/share/vdsm/storage/imageRepository/formatConverter.py", line 360, in convert converter(repoPath, hostId, imageRepo, isMsd) File "/usr/share/vdsm/storage/imageRepository/formatConverter.py", line 212, in v3DomainConverter v3ReallocateMetadataSlot(domain, allVolumes) File "/usr/share/vdsm/storage/imageRepository/formatConverter.py", line 174, in v3ReallocateMetadataSlot vol = domain.produceVolume(imgUUIDs[0], volUUID) File "/usr/share/vdsm/storage/blockSD.py", line 577, in produceVolume return blockVolume.BlockVolume(repoPath, self.sdUUID, imgUUID, volUUID) File "/usr/share/vdsm/storage/blockVolume.py", line 80, in __init__ volume.Volume.__init__(self, repoPath, sdUUID, imgUUID, volUUID) File "/usr/share/vdsm/storage/volume.py", line 128, in __init__ self.validate() File "/usr/share/vdsm/storage/blockVolume.py", line 89, in validate volume.Volume.validate(self) File "/usr/share/vdsm/storage/volume.py", line 140, in validate self.validateImagePath() File "/usr/share/vdsm/storage/blockVolume.py", line 404, in validateImagePath raise se.ImagePathError(imageDir) ImagePathError: Image path does not exist or cannot be accessed/created: ('/rhev/data-center/9a9db723-c63b-469d-9545-d8cc9407822b/cf28adb9-28e7-49f9-88d8-0e6d12336bd9/images/983339ed-14d1-4c5e-9704-3183bf80ae8c',) Thread-7455::DEBUG::2013-04-04 18:37:14,515::task::852::TaskManager.Task::(_run) Task=`b8410171-89fa-4f90-b235-56114f4f9bbb`::Task._run: b8410171-89fa-4f90-b235-56114f4f9bbb ('cf28adb9-28e7-49f9-88d8-0e6d12336bd9', '9a9db723-c63b-469d-9 545-d8cc9407822b') {} failed - stopping task [root@cougar02 ~]# vgs -o vg_all |grep --color MDT_VERSION lvm2 uCWJqy-pvZQ-3zxd-VJca-Hs6Q-j1ub-S8wO4s 6bbbe226-7456-46da-8fc4-4c4d59472436 wz--n- 99.62g 80.75g 128.00m 797 646 0 0 1 19 0 60 RHAT_storage_domain,MDT_ROLE=Master,MDT_POOL_DESCRIPTION=iSCSI,MDT_LOCKPOLICY=,MDT_PV0=pv:1Dafna-lion1365076&44&uuid:duUBrS-En02-607t-KjKa-Gc3d-xW1K-OwT5VG&44&pestart:0&44&pecount:797&44&mapoffset:0,MDT_CLASS=Data,MDT_MASTER_VERSION=2,MDT_DESCRIPTION=Dafna-lion,MDT_LOCKRENEWALINTERVALSEC=5,MDT_IOOPTIMEOUTSEC=1,MDT_TYPE=ISCSI,MDT_LOGBLKSIZE=512,MDT_SDUUID=6bbbe226-7456-46da-8fc4-4c4d59472436,MDT_LEASERETRIES=3,MDT_LEASETIMESEC=5,MDT_PHYBLKSIZE=512,MDT_VGUUID=uCWJqy-pvZQ-3zxd-VJca-Hs6Q-j1ub-S8wO4s,MDT_POOL_UUID=9a9db723-c63b-469d-9545-d8cc9407822b,MDT_VERSION=3,MDT_POOL_DOMAINS=cf28adb9-28e7-49f9-88d8-0e6d12336bd9:Attached&44&6bbbe226-7456-46da-8fc4-4c4d59472436:Active,MDT_POOL_SPM_ID=2,MDT_POOL_SPM_LVER=3,MDT__SHA_CKSUM=db61549e4d7decb11adb595fe649889dc3feab16 2 2 63.99m 128.00m unmanaged lvm2 jfhiis-ChHI-jKra-40jz-Le49-3P9i-8v4CrN cf28adb9-28e7-49f9-88d8-0e6d12336bd9 wz--n- 99.62g 77.75g 128.00m 797 622 0 0 1 18 0 66 RHAT_storage_domain,MDT_VERSION=2,MDT_POOL_DOMAINS=cf28adb9-28e7-49f9-88d8-0e6d12336bd9:Active&44&6bbbe226-7456-46da-8fc4-4c4d59472436:Active,MDT_ROLE=Master,MDT_POOL_DESCRIPTION=iSCSI,MDT_LOCKPOLICY=,MDT_PV0=pv:1Dafna-target313649124&44&uuid:Wzl00P-KeCL-WJmv-cZpK-CXw7-elcL-zQe7lL&44&pestart:0&44&pecount:797&44&mapoffset:0,MDT_POOL_UUID=9a9db723-c63b-469d-9545-d8cc9407822b,MDT_IOOPTIMEOUTSEC=10,MDT_CLASS=Data,MDT_LEASETIMESEC=60,MDT_MASTER_VERSION=1,MDT__SHA_CKSUM=1aa16ed0ca4bc764e40a1aa13bed008e09fc37f7,MDT_LOCKRENEWALINTERVALSEC=5,MDT_SDUUID=cf28adb9-28e7-49f9-88d8-0e6d12336bd9,MDT_DESCRIPTION=five,MDT_POOL_SPM_ID=2,MDT_TYPE=ISCSI,MDT_LOGBLKSIZE=512,MDT_LEASERETRIES=3,MDT_PHYBLKSIZE=512,MDT_POOL_SPM_LVER=1,MDT_VGUUID=jfhiis-ChHI-jKra-40jz-Le49-3P9i-8v4CrN 2 2 63.99m 128.00m unmanaged
The reproducer for this is simple: 1. data center 3.0 with two data block domains 2. put in maintenance the non-master domain 3. upgrade the data center to 3.1 4. make sure you have no links for the domain in maintenance: # rm -rf /rhev/data-center/<spUUID>/<deactivatedSdUUID> # rm -rf /rhev/data-center/mnt/blockSD/<deactivatedSdUUID> 5. activate the domain You should notice the upgrade starting and failing with the following exceptions: Traceback (most recent call last): File "/usr/share/vdsm/storage/blockVolume.py", line 408, in validateImagePath os.mkdir(imageDir, 0755) Traceback (most recent call last): ... File "/usr/share/vdsm/storage/blockVolume.py", line 411, in validateImagePath raise se.ImagePathError(imageDir) ImagePathError: Image path does not exist or cannot be accessed/created: ('/rhev/data-center/a2834714-c9d8-4316-878d-3af799f10feb/5db46280-a002-4d4e-b5cf-59533f9aa36d/images/aa136091-6d13-4088-a089-cacfed0bd7d6',) And the domain remains deactivated. A fix has been posted upstream.
*** Bug 950579 has been marked as a duplicate of this bug. ***
Checked on RHVM-3.2 - SF15 vdsm-4.10.2-17.0.el6ev.x86_64 rhevm-3.2.0-10.21.master.el6ev.noarch I've manage to upgrade vdsm and pool and also to activate the domain.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2013-0886.html