Created attachment 714607 [details] vdsm.log Description of problem: Cannot export a VM with SF11. In vdsm.log there's reference to path which does not exist: 5ef86a24-283b-4436-9396-7c511e20a2b2::ERROR::2013-03-22 16:07:01,178::image::560::Storage.Image::(_createTargetImage) Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/image.py", line 542, in _createTargetImage srcVolUUID=volParams['parent']) File "/usr/share/vdsm/storage/fileSD.py", line 285, in createVolume volUUID, desc, srcImgUUID, srcVolUUID) File "/usr/share/vdsm/storage/volume.py", line 415, in create imgPath = image.Image(repoPath).create(sdUUID, imgUUID) File "/usr/share/vdsm/storage/image.py", line 124, in create os.mkdir(imageDir) OSError: [Errno 2] No such file or directory: '/rhev/data-center/ffec9aa4-692c-11e2-9e91-001a4a013f3a/131d564c-52d1-4bba-8d60-39e889a8bc08/images/6aff0cf7-c11e-4ce6-863b-dcf2fa5fb387' See '/rhev/data-center/ffec9aa4-692c-11e2-9e91-001a4a013f3a/131d564c-52d1-4bba-8d60-39e889a8bc08/images/6aff0cf7-c11e-4ce6-863b-dcf2fa5fb387' The UUID 'ffec9aa4-692c-11e2-9e91-001a4a013f3a' is not UUID of my DC at all, the UUID is poolui. # zcat /tmp/vdsm.log.gz | grep =ffec9aa4-692c-11e2-9e91-001a4a013f3a Thread-7570::DEBUG::2013-03-22 16:07:00,929::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Backup', 'DESCRIPTION=str02-nfs-export', 'IOOPTIMEOUTSEC=1', 'LEASERETRIES=3', 'LEASETIMESEC=5', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'MASTER_VERSION=0', 'POOL_UUID=ffec9aa4-692c-11e2-9e91-001a4a013f3a', 'REMOTE_PATH=10.34.63.204:/mnt/export/nfs/export', 'ROLE=Regular', 'SDUUID=131d564c-52d1-4bba-8d60-39e889a8bc08', 'TYPE=NFS', 'VERSION=0', '_SHA_CKSUM=30fbc3acc41aa41401319d58fd79115755a81d95'] Version-Release number of selected component (if applicable): sf11 vdsm-xmlrpc-4.10.2-12.0.el6ev.noarch vdsm-4.10.2-12.0.el6ev.x86_64 vdsm-cli-4.10.2-12.0.el6ev.noarch vdsm-python-4.10.2-12.0.el6ev.x86_64 Red Hat Enterprise Linux Server release 6.4 (Santiago) How reproducible: 100% Steps to Reproduce: 1. have a vm 2. try to export 3. Actual results: export fails Expected results: should work Additional info: export is nfs 10.34.63.199:/jb01 on /rhev/data-center/mnt/10.34.63.199:_jb01 type nfs (rw,soft,nosharecache,timeo=600,retrans=6,nfsvers=3,addr=10.34.63.199) 10.34.63.204:/mnt/export/nfs/export on /rhev/data-center/mnt/10.34.63.204:_mnt_export_nfs_export type nfs (rw,soft,nosharecache,timeo=600,retrans=6,nfsvers=3,addr=10.34.63.204) 10.34.63.204:/home/iso/shared on /rhev/data-center/mnt/10.34.63.204:_home_iso_shared type nfs (rw,soft,nosharecache,timeo=600,retrans=6,nfsvers=3,addr=10.34.63.204)
please attach engine log as well.
Jiri, I suspect something is wrong with the export path, can you please execute the following commands on the hypervisor: tree /rhev/data-center/ffec9aa4-692c-11e2-9e91-001a4a013f3a ls -l /rhev/data-center/ffec9aa4-692c-11e2-9e91-001a4a013f3a/131d564c-52d1-4bba-8d60-39e889a8bc08/images/62787b06-d836-4e14-afc7-023e08aeee96
Created attachment 715897 [details] engine.log
[root@dell-r210ii-03 ~]# tree /rhev/data-center/ffec9aa4-692c-11e2-9e91-001a4a013f3a/rhev/data-center/ffec9aa4-692c-11e2-9e91-001a4a013f3a [error opening dir] 0 directories, 0 files [root@dell-r210ii-03 ~]# ls -l /rhev/data-center/ffec9aa4-692c-11e2-9e91-001a4a013f3a/131d564c-52d1-4bba-8d60-39e889a8bc08/images/62787b06-d836-4e14-afc7-023e08aeee96 ls: cannot access /rhev/data-center/ffec9aa4-692c-11e2-9e91-001a4a013f3a/131d564c-52d1-4bba-8d60-39e889a8bc08/images/62787b06-d836-4e14-afc7-023e08aeee96: No such file or directory
*** Bug 924835 has been marked as a duplicate of this bug. ***
The export domain 131d564c-52d1-4bba-8d60-39e889a8bc08 belongs to pool ffec9aa4-692c-11e2-9e91-001a4a013f3a. In spite of this pool, a05c6f22-2a40-4f39-a2a8-aa91b539b217 metadata marks this export domain as part of the pool. After extensive search of _all_ the vdsm logs in the host, can't be determined which failed operation left the export domain in such state. ######################################################################## Thread-155232::INFO::2013-03-25 08:47:49,369::fileSD::302::Storage.StorageDomain::(validate) sdUUID=131d564c-52d1-4bba-8d60-39e889a8bc08 Thread-155232::DEBUG::2013-03-25 08:47:49,371::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Backup', 'DESCRIPTION=str02-nfs-export', 'IOOPTIMEOUTSEC=1', 'LEASERETRIES=3', 'LEASETIMESEC=5', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'MASTER_VERSION=0', 'POOL_UUID=ffec9aa4-692c-11e2-9e91-001a4a013f3a', 'REMOTE_PATH=10.34.63.204:/mnt/export/nfs/export', 'ROLE=Regular', 'SDUUID=131d564c-52d1-4bba-8d60-39e889a8bc08', 'TYPE=NFS', 'VERSION=0', '_SHA_CKSUM=30fbc3acc41aa41401319d58fd79115755a81d95'] Pool a05c6f22-2a40-4f39-a2a8-aa91b539b217 metadata: CLASS=Data DESCRIPTION=str03-jb01-data IOOPTIMEOUTSEC=10 LEASERETRIES=3 LEASETIMESEC=60 LOCKPOLICY= LOCKRENEWALINTERVALSEC=5 MASTER_VERSION=1 POOL_DESCRIPTION=Default POOL_DOMAINS=131d564c-52d1-4bba-8d60-39e889a8bc08:Active,a7e5f59c-2877-475b-8afc-f760ba63defb:Active,cc4d884d-15d9-4e35-b869-4330245c1b94:Active POOL_SPM_ID=2 POOL_SPM_LVER=37 POOL_UUID=a05c6f22-2a40-4f39-a2a8-aa91b539b217 REMOTE_PATH=10.34.63.199:/jb01 ROLE=Master SDUUID=cc4d884d-15d9-4e35-b869-4330245c1b94 TYPE=NFS VERSION=3 _SHA_CKSUM=2d71c0a6c4450b8476bbbd557a7f90df2bba21a3
Created attachment 715965 [details] All engine logs
In the engine logs, failed attempts of attach the export were found. In addition, recurrent attempts to activate this export found too. They failed. Probably the fact that the sdUUID of the export is part of the pool, in spite the failed attaches is consequence of a wrong reconstruct master.
Pool and domain metadata should be manually corrected in order to use this export.
(In reply to comment #10) > Pool and domain metadata should be manually corrected in order to use this > export. Domain was probably force detached from the pool. This is a known design issue and will be taken care of once we get rid of the export domain entirely.
(In reply to comment #11) > (In reply to comment #10) > > Pool and domain metadata should be manually corrected in order to use this > > export. > > Domain was probably force detached from the pool. > This is a known design issue and will be taken care of once we get rid of > the export domain entirely. I don't think so, we need to understand how system got to a state where export domain metadata contains wrong pool, export domain status is up and active, hence export process starts and fails. moving back to assigned, scrubbing is needed from engine side.
(In reply to comment #12) > (In reply to comment #11) > > (In reply to comment #10) > > > Pool and domain metadata should be manually corrected in order to use this > > > export. > > > > Domain was probably force detached from the pool. > > This is a known design issue and will be taken care of once we get rid of > > the export domain entirely. > > I don't think so, we need to understand how system got to a state where > export domain metadata contains wrong pool, export domain status is up and > active, hence export process starts and fails. > moving back to assigned, scrubbing is needed from engine side. Ack, I missed the part where it became up on engine.
The comment #10 states true, POOL_UUID in export domain metadata was not my storage pool id. I don't know how exactly it has happened (comment #11 + comment #12) but truth is, this export domain is "shared" by our team. Right now I force detached the export domain as I could not remove it, manually removed id from POOL_UUID and _SHA_CKSUM and reimported the export domain again. I can import from the domain now. If you think this was caused by PEBKAC issue, feel free to close the BZ.
closing for now (smells like manual intervention + modification of MD was in place). feel free to re-open if issue reproduces.