Bug 908648
| Summary: | Export VM & discard snapshots fails with "low level Image copy failed" when using compatibility mode 3.1 & 3.2 | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Eyal Edri <eedri> | ||||
| Component: | vdsm | Assignee: | Yaniv Bronhaim <ybronhei> | ||||
| Status: | CLOSED DUPLICATE | QA Contact: | Barak Dagan <bdagan> | ||||
| Severity: | unspecified | Docs Contact: | |||||
| Priority: | urgent | ||||||
| Version: | 3.2.0 | CC: | bazulay, dyasny, ewarszaw, hateya, iheim, lpeer, mgoldboi, michal.skrivanek, oramraz, pstehlik, ykaul | ||||
| Target Milestone: | --- | Keywords: | Regression, TestBlocker | ||||
| Target Release: | 3.2.0 | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | infra | ||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2013-02-17 14:42:09 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | Infra | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 890951 | ||||||
| Attachments: |
|
||||||
|
Description
Eyal Edri
2013-02-07 09:05:02 UTC
Created attachment 694345 [details]
engine & host logs
includes vdsm,libvirt,engine,sanlock
I see this error on our zstream environment as well : 2013-02-08 09:34:12,308 - MainThread - plmanagement.matrix-test-composer - ERROR - Entity rest_templ1 not found for collection 'templates'. 2013-02-08 09:34:12,498 - MainThread - plmanagement.error_fetcher - ERROR - Errors fetched from VDC(jenkins-vm-17.qa.lab.tlv.redhat.com): 2013-02-08 09:34:04,293 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (QuartzScheduler_Worker-81) Failed in HSMGetAllTasksStatusesVDS method 2013-02-08 09:34:04,293 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (QuartzScheduler_Worker-81) Error code CopyImageError and error message VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = low level Image copy failed 2013-02-08 09:34:04,294 INFO [org.ovirt.engine.core.bll.SPMAsyncTask] (QuartzScheduler_Worker-81) SPMAsyncTask::PollTask: Polling task 4367540b-2e43-42f5-8edc-e15e3009b607 (Parent Command AddVmTemplate, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters) returned status finished, result 'cleanSuccess'. 2013-02-08 09:34:04,301 ERROR [org.ovirt.engine.core.bll.SPMAsyncTask] (QuartzScheduler_Worker-81) BaseAsyncTask::LogEndTaskFailure: Task 4367540b-2e43-42f5-8edc-e15e3009b607 (Parent Command AddVmTemplate, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters) ended with failure: -- Result: cleanSuccess -- Message: VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = low level Image copy failed, -- Exception: VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = low level Image copy failed 2013-02-08 09:34:04,302 INFO [org.ovirt.engine.core.bll.EntityAsyncTask] (QuartzScheduler_Worker-81) EntityAsyncTask::EndActionIfNecessary: All tasks of entity fe964e8b-12c2-406c-86ff-266b88fa528a has ended -> executing EndAction 2013-02-08 09:34:04,330 ERROR [org.ovirt.engine.core.bll.AddVmTemplateCommand] (pool-3-thread-50) [518] Ending command with failure: org.ovirt.engine.core.bll.AddVmTemplateCommand 2013-02-08 09:34:04,337 ERROR [org.ovirt.engine.core.bll.CreateImageTemplateCommand] (pool-3-thread-50) [64860853] Ending command with failure: org.ovirt.engine.core.bll.CreateImageTemplateCommand ################################################################# vdsm version ( si27 ) : vdsm-4.10.2-1.3.el6.x86_64 An image on an NFS domain is deleted with postZero = True.
Post zeroing NFS images is forbidden because it is irrelevant.
Thread-1603::INFO::2013-02-07 09:24:58,910::logUtils::37::dispatcher::(wrapper) Run and protect: deleteImage(sdUUID='72385e09-7f27-4db2-951c-3795fae0b391', s
pUUID='a7fddec3-aea5-4d4e-a3a0-435d18aabee5', imgUUID='fd9eafa5-64c7-4a18-8449-a8ea21e3a8b4', postZero='true', force='false')
Thread-1603::INFO::2013-02-07 09:24:58,910::fileSD::302::Storage.StorageDomain::(validate) sdUUID=72385e09-7f27-4db2-951c-3795fae0b391
Thread-1603::DEBUG::2013-02-07 09:24:58,914::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=D
ataDomainRest', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'MASTER_VERSION=1', 'POOL_DESCRIPTION=De
faultRest', 'POOL_DOMAINS=527e2e5c-5736-4bd5-8ce0-0efcc06aa8ab:Active,72385e09-7f27-4db2-951c-3795fae0b391:Active,ea82189d-3486-4926-880a-bb3c45b18b82:Active
', 'POOL_SPM_ID=1', 'POOL_SPM_LVER=1', 'POOL_UUID=a7fddec3-aea5-4d4e-a3a0-435d18aabee5', 'REMOTE_PATH=10.35.64.106:/export/jenkins-automation-rpm-vm30_nfs_20
13_02_07_9_8_29_765150', 'ROLE=Master', 'SDUUID=72385e09-7f27-4db2-951c-3795fae0b391', 'TYPE=NFS', 'VERSION=3', '_SHA_CKSUM=9194c5945c950223206ef5b24a0777a39
db80658']
813af784-199a-4c61-a8ef-cd633eacf588::ERROR::2013-02-07 09:24:59,340::task::833::TaskManager.Task::(_setError) Task=`813af784-199a-4c61-a8ef-cd633eacf588`::U
nexpected error
Traceback (most recent call last):
File "/usr/share/vdsm/storage/task.py", line 840, in _run
return fn(*args, **kargs)
File "/usr/share/vdsm/storage/task.py", line 307, in run
return self.cmd(*self.argslist, **self.argsdict)
File "/usr/share/vdsm/storage/fileSD.py", line 354, in zeroImage
"fileSD %s should not be zeroed." % (imgUUID, sdUUID))
SourceImageActionError: Error during source image manipulation: 'image=fd9eafa5-64c7-4a18-8449-a8ea21e3a8b4, source domain=72385e09-7f27-4db2-951c-3795fae0b391: image fd9eafa5-64c7-4a18-8449-a8ea21e3a8b4 on a fileSD 72385e09-7f27-4db2-951c-3795fae0b391 should not be zeroed.'
The reason for the engine fail, unrelated to the previous one:
928055c8-30fb-482e-8546-71e009790279::ERROR::2013-02-07 09:28:06,277::remoteFileHandler::190::Storage.CrabRPCProxy::(callCrabRPCFunction) Problem with handler, treating as timeout
Traceback (most recent call last):
File "/usr/share/vdsm/storage/remoteFileHandler.py", line 180, in callCrabRPCFunction
rawLength = self._recvAll(LENGTH_STRUCT_LENGTH, timeout)
File "/usr/share/vdsm/storage/remoteFileHandler.py", line 149, in _recvAll
timeLeft):
File "/usr/lib64/python2.6/contextlib.py", line 83, in helper
return GeneratorContextManager(func(*args, **kwds))
File "/usr/share/vdsm/storage/remoteFileHandler.py", line 133, in _poll
res = misc.NoIntrPoll(self._poller.poll, timeout)
File "/usr/share/vdsm/storage/misc.py", line 1348, in NoIntrPoll
return pollfun(timeout)
error: (4, 'Interrupted system call')
928055c8-30fb-482e-8546-71e009790279::ERROR::2013-02-07 09:28:06,278::image::900::Storage.Image::(copy) Unexpected error
Traceback (most recent call last):
File "/usr/share/vdsm/storage/image.py", line 881, in copy
size=srcVol.getVolumeSize(bs=1), dstvolType=dstVol.getType())
File "/usr/share/vdsm/storage/volume.py", line 1023, in qemuConvert
recoveryCallback=baseAsyncTasksRollback)
File "/usr/share/vdsm/storage/misc.py", line 254, in watchCmd
recoveryCallback(proc)
File "/usr/share/vdsm/storage/volume.py", line 971, in baseAsyncTasksRollback
str(misc.getProcCtime(proc.pid))]))
File "/usr/share/vdsm/storage/task.py", line 1027, in pushRecovery
self.persist()
File "/usr/share/vdsm/storage/task.py", line 1098, in persist
self._save(self.store)
File "/usr/share/vdsm/storage/task.py", line 742, in _save
getProcPool().os.rename(origTaskDir, origTaskDir + BACKUP_EXT)
File "/usr/share/vdsm/storage/remoteFileHandler.py", line 275, in callCrabRPCFunction
*args, **kwargs)
File "/usr/share/vdsm/storage/remoteFileHandler.py", line 191, in callCrabRPCFunction
raise Timeout()
Timeout
any ETA on addressing this bug? it's a major test blocker for almost all automaton tests for 3.2. (and possibly 3.1.Z) Isn't it a dup of bug 911139? *** This bug has been marked as a duplicate of bug 910667 *** |