Bug 874481
Summary: | 3.1 - vdsm [Live Storage Migration]: vm fails to resume after being paused on ENOspace | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Dafna Ron <dron> | ||||
Component: | vdsm | Assignee: | Federico Simoncelli <fsimonce> | ||||
Status: | CLOSED ERRATA | QA Contact: | Dafna Ron <dron> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | urgent | ||||||
Version: | 6.3 | CC: | abaron, aburden, amureini, bazulay, hateya, iheim, ilvovsky, jbiddle, lpeer, thildred, ykaul | ||||
Target Milestone: | rc | Keywords: | ZStream | ||||
Target Release: | 6.3 | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | storage | ||||||
Fixed In Version: | vdsm-4.9.6-42.0 | Doc Type: | Bug Fix | ||||
Doc Text: |
Previously, if a virtual disk was migrating to a storage domain, and the storage domain ran out of available space, the virtual machine paused on an unknown storage error and could not be resumed. The virtual disk will now automatically resume once space has been made available on the storage domain.
|
Story Points: | --- | ||||
Clone Of: | Environment: | ||||||
Last Closed: | 2012-12-04 19:13:58 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Can you check if this is general for live snapshot? i.e. create a preallocated disk, take a live snapshot and after that start writing heavily to fill up the disk? are all the other errors in the log expected? libvirtEventLoop::ERROR::2012-11-07 12:07:22,250::libvirtvm::1982::vm.Vm::(_onBlockJobEvent) vmId=`febcd70c-42d6-42ca-8ead-0b5d637d87e8`::Live merge completed for an unexpected path: /rhev/data-center/28425e84-d608-4a44-a60d-a6cd3bcd9ce5/2d52459b-ccd1-4a99-bff3-1264ca77a2c4/images/2ce2b332-4028-4e4e-82f9-384bc7d79bb4/037b28fe-dda9-4417-bd01-9e376dd3f890 ... Thread-38182::INFO::2012-11-07 12:15:26,418::logUtils::37::dispatcher::(wrapper) Run and protect: getVolumeSize(sdUUID='af5bb9ad-ebe7-4ade-8d0d-a6882d11af44', spUUID='28425e84-d608-4a44-a60d-a6cd3bcd9ce5', imgUUID='67b5d784-d415-4ec2-91b9-65de15adcdb9', volUUID='7ccb8f4e-de9d-4f57-b4fd-5fbb8151a686', options=None) Thread-38182::WARNING::2012-11-07 12:15:26,418::blockVolume::102::Storage.Volume::(getVSize) Could not get size for vol af5bb9ad-ebe7-4ade-8d0d-a6882d11af44/7ccb8f4e-de9d-4f57-b4fd-5fbb8151a686 using optimized methods Traceback (most recent call last): File "/usr/share/vdsm/storage/blockVolume.py", line 94, in getVSize return _getDeviceSize(lvm.lvPath(sdobj.sdUUID, volUUID)) / bs File "/usr/share/vdsm/storage/blockVolume.py", line 65, in _getDeviceSize with open(devPath, "rb") as f: IOError: [Errno 2] No such file or directory: '/dev/af5bb9ad-ebe7-4ade-8d0d-a6882d11af44/7ccb8f4e-de9d-4f57-b4fd-5fbb8151a686' ... 70c1202c-9f17-4591-82ac-d8254f3b3bcf::ERROR::2012-11-07 14:02:36,687::task::853::TaskManager.Task::(_setError) Task=`9c6d56d0-a289-4d06-93f2-5ec41670f70a`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 861, in _run return fn(*args, **kargs) File "/usr/share/vdsm/vm.py", line 814, in __afterReplicaExtension self.__verifyVolumeExtension(volInfo) File "/usr/share/vdsm/vm.py", line 809, in __verifyVolumeExtension volInfo['domainID'], volInfo['volumeID'])) RuntimeError: Volume extension failed for vda (domainID: 5140b72b-0eac-4d4c-b475-5724bc9c27b7, volumeID: da3a4804-44d9-48fb-b24c-99bf646f4d14) ... libvirtEventLoop::INFO::2012-11-07 14:29:13,800::libvirtvm::2221::vm.Vm::(_onAbnormalStop) vmId=`1e6b4b28-9c6d-4574-a66b-293d87f1bd50`::abnormal vm stop device virtio-disk0 error eother libvirtEventLoop::DEBUG::2012-11-07 14:29:13,801::libvirtvm::2679::vm.Vm::(_onLibvirtLifecycleEvent) vmId=`1e6b4b28-9c6d-4574-a66b-293d87f1bd50`::event Suspended detail 2 opaque None (In reply to comment #2) > are all the other errors in the log expected? > > libvirtEventLoop::ERROR::2012-11-07 > 12:07:22,250::libvirtvm::1982::vm.Vm::(_onBlockJobEvent) > vmId=`febcd70c-42d6-42ca-8ead-0b5d637d87e8`::Live merge completed for an > unexpected path: > /rhev/data-center/28425e84-d608-4a44-a60d-a6cd3bcd9ce5/2d52459b-ccd1-4a99- > bff3-1264ca77a2c4/images/2ce2b332-4028-4e4e-82f9-384bc7d79bb4/037b28fe-dda9- > 4417-bd01-9e376dd3f890 Logging issue. Not a problem. > Thread-38182::INFO::2012-11-07 > 12:15:26,418::logUtils::37::dispatcher::(wrapper) Run and protect: > getVolumeSize(sdUUID='af5bb9ad-ebe7-4ade-8d0d-a6882d11af44', > spUUID='28425e84-d608-4a44-a60d-a6cd3bcd9ce5', > imgUUID='67b5d784-d415-4ec2-91b9-65de15adcdb9', > volUUID='7ccb8f4e-de9d-4f57-b4fd-5fbb8151a686', options=None) > Thread-38182::WARNING::2012-11-07 > 12:15:26,418::blockVolume::102::Storage.Volume::(getVSize) Could not get > size for vol > af5bb9ad-ebe7-4ade-8d0d-a6882d11af44/7ccb8f4e-de9d-4f57-b4fd-5fbb8151a686 > using optimized methods > Traceback (most recent call last): > File "/usr/share/vdsm/storage/blockVolume.py", line 94, in getVSize > return _getDeviceSize(lvm.lvPath(sdobj.sdUUID, volUUID)) / bs > File "/usr/share/vdsm/storage/blockVolume.py", line 65, in _getDeviceSize > with open(devPath, "rb") as f: > IOError: [Errno 2] No such file or directory: > '/dev/af5bb9ad-ebe7-4ade-8d0d-a6882d11af44/7ccb8f4e-de9d-4f57-b4fd- > 5fbb8151a686' Another logging issue, I think upstream it's fixed (no backtrace). > 70c1202c-9f17-4591-82ac-d8254f3b3bcf::ERROR::2012-11-07 > 14:02:36,687::task::853::TaskManager.Task::(_setError) > Task=`9c6d56d0-a289-4d06-93f2-5ec41670f70a`::Unexpected error > Traceback (most recent call last): > File "/usr/share/vdsm/storage/task.py", line 861, in _run > return fn(*args, **kargs) > File "/usr/share/vdsm/vm.py", line 814, in __afterReplicaExtension > self.__verifyVolumeExtension(volInfo) > File "/usr/share/vdsm/vm.py", line 809, in __verifyVolumeExtension > volInfo['domainID'], volInfo['volumeID'])) > RuntimeError: Volume extension failed for vda (domainID: > 5140b72b-0eac-4d4c-b475-5724bc9c27b7, volumeID: > da3a4804-44d9-48fb-b24c-99bf646f4d14) > > ... > > libvirtEventLoop::INFO::2012-11-07 > 14:29:13,800::libvirtvm::2221::vm.Vm::(_onAbnormalStop) > vmId=`1e6b4b28-9c6d-4574-a66b-293d87f1bd50`::abnormal vm stop device > virtio-disk0 error eother > libvirtEventLoop::DEBUG::2012-11-07 > 14:29:13,801::libvirtvm::2679::vm.Vm::(_onLibvirtLifecycleEvent) > vmId=`1e6b4b28-9c6d-4574-a66b-293d87f1bd50`::event Suspended detail 2 opaque > None This is the actual effect of not extending the lv (the spm connection to the storage was blocked), so yes in the scope of the test it's expected. The only thing that puzzles me is "eother" since we expect "enospc" but looking at the code it's not making a difference in terms of how it's handled. Merged Ia5535cc8ad55441046e81aa329fac82a354592b9 this bug is verified on vdsm-4.9.6-42.0.el6_3.x86_64 ERROR below is related to https://bugzilla.redhat.com/show_bug.cgi?id=873145 1591e96c-0f5f-4a21-892a-af2b9eeac74f::ERROR::2012-11-12 12:16:12,792::threadPool::221::Misc.ThreadPool.WorkerThread::(run) Task <function runTask at 0x17d8e60> failed Traceback (most recent call last): File "/usr/share/vdsm/storage/threadPool.py", line 213, in run cmd(args) File "/usr/share/vdsm/storage/storage_mailbox.py", line 75, in runTask ctask.prepare(cmd, *args) File "/usr/share/vdsm/storage/task.py", line 1164, in prepare raise self.error RuntimeError: Volume extension failed for hda (domainID: b25b91ca-ebd7-4d98-8b8c-7dc7aa1cb4f5, volumeID: c85f8e24-c29a-4bf8-bccc-a99e9ce8dfed) Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2012-1508.html |
Created attachment 640689 [details] logs Description of problem: I live migrated a vm's disk and during migration I fill the destination storage so that vm will run out of space. VM pauses on unknown storage error and cannot resume once we clean the storage. Version-Release number of selected component (if applicable): si24 vdsm-4.9.6-41.0.el6_3.x86_64 How reproducible: 100% Steps to Reproduce: 1. create a large preallocated disk and run the vm 2. move the disk to a second storage 3. move a thin provision disk to the same domain Actual results: vm with preallocated disk will pause and will not be resumed even after we free more space in the storage Expected results: vm should resume automatically Additional info: logs