Bug 837003
| Summary: | 3.1.z - VM suspend leads to paused state instead of suspended | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Oded Ramraz <oramraz> | ||||
| Component: | vdsm | Assignee: | Michal Skrivanek <michal.skrivanek> | ||||
| Status: | CLOSED WORKSFORME | QA Contact: | Oded Ramraz <oramraz> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 6.3 | CC: | abaron, acathrow, bazulay, bugzilla-qe-tlv, dyasny, eedri, iheim, mgoldboi, oramraz, pnovotny, sgrinber, ykaul | ||||
| Target Milestone: | rc | Keywords: | Regression, Reopened, TestBlocker | ||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | virt | ||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | 826575 | Environment: | |||||
| Last Closed: | 2013-01-06 17:11:39 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | 826575 | ||||||
| Bug Blocks: | |||||||
| Attachments: |
|
||||||
|
Description
Oded Ramraz
2012-07-02 13:39:01 UTC
The above happened due to ioerror: """ libvirtError: unable to close /rhev/data-center/61285a30-7edd-467a-8596-ce62bb4f3d44/602a3739-d688-4471-ae17-47007e18154b/images/6a21b667-da7e-4bdc-a8fd-e35dfc6bc9bf/3909323e-ecd3-4bd2-97ee-7314f460d49c: Input/output error """ so qemu pauses the VM. Can it be reproduced often? on a different SD ? Reopening.
The issue still persists.
Version:
vdsm-4.9.6-37.0.el6_3.x86_64
libvirt-0.9.10-21.el6_3.4.x86_64
Reproducer:
1. In webadmin, create new VM in NFS data center.
2. Add diskt to the VM (1GB size, Thin provision, bootable, no guest OS needed) and update the VM to boot from it.
3. Start the VM.
4. When up, Suspend the VM.
Results:
VM is Paused instead of Suspended and stuck in the paused mode. Shutting down or stopping the VM does not work from this point.
Expected results:
VM should be suspended.
Additional info:
VDSM log excerpt (see full log in attachment):
Thread-689::DEBUG::2012-10-15 14:15:09,982::BindingXMLRPC::880::vds::(wrapper) return vmHibernate with {'status': {'message': 'Hibernation process starting', 'code': 0}}
...
...
Thread-690::ERROR::2012-10-15 14:15:11,410::task::853::TaskManager.Task::(_setError) Task=`e4d8ba80-4109-4e5f-b553-99a197cfce75`::Unexpected error
Traceback (most recent call last):
File "/usr/share/vdsm/storage/task.py", line 861, in _run
return fn(*args, **kargs)
File "/usr/share/vdsm/logUtils.py", line 38, in wrapper
res = f(*args, **kwargs)
File "/usr/share/vdsm/storage/hsm.py", line 2814, in teardownImage
img.teardown(sdUUID, imgUUID, volUUID)
File "/usr/share/vdsm/storage/image.py", line 354, in teardown
chain = self.getChain(sdUUID, imgUUID, volUUID)
File "/usr/share/vdsm/storage/image.py", line 286, in getChain
uuidlist = volclass.getImageVolumes(self.repoPath, sdUUID, imgUUID)
File "/usr/share/vdsm/storage/fileVolume.py", line 387, in getImageVolumes
produceVolume(imgUUID, volid).
File "/usr/share/vdsm/storage/fileSD.py", line 174, in produceVolume
return fileVolume.FileVolume(repoPath, self.sdUUID, imgUUID, volUUID)
File "/usr/share/vdsm/storage/fileVolume.py", line 71, in __init__
volume.Volume.__init__(self, repoPath, sdUUID, imgUUID, volUUID)
File "/usr/share/vdsm/storage/volume.py", line 127, in __init__
self.validate()
File "/usr/share/vdsm/storage/volume.py", line 140, in validate
self.validateVolumePath()
File "/usr/share/vdsm/storage/fileVolume.py", line 557, in validateVolumePath
raise se.VolumeDoesNotExist(self.volUUID)
VolumeDoesNotExist: Volume does not exist: ('f6ee3a1c-e0e3-470c-a175-63c17eafb0bd',)
Thread-690::DEBUG::2012-10-15 14:15:11,430::task::872::TaskManager.Task::(_run) Task=`e4d8ba80-4109-4e5f-b553-99a197cfce75`::Task._run: e4d8ba80-4109-4e5f-b553-99a197cfce75 ('3774dc69-8ba3-456e-b584-4c08e3ba3826', 'e42f70cc-802e-46be-b931-867450cc6282', 'ffc59ea3-64f7-4fbf-ae9a-889a6efbf331') {} failed - stopping task
Thread-690::DEBUG::2012-10-15 14:15:11,434::task::1199::TaskManager.Task::(stop) Task=`e4d8ba80-4109-4e5f-b553-99a197cfce75`::stopping in state preparing (force False)
Thread-690::DEBUG::2012-10-15 14:15:11,439::task::978::TaskManager.Task::(_decref) Task=`e4d8ba80-4109-4e5f-b553-99a197cfce75`::ref 1 aborting True
Thread-690::INFO::2012-10-15 14:15:11,443::task::1157::TaskManager.Task::(prepare) Task=`e4d8ba80-4109-4e5f-b553-99a197cfce75`::aborting: Task is aborted: 'Volume does not exist' - code 201
Thread-690::DEBUG::2012-10-15 14:15:11,448::task::1162::TaskManager.Task::(prepare) Task=`e4d8ba80-4109-4e5f-b553-99a197cfce75`::Prepare: aborted: Volume does not exist
Created attachment 627512 [details]
VDSM log
VDSM log starting at the moment of VM suspension.
currently debugging with pdufek. Looks like either libvirtd bug or misconfiguration(the lack of) by vdsm verified it works correctly when /etc/libvirt/qemu.conf is changed to use vdsm:kvm instead of root:root. The NFS share is not writable for "root", only for "vdsm" user After investigation the affected VMs were on a storage provided by TLV engops , a non Linux-based NFS server with root-squash enabled. I suspect a bug/incompatibility in their NFSv4 implementation. Switching back to NFSv3 or disabling root_squash likely solves the issue. Would you please retry with those settings and/or on different storage server? I'd also suggest to remove Regression keyword No response, there's nothing to fix here, hence closing. Feel free to reopen |