Bug 837003
Summary: | 3.1.z - VM suspend leads to paused state instead of suspended | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Oded Ramraz <oramraz> | ||||
Component: | vdsm | Assignee: | Michal Skrivanek <michal.skrivanek> | ||||
Status: | CLOSED WORKSFORME | QA Contact: | Oded Ramraz <oramraz> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 6.3 | CC: | abaron, acathrow, bazulay, bugzilla-qe-tlv, dyasny, eedri, iheim, mgoldboi, oramraz, pnovotny, sgrinber, ykaul | ||||
Target Milestone: | rc | Keywords: | Regression, Reopened, TestBlocker | ||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | virt | ||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | 826575 | Environment: | |||||
Last Closed: | 2013-01-06 17:11:39 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 826575 | ||||||
Bug Blocks: | |||||||
Attachments: |
|
Description
Oded Ramraz
2012-07-02 13:39:01 UTC
The above happened due to ioerror: """ libvirtError: unable to close /rhev/data-center/61285a30-7edd-467a-8596-ce62bb4f3d44/602a3739-d688-4471-ae17-47007e18154b/images/6a21b667-da7e-4bdc-a8fd-e35dfc6bc9bf/3909323e-ecd3-4bd2-97ee-7314f460d49c: Input/output error """ so qemu pauses the VM. Can it be reproduced often? on a different SD ? Reopening. The issue still persists. Version: vdsm-4.9.6-37.0.el6_3.x86_64 libvirt-0.9.10-21.el6_3.4.x86_64 Reproducer: 1. In webadmin, create new VM in NFS data center. 2. Add diskt to the VM (1GB size, Thin provision, bootable, no guest OS needed) and update the VM to boot from it. 3. Start the VM. 4. When up, Suspend the VM. Results: VM is Paused instead of Suspended and stuck in the paused mode. Shutting down or stopping the VM does not work from this point. Expected results: VM should be suspended. Additional info: VDSM log excerpt (see full log in attachment): Thread-689::DEBUG::2012-10-15 14:15:09,982::BindingXMLRPC::880::vds::(wrapper) return vmHibernate with {'status': {'message': 'Hibernation process starting', 'code': 0}} ... ... Thread-690::ERROR::2012-10-15 14:15:11,410::task::853::TaskManager.Task::(_setError) Task=`e4d8ba80-4109-4e5f-b553-99a197cfce75`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 861, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 38, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 2814, in teardownImage img.teardown(sdUUID, imgUUID, volUUID) File "/usr/share/vdsm/storage/image.py", line 354, in teardown chain = self.getChain(sdUUID, imgUUID, volUUID) File "/usr/share/vdsm/storage/image.py", line 286, in getChain uuidlist = volclass.getImageVolumes(self.repoPath, sdUUID, imgUUID) File "/usr/share/vdsm/storage/fileVolume.py", line 387, in getImageVolumes produceVolume(imgUUID, volid). File "/usr/share/vdsm/storage/fileSD.py", line 174, in produceVolume return fileVolume.FileVolume(repoPath, self.sdUUID, imgUUID, volUUID) File "/usr/share/vdsm/storage/fileVolume.py", line 71, in __init__ volume.Volume.__init__(self, repoPath, sdUUID, imgUUID, volUUID) File "/usr/share/vdsm/storage/volume.py", line 127, in __init__ self.validate() File "/usr/share/vdsm/storage/volume.py", line 140, in validate self.validateVolumePath() File "/usr/share/vdsm/storage/fileVolume.py", line 557, in validateVolumePath raise se.VolumeDoesNotExist(self.volUUID) VolumeDoesNotExist: Volume does not exist: ('f6ee3a1c-e0e3-470c-a175-63c17eafb0bd',) Thread-690::DEBUG::2012-10-15 14:15:11,430::task::872::TaskManager.Task::(_run) Task=`e4d8ba80-4109-4e5f-b553-99a197cfce75`::Task._run: e4d8ba80-4109-4e5f-b553-99a197cfce75 ('3774dc69-8ba3-456e-b584-4c08e3ba3826', 'e42f70cc-802e-46be-b931-867450cc6282', 'ffc59ea3-64f7-4fbf-ae9a-889a6efbf331') {} failed - stopping task Thread-690::DEBUG::2012-10-15 14:15:11,434::task::1199::TaskManager.Task::(stop) Task=`e4d8ba80-4109-4e5f-b553-99a197cfce75`::stopping in state preparing (force False) Thread-690::DEBUG::2012-10-15 14:15:11,439::task::978::TaskManager.Task::(_decref) Task=`e4d8ba80-4109-4e5f-b553-99a197cfce75`::ref 1 aborting True Thread-690::INFO::2012-10-15 14:15:11,443::task::1157::TaskManager.Task::(prepare) Task=`e4d8ba80-4109-4e5f-b553-99a197cfce75`::aborting: Task is aborted: 'Volume does not exist' - code 201 Thread-690::DEBUG::2012-10-15 14:15:11,448::task::1162::TaskManager.Task::(prepare) Task=`e4d8ba80-4109-4e5f-b553-99a197cfce75`::Prepare: aborted: Volume does not exist Created attachment 627512 [details]
VDSM log
VDSM log starting at the moment of VM suspension.
currently debugging with pdufek. Looks like either libvirtd bug or misconfiguration(the lack of) by vdsm verified it works correctly when /etc/libvirt/qemu.conf is changed to use vdsm:kvm instead of root:root. The NFS share is not writable for "root", only for "vdsm" user After investigation the affected VMs were on a storage provided by TLV engops , a non Linux-based NFS server with root-squash enabled. I suspect a bug/incompatibility in their NFSv4 implementation. Switching back to NFSv3 or disabling root_squash likely solves the issue. Would you please retry with those settings and/or on different storage server? I'd also suggest to remove Regression keyword No response, there's nothing to fix here, hence closing. Feel free to reopen |