Bug 1708031
| Summary: | Failure to resume a VM from suspend: Changed state to Down: internal error: Child process (gzip -dc) | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [oVirt] vdsm | Reporter: | wang_meng <wang_meng> | ||||
| Component: | General | Assignee: | Dan Kenigsberg <danken> | ||||
| Status: | CLOSED DUPLICATE | QA Contact: | Lukas Svaty <lsvaty> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 4.30.13 | CC: | bugs, rbarry | ||||
| Target Milestone: | ovirt-4.4.0 | Flags: | rbarry:
ovirt-4.4?
|
||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2019-05-10 01:20:04 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | Virt | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
Created attachment 1565926 [details]
vdsm.log on the host where VM run.
*** Bug 1708030 has been marked as a duplicate of this bug. *** Hi Wang - We are waiting for a platform fix in 7.7 which will hopefully resolve. It's not a vdsm bug per se, since we're waiting for libvirt. I'd suggest using lzo as a workaround until 7.7 is available, and we'll re-evaluate Thanks a lot, Ryan. So you could close this bug, for it's not reasonal to submit here. *** This bug has been marked as a duplicate of bug 1503468 *** |
Description of problem: When resume a VM from suspend, the VM always restart the operate system which cause the lose of the live memory that should be recorded after resuming. Version-Release number of selected component (if applicable): vdsm-4.30.13-1.el7 How reproducible: Always Steps to Reproduce: 1. Create a VM on cluster based on iscsi/FC. 2. Open some files in VM. 3. Click pause button on Ovirt UI to pause the VM. 4. Click run button to resume the VM from the suspend status . Actual results: The VM restart OS, and enter an initialed state, no files opened. Expected results: The VM resumed, and the files opened still be there in VM. Additional info: 1. the vdsm reports an exception: 2019-05-09 08:42:39,705+0800 ERROR (vm/f060e3d3) [virt.vm] (vmId='f060e3d3-fc16-463c-bfe4-46c63ddfe97e') The vm start process failed (vm:937) 10714 Traceback (most recent call last): 10715 File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 866, in _startUnderlyingVm 10716 self._run() 10717 File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2828, in _run 10718 self._connection.restore(fname) 10719 File "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line 131, in wrapper 10720 ret = f(*args, **kwargs) 10721 File "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 94, in wrapper 10722 return func(inst, *args, **kwargs) 10723 File "/usr/lib64/python2.7/site-packages/libvirt.py", line 4513, in restore 10724 if ret == -1: raise libvirtError ('virDomainRestore() failed', conn=self) 10725 libvirtError: internal error: Child process (gzip -dc) unexpected exit status 2: 10726 gzip: stdin: decompression OK, trailing garbage ignored 10727 10728 2019-05-09 08:42:39,705+0800 INFO (vm/f060e3d3) [virt.vm] (vmId='f060e3d3-fc16-463c-bfe4-46c63ddfe97e') Changed state to Down: internal error: Child process (gzip -dc) unexpected exit status 2: 10729 gzip: stdin: decompression OK, trailing garbage ignored 10730 (code=1) (vm:1675) 10731 2019-05-09 08:42:39,710+0800 INFO (vm/f060e3d3) [virt.vm] (vmId='f060e3d3-fc16-463c-bfe4-46c63ddfe97e') Stopping connection (guestagent:455) 2. This problem only happened on iscsi/FC storage, not NFS. 3. It seems that the problem is related with the bug: https://bugzilla.redhat.com/show_bug.cgi?id=1503468 After I changed the config of save_image_format="lzop" in /etc/libvirt/qemu.conf, the problem disappared. I think the reason is Comment 45 in above bug(1503468). if it's not a problem, how should i do to avoid the problem?