1708031 – Failure to resume a VM from suspend: Changed state to Down: internal error: Child process (gzip -dc)

Bug 1708031 - Failure to resume a VM from suspend: Changed state to Down: internal error: Child process (gzip -dc)

Summary: Failure to resume a VM from suspend: Changed state to Down: internal error: C...

Keywords:
Status:	CLOSED DUPLICATE of bug 1503468
Alias:	None
Product:	vdsm
Classification:	oVirt
Component:	General
Sub Component:
Version:	4.30.13
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	medium
Target Milestone:	ovirt-4.4.0
Target Release:	---
Assignee:	Dan Kenigsberg
QA Contact:	Lukas Svaty
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1708030 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-05-09 02:22 UTC by wang_meng@massclouds.com
Modified:	2019-05-10 01:20 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2019-05-10 01:20:04 UTC
oVirt Team:	Virt
Embargoed:
Dependent Products:
Flags:	rbarry: ovirt-4.4?

Attachments	(Terms of Use)
vdsm.log on the host where VM run. (12.47 MB, text/plain) 2019-05-09 02:25 UTC, wang_meng@massclouds.com	no flags	Details
View All

Description wang_meng@massclouds.com 2019-05-09 02:22:17 UTC

Description of problem:
  When resume a VM from suspend, the VM always restart the operate system which cause the lose of the live memory that should be recorded after resuming.
     
Version-Release number of selected component (if applicable):
  vdsm-4.30.13-1.el7

How reproducible:
  Always

Steps to Reproduce:
1. Create a VM on cluster based on iscsi/FC.
2. Open some files in VM.
3. Click pause button on Ovirt UI to pause the VM.
4. Click run button to resume the VM from the suspend status .

Actual results:
   The VM restart OS, and enter an initialed state, no files opened.

Expected results:
   The VM resumed, and the files opened still be there in VM.

Additional info:
1. the vdsm reports an exception:
2019-05-09 08:42:39,705+0800 ERROR (vm/f060e3d3) [virt.vm] (vmId='f060e3d3-fc16-463c-bfe4-46c63ddfe97e') The vm start process failed (vm:937)
10714 Traceback (most recent call last):
10715   File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 866, in _startUnderlyingVm
10716     self._run()
10717   File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2828, in _run
10718     self._connection.restore(fname)
10719   File "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line 131, in wrapper
10720     ret = f(*args, **kwargs)
10721   File "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 94, in wrapper
10722     return func(inst, *args, **kwargs)
10723   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 4513, in restore
10724     if ret == -1: raise libvirtError ('virDomainRestore() failed', conn=self)
10725 libvirtError: internal error: Child process (gzip -dc) unexpected exit status 2:
10726 gzip: stdin: decompression OK, trailing garbage ignored
10727 
10728 2019-05-09 08:42:39,705+0800 INFO  (vm/f060e3d3) [virt.vm] (vmId='f060e3d3-fc16-463c-bfe4-46c63ddfe97e') Changed state to Down: internal error: Child process (gzip -dc) unexpected exit status 2:
10729 gzip: stdin: decompression OK, trailing garbage ignored
10730  (code=1) (vm:1675)
10731 2019-05-09 08:42:39,710+0800 INFO  (vm/f060e3d3) [virt.vm] (vmId='f060e3d3-fc16-463c-bfe4-46c63ddfe97e') Stopping connection (guestagent:455)
2. This problem only happened on iscsi/FC storage, not NFS.
3. It seems that the problem is related with the bug: https://bugzilla.redhat.com/show_bug.cgi?id=1503468
   After I changed the config of save_image_format="lzop" in /etc/libvirt/qemu.conf, the problem  disappared. I think the reason is Comment 45 in above bug(1503468). 
    if it's not a problem, how should i do to avoid the problem?

Comment 1 wang_meng@massclouds.com 2019-05-09 02:25:59 UTC

Created attachment 1565926 [details]
vdsm.log on the host where VM run.

Comment 2 Ryan Barry 2019-05-10 00:44:01 UTC

*** Bug 1708030 has been marked as a duplicate of this bug. ***

Comment 3 Ryan Barry 2019-05-10 00:48:47 UTC

Hi Wang -

We are waiting for a platform fix in 7.7 which will hopefully resolve.

It's not a vdsm bug per se, since we're waiting for libvirt. I'd suggest using lzo as a workaround until 7.7 is available, and we'll re-evaluate

Comment 4 wang_meng@massclouds.com 2019-05-10 01:18:49 UTC

Thanks a lot, Ryan. 

So you could close this bug, for it's not reasonal to submit here.

Comment 5 Ryan Barry 2019-05-10 01:20:04 UTC


*** This bug has been marked as a duplicate of bug 1503468 ***

Note You need to log in before you can comment on or make changes to this bug.