Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1733804

Summary:

Resume of a suspended VM fails with error "Child process (gzip -dc) unexpected exit status 2"

Product:

[oVirt] ovirt-engine

Reporter:

Polina <pagranat>

Component:

BLL.Virt

Assignee:

Michal Skrivanek <michal.skrivanek>

Status:

CLOSED DEFERRED

QA Contact:

meital avital <mavital>

Severity:

unspecified

Docs Contact:

Priority:

unspecified

Version:

4.3.5.4

CC:

bugs, rbarry

Target Milestone:

---

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

Virt

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
engine & vdsm logs	none
qemu log of failed VM	none
engine /var/log/	none
host /var/log/ for source and destination	none

Description Polina 2019-07-28 19:30:33 UTC

Created attachment 1594112 [details]
engine & vdsm logs

Description of problem: a suspended VM which is part of affinity group fails to run with error: "Exit message: Wake up from hibernation failed:internal error: Child process (gzip -dc) unexpected exit status 2: 
gzip: stdin: decompression OK, trailing garbage ignored"

Version-Release number of selected component (if applicable): ovirt-engine-4.3.5.4-0.1.el7.noarch

How reproducible: 80% in the following scenario

Steps to Reproduce:

1. Create affinity group with hard positive VMs rule for VM1,VM2, VM3 (in Cluster/Affinity Groups).
2. Run VMs on host1.
3. Suspend one VM from the group - success.
4. Select all three VMs , choose to migrate with closure option (Migrate VMs in Affinity in Migrate window) to host2 - success:two VMs migrated .
5. Try to resume the suspended VM.

Actual results: VM fails to run.

2019-07-28 18:54:51,105+03 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-1) [] VM '155c0288-1fde-47f4-8014-b6449d9ee51e'(golden_env_mixed_virtio_2_0) moved from 'RestoringState' --> 'Down'
2019-07-28 18:54:51,170+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-1) [] EVENT_ID: VM_DOWN_ERROR(119), VM golden_env_mixed_virtio_2_0 is down with error. Exit message: Wake up from hibernation failed:internal error: Child process (gzip -dc) unexpected exit status 2:
gzip: stdin: decompression OK, trailing garbage ignored

Expected results:
VM must on the host2

Additional info: sometimes it succeeds, but fails quite a lot. Please see in the attached engine.log the error at 2019-07-28 18:54:51,170+03

Comment 1 Polina 2019-07-28 19:39:24 UTC

Created attachment 1594113 [details]
qemu log of failed VM

Comment 2 Ryan Barry 2019-07-29 10:07:37 UTC

Full /var/log would be helpful in this case

Comment 3 Polina 2019-07-30 16:09:12 UTC

Created attachment 1594688 [details]
engine /var/log/

Attached var/log dir for engine , source host and destination host .
The scenario (the same as in description): three VMs in positive hard affinity VMs rule are running on host_mixed_2. vm  golden_env_mixed_virtio_2_0 is suspended. Then select all three VMs and choose to migrate with closure option to the host_mixed_1. Two VMs are migrated successfully . Try to resume the suspended VM fails.

The ERROR happens at:

2019-07-30 17:28:00,652+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-10) [] EVENT_ID: VM_DOWN_ERROR(119), VM golden_env_mixed_virtio_2_0 is down with error. Exit message: Wake up from hibernation failed:internal error: Child process (gzip -dc) unexpected exit status 2:
gzip: stdin: decompression OK, trailing garbage ignored

2019-07-30 17:28:02,359+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-67833) [5833f98] EVENT_ID: USER_FAILED_RUN_VM(54), Failed to run VM golden_env_mixed_virtio_2_0  (User: admin@internal-authz).

The bug is quite reproducible. sometimes it is just required to repeat the scenario two or three times.

Comment 4 Polina 2019-07-30 16:20:37 UTC

Created attachment 1594702 [details]
host /var/log/ for source and destination

Comment 5 Ryan Barry 2019-07-30 16:41:10 UTC

Thanks Polina.

So, this is still on block storage, and possibly related to https://bugzilla.redhat.com/show_bug.cgi?id=1503468 (and the related change from lzo to gzip). Reproducible on NFS?

Comment 6 Michal Skrivanek 2020-03-18 15:50:25 UTC

This bug didn't get any attention for a while, we didn't have the capacity to make any progress. If you deeply care about it or want to work on it please assign/target accordingly

Comment 7 Michal Skrivanek 2020-03-18 15:54:55 UTC

This bug didn't get any attention for a while, we didn't have the capacity to make any progress. If you deeply care about it or want to work on it please assign/target accordingly

Comment 8 Michal Skrivanek 2020-04-01 14:49:10 UTC

ok, closing. Please reopen if still relevant/you want to work on it.

Comment 9 Michal Skrivanek 2020-04-01 14:52:05 UTC

ok, closing. Please reopen if still relevant/you want to work on it.