Bug 630989 - HVM guest w/ UP and PV driver hangs after live migration or suspend/resume [rhel-5.5.z]
Summary: HVM guest w/ UP and PV driver hangs after live migration or suspend/resume [r...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel-xen
Version: 5.5
Hardware: All
OS: Linux
urgent
medium
Target Milestone: rc
: ---
Assignee: Jiri Pirko
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On: 629773
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-09-07 14:51 UTC by RHEL Program Management
Modified: 2015-05-05 01:21 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previously, migrating a hardware virtual machine (HVM) guest with both, UP and PV drivers, may have caused the guest to stop responding. With this update, HVM guest migration works as expected.
Clone Of:
Environment:
Last Closed: 2010-11-09 18:07:39 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2010:0839 0 normal SHIPPED_LIVE Moderate: kernel security and bug fix update 2010-11-09 18:06:20 UTC

Description RHEL Program Management 2010-09-07 14:51:06 UTC
This bug has been copied from bug #629773 and has been proposed
to be backported to 5.5 z-stream (EUS).

Comment 2 Jiri Pirko 2010-10-11 08:55:24 UTC
in kernel 2.6.18-194.20.1.el5

xen-hvm-fix-up-suspend-resume-migration-w-pv-drivers.patch

Comment 4 Lei Wang 2010-10-27 10:17:14 UTC
I can reproduce and verify this bug on x86_64 platform, details as below:

Reproduce this issue with:
host:
RHEL-5.5 x86_64
kernel-xen-2.6.18-194.el5

guest:
RHEL-5.5 x86_64
guest with one vcpu and netfront vif.
(could not reproduce this issue with 32bit guest)

HVM guest will 100% hang after restore.
HVM guest will hang after migration approximately 50% of the time.

Verify this issue with: 
kernel-xen-2.6.18-194.24.1.el5
save/restore and migration works correctly.
======================================================
But on IA64 host with 2.6.18-194.24.1.el5xen, when restore the HVM guest from image, met Error:

[root@dhcp-66-82-141 bug630989]# xm restore vm3.save
Error: Restore failed
Usage: xm restore <CheckpointFile>

Restore a domain from a saved state.

Found "ERROR Internal error: HVM Restore is unsupported" in xend.log
... ...
[2010-09-18 02:05:27 xend 2879] INFO (XendCheckpoint:181) restore hvm domain 7, apic=0, pae=0
[2010-09-18 02:05:27 xend 2879] DEBUG (image:330) args: boot, val: dc
[2010-09-18 02:05:27 xend 2879] DEBUG (image:330) args: fda, val: None
[2010-09-18 02:05:27 xend 2879] DEBUG (image:330) args: fdb, val: None
[2010-09-18 02:05:27 xend 2879] DEBUG (image:330) args: soundhw, val: None
[2010-09-18 02:05:27 xend 2879] DEBUG (image:330) args: localtime, val: 0
[2010-09-18 02:05:27 xend 2879] DEBUG (image:330) args: serial, val: None
[2010-09-18 02:05:27 xend 2879] DEBUG (image:330) args: std-vga, val: None
[2010-09-18 02:05:27 xend 2879] DEBUG (image:330) args: isa, val: None
[2010-09-18 02:05:27 xend 2879] DEBUG (image:330) args: vcpus, val: 1
[2010-09-18 02:05:27 xend 2879] DEBUG (image:330) args: acpi, val: 1
[2010-09-18 02:05:27 xend 2879] DEBUG (image:330) args: usb, val: None
[2010-09-18 02:05:27 xend 2879] DEBUG (image:330) args: usbdevice, val: None
[2010-09-18 02:05:27 xend 2879] DEBUG (image:330) args: k, val: None
[2010-09-18 02:05:27 xend 2879] DEBUG (image:390) No VNC passwd configured for vfb access
[2010-09-18 02:05:27 xend 2879] DEBUG (XendCheckpoint:200) restore:shadow=0x0, _static_max=0x400, _static_min=0x400,
[2010-09-18 02:05:27 xend 2879] DEBUG (balloon:145) Balloon: 3076320 KiB free; need 1065024; done.
[2010-09-18 02:05:27 xend 2879] DEBUG (XendCheckpoint:217) [xc_restore]: /usr/lib/xen/bin/xc_restore 19 7 1 2 1 0 0
[2010-09-18 02:05:27 xend 2879] INFO (XendCheckpoint:353) ERROR Internal error: HVM Restore is unsupported
[2010-09-18 02:05:27 xend 2879] INFO (XendCheckpoint:353) Restore exit with rc=1
[2010-09-18 02:05:27 xend.XendDomainInfo 2879] DEBUG (XendDomainInfo:2189) XendDomainInfo.destroy: domid=7
[2010-09-18 02:05:27 xend.XendDomainInfo 2879] ERROR (XendDomainInfo:2198) XendDomainInfo.destroy: xc.domain_destroy failed.
Traceback (most recent call last):
  File "/usr/lib/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 2194, in destroy
    xc.domain_pause(self.domid)
Error: (3, 'No such process')
[2010-09-18 02:05:27 xend.XendDomainInfo 2879] INFO (XendDomainInfo:2330) Dev 768 still active, looping...
[2010-09-18 02:05:27 xend.XendDomainInfo 2879] INFO (XendDomainInfo:2330) Dev 768 still active, looping...
[2010-09-18 02:05:27 xend.XendDomainInfo 2879] INFO (XendDomainInfo:2330) Dev 768 still active, looping...
[2010-09-18 02:05:28 xend.XendDomainInfo 2879] INFO (XendDomainInfo:2330) Dev 768 still active, looping...
[2010-09-18 02:05:28 xend.XendDomainInfo 2879] INFO (XendDomainInfo:2330) Dev 768 still active, looping...
[2010-09-18 02:05:28 xend.XendDomainInfo 2879] INFO (XendDomainInfo:2330) Dev 768 still active, looping...
[2010-09-18 02:05:28 xend.XendDomainInfo 2879] INFO (XendDomainInfo:2330) Dev 768 still active, looping...
[2010-09-18 02:05:28 xend.XendDomainInfo 2879] DEBUG (XendDomainInfo:2114) UUID Created: True
[2010-09-18 02:05:28 xend.XendDomainInfo 2879] DEBUG (XendDomainInfo:2115) Devices to release: [5], domid = 7
[2010-09-18 02:05:28 xend.XendDomainInfo 2879] DEBUG (XendDomainInfo:2127) Releasing PVFB backend devices ...
[2010-09-18 02:05:28 xend 2879] ERROR (XendDomain:284) Restore failed
Traceback (most recent call last):
  File "/usr/lib/python2.4/site-packages/xen/xend/XendDomain.py", line 279, in domain_restore_fd
    return XendCheckpoint.restore(self, fd, relocating=relocating)
  File "/usr/lib/python2.4/site-packages/xen/xend/XendCheckpoint.py", line 221, in restore
    forkHelper(cmd, fd, handler.handler, True)
  File "/usr/lib/python2.4/site-packages/xen/xend/XendCheckpoint.py", line 341, in forkHelper
    raise XendError("%s failed" % string.join(cmd))
XendError: /usr/lib/xen/bin/xc_restore 19 7 1 2 1 0 0 failed

Comment 5 Lei Wang 2010-10-27 10:29:23 UTC
Hi, Miroslav

Would you please help to confirm this issue? 
Do we need to verify this issue on IA64 platform? 
I saw the "Architecture:x86, IA64" in the Description of bug 629773 from which this bug was cloned. 
If not I think this bug could be verified according to the test result on x86_64 host in Comment 4. 

Thanks
Lei Wang

Comment 6 Miroslav Rezanina 2010-10-27 11:03:57 UTC
No, this is not issue on IA64 as HVM save/migrate is not supported on it.

Comment 7 Lei Wang 2010-10-28 01:47:26 UTC
Thanks, Miroslav

According to comment 4 and comment 6, move to VERIFIED.

Comment 9 errata-xmlrpc 2010-11-09 18:07:39 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0839.html

Comment 10 Martin Prpič 2010-11-11 13:56:45 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Previously, migrating a hardware virtual machine (HVM) guest with both, UP and PV drivers, may have caused the guest to stop responding. With this update, HVM guest migration works as expected.


Note You need to log in before you can comment on or make changes to this bug.