Description of problem: The live migration of HVM guest w/ UP + PV driver causes its guest to hang approximately 50% of the time. The customer does not experience this problem w/o UP and w/o PV driver. Live migration is similar to resume/restore. If the customer does a restore on same machine, it causes domain to hang 100% of the time. The dump trace is follows. crash> bt -a PID: 2595 TASK: ffff81001362f0c0 CPU: 0 COMMAND: "suspend" #0 [ffff8100115abca0] schedule at ffffffff80063f96 #1 [ffff8100115abca8] thread_return at ffffffff80063ff8 #2 [ffff8100115abd78] read_reply at ffffffff880fc1fa #3 [ffff8100115abe68] _spin_unlock_irqrestore at ffffffff80065b50 #4 [ffff8100115abe98] __xen_suspend at ffffffff880fb96a #5 [ffff8100115abed8] xen_suspend at ffffffff880fb605 #6 [ffff8100115abee8] kthread at ffffffff80032bdc #7 [ffff8100115abf48] kernel_thread at ffffffff8005efb1 Version-Release number of selected component (if applicable): Red Hat Enterprise Linux Version Number:5.5 Release Number: Architecture:x86, IA64 Kernel Version:2.6.18-194.el5xen Related Package Version:none Related Middleware / Application:none How reproducible: create HVM w/ UP w/ PV driver do suspend/resume. Actual results: HVM guest hungs Expected results: Guest does not hang Additional info: I am seeing if we can get the dump
This is UP variant of BZ #555910.
As Miroslav says, this is the UP variant of bug 555910, but we can take a fresh look at both cases, starting with the UP case this time. I've tried a couple experimental patches, but haven't had any luck keeping xenbus from jumping in on the suspend. Unfortunately upstream code is quite a bit different in this area, but we should still attempt this with an upstream 2.6.18-based kernel running on the full virt guest to see what happens.
Created attachment 443458 [details] Patch fixing save/restore with xen vnif device This is a backport of c/s 15691 fixing this problem.
*** Bug 573926 has been marked as a duplicate of this bug. ***
Test packages containing fix can be downloaded from people.redhat.com/mrezanin/bz629773.
in kernel-2.6.18-222.el5 You can download this test kernel from http://people.redhat.com/jwilson/el5 Detailed testing feedback is always welcomed.
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Previously, migrating a hardware virtual machine (HVM) guest with both UP and PV drivers may have caused the guest to stop responding. With this update, HVM guest migration works as expected.
Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1 +1 @@ -Previously, migrating a hardware virtual machine (HVM) guest with both UP and PV drivers may have caused the guest to stop responding. With this update, HVM guest migration works as expected.+Previously, migrating a hardware virtual machine (HVM) guest with both, UP and PV drivers, may have caused the guest to stop responding. With this update, HVM guest migration works as expected.
QA verified this bug with kernel-xen-2.6.18-232.el5: 1. Start a RHEL-5.4 HVM guest with vcpus=1 and memory=1024, also using netfront as the network device type. 2. save and restore the guest For kernel-xen package(kernel-xen-2.6.18-194.8.1.el5.x86_64.rpm) without the patch, the guest would hang there after migration. But for kernel-xen-2.6.18-232.el5, the guest runs well after restore. So change this bug to VERIFIED.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0017.html