Red Hat Bugzilla – Bug 507765
XenD looses PVFB config info after PV guest reboot
Last modified: 2014-02-02 17:37:14 EST
Description of problem:
The virtual console of PV guest in virt-manager can't work when issue reboot command while the OS can boot up(can log in to it using ssh), while virtual console of HVM guest works well, to make the virtual console work, the work arrounds are:
*Click the "Shut Down" button on virt-manager and then click "Run" button after the PV guest shutdown.
*Issue "shutdown -h now" then click "Run" button
Version-Release number of selected component (if applicable):
[root@maxxm ~]# uname -a
Linux maxxm.rx3600-7.test 2.6.18-152.el5xen #1 SMP Wed Jun 3 19:21:01 EDT 2009 ia64 ia64 ia64 GNU/Linux
[root@maxxm ~]# rpm -qa | grep virt-manager
Steps to Reproduce:
1.Install RHEL5.4 alpha1 (xen) using installation number as Dom0.
2.Install a PV guest (rhel5.4a1) as domU.
3.Reboot the domU.(domU system command or click the virt-manager reboot button)
4.After shutdown, Virt-manager console Interface always displays :"Console not configured for guest". Actually domU have been rebooted. DomU's SSH port is open.
After reboot, the virtual console of PV guest in virt-manager can't work.
The virtual console of PV guest in virt-manager should display whole reboot process and work well.
Created attachment 349200 [details]
Created attachment 349201 [details]
Created attachment 349202 [details]
Looks like this is either a xen or libvirt bug (reassigning to libvirt for now). Rebooting a VM using virsh makes the <display> device completely disappear from the XML. Some attachments coming.
Created attachment 349406 [details]
Output from 'start, dumpxml, reboot, dumpxml'
Created attachment 349407 [details]
Output from 'virsh start, xm list --long, virsh reboot, xm list --long'
*** Bug 509944 has been marked as a duplicate of this bug. ***
Aaaaaaaaaaaahh, Cole's log from comment #8 is slightly mis-leading. In the second invocation of 'xm list --long' the guest has not yet rebooted - note that the domain ID is still '10' - you'd expect it to be 11.
I can reproduce this and when it happens, 'xm list --long' ceases to give back any info about the vfb device, hence libvirt can't report the VNC info.
Can't think of anything off hand that would cause this, but it is clearly a must-fix blocker
Created attachment 350951 [details]
PVFB backend removal fix
I've found out what the problem was. The problem was introduced by fix for BZ #439182 about PVFB devices xenstore leak. The problem here is that it removed the PVFB devices but XenD was unable to set the devices back again after restart so I did a patch to avoid those devices removal when restarting VMs so this is the patch not to detach PVFB devices when restarting domains.
(In reply to comment #13)
> Created an attachment (id=350951) [details]
> PVFB backend removal fix
> I've found out what the problem was. The problem was introduced by fix for BZ
> #439182 about PVFB devices xenstore leak. The problem here is that it removed
> the PVFB devices but XenD was unable to set the devices back again after
> restart so I did a patch to avoid those devices removal when restarting VMs so
> this is the patch not to detach PVFB devices when restarting domains.
Hi, I tried to add this patch to the package and retest it, but the issue still exist.
This is my steps:
1.Install the xen source rpm package:xen-3.0.3-88.el5.src.rpm.
2.Copy the patch into /usr/src/redhat/SOURCES/ and modify the SPEC file "xen.spec" to add the patch.
3.Build binary and source packages. Type "rpmbuild -ba xen.spec". Then rpm package "xen-3.0.3-88.ia64.rpm" and "xen-libs-3.0.3-88.ia64.rpm" are found.
rpm -e xen-3.0.3-88.el5
rpm -e --nodeps xen-libs-3.0.3-88.el5
rpm -ivh xen-libs-3.0.3-88.ia64.rpm xen-3.0.3-88.ia64.rpm
5.After the above steps, retest it, but the issue still exist.
I am not sure whether my procedure is right. Or the patch is not in effect. Please have a look. Thanks.
Well, could you grab /var/log/xen/xend.log output? I have added some more logging there so I should be able to get to know whether it's been recompiled OK. And also, I didn't test it on ia64 at all but on x86_64 only.
Anyway, did you restart xend service after installing new RPMs ?
Created attachment 351029 [details]
(In reply to comment #15)
Yes. After installing the new package(step 4 in Comment #14). I reboot the system, then start to test(step 5 in Comment #14).
Comment #16 has xend.log file.
Thanks for your reply.
Ok, you're having my patch applied (according to "Restart in progress" contents). What's strange is that this is set to False but the line above says the domain is shutdown because of it's rebooting. Apparently I can't trust already existing entry in xenstore that should mean the domain is restarting. Hopefully I will find a working solution soon. The strange thing here is that I tried it on my Xeon workstation (x86_64 system) and it was working fine. May work in some circumstances which is certainly not good so give me some time to come up with some new solution.
Created attachment 351563 [details]
PVFB backend removal fix update
This is updated version of my patch. The previous version was working in some conditions and in some it didn't so I chose a different approach and created a new variable to identify whether restart is in progress or not (currently existing variable had some problems because sometimes it was removed before the code went to part required for this patch). It's been tested in series of about 20 PV guest reboots and it was working fine.
In about half one hour - after brew finished the job - the RPMs with this patch applied will be available at http://people.redhat.com/minovotn/xen as -89mig version of xen RPMs - compiled for all available platforms so please do some testing with those RPMs...
Well, RPMs are on URL as described at comment #20. Could you please try with those RPMs Adam ?
Yes, of course. I will test it later.
Thanks Adam, comment this BZ then please.
Could you please test the package from comment #20 and check that this patch does not reintroduce your problems with restarting Xen guests (https://bugzilla.redhat.com/show_bug.cgi?id=439182)
Hi Michal, the patch from comment #20 has fixed this issue. It works well when PV guest reboot. Thanks Michal.
Created attachment 354341 [details]
New fix for this BZ
Unfortunately it seems this fix broke some other things, most certainly BZ #486157 too. So this is the new updated version of this fix that should work fine. You can find test RPMs with this patch applied at:
Please provide test results after you finish your testing ...
I updated xen-3.0.3-90mig.el5.ia64.rpm and xen-libs-3.0.3-90mig.el5.ia64.rpm, rebooted the system dom0, and then tested it. This issue didn't be reproduce.
Thanks for your testing Adam. This should be the package that Santwana should test as well because it contains the newest version of this patch.
Fix built into xen-3.0.3-91.el5
Hi,I can`t reproduce in xen-3.0.3-80.el5.And I test the bug in both xen-3.0.3-80.el5 and xen-3.0.3-91.el5 on a ia64 machine.In both cases,the virtual console of PV guest in virt-manager displays whole reboot process and works well.
Has this bug been fixed in xen-3.0.3-80.el5?
This was a regression since 5.3, which means that it was working in xen-3.0.3-80.el5. Then it was broken during the development and now, in xen-3.0.3-91.el5, it should work again.
Verified on xen-3.0.3-91.el5
*** Bug 509099 has been marked as a duplicate of this bug. ***
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.