Red Hat Bugzilla – Bug 201298
Intel test machine (Conroe) hangs on reboot with VMX enabled
Last modified: 2009-12-14 15:41:07 EST
The Intel test machines of model number D3C51SYBANCNRB fail to reboot under Xen
if VMX is enabled in the BIOS. The reboot code of Xen is identical to that of
Linux, in particular, it attempts a reboot via the keyboard reset line, and if
that fails, it tries to create a triple fault by zeroing the IDT.
This works perfectly if VMX is diabled in the BIOS. This also works if
start_vmx is not called from Xen. In fact, the reboot still works if start_vmx
is followed immediately by a call to stop_vmx.
Booting Xen with reboot=b (which uses a different reboot strategy where it jumps
into a BIOS address in real mode) causes it to reboot correctly even with VMX
What I'd like to know from Intel is
1) What is the reason that the aformentioned reboot strategies (keyboard +
triple fault, a standard reboot strategy for many years in Linux) fails when VMX
was enabled in the past?
2) If you think this reboot strategy is flawed, what reboot strategy do you
recommend that can work across the full range of x86 hardware that Linux/Xen
We also have the same problem.
Native RHEL4/SLES10 could be rebooted on Conroe without any problems, but xen0
could not be rebooted.
To reboot xen0 on Conroe we have to press reset button manually.
We thought it's a hardware fault of Conroe.
I will recheck this issue, and see if it's a bug xen vt.
Yunfeng - any update?
Also fails for me on the same product code. Attaching successful and failing
boot logs in case they are useful.
Created attachment 134562 [details]
Boot logs from kernel-2.6.17-1.2573.fc6PAE
Reboot/poweroff work correctly on this kernel.
Created attachment 134563 [details]
Boot logs from kernel-2.6.17-1.2573.fc6xen
Reboot/poweroff always hang on this kernel.
Thanks for the dmesgs Stephen. At this point I'm only interested in poweroff
since your reboot issue can be explained by the fact that Xen enables VMX while
baremetal does not.
I can see two differences between your setup and mine. Firstly your ACPI BIOS
is different to mine, and secondly I need to test using the same kernel as
you're to see if that could make my baremetal poweroff consistently.
I've observed an interesting phenomon with poweroff on my machine. It seems to
work if I leave it either on or off for an extended period of time. However, it
only works once. That is, if I power it back on after a successful poweroff and
immediately try to halt, it fails to power off.
So could you do an experiment for me? See if you could do three or four
successive poweroffs (on baremetal of course) and let me know whether they all
Using the 1.2600 PAE kernel, bare metal poweroff failed on the first attempt.
Strange, as it used to work reliably; but then again I haven't tried it
recently, as all my recent work has been using a -xen kernel on that box, and
I've just got used to having to poweroff manually with that kernel.
Created attachment 135530 [details]
Patch to work around reboot issue.
This patch works around the reboot issue by rebooting through the BIOS if VMX
is detected to be on. It works on my machine. Please let me know if it allows
your machines to reboot.
Is this patch not included in 3.0.3? If it is included, please close it.
This patch is not part of 3.0.3. I haven't submitted it yet because I'm waiting
for confirmation that it works on a machine other than mine. Thanks.
change QA contact
Is this problem still present on Xen 3.1?
It would be interesting to check if the problem happens when KVM is used to
enable VMX, too.
The motherboard in question has been upgraded long ago. So unfortunately I'm no
longer in a position to reproduce this.
I also have a Conroe/Mequon, and with RHEL-5.1, the issue still happens. A
couple of interesting points:
1) danpb noticed that inside the dom0, if you do "echo b >
/proc/sysrq-trigger", the box *will* reboot successfully.
2) Based on 1), I took a quick gander at the shutdown code in dom0 and the HV.
In terms of the dom0, there is not too much interesting; it really just traps
out to the HV with a shutdown event to do the shutdown. However, I didn't see
any significant differences between the "crash" case above and a "shutdown -h now".
If you want me to do any additional testing, I'm happy to do it; I've just sort
of put it on the back burner since it doesn't seem all that important.
Based on the date this bug was created, it appears to have been reported
against rawhide during the development of a Fedora release that is no
longer maintained. In order to refocus our efforts as a project we are
flagging all of the open bugs for releases which are no longer
maintained. If this bug remains in NEEDINFO thirty (30) days from now,
we will automatically close it.
If you can reproduce this bug in a maintained Fedora version (7, 8, or
rawhide), please change this bug to the respective version and change
the status to ASSIGNED. (If you're unable to change the bug's version
or status, add a comment to the bug and someone will change it for you.)
Thanks for your help, and we apologize again that we haven't handled
these issues to this point.
The process we're following is outlined here:
We will be following the process here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this
doesn't happen again.
This bug has been in NEEDINFO for more than 30 days since feedback was
first requested. As a result we are closing it.
If you can reproduce this bug in the future against a maintained Fedora
version please feel free to reopen it against that version.
The process we're following is outlined here: