Red Hat Bugzilla – Bug 181856
Xen guest reporting, "BUG: soft lockup detected on CPU#0!"
Last modified: 2007-11-30 17:11:24 EST
Receiving error message while running a xen guest that states, "BUG: soft
lockup detected on CPU#0!"
This seems to occur a lot when running 'yum -y update' on both the xen host and
guest, both of which running Fedora Core 5 Test 2. The error occurs at other
times as well, but seems virtually 100% reproducable for me if I run yum on
both the host and guest at the same time. The console output from one attempt
to run yum is attached to this bug report.
The guest console does not respond for several seconds upon receiving this
message, but does return and continue exactly where it left off without any
Nothing is recorded in /var/log/dmesg when these errors occur.
Created attachment 124800 [details]
Xen guest console showing error messages
This problem existed in every version of the hypervisor/guest kernels I have
used up to and including kernel-xen-hypervisor-2.6.15-1.1955_FC5/kernel-xen-
I am also not sure if the systems low specs could be contributing to the
frequency of the error messages. The system is a Dell Inspiron 8000 Laptop
with a PIII 700 MHz processor, and 512 MB of RAM.
This still occurs on FC5T3 host/guests. I have been running through some
different scenarios to try to determine what is causing this.
It seems that the guest locks up when the host starts the line that reads:
developmen: ################################################## 4303/4303
It looks as though, any time I am connected to the xen console (using 'xm
console fc5t3xen') and try to run yum on both simultaneously this happens. It
does not matter if it is two tabs in a Gnome Terminal, two separate Gnome
Terminals, two ssh sessions to the host in which the guest console is
connected, with a Gnome session open on the desktop and an ssh session to the
host with the guest console open, etc.
It basically occurs at any time if I try to run yum from both trying to using
the xen console to initiate the process for the guest.
However, if I open an SSH session to each individually (or just an ssh session
to the guest at least) and run yum from both without attempting to use the xen
guest console, then everything appears to work perfectly normal. Because of
that I tend to believe that the guest is having a problem updating it's console
while the host is updating one of its terminal sessions.
These bugs are being closed since a large number of updates have been released
after the FC5 test1 and test2 releases. Kindly update your system by running yum
update as root user or try out the third and final test version of FC5 being
released in a short while and verify if the bugs are still present on the system
.Reopen or file new bug reports as appropriate after confirming the presence of
this issue. Thanks
Read previous comment. This still occurs with FC5T3 guest and host on a clean
This is almost certainly because of the raised priority dom0 has over domU. It
may well be worthwhile modifying HV scheduler defaults to deal with this case.
From the xen-devel mailing list. It may be an idea to have defaults like this
in our hypervisor...
Date: Tue, 21 Feb 2006 11:01:06 +1100
From: James Harper
Subject: RE: [Xen-devel] dom0 starves guests off CPU
I found this too when doing a compile in dom0. Search the archives for a
thread titled 'Performance problems' from January this year.
xm sched-sedf <domID> 0 0 0 1 1
was suggested there and it works for me!
for the recent xen-devel thread.
xm sched-sedf 0 0 0 0 1 1 seems to have made the two play much more nicely
together. While compiling and running yum on the host I was able to run yum
from the guest console and not once suffer one of these error messages.
*** Bug 185081 has been marked as a duplicate of this bug. ***
This should be fixed now in kernel-xen0-2.6.15-1.2054_FC5.
Appears to be working well. Can't force a soft lockup message out my
Created attachment 126269 [details]
Errors still occuring with 2054 kernel under high CPU load
I hate to do this, after saying that everything was good. I downloaded the
BOINC client (http://boinc.berkeley.edu) last night to play with last night,
because I was bored and left it running all night.
When I woke up in the morning the console connected to my xen guest had all the
errors in the attachment sitting on it.
The problem seems to have improved in that, when I catch the error message
occuring, the guest is immediately responsive again, not like in the passed
where it would hang for several seconds, possibly even minutes. Whatever is
catching the condition and spitting out the error message seems almost too
sensitive. Previously, even if I did not see the message I would know that
something was wrong by the guest locking up and being unresponsive.
Now, I would have no idea there was a problem, save for the error message...
What if you re-run the above load all night with the manual dom0 workaround?
xm sched-sedf <domID> 0 0 0 1 1
running the manual workaround seems to prevent the errors from occurring all
together. Without it, it still seems more difficult than with previous kernels
to produce the errors, but still possible.
*** Bug 186049 has been marked as a duplicate of this bug. ***
There was a spec file problem which prevented the updated scheduler defaults
from taking effect in 1.2054; we are preparing an update kernel to fix this.
i'm also seeing this, on a lowly pIII 450, 768MB ram, with no real load on
either dom0 or domU. FC5 final, yum updated. Not too much to add, but didn't see
a way to get me on the cc list w/out adding a comment. :-/
This should be fixed with the latest kernels on fedora-updates-testing
(currently at 2.6.16-1.2069_FC5), can you confirm?
Still happening with 2069 installed on the host and guest. Attached a copy of
my console output from the guest.
Created attachment 126892 [details]
Console output with Xen0/XenU 2069 installed
This is the output from a console after installing the 2069 kernel from Fedora
What if you try the manual setting again:
xm sched-sedf <domID> 0 0 0 1 1
I entered in the manual fix after getting the errors (I received about 5 error
messages in about an hour of testing before entering the manual setting) and
once again get no error messages with the manual setting.
Created attachment 127197 [details]
I've reworked the hypervisor patch to correctly simulate the effect of the
I haven't been able to reproduce the problem with the existing patch, but heavy
testing with this new patch has not revealed any new problems.
Not sure when Xen will be enabled in the rawhide kernel rpm, so you may want to
try this patch with an srpm build -- just drop this patch on top of the file
xen-sched-sedf.patch in SOURCES.
I applied the patch and recompiled and I have been running for about 3 days now
with the host at as close to 100% cpu I could keep it the entire time. I
performed various tasks with the guest at various different times and was
unable to create the errors. It looks like this patch is working good.
Thanks for the testing.