Bug 445504 - kernel crash when three virtualized guest are running at the same time
kernel crash when three virtualized guest are running at the same time
Status: CLOSED INSUFFICIENT_DATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel-xen (Show other bugs)
5.2
All Linux
medium Severity high
: rc
: ---
Assigned To: Xen Maintainance List
Martin Jenner
: Reopened
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-05-07 05:05 EDT by Michal Nowak
Modified: 2013-03-07 21:04 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-07-22 08:16:46 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
xend.log (35.92 KB, text/plain)
2008-05-09 11:50 EDT, Michal Nowak
no flags Details
xend.log old one (126.05 KB, text/plain)
2008-05-09 11:51 EDT, Michal Nowak
no flags Details

  None (edit)
Description Michal Nowak 2008-05-07 05:05:40 EDT
Description of problem:

Boot physical system with kernel-xen, prepare at least three guests in
virtual-manager. In my case: rawhide + rhel5.2 (both full virt) + rhel4.7
(para). Start one after another, crash happens in few seconds after the third
guest is run.

Version-Release number of selected component (if applicable):

kernel-xen-2.6.18-92.el5
virt-manager-0.5.3-8.el5

How reproducible:
always

Not tested with kernel-2.6.18-*.el5
Comment 1 Bill Burns 2008-05-07 08:03:27 EDT
Can you provide the console output from the crash?
Comment 2 Michal Nowak 2008-05-09 11:50:49 EDT
Not much interesting from my POV in log:

Before the first crash:

May  7 09:37:30 dhcp-lab-198 kernel: virbr0: port 7(vif13.0) entering disabled state
May  7 09:37:30 dhcp-lab-198 kernel: device vif13.0 left promiscuous mode
May  7 09:37:30 dhcp-lab-198 kernel: virbr0: port 7(vif13.0) entering disabled state
May  7 09:37:31 dhcp-lab-198 logger: /etc/xen/scripts/blktap: xenstore-read
/local/domain/13/vm failed.
May  7 09:37:31 dhcp-lab-198 logger: /etc/xen/scripts/blktap:
/etc/xen/scripts/blktap failed; error detected.
May  7 09:38:19 dhcp-lab-198 kernel: tap tap-14-51712: 2 getting info
May  7 09:38:19 dhcp-lab-198 kernel: device vif14.0 entered promiscuous mode
May  7 09:38:19 dhcp-lab-198 kernel: ADDRCONF(NETDEV_UP): vif14.0: link is not ready
May  7 09:38:20 dhcp-lab-198 kernel: ADDRCONF(NETDEV_CHANGE): vif14.0: link
becomes ready
May  7 09:38:20 dhcp-lab-198 kernel: blktap: ring-ref 522, event-channel 9,
protocol 1 (x86_32-abi)
May  7 09:38:20 dhcp-lab-198 kernel: virbr0: port 7(vif14.0) entering listening
state
May  7 09:38:35 dhcp-lab-198 kernel: virbr0: port 7(vif14.0) entering learning state
May  7 09:38:50 dhcp-lab-198 kernel: virbr0: topology change detected, propagating
May  7 09:38:50 dhcp-lab-198 kernel: virbr0: port 7(vif14.0) entering forwarding
state

...

then it worked for 10 minutes, I issued the third guest and it restarted/crashed.

Before second crash:

May  7 10:25:45 dhcp-lab-198 kernel: device tap5 entered promiscuous mode
May  7 10:25:45 dhcp-lab-198 kernel: xenbr0: port 7(tap5) entering learning state
May  7 10:25:45 dhcp-lab-198 kernel: xenbr0: topology change detected, propagating
May  7 10:25:45 dhcp-lab-198 kernel: xenbr0: port 7(tap5) entering forwarding state
May  7 10:25:46 dhcp-lab-198 kernel: device vif7.0 entered promiscuous mode

<< immediate crash here

Attaching /var/log/xen/xend.log and /var/log/xen/xend.log.1, interesting seems
to be the end of $NAME.1 and the beginning of $NAME.

I am not able to provide output right after the crash because it simply
restarted. If you are aware of some place where is output from console
redirected let me know.
Comment 3 Michal Nowak 2008-05-09 11:50:59 EDT
Created attachment 304961 [details]
xend.log
Comment 4 Michal Nowak 2008-05-09 11:51:20 EDT
Created attachment 304962 [details]
xend.log old one
Comment 5 Bill Burns 2008-05-09 13:24:58 EDT
Sorry for not being clear. Can you run with serial console and capture that
output? That will show us the fault/panic info we need. Also is this on any
specific hardware? How much memory is on the system and how much are you giving
to each guest?

Thanks

Comment 6 Michal Nowak 2008-05-12 11:06:06 EDT
Nothing special: Dell Precision 490, dual 64bit Xeon, 2 GB memory; 512 MB per
each guest. I am not able to reproduce it again but was able to do so the week
before.

Let's re-open it when it crashes again but for now...

Thanks for you time, Bill.
Comment 7 Bill Burns 2008-05-12 11:30:13 EDT
Ok, thanks.
Comment 8 Michal Nowak 2008-05-12 15:38:36 EDT
Umm, I've been hit by this again when mangling with several guests... Could you
please point me or guide how to set up serial console?

The thing is that when it crasher it immediately restarts so I am unable to
catch any error msg, if any's there. 

Is it even possible to have the Dom0 shut down but hypervisor running and
possibly providing the error output? The crashes are irregular but often so I
can catch it someday. 
Comment 9 Bill Burns 2008-05-12 16:32:39 EDT
In the grub configuration for the "kernel" line (which loads xen.gz) you usually
add:
  console=com1 com1=115200,8n1
then on the next module line, where the dom 0 kernel gets loaded, you add
 console=ttyS0,115200

Then you need a serial cable to the box from another system ans
use the ttywatch program to monitor it.

Comment 10 Michal Nowak 2008-07-22 08:16:46 EDT
Can't provide more data; never been hit by this for a long time. Thanks for your
time a nice guidance, Bill.

Note You need to log in before you can comment on or make changes to this bug.