Bug 184378 - Xen kernel BUG crash on x86_64
Xen kernel BUG crash on x86_64
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: kernel-xen (Show other bugs)
rawhide
x86_64 Linux
medium Severity high
: ---
: ---
Assigned To: Juan Quintela
Brian Brock
:
: 185235 (view as bug list)
Depends On:
Blocks: 179599
  Show dependency treegraph
 
Reported: 2006-03-08 06:05 EST by Aleksander Adamowski
Modified: 2008-08-02 19:40 EDT (History)
3 users (show)

See Also:
Fixed In Version: 2.6.20-1.2307.fc5
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-04-17 09:21:30 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
Crash screenshot including debug info (14.60 KB, image/png)
2006-03-08 06:07 EST, Aleksander Adamowski
no flags Details

  None (edit)
Description Aleksander Adamowski 2006-03-08 06:05:36 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20060202 Fedora/1.7.12-1.5.2

Description of problem:
When booting the latest Xen kernel for x86_64 on a HP Proliant DL 385 with 2 dual core Opteron processors (AMD Opteron(tm) Processor 265, 1800 MHz), the kernel crashes with a message:

Kernel BUG at arch/x86_64/mm/fault-xen.c:292
invalid opcode: 0000 [1]  SMP
CPU 1
...... more info in the screenshot I'm attaching

Reproducibility: sometimes (in about half cases the system boots up fine).


contents of /proc/cpuinfo:

processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 33
model name      : AMD Opteron(tm) Processor 265
stepping        : 2
cpu MHz         : 1804.116
cache size      : 1024 KB
physical id     : 0
siblings        : 1
core id         : 0
cpu cores       : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni cmp_legacy
bogomips        : 4512.23
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

processor       : 1
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 33
model name      : AMD Opteron(tm) Processor 265
stepping        : 2
cpu MHz         : 1804.116
cache size      : 1024 KB
physical id     : 0
siblings        : 1
core id         : 0
cpu cores       : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni cmp_legacy
bogomips        : 4512.23
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp


Version-Release number of selected component (if applicable):
2.6.15-1.2025_FC5xen0

How reproducible:
Sometimes

Steps to Reproduce:
1. Boot the latest xen0 kernel on a HP Proliant DL385 with two dual core Opterons 265


Additional info:
Comment 1 Aleksander Adamowski 2006-03-08 06:07:16 EST
Created attachment 125792 [details]
Crash screenshot including debug info
Comment 2 Aleksander Adamowski 2006-03-08 06:09:48 EST
See also bug 183221, which covers a hard system hang of the same machine under
earlier Xen kernels for x86_64.
Comment 3 Aleksander Adamowski 2006-03-14 04:56:45 EST
What's strange, if the system boots successfully to domain 0, then it works fine
afterwards. I'm currently running 2.6.15-1.2038_FC5xen0 and have accumulated 3
days uptime.

The problem is, guest domains have no network connectivity (maybe it's related
to the fact that the crash stacktrace I've attached shows that the crash occured
in the tg3 driver?).

The NIC in the machine is a Broadcom Corporation NetXtreme BCM5704 Gigabit
Ethernet (rev 10).
Comment 4 Andy Burns 2006-03-15 16:07:17 EST
Very similar to crashes I've experienced with 2041_FC5xen0 and 2054_FC5xen0 the
crash happens intermittenty at boot time, if the crash doesn't happen the system
is then stable.

Full log attached to https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=185235
but the pertinent bit seems the same as above

Kernel BUG at arch/x86_64/mm/fault-xen.c:292
invalid opcode: 0000 [1] SMP
CPU 0
Modules linked in: snd_mixer_oss snd_pcm snd_timer snd soundcore snd_page_alloc
hw_random i2c_i801 i2c_core dm_snapshot dm_zero dm_mirror dm_mod ext3 jbd ahci
libata sd_mod scsi_mod
Pid: 615, comm: udevd Not tainted 2.6.15-1.2041_FC5xen0 #1
Comment 5 Stephen Tweedie 2006-03-16 14:58:30 EST
*** Bug 185235 has been marked as a duplicate of this bug. ***
Comment 6 Stephen Tweedie 2007-03-16 11:08:09 EDT
Is this still reproducible on the latest stable release+updates?  Thanks.
Comment 7 Aleksander Adamowski 2007-04-16 11:19:42 EDT
Didn't experience crashes with 2.6.20-1.2307.fc5xen0 or 2.6.18-1.2239.fc5xen0
anymore.

Note You need to log in before you can comment on or make changes to this bug.