Bug 207254 - Bug message on boot, freeze or crash on load
Bug message on boot, freeze or crash on load
Status: CLOSED WONTFIX
Product: Fedora
Classification: Fedora
Component: kernel-xen (Show other bugs)
6
i686 Linux
medium Severity high
: ---
: ---
Assigned To: Xen Maintainance List
Virtualization Bugs
:
Depends On: 206757
Blocks:
  Show dependency treegraph
 
Reported: 2006-09-20 04:12 EDT by Daniel Tschan
Modified: 2009-12-14 15:39 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-02-26 18:22:11 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Complete kernel and lspci -vv output (49.93 KB, text/plain)
2006-09-20 04:12 EDT, Daniel Tschan
no flags Details
5 additional kernel backtraces (21.67 KB, text/plain)
2006-09-20 14:43 EDT, Daniel Tschan
no flags Details

  None (edit)
Description Daniel Tschan 2006-09-20 04:12:20 EDT
Description of problem:
kernel-xen-2.6.17-1.2647.fc6 issues the following bug report when booting:
BUG: warning at kernel/lockdep.c:1814/trace_hardirqs_on() (Not tainted)
 [<c0405666>] show_trace_log_lvl+0x58/0x177
 [<c0405c6b>] show_trace+0xd/0x10
 [<c0405ca9>] dump_stack+0x19/0x1b
 [<c0436442>] trace_hardirqs_on+0xa4/0x120
 [<c0404e5f>] restore_all+0x37/0x3a
DWARF2 unwinder stuck at restore_all+0x37/0x3a
Leftover inexact backtrace:
Inexact backtrace:
 [<c0405c6b>] show_trace+0xd/0x10
 [<c0405ca9>] dump_stack+0x19/0x1b
 [<c0436442>] trace_hardirqs_on+0xa4/0x120
 [<c0404e5f>] restore_all+0x37/0x3a


And either freezes or crashes when the system is loaded:
BUG: unable to handle kernel paging request at virtual address 6f6c6700
 printing eip:
c048321c
294d9000 -> *pde = 00000000:c674d001
2714d000 -> *pme = 00000000:00000000
Oops: 0000 [#1]
SMP 
last sysfs file: /devices/system/cpu/cpu1/cpufreq/scaling_setspeed

Please see attachment for complete backtrace!

Configuration:
Gigabyte GA-965P-DS3 mainboard with Intel P965, ICH8 (not ICH8R), JMicron JMB363
Intel Core 2 Duo E6600
4x 1GB DDR2-675
Software RAID 5 with 3x 250gb SATA2 disks
Gigabyte GV-NX76G256D-RH PCIe graphics card with nVidia GeForce 7600 GS
LG GSA-H10N IDE DVD writer connected to JMB363


Version-Release number of selected component (if applicable):
2.6.17-1.2647.fc6xen

How reproducible:
Always

Steps to Reproduce:
1. Boot kernel-xen-2.6.17-1.2647.fc6
2. Execute rpm -Va to generate CPU and disk load
3.
  
Actual results:
System either freezes completely or crashes

Expected results:
System runs stable on high loads

Additional info:
Attachment with complete kernel and lspci -vv output
Comment 1 Daniel Tschan 2006-09-20 04:12:21 EDT
Created attachment 136715 [details]
Complete kernel and lspci -vv output
Comment 2 Stephen Tweedie 2006-09-20 04:21:25 EDT
Hmm, the oops here is not obviously related to xen.  Does the non-xen
2.6.17-1.2647.fc6 kernel run reliably for you?  Does the crash always look the
same or is the backtrace different each time?
Comment 3 Daniel Tschan 2006-09-20 05:06:37 EDT
2.6.17-1.2647.fc6 didn't show any of these symptoms so far, neither the message
on boot nor a freeze or crash. The PAE kernels however do not boot. See bug
#206757. So the problem may be related to PAE. The backtrace is different each time.
Comment 4 Stephen Tweedie 2006-09-20 06:06:10 EDT
Could you please supply several example backtraces?  Without that it's
impossible to look for any sort of pattern here.  Thanks!
Comment 5 Daniel Tschan 2006-09-20 14:43:59 EDT
Created attachment 136764 [details]
5 additional kernel backtraces

Sure. I attached 5 additional backtraces. During the 5 crashes I observed 2
freezes. But I just remembered now that I might be able to get info out of
these with magic sysrq. Please tell me if that would be useful or if you need
anything else.
Comment 6 Stephen Tweedie 2006-09-20 16:31:38 EDT
Hmm, it definitely does look like it could be related to the PAE problem ---
kernel-xen is built with PAE enabled by default, but it uses highmem (ie. >4GB)
memory in different ways due to the way the hypervisor parcels memory out to the
kernel.

If PAE is not working, then the -xen kernel is unlikely to do any better,
although it may fail in different ways.  We'd really need to get the underlying
PAE problem fixed in order to be able to test the -xen case.
Comment 7 Daniel Tschan 2006-10-30 02:24:07 EST
The problem is still present in kernel-xen-2.6.18-1.2798.fc6.i686 but cannot be
reproduced that easily any more. It seems to be caused by the wrong
initialization of the agpgart. Please see new comments of bug #206757 .
Comment 8 Matthew Miller 2007-04-06 14:06:55 EDT
Fedora Core 5 and Fedora Core 6 are, as we're sure you've noticed, no longer
test releases. We're cleaning up the bug database and making sure important bug
reports filed against these test releases don't get lost. It would be helpful if
you could test this issue with a released version of Fedora or with the latest
development / test release. Thanks for your help and for your patience.

[This is a bulk message for all open FC5/FC6 test release bugs. I'm adding
myself to the CC list for each bug, so I'll see any comments you make after this
and do my best to make sure every issue gets proper attention.]
Comment 9 Red Hat Bugzilla 2007-07-24 21:33:40 EDT
change QA contact
Comment 10 Chris Lalancette 2008-02-26 18:22:11 EST
This report targets FC6, which is now end-of-life.

Please re-test against Fedora 7 or later, and if the issue persists, open a new bug.

Thanks

Note You need to log in before you can comment on or make changes to this bug.