Bug 229913

Summary: kernel immediately gets 'Bad RIP Value' with memory remap, Asus P5B 4GB Core 2
Product: [Fedora] Fedora Reporter: Francis Upton <francisu>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED CURRENTRELEASE QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: high    
Version: 6CC: clive.m.messer, ctac113, hugo.paredes, redhat-bugzilla, triage, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard: bzcl34nup
Fixed In Version: 2.6.24.3-50.fc8 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-04-04 06:58:36 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Shot of boot failure
none
first shot of failure (one angle)
none
first shot of failure (another angle)
none
second page of stack trace
none
The first image of screen from the oops of 2.6.20-1.2944
none
The second image of screen from the oops of 2.6.20-1.2944 none

Description Francis Upton 2007-02-24 07:55:59 UTC
Description of problem:

System installed and runs fine with the BIOS memory remap option disabled. 
Running 2.6.19-1.2911.fc6 kernel and latest BIOS version (1004).  Shows 3GB of
memory, but that's expected.

When I enable memory remap, it immediately gets a stack trace when the OS starts
the boot.  I'm sorry, but I don't know how to capture the stack trace
information to provide it, if it's important, I will find out and do so.

Version-Release number of selected component (if applicable):

2.6.19-1.2911.fc6 

How reproducible:


Steps to Reproduce:
1. Change BIOS to enable memory remap
2. Boot
3.
  
Actual results:

Boot immediately fails.

Expected results:

Should boot and use 4GB of memory.

Additional info:

Could this be related to:

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=193435

Comment 1 Chuck Ebbert 2007-02-24 16:29:20 UTC
You can capture the stack trace by taking a picture of the screen with a digital
camera and attaching that to this bugzilla entry.

Also, please try booting with the kernel option "iommu=soft".


Comment 2 Hugo Paredes 2007-02-24 16:47:12 UTC
I have the same problem. 
I tried the kernel option "iommu=soft" and no luck. 

Comment 3 Chuck Ebbert 2007-02-24 16:54:04 UTC
There is a fix for IOMMU-caused kernel panics in
kernel-2.6.19-1.2911.6.3.fc6 in testing, available at:

http://download.fedora.redhat.com/pub/fedora/linux/core/updates/testing/6/

I don't know if it fixes the problem, but it might.


Comment 4 Hugo Paredes 2007-02-24 17:19:03 UTC
I have just tried the testing kernel kernel-2.6.19-1.2911.6.3.fc6 and no luck.
 
As far as I read in some forums this could be a kernel problem or a kernel+BIOS
problem.

Comment 5 Francis Upton 2007-02-24 20:29:44 UTC
Created attachment 148747 [details]
Shot of boot failure

This happened with this kernel: kernel-2.6.19-1.2911.6.3.fc6, with the
iommu=soft option specified.

Let me know if you need more information or if you want me to try anything
else.

Comment 6 Chuck Ebbert 2007-02-26 14:39:14 UTC
The interesting part of the oops has scrolled off the screen.
Try booting with kernel option "vga=1" to get 50-line mode.
And turn off the camera's flash so we can see the center of the screen.



Comment 7 Francis Upton 2007-02-26 22:13:11 UTC
Created attachment 148834 [details]
first shot of failure (one angle)

Comment 8 Francis Upton 2007-02-26 22:14:20 UTC
Created attachment 148835 [details]
first shot of failure (another angle)

Comment 9 Francis Upton 2007-02-26 22:17:51 UTC
Created attachment 148837 [details]
second page of stack trace

I have 3 attachments, the ones labeled "first shot" are the initial boot screen
of where is froze.

The one labeled "second page" happened in a previous boot (where I did not have
my camera ready).  In this boot, a page similar to the "first shot" above
appeared, and then maybe 30 seconds later this additional stack trace appeared.


Both boots were identical everything, and with the kernel, options and h/w
described by me the most recent comments.

Comment 10 Francis Upton 2007-03-06 06:02:01 UTC
Just updated to the  2.6.19-1.2911.6.5.fc6 kernel and the problem persists.

Let me know if you would like me to capture the stack trace from the boot with
the new kernel.

Comment 11 Stanislav Sukholet 2007-03-13 06:07:53 UTC
*** Bug 230985 has been marked as a duplicate of this bug. ***

Comment 12 Francis Upton 2007-03-14 23:46:48 UTC
Should I file this as a bug report for kernel.org, since it seems like several
people are hitting this problem?

Comment 13 Martin Nichol 2007-03-15 22:00:34 UTC
Having the same problem with all versions of the FC6 kernel: regular, PAE and
64-bit.  I was able to boot Ubuntu 6.10 without any problems.  I have not
determined what the differences are that make one work and the other fail.

Comment 14 Hugo Paredes 2007-03-15 22:41:50 UTC
Updated to kernel-2.6.19-1.2911.6.5.fc6 and the problem persists.

Comment 15 Hugo Paredes 2007-03-15 22:43:37 UTC
Sorry for my last comment!
I have updated to kernel-2.6.20-1.2925.fc6 and the problem persists

Comment 16 Stanislav Sukholet 2007-03-20 07:16:39 UTC
XEN version of kernel does not have this bug ?


$ dmesg | head -4
Linux version 2.6.19-1.2911.6.5.fc6xen (brewbuilder.redhat.com) (gcc version 
4.1.1 20070105 (Red Hat 4.1.1-51
)) #1 SMP Sun Mar 4 16:23:59 EST 2007
Command line: ro root=LABEL=ROOT video=ywrap vga=0x315
BIOS-provided physical RAM map:
Xen: 0000000000000000 - 00000000f2f86000 (usable)

$ head -1 /proc/meminfo
MemTotal: 3971352 kB

Comment 17 Francis Upton 2007-03-21 20:10:36 UTC
The XEN kernel as in comment 16 works fine for me as well.

Comment 18 Martin Nichol 2007-04-15 19:56:03 UTC
Upgraded to kernel-2.6.20-1.2944.fc6 and kernel-PAE-2.6.20-1.2944.fc6.

With the memory remapping enabled:
kernel-2.6.20-1.2944.fc6 boots but only sees 2 of the 4 gig with a message to
use the PAE kernel to access the full 4 gigs.  It used to kernel panic before.

kernel-PAE-2.6.20-1.2944.fc6 still kernel panics on boot up.

Have not tried the 64-bit kernel.  

Comment 19 Stanislav Sukholet 2007-04-20 06:57:32 UTC
64-bit kernels - kernel-2.6.20-1.2944.fc6 hangs with same diagnostics
different memmap= options does not solve problem

XEN vesion of kernel boots perfectly

Comment 20 Chuck Ebbert 2007-04-20 23:29:37 UTC
Could someone upload screen pictures of failure with kernel 2944?


Comment 21 Nathan G. Grennan 2007-04-23 21:26:54 UTC
I looked at upgrading to 4gb on my P5B-Deluxe board. I found this bug and
reconsidered. But this morning I found a forum post about Ubuntu and this same
issue. Someone had found blacklisting the Intel AGP driver fixed the issue. From
what it said it sounded like Ubuntu has the AGP drivers as modules, and hence
they can be prevented from loading. Where as it seems Fedora compiles them in.

I purchased the additional 2gb of ram and installed. Then I tried disabling the
AGP drivers in the configuration of a 2.6.20-1.2944 x86_64 kernel and
recompiled. The resulting rpm had a config with AGP enabled, but the Intel AGP
driver as a module. I then moved the module file out of the /lib/modules
directory and booted the kernel. It did successfully boot. 

I then had an issue with the kernel missing the dist tag and recompiled again
with it. This time I forget to move the Intel AGP driver module out of
/lib/modules and that resulted a system reboot around the starting of udev.

My next goal is to recompile with all the AGP drivers as modules, and then
blacklist the Intel AGP driver. I will then reenable the remap option in the
bios, make sure it boots, and then check that everything else works.

As a side note, linux did seem to see all the memory, but I didn't have alot of
time to look it over.

Comment 22 Chuck Ebbert 2007-04-23 21:31:10 UTC
Casn you boot stock kernel 2944 and repost the screen pictures?
I don't have debug info for the older kernels installed.


Comment 23 Nathan G. Grennan 2007-04-23 21:39:31 UTC
Yes, I can try later. Hopefully it doesn't just reboot like it seems to do with
the driver as a module.

Comment 24 Hugo Paredes 2007-04-23 21:43:12 UTC
 I have included the kernel option agp=off and I have successfully boot with my
4GB RAM.


Comment 25 Nathan G. Grennan 2007-04-23 22:01:29 UTC
Ah, very nice a straight forward. What video card and driver to do you use? I am
curious if there are any side effects to disabling AGP.

Comment 26 Hugo Paredes 2007-04-23 22:10:00 UTC
I am using a NVidia 7600GT PCIe 256Mb and livna nvidia driver (nvidia-1.0.9755-4).

01:00.0 VGA compatible controller: nVidia Corporation G70 [GeForce 7600 GT] (rev a1)

I have just tested the agp=off kernel option. Until now I haven't found any side
effects with my video card. Actually I am using Gnome and Beryl and no problems.
If I have any problem using this options I will post here. 

Additional info: I am using kernel kernel-2.6.20-1.2944.fc6.x86-64 (fc6)

[root@phantom ~]# cat /proc/meminfo 
MemTotal:      4017504 kB


Comment 27 Nathan G. Grennan 2007-04-23 22:14:07 UTC
That is great news. I have the exact same card, and I am using the same exact
driver.

Your memory total seems to match what I saw while I had it working earlier.

Comment 28 Nathan G. Grennan 2007-04-25 17:58:42 UTC
Here is the url of someone who talks about some of the details of the bug:
http://myweb.facstaff.wwu.edu/~riedesg/sysadmin1138/2006/12/opensuse-102-on-asus-p5b-deluxe.html

I flashed by my bios and bricked my motherboard. Ended up getting a P5B Premium
this time. It is basically the same board, but has it's own version of the bios.
It seems to have the same issue and the same fix works.

I am only getting 4017404kb. Which is interestingly 100kb less than Hugo. It is
roughly 3.8gb. Which is 174mb, which is mentioned during boot as reserved.

Memory: 4015460k/6291456k available (2454k kernel code, 177876k reserved, 1459k
data, 316k init)

Comment 29 Nathan G. Grennan 2007-04-25 18:01:02 UTC
Created attachment 153431 [details]
The first image of screen from the oops of 2.6.20-1.2944

Comment 30 Nathan G. Grennan 2007-04-25 18:02:35 UTC
Created attachment 153432 [details]
The second image of screen from the oops of 2.6.20-1.2944

Comment 31 Michael 2007-04-30 22:13:11 UTC
Just an FYI but adding agp=off fixes this for my Asus P5B-E and Asus P5B Deluxe



Comment 32 Clive Messer 2007-05-27 03:18:42 UTC
Works for me too with 'agp=off' and P5B Deluxe. (4GB with 
kernel-PAE-2.6.20-1.2948.fc6.)

Comment 33 Chuck Ebbert 2007-05-30 20:33:56 UTC
Should be fixed in kernel 1.2952

Comment 34 Clive Messer 2007-06-01 13:44:23 UTC
Does not appear to be fixed in kernel-PAE-2.6.20-1.2952.fc6. Still 
requires 'agp=off' to boot.

Comment 35 Stanislav Sukholet 2007-12-13 01:55:46 UTC
Enterprise kernel-2.6.18-8.1.15.el5 have same bug (and same solving)
But new kernel 2.6.18-53.1.4.el5 works without agp=off !!! That's just fine!
That's what we want!

Thanks everybody

Comment 36 Bug Zapper 2008-04-04 06:21:36 UTC
Fedora apologizes that these issues have not been resolved yet. We're
sorry it's taken so long for your bug to be properly triaged and acted
on. We appreciate the time you took to report this issue and want to
make sure no important bugs slip through the cracks.

If you're currently running a version of Fedora Core between 1 and 6,
please note that Fedora no longer maintains these releases. We strongly
encourage you to upgrade to a current Fedora release. In order to
refocus our efforts as a project we are flagging all of the open bugs
for releases which are no longer maintained and closing them.
http://fedoraproject.org/wiki/LifeCycle/EOL

If this bug is still open against Fedora Core 1 through 6, thirty days
from now, it will be closed 'WONTFIX'. If you can reporduce this bug in
the latest Fedora version, please change to the respective version. If
you are unable to do this, please add a comment to this bug requesting
the change.

Thanks for your help, and we apologize again that we haven't handled
these issues to this point.

The process we are following is outlined here:
http://fedoraproject.org/wiki/BugZappers/F9CleanUp

We will be following the process here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this
doesn't happen again.

And if you'd like to join the bug triage team to help make things
better, check out http://fedoraproject.org/wiki/BugZappers

Comment 37 Francis Upton 2008-04-04 06:58:36 UTC
This has been fixed:

 2.6.24.3-50.fc8 (probably before)

more /proc/meminfo
MemTotal:      4064604 kB