Bug 212809 - 32bit kernel with PAE and SMP enabled leads to corruption
32bit kernel with PAE and SMP enabled leads to corruption
Status: CLOSED NOTABUG
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.4
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Kernel Maintainer List
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-10-29 04:03 EST by Frode Nordahl
Modified: 2007-11-16 20:14 EST (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-10-12 16:06:26 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Frode Nordahl 2006-10-29 04:03:42 EST
Description of problem:
When using the default RedHat SMP kernel with PAE enabled this system will not work. Programs fail to 
start with a "invalid executable" error and so on.

Disk subsystem code complains about attemt to access beyound end of device, for example:
Oct 27 14:10:49 hostname kernel: attempt to access beyond end of device
Oct 27 14:10:49 hostname kernel: dm-3: rw=0, want=13350692872, limit=40960000
Oct 27 14:10:49 hostname kernel: attempt to access beyond end of device
Oct 27 14:10:49 hostname kernel: dm-3: rw=0, want=14679774112, limit=40960000
Oct 27 14:10:49 hostname kernel: attempt to access beyond end of device
Oct 27 14:10:49 hostname kernel: dm-3: rw=0, want=56335120, limit=40960000
Oct 27 14:10:49 hostname kernel: attempt to access beyond end of device
Oct 27 14:10:49 hostname kernel: dm-3: rw=0, want=14421294016, limit=40960000

PAE is a very special feature and I can't see that a great number of users would make use of it. Those 
really needing it will go 64-bit anyway.

Why can't this be hidden away in the -hugemem series of kernels as it was before? It only introduces 
bugs and headaches for those not in need for PAE.

Version-Release number of selected component (if applicable):
All.

How reproducible:
100%

Steps to Reproduce:
1. Install comparable system in 32-bit mode
2. Try to use it.
  
Actual results:
A non-working system.

Expected results:
A working system.

Additional info:
Intel SE7520JR2 motherboard
2 x Intel Xeon 3.0GHz with 2MB cache EM64T capable, running in 32-bit mode, no HyperThreading
4 GB RAM
Comment 1 Jason Baron 2007-03-02 16:13:00 EST
hmmm can you be specific as to what kernel version you are running? if this a
pae issue, can you try booting the UP kernel which does not have PAE to verify this.
Comment 2 Frode Nordahl 2007-03-04 10:30:14 EST
(In reply to comment #1)
> hmmm can you be specific as to what kernel version you are running? if this a
> pae issue, can you try booting the UP kernel which does not have PAE to verify this.

Version: kernel-2.6.9-42.0.3.EL

The system is currently running the UP kernel as it does not work with the RedHat's SMP kernel.

I have also verified that it does work if I compile my own version of the SMP from RedHat sources with PAE 
disabled.
Comment 3 Jason Baron 2007-04-13 16:13:04 EDT
ok, do you have a testcase for us to help debug this? thanks.
Comment 4 Frode Nordahl 2007-04-16 03:55:35 EDT
Boot up a system comparable to this in 32-bit mode with SMP kernel:
Intel SE7520JR2 motherboard
2 x Intel Xeon 3.0GHz with 2MB cache EM64T capable, running in 32-bit mode, no HyperThreading
4 GB RAM

Look for strange ext3 error messages during boot, try to launch some programs and watch that some 
of them won't start. You migth be able to log in, but starting a second program will result in "invalid 
executable" messages.

Watch that if you boot the same system with a UP kernel all problems are gone. Compile a custom SMP 
kernel with RedHat default config, except that you disable PAE, and watch that the problems are still 
gone. Boot up the default RedHat SMP kernel with PAE enabled, and watch that the system is still 
broken.
Comment 5 Frode Nordahl 2007-04-17 09:42:43 EDT
There seems to be firmware fixes available for the RAID controller used in this system related to PAE. Intel 
SRCS16S (OEM version of MegaRAID SATA 300-8X)

BUT, my original complaint against using PAE in stock kernel stands, just for this reason.

Having it enabled by default opens a plethora of bugs that need not be there. Stick to non-PAE and tell 4 
GB users to go 64bit, at least include a no-PAE kernel so we don't have to do custom builds.
Comment 6 Jason Baron 2007-10-12 16:06:26 EDT
We will keep your suggestion in mind...but there are just too many systems with
> 4GB and we try to add kernels to the test matrix. as of now there are no plans
for a non-pae smp kernel for RHEL4. thanks.

Note You need to log in before you can comment on or make changes to this bug.