Description of problem: When using the default RedHat SMP kernel with PAE enabled this system will not work. Programs fail to start with a "invalid executable" error and so on. Disk subsystem code complains about attemt to access beyound end of device, for example: Oct 27 14:10:49 hostname kernel: attempt to access beyond end of device Oct 27 14:10:49 hostname kernel: dm-3: rw=0, want=13350692872, limit=40960000 Oct 27 14:10:49 hostname kernel: attempt to access beyond end of device Oct 27 14:10:49 hostname kernel: dm-3: rw=0, want=14679774112, limit=40960000 Oct 27 14:10:49 hostname kernel: attempt to access beyond end of device Oct 27 14:10:49 hostname kernel: dm-3: rw=0, want=56335120, limit=40960000 Oct 27 14:10:49 hostname kernel: attempt to access beyond end of device Oct 27 14:10:49 hostname kernel: dm-3: rw=0, want=14421294016, limit=40960000 PAE is a very special feature and I can't see that a great number of users would make use of it. Those really needing it will go 64-bit anyway. Why can't this be hidden away in the -hugemem series of kernels as it was before? It only introduces bugs and headaches for those not in need for PAE. Version-Release number of selected component (if applicable): All. How reproducible: 100% Steps to Reproduce: 1. Install comparable system in 32-bit mode 2. Try to use it. Actual results: A non-working system. Expected results: A working system. Additional info: Intel SE7520JR2 motherboard 2 x Intel Xeon 3.0GHz with 2MB cache EM64T capable, running in 32-bit mode, no HyperThreading 4 GB RAM
hmmm can you be specific as to what kernel version you are running? if this a pae issue, can you try booting the UP kernel which does not have PAE to verify this.
(In reply to comment #1) > hmmm can you be specific as to what kernel version you are running? if this a > pae issue, can you try booting the UP kernel which does not have PAE to verify this. Version: kernel-2.6.9-42.0.3.EL The system is currently running the UP kernel as it does not work with the RedHat's SMP kernel. I have also verified that it does work if I compile my own version of the SMP from RedHat sources with PAE disabled.
ok, do you have a testcase for us to help debug this? thanks.
Boot up a system comparable to this in 32-bit mode with SMP kernel: Intel SE7520JR2 motherboard 2 x Intel Xeon 3.0GHz with 2MB cache EM64T capable, running in 32-bit mode, no HyperThreading 4 GB RAM Look for strange ext3 error messages during boot, try to launch some programs and watch that some of them won't start. You migth be able to log in, but starting a second program will result in "invalid executable" messages. Watch that if you boot the same system with a UP kernel all problems are gone. Compile a custom SMP kernel with RedHat default config, except that you disable PAE, and watch that the problems are still gone. Boot up the default RedHat SMP kernel with PAE enabled, and watch that the system is still broken.
There seems to be firmware fixes available for the RAID controller used in this system related to PAE. Intel SRCS16S (OEM version of MegaRAID SATA 300-8X) BUT, my original complaint against using PAE in stock kernel stands, just for this reason. Having it enabled by default opens a plethora of bugs that need not be there. Stick to non-PAE and tell 4 GB users to go 64bit, at least include a no-PAE kernel so we don't have to do custom builds.
We will keep your suggestion in mind...but there are just too many systems with > 4GB and we try to add kernels to the test matrix. as of now there are no plans for a non-pae smp kernel for RHEL4. thanks.