Red Hat Bugzilla – Bug 198657
RHEL4-U4-B2:Kernel Panic while boot-up during installation on PE6800.
Last modified: 2013-03-06 00:59:47 EST
Description of problem:
When we do PXE-install of RHEL4-U4-B2 x86_64(kernel 2.6.9-40) on PE6800 (with
2GB memory),during the boot-up the kernel panics.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1.Install RHEL4-U4-B2(kernel 2.6.9-40)
2.When trying to boot-up the kernel panics
Kernel fails to boot up properly and kernel panic observed.
Kernel should boot up fine and installed successfully.
1. Tried passing the noapic parameter during boot-up,but still there is kernel
2. RHEL4-U3 on the same machine boots up fine.
Created attachment 132313 [details]
Serial console output of Kernel panic
Can you please try 'nolapic' too?
We don't have this hardware in house. Can you please tell us which RHEL4-U4
kernel worked last, if any. Once we have that data we can try and narrow down
when this issue was introduced.
This crash happens because phys_cpu_present_map does not include an entry for
the APIC ID of the CPU that's running at boot. Access to hardware will make
this easier to track down.
We located the PE6800 in Westford, but apparently it installed RHEL4U4 just
fine. I believe I know what's going on, though:
The console output in Comment 1 indicates that the system on which this problem
occurs (a) has only 4 CPU cores while our system has 16, and (b) does not
enumerate them from 0 (it has 8/9/14/15).
What the console dump does not show is which of these CPUs is flagged by the
ACPI tables as the boot CPU. It turns out that if the first CPU detected by
Linux (i.e. the first one listed in the MADT) is *not* flagged as the boot CPU,
we could have trouble booting the install kernel. The reason is that the
install kernel is limited to 1 CPU (NR_CPUS=1). This means that it will only
use the first CPU it finds in the MADT and will scan (but not use) the others.
The kernel sets the "boot_cpu_id" variable based on the CPU_BOOTPROCESSOR flag
in the MADT, though. This means that it's possible for the system to be running
on a CPU other than the boot CPU at boot time. If only one CPU is allowed, then
the bit for "boot_cpu_id" will not be found in phys_cpu_present_map. I think
this is why we're tripping over this bug.
Questions for Dell:
(1) under what circumstances would a PE6800 start enumerating CPUs from other
than 0, and is this a supported configuration?
(2) Under what circumstances would a CPU other than the first one in the ACPI
MADT table be flagged as the boot processor?
(3) how common and/or likely are (1) and (2) under normal operating conditions?
Sorry for the mis information in comment #3 for some reason it is not in the
inventory database. Both Jason and I looked. I will ask Matt about that. Thanks Jeff
Ok it actually is in the database but it is different then 99% of the other Dell
systems. Using the search in the inventory database I looked for:
dmi rh_bios_vendor = Dell Computer Corporation
dmi rh_bios_vendor = Dell inc.
That query picks up 99% of the systems. For some reason the dmi data for this
system is just Dell
When we pass both "noapic" and "nolapic" parameters kernel boots-up fine.
Passing only "noapic" or only "nolapic" does not solve the problem.
Issue is seen even on kernel-2.6.9-39.
When we pass both "noapic" and "nolapic" parameters kernel boots-up fine and
the installation is completed successfully.
After the installation when we check /proc/cmdline only noapic is specified.
When we removed noapic parameter and rebooted the system,the system comes up
So basically we have a problem with the boot kernel. But the normal kernel
boots up fine.
Does Red hat have plans to document this issue as a KB article?
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release. Product Management has requested
further review of this enhancement by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products. This enhancement is not yet committed for inclusion in an Update
Does Red Hat have a fix for this issue? Can we have a commitment for RHEL 4.5?
committed in stream U5 build 42.6. A test kernel with this patch is available
Issue is reproducible with test kernel-2.6.9-42.7.EL.x86_64.
When we try to do PXE install of kernel-2.6.9-42.7.EL x86_64,the kernel panics.
I have attached the serial console output.
Created attachment 135778 [details]
Serial console output of kernel panic on kernel-2.6.9-42.7
Question for Dell:
Can you reproduce this problem by installing RHEL4-U3 on the system, then
installing the UP (Non-SMP) RHEL4-U4 kernel and booting that? I'd like to see
if we can reproduce this on something other than the bootstrap kernel.
If you can reproduce it this way I may have a kernel for you to test.
Per comment #30, changing status to NEEDINFO.
After installing RHEL4-U3(2.6.9-34) on PE6800, installed RHEL4-U4(2.6.9-42.EL)
UP(Non-SMP) and then booting into RHEL4-U4 kernel results in Kernel Panic.
Re Comment #32: Thank you for testing this. I will shortly provide a test
kernel that you can try out on this system to see if it fixes the problem.
Try the kernel at http://people.redhat.com/jparadis/bz198657 and let me know if
it boots successfully on this hardware...
(In reply to comment #34)
> Try the kernel at http://people.redhat.com/jparadis/bz198657 and let me know
> it boots successfully on this hardware...
Issue is reproducible with the test kernel provided in comment #34.
Installed the test kernel(kernel-2.6.9-42.10.bz198657.EL) and then on booting
into the test kernel results in kernel panic.
I have been trying to reproduce this issue in-house. The problem I have is that
the crash shows that the only CPUs online are APIC IDs 8, 14, 9, and 15. My
system shows 16 logical processors: 0-15. I tried pulling Processors 1 and 2
from my box to get a similar configuration (one where the boot processor is not
APIC ID 0), but now it just won't come up at all. What am I supposed to do to
replicate the originator's configuration?
I am able to reproduce this issue on PE6800 and PE6850 with only 2 CPU's.
This issue is reproducible on both single-core and dual-core processor machines.
This issue is reproducible when we have only 2 CPU's on a 4 CPU machine.If we
have all the 4 CPU's on the machine then this issue is not seen.
Reply to comment #37 : If processors 3 and 4 are removed from a 4-CPU machine
then the issue can be reproduced.
If you are running a 2 processor system, then the sytem must have processors 3
and 4 removed. (Only 1,2 an 4 processor configurations are supported) What BIOS
versions are each of the systems at in Red Hat and in Dell? BIOS A04 is the
latest available (it can be found at support.dell.com)
Our current Bios version is A02 I am pulling down the latest version from Dells
Ok after reconfiguring the system and putting the latest bios on the system we
are able to reproduce this error in house. With the pe6800 located in the
Westford office. Looking at the panic that was posted in this bz and the one I
now have they are identical.
Created attachment 138731 [details]
this is the patch causing this problem that was added in U4. it resolved bugs
176612 and 174627
Created attachment 139019 [details]
patch that resolves this issue
committed in stream U5 build 42.21. A test kernel with this patch is available
Issue is fixed with test kernel(kernel-2.6.9-42.21.EL.x86_64.rpm).
Installed the test kernel on PE6800 and the kernel boots up fine.
This bugzilla has Keywords: Regression.
Since no regressions are allowed between releases,
it is also being marked as a blocker for this release.
Please resolve ASAP.
We were told that the workaround for this issue would be documented on the RH
knowledgebase. We reviewed the content a few months ago but we do not see the KB
entry yet. We have seen many instances of this issue on Dell mailing lists.
Please have the KB entry asap. Else we will have to wait till April 2007 for
RHEL 4.5 GA.
KB information :
Why does a Dell PE6800 system encounter a kernel panic when doing a PXE-install
with Red Hat Enterprise Linux 4 Update 4 on an x86-64 kernel?
Article ID: 8755
Last update: 12-19-06
Patch is in -50.EL and has been verified by two partners.
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.