Bug 311411 - [RHEL5 RT] Kernel panic - not syncing: Out of memory
Summary: [RHEL5 RT] Kernel panic - not syncing: Out of memory
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: realtime-kernel
Version: 1.0
Hardware: i386
OS: Linux
low
high
Target Milestone: ---
: ---
Assignee: Michal Schmidt
QA Contact:
URL: http://rhts.lab.boston.redhat.com/cgi...
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-09-28 17:41 UTC by Jeff Burke
Modified: 2008-02-27 19:58 UTC (History)
3 users (show)

Fixed In Version: 2.6.21-46.el5rt
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-10-26 09:22:38 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Jeff Burke 2007-09-28 17:41:34 UTC
Description of problem:
 Testing kernel 2.6.21-39rt with RHEL5.1 distro RHEL5.1-Server-20070920.1 using
this system hp-dl585-02.rhts.boston.redhat.com. The kernel-rt panics while loading.

Version-Release number of selected component (if applicable):
 2.6.21-39rt

How reproducible:
 Always

Steps to Reproduce:
1. Install RHEL5.1 RHEL5.1-Server-20070920.1 on
hp-dl585-02.rhts.boston.redhat.com. i386 variant. Install the RT kernel and reboot.
  
Actual results:
Booting command-list
root (hd0,0)
 Filesystem type is ext2fs, partition type 0x83
kernel /vmlinuz-2.6.21-39.el5rt ro root=/dev/VolGroup00/LogVol00 console=ttyS0,
115200 debug earlyprintk=ttyS0,115200
   [Linux-bzImage, setup=0x1e00, size=0x214ae0]
initrd /initrd-2.6.21-39.el5rt.img
   [Linux-initrd @ 0x37d0c000, 0x2e3151 bytes]

Linux version 2.6.21-39.el5rt (brewbuilder.redhat.com) (gcc
version 4.1.1 20070105 (Red Hat 4.1.1-52)) #1 SMP PREEMPT RT Thu Sep 20 11:18:50
EDT 2007
BIOS-provided physical RAM map:
sanitize start
sanitize end
copy_e820_map() start: 0000000000000000 size: 000000000009f400 end:
000000000009f400 type: 1
copy_e820_map() type is E820_RAM
copy_e820_map() start: 000000000009f400 size: 0000000000000c00 end:
00000000000a0000 type: 2
copy_e820_map() start: 00000000000f0000 size: 0000000000010000 end:
0000000000100000 type: 2
copy_e820_map() start: 0000000000100000 size: 00000000f56f6800 end:
00000000f57f6800 type: 1
copy_e820_map() type is E820_RAM
copy_e820_map() start: 00000000f57f6800 size: 0000000000009800 end:
00000000f5800000 type: 3
copy_e820_map() start: 00000000fdc00000 size: 0000000000001000 end:
00000000fdc01000 type: 2
copy_e820_map() start: 00000000fdc10000 size: 0000000000001000 end:
00000000fdc11000 type: 2
copy_e820_map() start: 00000000fdc20000 size: 0000000000001000 end:
00000000fdc21000 type: 2
copy_e820_map() start: 00000000fdc30000 size: 0000000000001000 end:
00000000fdc31000 type: 2
copy_e820_map() start: 00000000fec00000 size: 0000000000001000 end:
00000000fec01000 type: 2
copy_e820_map() start: 00000000fec10000 size: 0000000000001000 end:
00000000fec11000 type: 2
copy_e820_map() start: 00000000fec20000 size: 0000000000001000 end:
00000000fec21000 type: 2
copy_e820_map() start: 00000000fee00000 size: 0000000000010000 end:
00000000fee10000 type: 2
copy_e820_map() start: 00000000ff800000 size: 0000000000800000 end:
0000000100000000 type: 2
copy_e820_map() start: 0000000100000000 size: 00000016fffff000 end:
00000017fffff000 type: 1
copy_e820_map() type is E820_RAM
 BIOS-e820: 0000000000000000 - 000000000009f400 (usable)
 BIOS-e820: 000000000009f400 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000f57f6800 (usable)
 BIOS-e820: 00000000f57f6800 - 00000000f5800000 (ACPI data)
 BIOS-e820: 00000000fdc00000 - 00000000fdc01000 (reserved)
 BIOS-e820: 00000000fdc10000 - 00000000fdc11000 (reserved)
 BIOS-e820: 00000000fdc20000 - 00000000fdc21000 (reserved)
 BIOS-e820: 00000000fdc30000 - 00000000fdc31000 (reserved)
 BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
 BIOS-e820: 00000000fec10000 - 00000000fec11000 (reserved)
 BIOS-e820: 00000000fec20000 - 00000000fec21000 (reserved)
 BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved)
 BIOS-e820: 00000000ff800000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 00000017fffff000 (usable)
97407MB HIGHMEM available.
896MB LOWMEM available.
found SMP MP-table at 000f4fa0
NX (Execute Disable) protection: active
Zone PFN ranges:
  DMA             0 ->     4096
  Normal       4096 ->   229376
  HighMem    229376 -> 25165823
early_node_map[1] active PFN ranges
    0:        0 -> 25165823
bootmem alloc of 1409286144 bytes failed!
Kernel panic - not syncing: Out of memory
 [<c0405fec>] dump_trace+0x5f/0x107
 [<c04060ae>] show_trace_log_lvl+0x1a/0x2f
 [<c04066d0>] show_trace+0x12/0x14
 [<c040675a>] dump_stack+0x16/0x18
 [<c0428340>] panic+0x50/0xf6
 [<c07e14cc>] __alloc_bootmem+0x2e/0x33
 [<c07e1506>] __alloc_bootmem_node+0x35/0x3c
 [<c07e1d40>] free_area_init_node+0xd7/0x3ec
 [<c07e246c>] free_area_init_nodes+0x139/0x140
 [<c07d4f7e>] zone_sizes_init+0x42/0x48
 [<c07d528e>] setup_arch+0x30a/0x36d
 [<c07ce79f>] start_kernel+0x6b/0x490
 =======================

Expected results:
 This should boot.

Additional info:

Comment 1 Jeff Burke 2007-09-28 17:44:07 UTC
The RHEL5.1 installation went fine. only after installing the RT kernel did this
issue start. Some additional data about the hardware.

********** System Information **********
Hostname                = hp-dl585-02.rhts.boston.redhat.com
Kernel Version          = 2.6.18-48.el5
Machine Hardware Name   = i686
Processor Type          = athlon
uname -a output         = Linux hp-dl585-02.rhts.boston.redhat.com 2.6.18-48.el5
#1 SMP Mon Sep 17 17:26:31 EDT 2007 i686 athlon i386 GNU/Linux
Swap Size               = 8191 MB
Mem Size                = 3890 MB
Number of Processors    = 8
System Release          = Red Hat Enterprise Linux Server release 5.1 Beta (Tikanga)
Command Line            = ro root=/dev/VolGroup00/LogVol00 console=ttyS0,115200
System NMI Interrupts   = NMI:          0          0          0          0     
    0          0          0          0 
********** LSPCI **********
00:03.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8111 PCI (rev 07)
00:04.0 ISA bridge: Advanced Micro Devices [AMD] AMD-8111 LPC (rev 05)
00:04.1 IDE interface: Advanced Micro Devices [AMD] AMD-8111 IDE (rev 03)
00:04.3 Bridge: Advanced Micro Devices [AMD] AMD-8111 ACPI (rev 05)
00:07.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12)
00:07.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01)
00:08.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12)
00:08.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01)
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron]
HyperTransport Technology Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM
Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron]
Miscellaneous Control
00:19.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron]
HyperTransport Technology Configuration
00:19.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
00:19.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM
Controller
00:19.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron]
Miscellaneous Control
00:1a.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron]
HyperTransport Technology Configuration
00:1a.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
00:1a.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM
Controller
00:1a.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron]
Miscellaneous Control
00:1b.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron]
HyperTransport Technology Configuration
00:1b.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
00:1b.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM
Controller
00:1b.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron]
Miscellaneous Control
01:00.0 USB Controller: Advanced Micro Devices [AMD] AMD-8111 USB (rev 0b)
01:00.1 USB Controller: Advanced Micro Devices [AMD] AMD-8111 USB (rev 0b)
01:02.0 System peripheral: Compaq Computer Corporation Integrated Lights Out
Controller (rev 01)
01:02.2 System peripheral: Compaq Computer Corporation Integrated Lights Out 
Processor (rev 01)
01:03.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
02:04.0 RAID bus controller: Compaq Computer Corporation Smart Array 5i/532 (rev 01)
02:06.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 Gigabit
Ethernet (rev 10)
02:06.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 Gigabit
Ethernet (rev 10)
04:09.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12)
04:09.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01)
04:0a.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12)
04:0a.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01)
04:0b.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12)
04:0b.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01)
04:0c.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12)
04:0c.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01)
********** Modprob **********
alias eth0 tg3
alias eth1 tg3
alias scsi_hostadapter cciss
********** Module Information **********
Checking module information autofs4:
Checking module information hidp:
Bluetooth HIDP ver 1.1
Checking module information rfcomm:
Bluetooth RFCOMM ver 1.8
Checking module information l2cap:
Bluetooth L2CAP ver 2.8
Checking module information bluetooth:
Bluetooth Core ver 2.10
Checking module information sunrpc:
Checking module information ipv6:
IPv6 protocol stack for Linux
Checking module information dm_multipath:
device-mapper multipath target
Checking module information video:
ACPI Video Driver
Checking module information sbs:
Smart Battery System ACPI interface driver
Checking module information backlight:
Backlight Lowlevel Control Abstraction
Checking module information i2c_ec:
ACPI EC SMBus driver
Checking module information button:
ACPI Button Driver
Checking module information battery:
ACPI Battery Driver
Checking module information asus_acpi:
Asus Laptop ACPI Extras Driver
Checking module information ac:
ACPI AC Adapter Driver
Checking module information parport_pc:
PC-style parallel port driver
Checking module information lp:
Checking module information parport:
Checking module information i2c_amd756:
AMD756/766/768/8111 and nVidia nForce SMBus driver
Checking module information i2c_core:
I2C-Bus main module
Checking module information serio_raw:
Raw serio driver
Checking module information floppy:
Checking module information ide_cd:
ATAPI CD-ROM Driver
Checking module information amd_rng:
H/W RNG driver for AMD chipsets
Checking module information cdrom:
Checking module information k8temp:
AMD K8 core temperature monitor
Checking module information hwmon:
hardware monitoring sysfs/class support
Checking module information tg3:
Broadcom Tigon3 ethernet driver
Checking module information k8_edac:
MC support for AMD K8 memory controllers -  Ver: 2.0.2 Sep 17 2007
Checking module information pcspkr:
PC Speaker beeper driver
Checking module information edac_mc:
Core library routines for MC reporting
Checking module information dm_snapshot:
device-mapper snapshot target
Checking module information dm_zero:
device-mapper dummy target returning zeros
Checking module information dm_mirror:
device-mapper mirror target
Checking module information dm_mod:
device-mapper driver
Checking module information cciss:
Driver for HP Controller SA5xxx SA6xxx version 3.6.16-RH1
Checking module information sd_mod:
SCSI disk (sd) driver
Checking module information scsi_mod:
SCSI core
Checking module information ext3:
Second Extended Filesystem with journaling extensions
Checking module information jbd:
Checking module information ehci_hcd:
10 Dec 2004 USB 2.0 'Enhanced' Host Controller (EHCI) Driver
Checking module information ohci_hcd:
2005 April 22 USB 1.1 'Open' Host Controller (OHCI) Driver
Checking module information uhci_hcd:
USB Universal Host Controller Interface driver

******** End System Information ********

Comment 2 Jeff Burke 2007-09-28 17:46:22 UTC
Here is a link to the successful install and reboot with the standard RHEL5.1

http://rhts.lab.boston.redhat.com/cgi-bin/rhts/test_log.cgi?id=854730
You will need ot use your Bugzilla login to view the above information.



Comment 3 Michal Schmidt 2007-10-02 16:25:10 UTC
The original kernel boots fine with this in the log:

Linux version 2.6.18-48.el5 (brewbuilder.redhat.com) (gcc
version 4.1.1 20070105 (Red Hat 4.1.1-52)) #1 SMP Mon Sep 17 17:26:31 EDT 2007
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009f400 (usable)
 BIOS-e820: 000000000009f400 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000f57f6800 (usable)
 BIOS-e820: 00000000f57f6800 - 00000000f5800000 (ACPI data)
 BIOS-e820: 00000000fdc00000 - 00000000fdc01000 (reserved)
 BIOS-e820: 00000000fdc10000 - 00000000fdc11000 (reserved)
 BIOS-e820: 00000000fdc20000 - 00000000fdc21000 (reserved)
 BIOS-e820: 00000000fdc30000 - 00000000fdc31000 (reserved)
 BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
 BIOS-e820: 00000000fec10000 - 00000000fec11000 (reserved)
 BIOS-e820: 00000000fec20000 - 00000000fec21000 (reserved)
 BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved)
 BIOS-e820: 00000000ff800000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 00000017fffff000 (usable)
Warning only 4GB will be used.
Use a PAE enabled kernel.
3200MB HIGHMEM available.
896MB LOWMEM available.
...

The machine has 96 GB RAM, but this kernel does not have PAE enabled, so it uses
only 4 GB. Works OK.

On the other hand, with the -rt kernel there is:
...
97407MB HIGHMEM available.
896MB LOWMEM available.
found SMP MP-table at 000f4fa0
NX (Execute Disable) protection: active
Zone PFN ranges:
  DMA             0 ->     4096
  Normal       4096 ->   229376
  HighMem    229376 -> 25165823
early_node_map[1] active PFN ranges
    0:        0 -> 25165823
bootmem alloc of 1409286144 bytes failed!
...

The -rt kernel has PAE enabled, so it tries to use all of RAM. 96 GB is a bit
too much for a 32-bit kernel. I think mem_map is so big that it doesn't even fit
into lowmem. According to http://www.redhat.com/rhel/compare/ the maximum memory
supported by RHEL on i386 is 16 GB. For 96 GB RAM x86_64 is necessary.

It should not crash like that, but it's not a supported configuration => setting
low priority of the bug.

Comment 4 Michal Schmidt 2007-10-03 09:15:10 UTC
2.6.21-39.el5rtvanilla boots fine on the machine. However, /proc/meminfo shows:
LowTotal:       125196 kB
LowFree:         57704 kB
So it has only 122 MB of low memory.

The same kernel when booted with mem=16G:
LowTotal:       780556 kB
LowFree:        738916 kB
i.e. much healthier 762 MB of low memory.

2.6.21-39.el5rt is also able to boot with mem=16G and has:
LowTotal:       677680 kB
LowFree:        626748 kB
So it has 661 MB of low memory. That's less than non-rt kernel, but usable.

Why is there a difference? The memory map takes more space on -rt, because
struct page is bigger (56 bytes vs. 32 bytes). The difference comes from a
spinlock contained in struct page:
#if NR_CPUS >= CONFIG_SPLIT_PTLOCK_CPUS
            spinlock_t ptl;
#endif

On non-rt, spinlock_t is a raw_spinlock_t which is 8 bytes. On -rt, it is a
mutex and takes 32 bytes.

Comment 5 Tim Burke 2007-10-04 15:13:54 UTC
Are you suggesting that we should have a documented maximum memory configuration
for x86?  If so, what max limit do you recommend?


Comment 6 Michal Schmidt 2007-10-05 06:59:33 UTC
We already have the maximum documented. The limit for RHEL-RT should be the same
as RHEL5's. For RHEL5 we officially support up to 16GB RAM on x86. RHEL3 and
RHEL4 supported up to 64GB, but only with the -hugemem kernel which was droppped
in RHEL5. Anyone with more than 16GB RAM should be running a 64-bit system now.

That said, I believe the kernel should protect itself from the crash and
artificially limit the amount of RAM it detects. I'll make a patch.

Comment 7 Tim Burke 2007-10-23 21:23:08 UTC
Reiterated x86 max mem config in a release note.  closing as notabug.

Comment 12 Michal Schmidt 2007-10-26 09:22:38 UTC
Fixed by limit-i386-ram-to-16gb.patch in 2.6.21-46.el5rt.


Note You need to log in before you can comment on or make changes to this bug.