Bug 247963

Summary: Non all memory visible to system
Product: Red Hat Enterprise Linux 4 Reporter: Tomasz Jaszowski <tjaszowski>
Component: kernelAssignee: Neil Horman <nhorman>
Status: CLOSED CANTFIX QA Contact: Martin Jenner <mjenner>
Severity: high Docs Contact:
Priority: low    
Version: 4.4   
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-10-04 17:14:55 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Tomasz Jaszowski 2007-07-12 10:43:56 UTC
Description of problem:
We have few DELL PowerEdge servers with 4GB RAM, but OS can't use all of this
memory.

PowerEdge 6650
--------------------
6650_1:~> free
             total
Mem:       3375272

6650_1:~> lspci
00:00.0 Host bridge: Broadcom CMIC-HE (rev 22)
00:00.1 Host bridge: Broadcom CMIC-HE
00:00.2 Host bridge: Broadcom CMIC-HE
00:00.3 Host bridge: Broadcom CMIC-HE
00:04.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
00:05.0 Class ff00: Dell Remote Access Card III
00:05.1 Class ff00: Dell Remote Access Card III
00:05.2 Class ff00: Dell Remote Access Card III: BMC/SMIC device not present
00:0f.0 Host bridge: Broadcom CSB5 South Bridge (rev 93)
00:0f.1 IDE interface: Broadcom CSB5 IDE Controller (rev 93)
00:0f.2 USB Controller: Broadcom OSB4/CSB5 OHCI USB Controller (rev 05)
00:0f.3 ISA bridge: Broadcom CSB5 LPC bridge
00:10.0 Host bridge: Broadcom CIOB30 (rev 03)
00:10.2 Host bridge: Broadcom CIOB30 (rev 03)
00:11.0 Host bridge: Broadcom CIOB30 (rev 03)
00:11.2 Host bridge: Broadcom CIOB30 (rev 03)
00:12.0 Host bridge: Broadcom CIOB30 (rev 03)
00:12.2 Host bridge: Broadcom CIOB30 (rev 03)
03:01.0 PCI bridge: Intel Corporation 21154 PCI-to-PCI Bridge
04:00.0 PCI bridge: Intel Corporation 21154 PCI-to-PCI Bridge
04:01.0 SCSI storage controller: QLogic Corp. ISP12160 Dual Channel Ultra3 SCSI
Processor (rev 06)
05:00.0 RAID bus controller: American Megatrends Inc. MegaRAID (rev 20)
0a:01.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5700 Gigabit
Ethernet (rev 14)
0a:02.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5700 Gigabit
Ethernet (rev 14)
15:01.0 PCI bridge: Intel Corporation 21154 PCI-to-PCI Bridge
16:00.0 PCI bridge: Intel Corporation 21154 PCI-to-PCI Bridge
16:01.0 SCSI storage controller: QLogic Corp. ISP12160 Dual Channel Ultra3 SCSI
Processor (rev 06)
17:00.0 RAID bus controller: American Megatrends Inc. MegaRAID (rev 20)


--------------------
6650_2 :~> free
             total
Mem:       3635360

6650_2:~> lspci
00:00.0 Host bridge: Broadcom CMIC-HE (rev 22)
00:00.1 Host bridge: Broadcom CMIC-HE
00:00.2 Host bridge: Broadcom CMIC-HE
00:00.3 Host bridge: Broadcom CMIC-HE
00:04.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
00:05.0 Class ff00: Dell Remote Access Card III
00:05.1 Class ff00: Dell Remote Access Card III
00:05.2 Class ff00: Dell Remote Access Card III: BMC/SMIC device not present
00:0f.0 Host bridge: Broadcom CSB5 South Bridge (rev 93)
00:0f.1 IDE interface: Broadcom CSB5 IDE Controller (rev 93)
00:0f.3 ISA bridge: Broadcom CSB5 LPC bridge
00:10.0 Host bridge: Broadcom CIOB30 (rev 03)
00:10.2 Host bridge: Broadcom CIOB30 (rev 03)
00:11.0 Host bridge: Broadcom CIOB30 (rev 03)
00:11.2 Host bridge: Broadcom CIOB30 (rev 03)
00:12.0 Host bridge: Broadcom CIOB30 (rev 03)
00:12.2 Host bridge: Broadcom CIOB30 (rev 03)
03:01.0 PCI bridge: Intel Corporation 21154 PCI-to-PCI Bridge
04:00.0 PCI bridge: Intel Corporation 21154 PCI-to-PCI Bridge
04:01.0 SCSI storage controller: QLogic Corp. ISP12160 Dual Channel Ultra3 SCSI
Processor (rev 06)
05:00.0 RAID bus controller: American Megatrends Inc. MegaRAID (rev 20)
0a:01.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5700 Gigabit
Ethernet (rev 14)
0a:02.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5700 Gigabit
Ethernet (rev 14)


PowerEdge 2560
--------------------
:~> free
             total
Mem:       3895444

:~> lspci
00:00.0 Host bridge: Broadcom CMIC-WS Host Bridge (GC-LE chipset) (rev 13)
00:00.1 Host bridge: Broadcom CMIC-WS Host Bridge (GC-LE chipset)
00:00.2 Host bridge: Broadcom CMIC-LE
00:04.0 Class ff00: Dell Embedded Remote Access or ERA/O
00:04.1 Class ff00: Dell Remote Access Card III
00:04.2 Class ff00: Dell Embedded Remote Access: BMC/SMIC device
00:0e.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
00:0f.0 Host bridge: Broadcom CSB5 South Bridge (rev 93)
00:0f.1 IDE interface: Broadcom CSB5 IDE Controller (rev 93)
00:0f.2 USB Controller: Broadcom OSB4/CSB5 OHCI USB Controller (rev 05)
00:0f.3 ISA bridge: Broadcom CSB5 LPC bridge
00:10.0 Host bridge: Broadcom CIOB-X2 PCI-X I/O Bridge (rev 03)
00:10.2 Host bridge: Broadcom CIOB-X2 PCI-X I/O Bridge (rev 03)
00:11.0 Host bridge: Broadcom CIOB-X2 PCI-X I/O Bridge (rev 03)
00:11.2 Host bridge: Broadcom CIOB-X2 PCI-X I/O Bridge (rev 03)
03:06.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5701 Gigabit
Ethernet (rev 15)
03:08.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5701 Gigabit
Ethernet (rev 15)
04:08.0 PCI bridge: Intel Corporation 80303 I/O Processor PCI-to-PCI Bridge (rev 01)
04:08.1 RAID bus controller: Dell PowerEdge Expandable RAID Controller 3/Di (rev 01)



Version-Release number of selected component (if applicable):
all servers uses RH4U4 with
2.6.9-42.ELsmp #1 SMP Sat Aug 12 09:39:11 CDT 2006 i686 i686 i386 GNU/Linux

How reproducible:
use Dell PowerEdge 2650 or 6650 with exactly 4GB of RAM

Steps to Reproduce:
1.
2.
3.
  
Actual results:
non all of memory can be used by system

Expected results:
all memory should be accessible by system

Additional info:
When i removed some unused PCIX cards I was freed 256MB of memory (see
difference between 6650_1 and 6650_2)

I've found some hardware related info:
Server Products
Not All Memory is Available after Installing 4GB or More of System Memory
(http://support.intel.com/support/motherboards/server/sb/CS-010458.htm)

Not All Physical Memory May Be Reported By The Operating System On Certain HP
ProLiant Servers 
http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?objectID=PSD_EL041214_CW01

but only info at DELL sites that i've found suggests to use /3GB and PAE switch
with Windows.... :)

Comment 1 Tomasz Jaszowski 2007-07-17 13:58:48 UTC
anything?

Comment 2 Tomasz Jaszowski 2007-07-23 13:21:37 UTC
it's becoming critical... we tried to add more memory but some of it is still
missing (256MB to over 700MB depending on machine...)

Comment 3 Neil Horman 2007-08-08 15:16:56 UTC
sounds like the common "pci/video memory hole" not remapped by bios.  Likely
nothing that can be done about it from the OS standpoing.  Please provide the
e820 map and the output of /proc/iomem, and we can confirm.  Thanks!

Comment 4 Neil Horman 2007-08-29 14:33:52 UTC
ping.  Any update here?

Comment 5 Tomasz Jaszowski 2007-09-18 10:29:07 UTC
PE6650:
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 00000000000a0000 (usable)
 BIOS-e820: 0000000000100000 - 00000000cffe0000 (usable)
 BIOS-e820: 00000000cffe0000 - 00000000cffefc00 (ACPI data)
 BIOS-e820: 00000000cffefc00 - 00000000cffff000 (reserved)
 BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
 BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved)
 BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved)
2431MB HIGHMEM available.
896MB LOWMEM available.

:~> cat /proc/iomem
00000000-0009ffff : System RAM
000a0000-000bffff : Video RAM area
000c0000-000c7fff : Video ROM
000c8000-000c93ff : Adapter ROM
000c9800-000cabff : Adapter ROM
000cb000-000cb5ff : Adapter ROM
000f0000-000fffff : System ROM
00100000-cffdffff : System RAM
  00100000-002d53e3 : Kernel code
  002d53e4-0039337f : Kernel data
cffe0000-cffefbff : ACPI Tables
cffefc00-cfffefff : reserved
e0000000-e7ffffff : PCI Bus #16
  e0000000-e7ffffff : PCI Bus #17
    e0000000-e7ffffff : 0000:17:00.0
      e0000000-e7ffffff : MegaRAID: LSI Logic Corporation
ed100000-ed3fffff : PCI Bus #16
  ed1ff000-ed1fffff : 0000:16:01.0
  ed300000-ed3fffff : PCI Bus #17
eff00000-eff0ffff : 0000:0a:02.0
  eff00000-eff0ffff : tg3
eff10000-eff1ffff : 0000:0a:01.0
  eff10000-eff1ffff : tg3
f0000000-f7ffffff : PCI Bus #04
  f0000000-f7ffffff : PCI Bus #05
    f0000000-f7ffffff : 0000:05:00.0
      f0000000-f7ffffff : MegaRAID: LSI Logic Corporation
fcd00000-fcffffff : PCI Bus #04
  fcdff000-fcdfffff : 0000:04:01.0
  fcf00000-fcffffff : PCI Bus #05
fd000000-fdffffff : 0000:00:04.0
fe100000-fe100fff : 0000:00:0f.2
  fe100000-fe100fff : ohci_hcd
fe101000-fe101fff : 0000:00:05.1
fe102000-fe102fff : 0000:00:04.0
feb00000-feb7ffff : 0000:00:05.1
feb80000-feb80fff : 0000:00:05.0
fec00000-fec0ffff : reserved
fee00000-fee0ffff : reserved
fff80000-ffffffff : reserved

PE2650:
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 00000000000a0000 (usable)
 BIOS-e820: 0000000000100000 - 00000000effe0000 (usable)
 BIOS-e820: 00000000effe0000 - 00000000effefc00 (ACPI data)
 BIOS-e820: 00000000effefc00 - 00000000effff000 (reserved)
 BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
 BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved)
 BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved)
2943MB HIGHMEM available.
896MB LOWMEM available.

:~> cat /proc/iomem
00000000-0009ffff : System RAM
000a0000-000bffff : Video RAM area
000c0000-000c7fff : Video ROM
000c8000-000cbfff : Adapter ROM
000cc000-000cc5ff : Adapter ROM
000f0000-000fffff : System ROM
00100000-effdffff : System RAM
  00100000-002d53e3 : Kernel code
  002d53e4-0039337f : Kernel data
effe0000-effefbff : ACPI Tables
effefc00-efffefff : reserved
f0000000-f7ffffff : 0000:04:08.1
fcf00000-fcf0ffff : 0000:03:08.0
  fcf00000-fcf0ffff : tg3
fcf10000-fcf1ffff : 0000:03:06.0
  fcf10000-fcf1ffff : tg3
fd000000-fdffffff : 0000:00:0e.0
fe100000-fe100fff : 0000:00:0e.0
fe101000-fe101fff : 0000:00:04.1
feb00000-feb7ffff : 0000:00:04.1
feb80000-feb80fff : 0000:00:04.0
fec00000-fec0ffff : reserved
fee00000-fee0ffff : reserved
ff000000-ff000fff : 0000:00:0f.2
  ff000000-ff000fff : ohci_hcd
fff80000-ffffffff : reserved


PE1850 (all mem visible)

BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 00000000000a0000 (usable)
 BIOS-e820: 0000000000100000 - 00000000bffc0000 (usable)
 BIOS-e820: 00000000bffc0000 - 00000000bffcfc00 (ACPI data)
 BIOS-e820: 00000000bffcfc00 - 00000000bffff000 (reserved)
 BIOS-e820: 00000000e0000000 - 00000000fec90000 (reserved)
 BIOS-e820: 00000000fed00000 - 00000000fed00400 (reserved)
 BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved)
 BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 0000000140000000 (usable)
4224MB HIGHMEM available.
896MB LOWMEM available.

:~> cd /opt/
Tue Sep 18 12:26:32 CEST 2007
grapp1:root:/opt> cat /proc/iomem
00000000-0009ffff : System RAM
000a0000-000bffff : Video RAM area
000c0000-000c7fff : Video ROM
000cb000-000cbfff : Adapter ROM
000cc000-000ccfff : Adapter ROM
000cd000-000d0bff : Adapter ROM
000d1000-000d31ff : Adapter ROM
000d3800-000d3dff : Adapter ROM
000f0000-000fffff : System ROM
00100000-bffbffff : System RAM
  00100000-002d53e3 : Kernel code
  002d53e4-0039337f : Kernel data
bffc0000-bffcfbff : ACPI Tables
bffcfc00-bfffefff : reserved
c0000000-c00003ff : 0000:00:1f.1
c8000000-cfffffff : 0000:09:0d.0
d7f00000-d7f7ffff : 0000:09:05.1
d7fff000-d7ffffff : 0000:09:05.0
d8000000-d80fffff : PCI Bus #01
  d8000000-d80fffff : PCI Bus #02
    d80f0000-d80fffff : 0000:02:0e.0
      d80f0000-d80fffff : MegaRAID: LSI Logic Corporation
df5e0000-df5effff : 0000:09:0d.0
df5fec00-df5fecff : 0000:09:06.0
  df5fec00-df5fecff : SiI680
df5ff000-df5fffff : 0000:09:05.1
df700000-dfbfffff : PCI Bus #05
  df800000-df9fffff : PCI Bus #07
    df8e0000-df8fffff : 0000:07:08.0
      df8e0000-df8fffff : e1000
  dfa00000-dfbfffff : PCI Bus #06
    dfae0000-dfafffff : 0000:06:07.0
      dfae0000-dfafffff : e1000
dfc00000-dfefffff : PCI Bus #01
  dfd00000-dfefffff : PCI Bus #02
    dfde0000-dfdfffff : 0000:02:0e.0
      dfde0000-dfdfffff : MegaRAID: LSI Logic Corporation
dff00000-dff003ff : 0000:00:1d.7
  dff00000-dff003ff : ehci_hcd
e0000000-fec8ffff : reserved
fed00000-fed003ff : reserved
fee00000-fee0ffff : reserved
ffb00000-ffffffff : reserved



Comment 6 Neil Horman 2007-09-18 10:41:24 UTC
Ahh, I was close.  It does look like your bios has managed to re-map the missing
ram from where your bios hole is located, however, its mapped it up above the
4GB boundary, so your kernel can't make use of it.  You should be able to get it
back by making use of the PAE kernel variant, which can utilize memory above 4GB.

Comment 7 Tomasz Jaszowski 2007-09-20 11:07:19 UTC
so i'll have to try with highmem kernel package

Comment 8 Neil Horman 2007-09-20 11:56:03 UTC
yeah, the hugemem kernel should have the appropriate PAE configuration set, IIRC.

Comment 9 Tomasz Jaszowski 2007-09-25 05:09:36 UTC
hmm, strange thing. Due to some maintenance my colleague added some RAM to
PE2650 and system saw 5.5GB instead of 6GB. He didn't changed kernel or anything
else, just putted additional 2GB of memory, booted machine and checked...
unfortunately   he forgot to save e820 map and the output of /proc/iomem :/

My question is: if system was able to see 5.5GB of memory using standard smp
kernel (not hugemem) why it has problem to address 4GB (or full 6GB)?

Comment 10 Neil Horman 2007-09-25 11:13:11 UTC
Not so strange at all.  System BIOS remapps the way it presents memory to the OS
on boot up.  Some bioses are intellegent about where they place memory, and
weather or not they just let it get lost underneath I/O resources.  Adding more
memory likely caused the BIOS to place more memory in accessible space for the os.

This does unfortunately suggest something troubling.  If you can see 5.5GB of
ram, you may already be running a kernel that has PAE enabled (I'll need to go
check).  If thats the case, then your memory visibility lies exclusively in the
hands of the BIOS at this point, and there won't be anything we can do to claim
more than it offers via the e820 map.


Comment 11 Tomasz Jaszowski 2007-10-04 15:41:25 UTC
we have tried to add memory to this server - and count was 5.5GB instead of 6GB

root:> free
             total       used       free     shared    buffers     cached
Mem:       3375272    3359380      15892          0      11104    2836812
-/+ buffers/cache:     511464    2863808
Swap:      2008084     695516    1312568
Thu Oct  4 17:34:05 CEST 2007
root:> uname -a
Linux huge-db 2.6.9-42.ELsmp #1 SMP Wed Jul 12 23:27:17 EDT 2006 i686 i686 i386
GNU/Linux


i think that smp kernel has PAE included...

Comment 12 Tomasz Jaszowski 2007-10-04 15:44:42 UTC
little more info about kernel

Linux version 2.6.9-42.ELsmp (bhcompile.redhat.com) (gcc
version 3.4.6 20060404 (Red Hat 3.4.6-2)) #1 SMP Wed Jul 12 23:27:17 EDT 2006


Comment 13 Neil Horman 2007-10-04 17:14:55 UTC
yeah, it would seem that this isn't  a PAE issue, but rather a bios mapping
problem.  This is going to have to be a cantfix.  Best advice I can offer is to
go to the system vendors web site and look for a bios update that enhances the
way bios maps physical ram and io space.  Sorry.