Bug 239931 - 4GB memory problems
Summary: 4GB memory problems
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-05-12 17:52 UTC by Jussi Torhonen
Modified: 2007-11-30 22:12 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-06-07 18:22:57 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
kernel bootup messages when using 4GB memory (37.42 KB, text/plain)
2007-05-12 17:52 UTC, Jussi Torhonen
no flags Details
kernel bootup messages when using 2GB memory (38.69 KB, text/plain)
2007-05-12 17:55 UTC, Jussi Torhonen
no flags Details
/proc/cpuinfo when booted with 4GB (1.23 KB, text/plain)
2007-05-13 07:29 UTC, Jussi Torhonen
no flags Details
dmesg when booting with 4GB (27.35 KB, text/plain)
2007-05-13 07:29 UTC, Jussi Torhonen
no flags Details
free output when booting with 4GB (230 bytes, text/plain)
2007-05-13 07:30 UTC, Jussi Torhonen
no flags Details
/proc/interrupts when booting with 4GB (941 bytes, text/plain)
2007-05-13 07:30 UTC, Jussi Torhonen
no flags Details
/proc/iomem when booting with 4GB (1.63 KB, text/plain)
2007-05-13 07:31 UTC, Jussi Torhonen
no flags Details
lspci -vv when booting with 4GB (20.74 KB, text/plain)
2007-05-13 07:31 UTC, Jussi Torhonen
no flags Details
/proc/meminfo when booting with 4GB (725 bytes, text/plain)
2007-05-13 07:32 UTC, Jussi Torhonen
no flags Details

Description Jussi Torhonen 2007-05-12 17:52:50 UTC
Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. boot the computer up with 2GB RAM with no problems
2. fullfill the total memory upto 4GB
3. see only part of the memory detected
  
Actual results:
Originally had 2GB RAM. After adding 2GB more memory the computer boots up
extremely slowly. If giving kernel parameter 'mem=3G' or 'mem=4G' the computer
is fast, but kerne detects only 3GB of the total memory.

Expected results:
Computer should detect 4GB RAM without such problems.

Additional info:
Motherboard Intel DG965WH with latest available BIOS 1679 [MQ96510J.86A]. Memory
originally 2 x 1GB DDR2 800MHz by Kingston (Valueram kit) and added similar 2 x
1GB extra kit to get 4GB. SATA connected CD/DVD drive. Two SATA-II hard disks by
Seagate. Distro FC7 test4 with all yum delivered updates installed.

http://www.intel.com/products/motherboard/dg965wh/
http://support.intel.com/support/motherboards/desktop/DG965WH/

Comment 1 Jussi Torhonen 2007-05-12 17:52:50 UTC
Created attachment 154587 [details]
kernel bootup messages when using 4GB memory

Comment 2 Jussi Torhonen 2007-05-12 17:55:20 UTC
Created attachment 154588 [details]
kernel bootup messages when using 2GB memory

Here's the original state. Computer detects 2GB RAM correctly without any
kernel parameters, boots up and works fast.

Comment 3 Jussi Torhonen 2007-05-12 18:00:03 UTC
So the distro is FC7 test 4 (x86_64) and CPU Intel Core 2 Duo E6600. Those two
SATA-II hard disks configured as software RAID1 disk mirror set.

Comment 4 Jussi Torhonen 2007-05-13 07:29:06 UTC
Created attachment 154601 [details]
/proc/cpuinfo when booted with 4GB

Comment 5 Jussi Torhonen 2007-05-13 07:29:40 UTC
Created attachment 154602 [details]
dmesg when booting with 4GB

Comment 6 Jussi Torhonen 2007-05-13 07:30:12 UTC
Created attachment 154603 [details]
free output when booting with 4GB

Comment 7 Jussi Torhonen 2007-05-13 07:30:37 UTC
Created attachment 154604 [details]
/proc/interrupts when booting with 4GB

Comment 8 Jussi Torhonen 2007-05-13 07:31:06 UTC
Created attachment 154605 [details]
/proc/iomem when booting with 4GB

Comment 9 Jussi Torhonen 2007-05-13 07:31:32 UTC
Created attachment 154606 [details]
lspci -vv when booting with 4GB

Comment 10 Jussi Torhonen 2007-05-13 07:32:05 UTC
Created attachment 154607 [details]
/proc/meminfo when booting with 4GB

Comment 11 Jussi Torhonen 2007-05-13 07:40:11 UTC
Added several attachments above. So, when booting the system with 4GB of RAM,
kernel seems to detect it. free lists it all and /proc/meminfo looks ok. But
after the boot the system works extremely slowly. Booting the system upto single
user mode takes 5-10 minutes instead of a typical 30 seconds or so. As a result
the system is completely useless.

Looking forward to good tips to make it usable again. As said, when booting
either with 2GB of RAM, os 4GB with kernel parameter 'mem=3G', 'mem=3584M',
'mem=4G' or 'mem=4096M' results about 3GB of detected memory and as fast speed
aas you can expect with the system hardware. But when booting with 4GB RAM
without any extra kernel parameters makes the system too slow to do anything.


Comment 12 Chuck Ebbert 2007-05-14 22:04:21 UTC
The system BIOS should have a "memory remap" option.

Comment 13 Jussi Torhonen 2007-05-18 05:20:26 UTC
If you look at dmesg with 4GB, you see that BIOS remaps one segment to
0x100000000 - 0x12c000000. Also says

Memory: 3942884k/4915200k available (2465k kernel code, 175460k reserved, 1445k
data, 332k init)

But I don't have 4915200k (4800M) but just 4GB (4096M, 4194304k). Sure that
mapped segment ends at 0x12c000000 (@4914200k, @4800M). Does '3942884k/4915200k
available' mean that kernel thinks there's 4915200k RAM or that detected RAM
ends at 4915200k?

So the kernel detects 4GB but why the system goes so slow that I cannot do
anything with it. Why is it this way?

If I use 'mem=4G', dmesg says

Memory: 3328924k/3398656k available (2328k kernel code, 68520k reserved, 1396k
data, 320k init)

That 3398656k equals to 3319M or 0xcf700000 which is the ending address of last
usable memory segment below 4G. So the kernel ignores that mapped segment
completely.

 BIOS-e820: 0000000000000000 - 000000000008f000 (usable)
 BIOS-e820: 000000000008f000 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000cf58f000 (usable)
 BIOS-e820: 00000000cf58f000 - 00000000cf59c000 (reserved)
 BIOS-e820: 00000000cf59c000 - 00000000cf64d000 (usable)
 BIOS-e820: 00000000cf64d000 - 00000000cf6a5000 (ACPI NVS)
 BIOS-e820: 00000000cf6a5000 - 00000000cf6a8000 (ACPI data)
 BIOS-e820: 00000000cf6a8000 - 00000000cf6ef000 (ACPI NVS)
 BIOS-e820: 00000000cf6ef000 - 00000000cf6f1000 (ACPI data)
 BIOS-e820: 00000000cf6f1000 - 00000000cf6f2000 (usable)
 BIOS-e820: 00000000cf6f2000 - 00000000cf6ff000 (ACPI data)
 BIOS-e820: 00000000cf6ff000 - 00000000cf700000 (usable)
 BIOS-e820: 00000000cf700000 - 00000000d0000000 (reserved)
 BIOS-e820: 00000000fff00000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 000000012c000000 (usable)


Currently using kernel 2.6.21-1.3163.fc7

How can I deploy the last usable segment 0x100000000-0x12c000000 ?


Comment 14 Jussi Torhonen 2007-05-18 15:45:12 UTC
Here's what hdparm -tT says about the disk i/o speed when booting with 4GB RAM
and 'mem=4096M' kernel parameter:

/dev/sda:
 Timing cached reads:   8160 MB in  2.00 seconds = 4085.10 MB/sec
 Timing buffered disk reads:  212 MB in  3.01 seconds =  70.43 MB/sec

/dev/sdb:
 Timing cached reads:   8220 MB in  2.00 seconds = 4115.00 MB/sec
 Timing buffered disk reads:  212 MB in  3.02 seconds =  70.27 MB/sec


Here's the speed when booting with 4GB RAM and without kernel parameters when
the system appears to be very slow and even booting upto single user mode takes
about 10 minutes:

/dev/sda:
 Timing cached reads:   114 MB in  2.01 seconds =  56.81 MB/sec
 Timing buffered disk reads:   52 MB in  3.04 seconds =  17.12 MB/sec

/dev/sdb:
 Timing cached reads:   114 MB in  2.02 seconds =  56.56 MB/sec
 Timing buffered disk reads:   52 MB in  3.04 seconds =  17.11 MB/sec

MoBo was Intel DG965WH with Intel G965 Express Chipset, Intel 82GG965 Graphics
and Memory Hub (GMCH) and Intel 82801 I/OP Controller Hub (ICH8). Looks like
incompatibility issue with the kernel, when MoBo has full 4GB RAM installed and
when MoBo BIOS makes memory mapping having the top most segment at
0x100000000-0x12c000000 ?

Using kernel 2.6.21-1.3167.fc7 right now for the tests.

Just ask, if you need any further debug information. I'd really like to see that
unused 704MB segment in use.


Comment 15 Chuck Ebbert 2007-05-18 22:55:38 UTC
What is in the file /proc/mtrr in the two different cases?

Comment 16 Jussi Torhonen 2007-05-19 06:37:57 UTC
With 4GB installed that /proc/mtrr is equal if I boot up to single user mode
without any kernel parameters or with 'mem=4096M':

reg00: base=0x00000000 (   0MB), size=2048MB: write-back, count=1
reg01: base=0x80000000 (2048MB), size=1024MB: write-back, count=1
reg02: base=0xc0000000 (3072MB), size= 256MB: write-back, count=1
reg03: base=0xcf800000 (3320MB), size=   8MB: uncachable, count=1
reg04: base=0xcf700000 (3319MB), size=   1MB: uncachable, count=1


Comment 17 Jussi Torhonen 2007-05-19 07:30:40 UTC
FC7 test4 ships with kernel 2.6.20-1.3104.fc7 so I made the same tests with that
one. Got the same results as above. With 4GB installed the kernel detects 4GB
RAM but the system in unusable slow. With kernel parameter mem=4096M the kernel
detects only 3GB RAM but the system runs fast.

So even the problem in this ticket has been now targeted to devel version, it
also  exits in FC7 test4 and it may exist in FC7 final as well. As a workaround
sure you can add kernel parameter mem=4G for such mobo's equipped with Intel
G965 chipset to succesfully install and run FC7, but the memory is not
completely used. The same problem must be in livecd version even I have not
tested it. But the problem exists and we look forward to get such patched kernel
that correctly deploys the whole 4GB memory. 

Actually that mobo could have more memory that 4GB but using DDR2 800MHz modules
that 4 GB is maximum. Using slower memory modules (DDR2 533 or 667 MHz) would
make it to total 8GB (4 x 2GB).


Comment 18 Jussi Torhonen 2007-05-21 18:06:13 UTC
Thanks to Linus, I hacked /proc/mtrr directly and got 4GB detected with the
following bash script:

#!/bin/sh
echo "base=0x100000000 size=0x20000000 type=write-back" >| /proc/mtrr
echo "base=0x120000000 size=0x8000000 type=write-back"  >| /proc/mtrr
echo "base=0x128000000 size=0x4000000 type=write-back"  >| /proc/mtrr


Now free says:

# free
             total       used       free     shared    buffers     cached
Mem:       3975016    1251468    2723548          0      26668     487580
-/+ buffers/cache:     737220    3237796
Swap:      2096376          0    2096376


And /proc/mtrr has:

$ cat /proc/mtrr 
reg00: base=0x00000000 (   0MB), size=2048MB: write-back, count=1
reg01: base=0x80000000 (2048MB), size=1024MB: write-back, count=1
reg02: base=0xc0000000 (3072MB), size= 256MB: write-back, count=1
reg03: base=0xcf800000 (3320MB), size=   8MB: uncachable, count=1
reg04: base=0xcf700000 (3319MB), size=   1MB: uncachable, count=1
reg05: base=0x100000000 (4096MB), size= 512MB: write-back, count=1
reg06: base=0x120000000 (4608MB), size= 128MB: write-back, count=1
reg07: base=0x128000000 (4736MB), size=  64MB: write-back, count=1


So, is this a BIOS or kernel problem? As listed in dmesg, that 704MB
(512+128+64) memory segment located at 0x100000000 is listed under E820
information as usable. Why doesn't kernel then use it when initializing /proc/mtrr ?


Comment 19 Chuck Ebbert 2007-05-21 20:16:07 UTC
I don't know how Linux sets up the MTRR values. It may be that the system
BIOS needs to do that. Maybe DaveJ knows...

Comment 20 Chuck Ebbert 2007-06-07 18:22:57 UTC
This is a system BIOS bug. Many systems have never been tested with >2GB of memory.

Possibly a BIOS update would fix it.

Comment 21 Todd Ignasiak 2007-06-09 19:27:20 UTC

This bug was introduced in Intel's BIOS 1676 update.  It is common to many Intel
G965 based motherboards (Listed in the release notes were:  DP965LT, DG965SS,
DG965RY, DG965PZ, DG965OT, DG965MS, DG965MQ, DQ963FX, DQ963GS ).

When using a 64 bit kernel, with 4GB of RAM, the system slows to a crawl.

The workaround is to revert to the previous BIOS, version 1669 from April 6, 2007.

The current version, as of 6/9/07 is 1687, and it does have the 4GB+64bit
problem.  This problem was also reported by 64bit Vista users, so it's not a
Linux kernel issue.

Comment 22 Jussi Torhonen 2007-06-10 15:59:23 UTC
Thanks, I can confirm, that using older system BIOS release 1669 did the trick.
Full 4 GB RAM is detected and the system runs just fine without the MTRR trick
introduced above. So, using BIOS release newer that 1669 upto currently latest
1687 is not recommended. Hopefully some future BIOS release will have this fixed.


Comment 23 Jussi Torhonen 2007-08-05 19:31:46 UTC
BIOS releases 1698 and 1699 for Intel DG965WH motherboard (and other earlier
mentioned mobo variants I guess) still have the same problem. When 1699 is the
latest available BIOS, that older 1669 is still the most recommended for users
having 4 GB of physical RAM installed.



Note You need to log in before you can comment on or make changes to this bug.