Description of problem:
I have an x86_64 box with an Asus P5B-VM motherboard with the Intel G965 chipset. According to the
motherboard documentation the machine is supposed to support up to 8GiB RAM. The machine
operates flawlessly with 2 * 2048MiB PC2-6400 memory, but when I add another 2 * 1024MiB to the
empty slots, the regular kernel doesn't boot. (The xen kernel works, however)
Version-Release number of selected component (if applicable):
Bios version: 1001
Always (if you have the right hardware :)
Steps to Reproduce:
1. Insert the memory, verify in bios that 6GiB is indeed detected
2. Boot the kernel
The boot process actually goes on but extremely slow (several minutes for activating lvm volumes) but
the "Starting udev:" never seems to return. (Have only tried about 20 minutes though)
A startup with regular speed
The system works and detects the memory correctly if using the xen kernel (184.108.40.206-2.fc8xen)
Does this failure also happen with the kernel-PAE package ?
Well, running an i686 kernel with an x86_64 distribution seems a bit problematic of it's own. I pulled
kernel-PAE-220.127.116.11-34 from the F8 i386 updates and installed, but when booting i got five lines of
'request_module: runaway loop modprobe binfmt-464c'
Am I doing something wrong in an obvious way?
oh, I missed that this was on x86-64 (too early, not enough caffeine yet).
Ignore the PAE suggestion.
Could you grab the 2.6.25rc x86-64 kernel from rawhide and see if that's any
If this is a fixed bug already, it may be something we can selectively backport
if we can pin point it to a single patch.
Upgrading to kernel-2.6.25-0.155.rc6.git8.fc9 does the trick.
So, if you have a collection of kernel packages between 18.104.22.168-34 and 2.6.25-0.155.rc6.git8 I'd be
happy to binary search for when the fix went in.
Adding Thomas and Ingo to the Cc. Perhaps they remember something in particular
that got merged which could be responsible for this.
You can find a ton of built rpms at
http://koji.fedoraproject.org/koji/packageinfo?packageID=8 though it looks like
a lot of the interim builds between .24 and .25rc3 got purged.
We might have to resort to hand building kernels, and using git-bisect to narrow
it down. Are you familiar with this process ?
I haven't built kernels in a few years, so if you have a pointer to some document describing the current
state of affairs in kernel land it would be helpful.
I need to pick up my son at daycare now, but when I'm back I'll try out some pre-built kernels and
perhaps have a go at rolling my own :)
http://fedoraproject.org/wiki/BuildingUpstreamKernel should be useful.
If any parts are unclear, send me an email, and I'll walk you through the process.
Some casual testing indicates that the first .25rc1 kernel built
(kernel-2.6.25-0.33.rc1.fc9) works but all 2.6.24 kernels has the problem. I
have played around with git bisect between 2.6.24 and 2.6.25-rc1 and after one
iteration I'm down to ~3000 patches to test. (It took a while to figure out that
to find a fix, good means bad and bad means good)
Compiling takes a lot of time though. Any good and easy to use shortcuts?
A few (heh) iterations of git bisect points to commit
99fc8d424bc5d803fe92cad56c068fe64e73747a x86, 32-bit: trim memory not covered by
Judging from the commit description it seems like this is our thing. If at all
possible, it would be really nice if this fix could be backported to the next
errata F8 kernel. If testing is needed, I'm here.
Fixed in Linus tree:
excellent. Thanks for the great help debugging Noa. And thanks Thomas for
pinpointing the solution so quickly. I'll get this diff into the next update.
Hmm. That seems to be dependant upon other changes in .25
Does this patch look ok ? It's a combination of
76c324182bbd29dfe4298ca65efb15be18055df1 , but just the setup_32.c changes on
top of .24
diff --git a/arch/x86/kernel/setup_32.c b/arch/x86/kernel/setup_32.c
index a441deb..1fc93de 100644
@@ -47,6 +47,7 @@
@@ -328,8 +329,6 @@ static unsigned long __init setup_memory(void)
min_low_pfn = PFN_UP(init_pg_tables_end);
max_low_pfn = find_max_low_pfn();
@@ -616,6 +615,12 @@ void __init setup_arch(char **cmdline_p)
strlcpy(command_line, boot_command_line, COMMAND_LINE_SIZE);
*cmdline_p = command_line;
+ /* update e820 for memory not covered by WB MTRRs */
+ if (mtrr_trim_uncached_memory(max_pfn))
max_low_pfn = setup_memory();
argh, this won't work at all, because we also don't have
99fc8d424bc5d803fe92cad56c068fe64e73747a in .24
Created attachment 299355 [details]
hopefully complete patchset
This should be all the dependant patches all in one.
Noa, can you try this on top of v2.6.24 ?
You can do this with the git tree you already have by doing..
git reset v2.6.24
cat ~/mtrr.diff | patch -p1
and then building like you did the others.
It seems like your patch is missing the update_e820() function. Compilation
fails with this:
arch/x86/kernel/cpu/mtrr/main.c: In function ‘mtrr_trim_uncached_memory’:
arch/x86/kernel/cpu/mtrr/main.c:730: error: implicit declaration of function
make: *** [arch/x86/kernel/cpu/mtrr/main.o] Error 1
sorry for the confusion, Dave just pushed my nose to the fact that you run a 64
bit kernel. So the commit which was pointed at by bisect changed something for
64bit as well. I have a look.
Created attachment 299365 [details]
fixes. take 2.
This one contains a lot more changes, but should be more complete.
You can unapply the previous one with git diff | patch -p1 -R
before applying this one in the same manner as before.
The new version of the patch applied to a clean 2.6.24 compiles and the
resulting kernel fixes the booting problem. Good work!
awesome, I'll get that into a build.
Thanks again for your testing.
kernel-22.214.171.124-63.fc8 has been submitted as an update for Fedora 8
kernel-126.96.36.199-64.fc8 has been submitted as an update for Fedora 8
I just tried out kernel-188.8.131.52-64.fc8 from koji and it works beautifully with my 6 gigs of memory.
Thanks a lot Dave and others for the quick response to my report.
kernel-184.108.40.206-64.fc8 has been pushed to the Fedora 8 stable repository. If problems still persist, please make note of it in this bug report.