Description of problem: I have an x86_64 box with an Asus P5B-VM motherboard with the Intel G965 chipset. According to the motherboard documentation the machine is supposed to support up to 8GiB RAM. The machine operates flawlessly with 2 * 2048MiB PC2-6400 memory, but when I add another 2 * 1024MiB to the empty slots, the regular kernel doesn't boot. (The xen kernel works, however) Version-Release number of selected component (if applicable): Bios version: 1001 kernel: 2.6.24.3-34.fc8 How reproducible: Always (if you have the right hardware :) Steps to Reproduce: 1. Insert the memory, verify in bios that 6GiB is indeed detected 2. Boot the kernel 3. Actual results: The boot process actually goes on but extremely slow (several minutes for activating lvm volumes) but the "Starting udev:" never seems to return. (Have only tried about 20 minutes though) Expected results: A startup with regular speed Additional info: The system works and detects the memory correctly if using the xen kernel (2.6.21.7-2.fc8xen)
Does this failure also happen with the kernel-PAE package ?
Well, running an i686 kernel with an x86_64 distribution seems a bit problematic of it's own. I pulled kernel-PAE-2.6.24.3-34 from the F8 i386 updates and installed, but when booting i got five lines of 'request_module: runaway loop modprobe binfmt-464c' Am I doing something wrong in an obvious way?
oh, I missed that this was on x86-64 (too early, not enough caffeine yet). Ignore the PAE suggestion. Could you grab the 2.6.25rc x86-64 kernel from rawhide and see if that's any better ? If this is a fixed bug already, it may be something we can selectively backport if we can pin point it to a single patch.
Upgrading to kernel-2.6.25-0.155.rc6.git8.fc9 does the trick. So, if you have a collection of kernel packages between 2.6.24.3-34 and 2.6.25-0.155.rc6.git8 I'd be happy to binary search for when the fix went in.
Adding Thomas and Ingo to the Cc. Perhaps they remember something in particular that got merged which could be responsible for this. You can find a ton of built rpms at http://koji.fedoraproject.org/koji/packageinfo?packageID=8 though it looks like a lot of the interim builds between .24 and .25rc3 got purged. We might have to resort to hand building kernels, and using git-bisect to narrow it down. Are you familiar with this process ?
I haven't built kernels in a few years, so if you have a pointer to some document describing the current state of affairs in kernel land it would be helpful. I need to pick up my son at daycare now, but when I'm back I'll try out some pre-built kernels and perhaps have a go at rolling my own :)
http://fedoraproject.org/wiki/BuildingUpstreamKernel should be useful. If any parts are unclear, send me an email, and I'll walk you through the process.
Some casual testing indicates that the first .25rc1 kernel built (kernel-2.6.25-0.33.rc1.fc9) works but all 2.6.24 kernels has the problem. I have played around with git bisect between 2.6.24 and 2.6.25-rc1 and after one iteration I'm down to ~3000 patches to test. (It took a while to figure out that to find a fix, good means bad and bad means good) Compiling takes a lot of time though. Any good and easy to use shortcuts?
A few (heh) iterations of git bisect points to commit 99fc8d424bc5d803fe92cad56c068fe64e73747a x86, 32-bit: trim memory not covered by wb mtrrs Judging from the commit description it seems like this is our thing. If at all possible, it would be really nice if this fix could be backported to the next errata F8 kernel. If testing is needed, I'm here.
Fixed in Linus tree: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=76c324182bbd29dfe4298ca65efb15be18055df1 Thanks, tglx
excellent. Thanks for the great help debugging Noa. And thanks Thomas for pinpointing the solution so quickly. I'll get this diff into the next update.
Hmm. That seems to be dependant upon other changes in .25 Does this patch look ok ? It's a combination of 093af8d7f0ba3c6be1485973508584ef081e9f93 and 76c324182bbd29dfe4298ca65efb15be18055df1 , but just the setup_32.c changes on top of .24 diff --git a/arch/x86/kernel/setup_32.c b/arch/x86/kernel/setup_32.c index a441deb..1fc93de 100644 --- a/arch/x86/kernel/setup_32.c +++ b/arch/x86/kernel/setup_32.c @@ -47,6 +47,7 @@ #include <video/edid.h> +#include <asm/mtrr.h> #include <asm/apic.h> #include <asm/e820.h> #include <asm/mpspec.h> @@ -328,8 +329,6 @@ static unsigned long __init setup_memory(void) */ min_low_pfn = PFN_UP(init_pg_tables_end); - find_max_pfn(); - max_low_pfn = find_max_low_pfn(); #ifdef CONFIG_HIGHMEM @@ -616,6 +615,12 @@ void __init setup_arch(char **cmdline_p) strlcpy(command_line, boot_command_line, COMMAND_LINE_SIZE); *cmdline_p = command_line; + /* update e820 for memory not covered by WB MTRRs */ + find_max_pfn(); + mtrr_bp_init(); + if (mtrr_trim_uncached_memory(max_pfn)) + find_max_pfn(); + max_low_pfn = setup_memory(); #ifdef CONFIG_VMI
argh, this won't work at all, because we also don't have 99fc8d424bc5d803fe92cad56c068fe64e73747a in .24
Created attachment 299355 [details] hopefully complete patchset This should be all the dependant patches all in one. Noa, can you try this on top of v2.6.24 ? You can do this with the git tree you already have by doing.. git reset v2.6.24 git checkout cat ~/mtrr.diff | patch -p1 and then building like you did the others. Thanks
It seems like your patch is missing the update_e820() function. Compilation fails with this: arch/x86/kernel/cpu/mtrr/main.c: In function ‘mtrr_trim_uncached_memory’: arch/x86/kernel/cpu/mtrr/main.c:730: error: implicit declaration of function ‘update_e820’ make[3]: *** [arch/x86/kernel/cpu/mtrr/main.o] Error 1
sorry for the confusion, Dave just pushed my nose to the fact that you run a 64 bit kernel. So the commit which was pointed at by bisect changed something for 64bit as well. I have a look.
Created attachment 299365 [details] fixes. take 2. This one contains a lot more changes, but should be more complete. You can unapply the previous one with git diff | patch -p1 -R before applying this one in the same manner as before.
The new version of the patch applied to a clean 2.6.24 compiles and the resulting kernel fixes the booting problem. Good work!
awesome, I'll get that into a build. Thanks again for your testing.
kernel-2.6.24.4-63.fc8 has been submitted as an update for Fedora 8
kernel-2.6.24.4-64.fc8 has been submitted as an update for Fedora 8
I just tried out kernel-2.6.24.4-64.fc8 from koji and it works beautifully with my 6 gigs of memory. Thanks a lot Dave and others for the quick response to my report.
kernel-2.6.24.4-64.fc8 has been pushed to the Fedora 8 stable repository. If problems still persist, please make note of it in this bug report.