From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.2) Gecko/20030716 Description of problem: This is bug is against AS2.1 Update 3 beta1 (As far as I now beta I) The summit kernel will not boot on my system. It is an 4-way x440 with under 4 gig on memory. This disto installed ok but the summit kennel will not boot. It gets as far as bring up sshd and xinitd but it will not pass this. I am still able to ping to box but I can't ssh into it. The key board is non responsive. The last kernel I had access to e.29summit booted ok but was not stable on the system (it fell down after about an hour of stress test). I booted the up kernel but was unable to bulid a debug kernel (thanks alot for including kdb in you kenel tree). When I did a make modules_install i get this mess. (after bzImage and modules) [root@elm3a2 linux-2.4]# make modules_install make -C kernel modules_install make[1]: Entering directory `/usr/src/linux-2.4.9-e.31/kernel' make[1]: Nothing to be done for `modules_install'. make[1]: Leaving directory `/usr/src/linux-2.4.9-e.31/kernel' mkdir -p /lib/kernel/2.4.9-e.31custom/ install -m 755 ulib/libredhat-kernel.so.1.0.1 /lib/kernel/2.4.9-e.31custom/ install: cannot stat `ulib/libredhat-kernel.so.1.0.1': No such file or directorymake: *** [_modinst_kernel] Error 1 [root@elm3a2 linux-2.4]# so I am a little stuck for more info at this time. I need to build kernels. That is wher I am at with the AS2.1 beta 1 release. Version-Release number of selected component (if applicable): kernel-2.4.9-e.31 How reproducible: Always Steps to Reproduce: 1.Load AS2.1 Update 3 beta 1 2.Boot summit kernel 3.Observe the box not boot. Actual Results: Box hangs while bringing up xinitd. Expected Results: box should boot. Additional info: I will attach output files to this bug
Created attachment 96438 [details] var/log/messages This is the /var/log/messages from the system. The last kenel to boot is the up kernel. There are 3-4 bad boots in this log.
does this happen if you chkconfig --level 345 irqbalance off ? if so, then the kernel is bust in the irq affinity settings on summit; something that we almost never used before U3
If I run without the irqbalance off I am able to boot just file. Does the irqbalance demon use the /proc irqaffinity? If so there may be issues for clusterd apic boxes. I fixed this in 2.5 a while ago but I don't have a patch for this 2.4.9 kernel tree.
Sorry just read my comment. It should read. If I run with irqbalance off I am able to boot just fine. ....
the e.34 kernel works on the x440 in the test lab
Where can I get the e.34 kernel?
http://people.redhat.com/~jbaron/.private/testing/2.4.9-e.34/ we ended up making writes to smp_affinity be noops. any testing is appreciated. We also haven't seen any stability issues as hinted at above.
I'll give this kenel some testing and let you know if I have problems. Would you take a patch to fix clusered apic boxes and irq affinity? I did this work for v2.6 I should be able to fix it here as well.
definitely, we can queue those for QU4.
or should say Update 4.
Keith, do you have a patch for this for U4?
Created attachment 97544 [details] a patch to fix irq affinity for summint and the e.35 summit kernel
Sorry for the delay. This fixes the clusterd apic for the summit kernel. It is a simple fix. Apics in v2.4 and summit boxes can only only address one cpu at a time. You can only set affinity to one cpu at a time. The current irq deamon tries to mask to multiple cpus at a time. With summit this maps to getting masked to the first cpu in the mask. It looks like cpus (with you current irqdeamon) are only getting mapped to 1/2 the cpus. Let me know what you think.