From Bugzilla Helper: User-Agent: Mozilla/4.75 [en] (X11; U; Linux 2.2.18pre11-va2.9-servers1 i686) Description of problem: Successfully installed 7.2 stock on Intel SRKA4 (quad Xeon 4U) with Mylex 1164 RAID controller as primary boot device. After installing, upgraded kernel from 2.4.7-10smp to 2.4.9-13smp. DAC960 detects controller properly, but when init starts, the kernel panics, printing this to the console: Kernel panic: DAC960: SegmentNumber != SegmentCount Have confirmed that this behavior exists when using a Mylex 2000 RAID controller as well. Have confirmed this behavior on several similar SRKA4's. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. Install 7.2 stock on Intel SRKA4 (quad Xeon 4U) with Mylex 1164 or 2000 RAID controller as primary boot device. 2. Upgrade kernel to 2.4.9-13smp, reboot. Actual Results: During boot, DAC960 detects controller and drives, but when init starts, kernel panics, printing this to the console: Kernel panic: DAC960: SegmentNumber != SegmentCount Expected Results: System should boot normally, as it does when using 2.4.7-10smp. Additional info: Am able to reproduce this behavior on multiple SRKA4s with similar configurations that are supported by 2.4.7-10smp. Am booting these machines with LILO, and not GRUB. Have confirmed that I configured lilo to use the correct initrd.
Tested similar software configuration on machine with Intel L440GX motherboard and Mylex 2000 as primary boot device. 2.4.9-13smp generates same kernel panic (DAC960: SegmentNumber != SegmentCount), although this error occurs later in the boot process: it occurs just after local filesystems are mounted. (By comparison, the SRKA4s kernel panic immediately after init starts.) To do: will test configuration with non-Intel motherboard. Will also try rolling back the DAC960 driver from July rev to February rev and building new initrd.
Interesting would be to boot with "mem=800M" to rule out highmem issues....
Test SRKA4s have 2GB - 4GB physical ram; test L440GX has 1GB ram. Using "mem=800M" during 2.4.9-13smp boot on both SRKA4 and L440GX prevents DAC960 kernel panic. Good call. What's our next step?
My next step is to go look at the dac960 sourcecode ;( (and the diff with older kernels)
I have the exact same DAC960 kernel panic with my AccelRAID 150 on RH 7.1 with the kernel updates 2.4.9-6smp and 2.4.9-12smp. It occurs occasionally during heavy disk activity and during boot up (max uptime was about 10 days). 2.4.3-12 did not have this problem and I have since reverted back to it. Perhaps I'll try the "mem=800M" as suggested.
Could this be related to my problem (bug # 56596?) I experience a failed init with the DAC960 driver on a single processor system. Instead of a panic, I only have to deal with a failed fsck (sig 11.) Everything seems fine when not using big memory (>896MB.)
We're testing a fix right now, but since it has some core blocklayer changes I rather test it good before asking others to test it...
I am having the same issue here. We are using a 2.4.9-13SGI_XFS_1.0.2 compiled from sources with egcs 2.91.66. We had a base slackware 8.0 installation. I know it seems not redhat related but i thought that posting here I can help track this problem only to something specific to this kernel. I get random crashes and rarely when it boots it reports that DAC960 message. The hardware is a dual SMP, Mylex 170 RAID5, 1GB ram. I have disabled highmem and still got some crashes. I could not see if is still this problem (it did'nt showed any message on boot since i had disabled highmem). I can see the last post here is 3 months old. I know redhat had released an errata kernel (2.4.9-21) with a DAC960 driver upgrade. does this solves this problem? Can someone who does have this problem upgrade to that and test it out?
The 2.4.9-21 kernel is supposed to fix this (it does in our lab). dizzy: you have bigger problems since egcs will miscompile the DAC960 driver..... the MINIMUM compiler for 2.4 kernels is gcc 2.95.3 (as per Documentation/Changes); egcs just miscompiles too much code.
arjanv: :) . you are right. just only that SGI recommends to use egcs for compiling their kernel, even the binary kernel provided by them and tested QA was compiled with egcs (AFAIK). anyway dont you think the errata comment on that kernel should include this bugid on solved bugs ? :)