Description of problem: I have problems with serial ata raid (aacraid module) on latest kernel. It is hanging with kernel panic (see attached log). Adaptec has latest BIOS from their web page. Can somebody tell me, if it is a hardware problem or a bug in fedora kernels? Does somebody similiar hardware configuration without problems? Version-Release number of selected component (if applicable): kernel-smp-2.6.16-1.2133_FC5 but also olders (before there was fedora core 4 kernels with same problem). How reproducible: It is hangig periodically, but not on special action. Now it is hanging aprox. 2 times daily. Before on fedora core 4 it has been once weekly. If you need more information, I can send it.
Created attachment 131345 [details] kernel panic message
I am also seeiing this on an IBM eServer xSeries 260 with a IBM ServeRAID 8i with latest firmware from IBM. I am running Fedora Core 5 x86_64 with kernel 2.6.17-1.2157_FC5. I see kernel panics between 15 minutes to 6 hours depending on when the disk IO increases. I have currently moved the server to a different machine with the exactly the same hardware configuration to make sure that the other machine just does not have a hardware issue. The server has 4 dual core Intel(R) Xeon(TM) MP CPU 3.66GHz with 8gb of RAM. I am running 6 ~146gb SAS drives in a hardware based raid 10 configuration. 01:02.0 RAID bus controller: Adaptec AAC-RAID (rev 02) Subsystem: IBM ServeRAID 8i Flags: bus master, stepping, 66MHz, medium devsel, latency 240, IRQ 169 Memory at eb000000 (64-bit, non-prefetchable) [size=2M] Memory at eb200000 (32-bit, non-prefetchable) [size=2M] Memory at d0000000 (32-bit, prefetchable) [size=256M] [virtual] Expansion ROM at e8020000 [disabled] [size=32K] Capabilities: [c0] Power Management version 2 Capabilities: [d0] Message Signalled Interrupts: 64bit+ Queue=0/1 Enable- Capabilities: [e0] PCI-X non-bridge device
Jan, With FC4 was this the same panic or a different one?
Hi, I can't tell you exactly. This raid worked fine on fedora core 4 and also with some kernels from fedora core 5. But may be traffic on this server has been lower or there may be other reasons (for example hardware problem). This raid device has been removed from my server and we are trying to claim it. Without raid server works without problems.
I want to say this is a bug in the kernel based on the log. Or rather the problem with the driver. The latest kernel, from mainline (or the FC6 - rawhide) kernel might have this fixed. I would recommend using the latest kernel, but not yet - there is still some instability issues with the rawhide kernels that could bite you. In the meantime pls give mainline kernel (http://www.kernel.org) a spin to see if it works there.
I have no abilities to test it now, because this card is not in server now. This server is in server hosting and I have no direct access. After each hangup I must go to restart it immediately. :(
A new kernel update has been released (Version: 2.6.18-1.2200.fc5) based upon a new upstream kernel release. Please retest against this new kernel, as a large number of patches go into each upstream release, possibly including changes that may address this problem. This bug has been placed in NEEDINFO state. Due to the large volume of inactive bugs in bugzilla, if this bug is still in this state in two weeks time, it will be closed. Should this bug still be relevant after this period, the reporter can reopen the bug at any time. Any other users on the Cc: list of this bug can request that the bug be reopened by adding a comment to the bug. In the last few updates, some users upgrading from FC4->FC5 have reported that installing a kernel update has left their systems unbootable. If you have been affected by this problem please check you only have one version of device-mapper & lvm2 installed. See bug 207474 for further details. If this bug is a problem preventing you from installing the release this version is filed against, please see bug 169613. If this bug has been fixed, but you are now experiencing a different problem, please file a separate bug for the new problem. Thank you.
Thank you for help. This kernel has been updated 6 days ago after an unexpected reboot (I am using an panic=30 kernel parameter now). 6 days working without problems, but this bug is happening less often. If you can, please leave this bug open for aprx. 2 weeks from now. If there will be no problems 2 weeks, I think this bug is solved.
My system is up for 22 days now. [ondrejj@ns ~]$ uptime 13:58:32 up 22 days, 3:40, 3 users, load average: 4.29, 4.73, 4.16 I think there is no similiar problem now. You can close this bug. Thank you again. :)