Bug 450444
Summary: | aac_srb: aac_fib_send failed with status: 8195 | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Trevin Beattie <trevin> | ||||||
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||||
Status: | CLOSED WONTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | 10 | CC: | amigo03, andriusb, davidt, dhuff, drees76, jlawson-redhat, mccomb, mpagano, paul.boin, ServeRAIDDriver, tbeattie, thenzl | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2009-12-18 06:12:26 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Trevin Beattie
2008-06-08 15:26:19 UTC
Clarification: the long pause occurs after the lines: "detecting hardware... waiting for hardware to initialize" It is as early as this point that the "aac_srb: aac_fib_send failed with status: 8195" messages start appearing. If I reboot the install DVD and add the "noprobe" kernel option, and then select the proper drivers when prompted (libata, sata_nv, etc.), it will detect my regular SATA drives and let me continue the installation. After the installation completes and I reboot the system, the startup sequence hangs for a minute or so at "Starting udev: _", and eventually complains: "Wait timeout. Will continue in the backgroun[FAILED]". This is followed by: "Setting up Logical Volume Management: No volume groups found". The system detects my regular SATA drives at least, but the kernel log is filling up with "aac_srb: aac_fib_send failed with status: 8195" messages at the rate of over 3,400 lines per minute! Can you compare the system logs from the older kernel with the new one? Does the driver print any additional/different messages when it loads, other than the repeating of the above message?? Created attachment 308888 [details]
Complete 2.6.22.5-49.fc6 kernel log for a short session
My current installation is Fedora Core 6 with kernel 2.6.22.5-49.fc6 (preserved
on a different partition). It's difficult to compare the message logs
directly; the syslog output looks vastly different between the two kernels, and
messages specifically from aacraid aren't explicitly labeled. I think the
aacraid driver output can be distilled down to this:
kernel: Adaptec aacraid driver (1.1-5[2437]-mh4)
kernel: AAC0: kernel 4.2-0[7349] Dec 11 2004
kernel: AAC0: monitor 4.2-0[7349]
kernel: AAC0: bios 4.2-0[7349]
kernel: AAC0: Non-DASD support enabled.
kernel: AAC0: 64bit support enabled.
kernel: AAC0: 64 Bit DAC enabled
kernel: scsi4 : aacraid
kernel: scsi 4:0:0:0: Direct-Access Adaptec Linux V1.0 PQ: 0
ANSI: 2
kernel: sd 4:0:0:0: [sdc] 430657536 512-byte hardware sectors (220497 MB)
kernel: sd 4:0:0:0: [sdc] Assuming Write Enabled
kernel: sd 4:0:0:0: [sdc] Assuming drive cache: write through
kernel: sd 4:0:0:0: [sdc] 430657536 512-byte hardware sectors (220497 MB)
kernel: sd 4:0:0:0: [sdc] Assuming Write Enabled
kernel: sd 4:0:0:0: [sdc] Assuming drive cache: write through
kernel: sdc: sdc1
kernel: sd 4:0:0:0: [sdc] Attached SCSI removable disk
kernel: scsi 4:1:0:0: Direct-Access MAXTOR ATLAS10K5_73WLS JNZ3 PQ: 0
ANSI: 3
kernel: scsi 4:1:1:0: Direct-Access MAXTOR ATLAS10K5_73WLS JNZ3 PQ: 0
ANSI: 3
kernel: scsi 4:1:2:0: Direct-Access MAXTOR ATLAS10K5_73WLS JNZ3 PQ: 0
ANSI: 3
kernel: scsi 4:1:3:0: Direct-Access MAXTOR ATLAS10K5_73WLS JNZ3 PQ: 0
ANSI: 3
kernel: sd 4:0:0:0: Attached scsi generic sg2 type 0
kernel: scsi 4:1:0:0: Attached scsi generic sg3 type 0
kernel: scsi 4:1:1:0: Attached scsi generic sg4 type 0
kernel: scsi 4:1:2:0: Attached scsi generic sg5 type 0
kernel: scsi 4:1:3:0: Attached scsi generic sg6 type 0
kernel: AAC1: kernel 5.2-0[15323] Sep 21 2007
kernel: AAC1: monitor 5.2-0[15323]
kernel: AAC1: bios 5.2-0[15323]
kernel: AAC1: serial 1644d4
kernel: AAC1: Non-DASD support enabled.
kernel: AAC1: 64bit support enabled.
kernel: AAC1: 64 Bit DAC enabled
kernel: scsi5 : aacraid
Created attachment 308889 [details]
2.6.25-14.fc9 kernel log with duplicates filtered out
The syslog output from the FC9 kernel has absolutely NO messages that can be
identified as being from aacraid other than the infinitely repeating error:
kernel: aac_srb: aac_fib_send failed with status: 8195
[message repeats 2287 times]
which shows up at line 8 of the boot session.
Full kernel logs from startup to shutdown for both FC6 and FC9 attached above. Found another user with the same problem on Gentoo Forums: http://forums.gentoo.org/viewtopic-p-5077382.html?sid=a51c3a0fba6aa854c0b49b8fae5cc15a He also has a 64-bit system and Adaptec 2120S. I think we can isolate this bug to 64-bit code. Based on a comment about a recent patch to the aacraid driver: http://www.spinics.net/lists/linux-scsi/msg26480.html I decided to try booting the 32-bit FC9 install DVD. The 32-bit driver loaded properly and detected both of my RAID cards. I just ran into this bug with the recent RHEL 4 update 7 kernel update. kernel-smp-2.6.9-78.EL So currently I'm forced to stay with the last working kernel: kernel-smp-2.6.9-67.0.20.EL RAID card is: Adaptec 2120S SCSI RAID SGL ULTRA 320 with 7349 firmware version. I'm running the 32-bit kernel. I have reproduced this problem on two different Dell PowerEdge 2650 servers running the 32-bit version of CentOS 5.2 (kernel-PAE-2.6.18-92.el5.i686.rpm), which should be the roughly comparable to RHEL 5 update 2. The two servers had the same RAID controller, but slightly different BIOS versions which both triggered the repeating "aac_fib_send failed" errors: Adaptec Dell Perc 3/Di BIOS 2.7-1 build 3170 Adaptec Dell Perc 3/Di BIOS 2.8-0 build 6082 After updating both to the latest version from Dell's website the problem seems to no longer occur under my limited testing so far: Adaptec Dell Perc 3/Di BIOS 2.8-1 build 7692 Jeff, are the 2650's RAID controllers integrated into the motherboard or are they expansion cards? Yes, the RAID controllers are integrated. I have one more Dell PowerEdge 2650 that is reproducing this problem which I have not yet upgraded the firmware on. I can leave this last system on this older firmware for a few more days if anyone has any other data-collection steps they'd like to try, otherwise I will upgrade its RAID firmware too. Jeff, could you give me a hint who I can update the RAID controller? Go to the website of your RAID controller's manufacturer and see if they have any updates for your model. My Dell PowerEdge 2650 had a Windows utility that created two floppy disks containing the automatic updater. If your RAID controller is integrated then go to the motherboard manufacturer's website. Again, I'm not certain as to whether updating this particular bug only occurs because of an old firmware issue, but I haven't seen the problem after updating two of my systems. I have the same problem. I upgraded my kernel with up2date to 2.6.9-78.0.1.ELsmp and cannot boot properly. I am getting the following error messages: aac_srb: aac_fib_send failed with status: 8195 aac_srb: aac_fib_send failed with status: 8195 aac_srb: aac_fib_send failed with status: 8195 aac_srb: aac_fib_send failed with status: 8195 aac_srb: aac_fib_send failed with status: 8195 I also have the Adaptec 2120S installed and the firmware was 8205. I upgraded to the latest firmware, 8208, but it did not make any difference. Unfortunately, for some reason GRUB does not list any other kernels that I can boot into, so I am stuck. Does anyone know how I can get around this issue without doing a full rebuild of the server? Thanks. See this post on LKML for another similar issue: http://marc.info/?l=linux-kernel&m=122166454808377&w=2 The same bug is also filed for RHEL 5 under Bug #453472. mccomb: You might try booting with one of the two options: aacraid.dacmode=0 or mem=4G I pulled the options from the kernel commit which introduced this change: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=94cf6ba11b (In reply to comment #8) > I just ran into this bug with the recent RHEL 4 update 7 kernel update. > kernel-smp-2.6.9-78.EL David and others, I think that you are probably using using RHEL4.7 and not Fedora (this is a Fedora bug) so I'm adding you to the cc:list on BZ#457552. The bug still exists in the Fedora 10 pre-release: kernel-2.6.27.4-68.fc10.x86_64. I've also verified that it only happens when I have my Adaptec 2120S controller installed. If that card is removed, aacraid properly detects my remaining 3085 controller. The new 32-bit kernel still boots normally with the 2120S controller. Could the patch mentioned in bug #453472 for EL 5.2 be applied to Fedora? (We're only 9 patch levels ahead...) This message is a reminder that Fedora 10 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 10. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '10'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 10's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 10 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping Fedora 10 changed to end-of-life (EOL) status on 2009-12-17. Fedora 10 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed. |