Bug 808880
Summary: | ACPI/IRQ assignment regression in kernels > 2.6.40 (Asus M2N-LR + 3Ware-9xxx) | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Solomon Peachy <pizza> | ||||||||||||
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||||||||||
Status: | CLOSED WONTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||||
Severity: | high | Docs Contact: | |||||||||||||
Priority: | unspecified | ||||||||||||||
Version: | 15 | CC: | gansalmon, itamar, jonathan, kernel-maint, kevin.hobbs.1, madhu.chinakonda, sassmann | ||||||||||||
Target Milestone: | --- | ||||||||||||||
Target Release: | --- | ||||||||||||||
Hardware: | x86_64 | ||||||||||||||
OS: | Linux | ||||||||||||||
Whiteboard: | |||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||
Clone Of: | Environment: | ||||||||||||||
Last Closed: | 2012-07-11 17:51:27 UTC | Type: | --- | ||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||
Documentation: | --- | CRM: | |||||||||||||
Verified Versions: | Category: | --- | |||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
Embargoed: | |||||||||||||||
Attachments: |
|
Description
Solomon Peachy
2012-04-01 13:21:36 UTC
Created attachment 574347 [details]
kernel log of successful 2.6.40.6 bootup.
Successful bootup with 2.6.40.6 -- I'm including this for reference reasons. I'm still trying to get a kernel boot log with the newer (failing) kernels.
Created attachment 574348 [details]
output of lspci -v
Created attachment 574350 [details]
dmidecode output
I grabbed kernel-2.6.43.1-2.fc15.x86_64.rpm out of Koji, and it also failed in the same way. However, I was finally able to get a kernel log -- I had to wait until the kernel finished trying to probe every LUN on the 3Ware cards. The basic problem is that for whatever reason, the scsi bus probes aren't succeeding. Further investigation shows that the 3Ware cards' IRQ assignments are all wonky; they're being routed to legacy IRQs. 2.6.40 (good) 3ware 9000 Storage Controller device driver for Linux v2.26.02.014. ACPI: PCI Interrupt Link [LNEC] enabled at IRQ 18 3w-9xxx 0000:03:00.0: PCI INT A -> Link[LNEC] -> GSI 18 (level, low) -> IRQ 18 scsi2 : 3ware 9000 Storage Controller 3w-9xxx: scsi2: Found a 3ware 9000 Storage Controller at 0xefdff000, IRQ: 18. 3w-9xxx: scsi2: Firmware FE9X 3.08.00.029, BIOS BE9X 3.10.00.003, Ports: 8. 3w-9xxx 0000:03:04.0: PCI INT A -> Link[LNEC] -> GSI 18 (level, low) -> IRQ 18 scsi 2:0:0:0: Direct-Access AMCC 9550SXU-8L DISK 3.08 PQ: 0 ANSI: 5 scsi 2:0:1:0: Direct-Access AMCC 9550SXU-8L DISK 3.08 PQ: 0 ANSI: 5 scsi7 : 3ware 9000 Storage Controller 3w-9xxx: scsi7: Found a 3ware 9000 Storage Controller at 0xefdfe000, IRQ: 18. 3w-9xxx: scsi7: Firmware FE9X 3.08.00.029, BIOS BE9X 3.10.00.003, Ports: 4. scsi 7:0:0:0: Direct-Access AMCC 9550SX-4LP DISK 3.08 PQ: 0 ANSI: 5 2.6.43: (and 2.6.41/2.6.42: bad) 3ware 9000 Storage Controller device driver for Linux v2.26.02.014. 3w-9xxx 0000:03:00.0: PCI IRQ 0 -> rerouted to legacy IRQ 16 ACPI: Invalid index 16 3w-9xxx 0000:03:00.0: PCI INT A: no GSI - using ISA IRQ 14 scsi4 : 3ware 9000 Storage Controller 3w-9xxx: scsi4: Found a 3ware 9000 Storage Controller at 0xefdff000, IRQ: 14. 3w-9xxx: scsi4: Firmware FE9X 3.08.00.029, BIOS BE9X 3.10.00.003, Ports: 8. 3w-9xxx 0000:03:04.0: PCI IRQ 0 -> rerouted to legacy IRQ 16 ACPI: Invalid index 16 3w-9xxx 0000:03:04.0: PCI INT A: no GSI - using ISA IRQ 14 scsi8 : 3ware 9000 Storage Controller 3w-9xxx: scsi8: Found a 3ware 9000 Storage Controller at 0xefdfe000, IRQ: 14. 3w-9xxx: scsi8: Firmware FE9X 3.08.00.029, BIOS BE9X 3.10.00.003, Ports: 4. scsi: waiting for bus probes to complete ... scsi 4:0:0:0: WARNING: (0x06:0x002C): Command (0x12) timed out, resetting card. scsi 8:0:0:0: WARNING: (0x06:0x002C): Command (0x12) timed out, resetting card. scsi 4:0:0:0: WARNING: (0x06:0x002C): Command (0x0) timed out, resetting card. scsi 8:0:0:0: WARNING: (0x06:0x002C): Command (0x0) timed out, resetting card. scsi 4:0:0:0: Device offlined - not ready after error recovery scsi 8:0:0:0: Device offlined - not ready after error recovery [repeat above six lines fifteen more times, once for each LUN] Created attachment 576032 [details]
dmesg log of failed 2.6.43.1-2 bootup
Created attachment 580730 [details]
PRT.patch
Solomon,
here's a patch that should apply on top of a 3.3 kernel. Please provide the dmesg output of a boot with this patch applied. Thanks!
Fedora 15 has reached it's end of life as of June 26, 2012. As a result, we will not be fixing any remaining bugs found in Fedora 15. In the event that you have upgraded to a newer release and the bug you reported is still present, please reopen the bug and set the version field to the newest release you have encountered the issue with. Before doing so, please ensure you are testing the latest kernel update in that release and attach any new and relevant information you may have gathered. Thank you for taking the time to file a report. We hope newer versions of Fedora suit your needs. This problem occurs for me while booting the fedora 17 netinstall disk. In order to observe the problem I have to append modprobe.blacklist=3w_9xxx to the kernel command line. Then when I modprobe the 3w_9xxx module the controller tail of dmesg is: [ 1036.768034] scsi 4:0:11:0: WARNING: (0x06:0x002C): Command (0x12) timed out, resetting card. [ 1042.934020] 3w-9xxx: scsi4: AEN: INFO (0x04:0x0029): Verify started:unit=0. [ 1063.139023] scsi 4:0:11:0: WARNING: (0x06:0x002C): Command (0x0) timed out, resetting card. [ 1084.304031] scsi 4:0:11:0: Device offlined - not ready after error recovery [ 1105.760032] scsi 4:0:12:0: WARNING: (0x06:0x002C): Command (0x12) timed out, resetting card. [ 1116.924023] 3w-9xxx: scsi4: AEN: INFO (0x04:0x0029): Verify started:unit=0. [ 1137.129013] scsi 4:0:12:0: WARNING: (0x06:0x002C): Command (0x0) timed out, resetting card. [ 1158.243050] scsi 4:0:12:0: Device offlined - not ready after error recovery [ 1179.744049] scsi 4:0:13:0: WARNING: (0x06:0x002C): Command (0x12) timed out, resetting card. [ 1190.959020] 3w-9xxx: scsi4: AEN: INFO (0x04:0x0029): Verify started:unit=0. [ 1211.164023] scsi 4:0:13:0: WARNING: (0x06:0x002C): Command (0x0) timed out, resetting card. [ 1227.329039] scsi 4:0:13:0: Device offlined - not ready after error recovery [ 1248.736029] scsi 4:0:14:0: WARNING: (0x06:0x002C): Command (0x12) timed out, resetting card. [ 1254.902019] 3w-9xxx: scsi4: AEN: INFO (0x04:0x0029): Verify started:unit=0. [ 1275.107060] scsi 4:0:14:0: WARNING: (0x06:0x002C): Command (0x0) timed out, resetting card. [ 1296.272027] scsi 4:0:14:0: Device offlined - not ready after error recovery [ 1317.728037] scsi 4:0:15:0: WARNING: (0x06:0x002C): Command (0x12) timed out, resetting card. [ 1328.841022] 3w-9xxx: scsi4: AEN: INFO (0x04:0x0029): Verify started:unit=0. [ 1349.046058] scsi 4:0:15:0: WARNING: (0x06:0x002C): Command (0x0) timed out, resetting card. [ 1370.160016] scsi 4:0:15:0: Device offlined - not ready after error recovery |