Bug 462425
| Summary: | Kernel 2.6.26.3-29.fc9.x86_64 drive goes offline | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Brian Rademacher <rad> | ||||||||
| Component: | kernel | Assignee: | Jeff Garzik <jgarzik> | ||||||||
| Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||
| Severity: | urgent | Docs Contact: | |||||||||
| Priority: | medium | ||||||||||
| Version: | 9 | CC: | brian.mosher, dave, emcnabb, erwan, fdor6, fujisan43, gijsbert.wiesenekker, herrold, jpiszcz, kernel-maint, mathguthrie, peterm, qr7atgwu, rad, rainer.traut, redhat, scott | ||||||||
| Target Milestone: | --- | ||||||||||
| Target Release: | --- | ||||||||||
| Hardware: | x86_64 | ||||||||||
| OS: | Linux | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | Environment: | ||||||||||
| Last Closed: | 2009-07-10 00:30:51 UTC | Type: | --- | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Embargoed: | |||||||||||
| Attachments: |
|
||||||||||
Created attachment 316813 [details]
/var/log/messages
Crashed again (without all of the nasty trace info this time since I caught it right away) during my scheduled rdiff backup (no additional disk IO this time as before). I went back to kernel 2.6.25.14-108.fc9.x86_64 and completed the same rdiff backup with no problem. Here is dmesg output this time:
Sep 16 12:37:43 radfiles kernel: ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
Sep 16 12:37:43 radfiles kernel: ata1.00: cmd 61/08:00:08:d6:42/00:00:25:00:00/40 tag 0 ncq 4096 out
Sep 16 12:37:43 radfiles kernel: res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Sep 16 12:37:43 radfiles kernel: ata1.00: status: { DRDY }
Sep 16 12:37:43 radfiles kernel: ata1: hard resetting link
Sep 16 12:37:43 radfiles kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Sep 16 12:37:43 radfiles kernel: ata1.00: max_sectors limited to 256 for NCQ
Sep 16 12:37:43 radfiles kernel: ata1.00: max_sectors limited to 256 for NCQ
Sep 16 12:37:43 radfiles kernel: ata1.00: configured for UDMA/133
Sep 16 12:37:43 radfiles kernel: ata1: EH complete
Sep 16 12:37:43 radfiles kernel: sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB)
Sep 16 12:37:43 radfiles kernel: sd 0:0:0:0: [sda] Write Protect is off
Sep 16 12:37:43 radfiles kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Sep 16 12:39:15 radfiles kernel: ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
Sep 16 12:39:15 radfiles kernel: ata1.00: cmd 61/08:00:08:d6:42/00:00:25:00:00/40 tag 0 ncq 4096 out
Sep 16 12:39:15 radfiles kernel: res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Sep 16 12:39:15 radfiles kernel: ata1.00: status: { DRDY }
Sep 16 12:39:15 radfiles kernel: ata1: hard resetting link
Sep 16 12:39:15 radfiles kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Sep 16 12:39:16 radfiles kernel: ata1.00: max_sectors limited to 256 for NCQ
Sep 16 12:39:16 radfiles kernel: ata1.00: max_sectors limited to 256 for NCQ
Sep 16 12:39:16 radfiles kernel: ata1.00: configured for UDMA/133
Sep 16 12:39:16 radfiles kernel: ata1: EH complete
Sep 16 12:39:16 radfiles kernel: sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB)
Sep 16 12:39:16 radfiles kernel: sd 0:0:0:0: [sda] Write Protect is off
Sep 16 12:39:16 radfiles kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Also, just for reference, kernel 2.6.25.14-108.fc9.x86_64 has NCQ enabled for sata_mv, which is relatively new, but functioning under that kernel:
ata1.00: 625142448 sectors, multi 0: LBA48 NCQ (depth 31/32)
ata2.00: 625142448 sectors, multi 0: LBA48 NCQ (depth 31/32)
ata3.00: 625142448 sectors, multi 0: LBA48 NCQ (depth 31/32)
ata4.00: 625142448 sectors, multi 0: LBA48 NCQ (depth 31/32)
ata5.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 31/32)
I can get the same error under Kernel 2.6.25.14-108.fc9.x86_64 now that I added a 5th drive to the RAID array, but it only shows up either during a RAID resync or about once every few hours. With only 4 drives, it only showed under a resync and never under regular operation. With Kernel 2.6.26.3-29.fc9.x86_64 I'm lucky if I can boot...It resets the bus every few minutes on average. With only 4 drives I was ok until heavy IO (like the backup mentioned in the bug), but with 5 it's unuseable. Hi Brian, I have about 10 computers with very different hardware running with fedora 9 (i386 e x86_64). All with the last kernel. All are OK except one: a computer I have used ext4. When using the 2.6.25.14-108 version it boots clean but with 2.6.26.3-29 my "/" partition is invisible. Searching the web I have seen 2.6.26 requires a patch to use ext4 partitions: "2008-08-20: The 2.6.26-ext4-7 patchset has been released. People who are using ext4 wih 2.6.26 should really take this patch. 2008-07-15: Delayed allocation has been merged into Linus's ext4 git tree! We have started maintaining patches against the latest 2.6 mainline kernel for make it easier for people to try out ext4. " (http://ext4.wiki.kernel.org/index.php/Main_Page) As in your first message we can read: "Sep 15 12:26:40 radfiles kernel: Modules linked in: ext4dev " and you said you have 4 ext3 drives that works OK and the 5th with ext4 partition is bad I think that was your initial problem. For me all I have to do is using 2.6.25 until Fedora team releases another version for 2.6.26. For you perhaps you can try the patch I cited above. The only thing I use ext4 for is on a terabyte backup drive, so it only mounts during the backup process and then umounts otherwise. The failure occurs at any time during heavy IO (ext4 aside), which is why I was seeing it during backups. I don't think that ext4dev should be interacting with anything when the drive isn't even mounted, but for now I have removed the module just to see if anything changes. I'll skip tomorrow's backup and see how it goes. If it works, I'll then try the patchset. Thanks for the idea! I'll try anything at this point... It didn't work... A few more updates: -smartctl reported that /dev/hda is fine, through 2 long tests and 3 short. -Disabling smartd didn't help. -Disabling NCQ didn't help, it just changed the error from NCQ to DMA. -Manually failing sda and later sde and going back to 4 drives (much less IO) worked fine, also showing that sda likely isn't the problem. SATA so reassigning to Jeff. Looks like another case of the bug Mark Lord fixed which I think is queued for .27 Any idea where that bug/patch might be? I'm getting about 6 or so of these lockups a day, so I wouldn't mind trying to push my own fix a little early... I look forward to this patch as well, do you have a link to it? I also use the Intel e1000e driver so I'd prefer the standalone patch vs. moving to 2.6.27.
[420781.333179] ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[420781.333189] ata6.00: cmd b0/d8:00:00:4f:c2/00:00:00:00:00/00 tag 0
[420781.333190] res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[420781.333194] ata6.00: status: { DRDY }
[420781.333200] ata6: hard resetting link
[420781.638589] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[420781.662166] ata6.00: configured for UDMA/133
[420781.662166] ata6: EH complete
[420781.662989] sd 5:0:0:0: [sdf] 586072368 512-byte hardware sectors (300069 MB)
[420781.669416] sd 5:0:0:0: [sdf] Write Protect is off
[420781.669416] sd 5:0:0:0: [sdf] Mode Sense: 00 3a 00 00
[420781.669416] sd 5:0:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[469680.004637] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[469680.004648] ata2.00: cmd b0/d8:00:00:4f:c2/00:00:00:00:00/00 tag 0
[469680.004649] res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[469680.004654] ata2.00: status: { DRDY }
[469680.004660] ata2: hard resetting link
[469680.309567] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[469680.333461] ata2.00: configured for UDMA/133
[469680.333477] ata2: EH complete
[469680.333461] sd 1:0:0:0: [sdb] 586072368 512-byte hardware sectors (300069 MB)
[469680.340461] sd 1:0:0:0: [sdb] Write Protect is off
[469680.340461] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[469680.345461] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
(In reply to comment #8) > SATA so reassigning to Jeff. Looks like another case of the bug Mark Lord fixed > which I think is queued for .27 Alan, in the mean time, is there something I can change/disable to return stability back to my server (kernel options, libata options in modprobe.conf, etc.)? I'd be willing to take a huge performance hit for stability... I can't find that fix either. I manually failed one of my RAID-5 drives, which has brought back stability to the system. Other than the performance hit and living on the edge of catastrophe if another HD fails, it's working. Certainly not a fix, but for now better than the constant freezing... I found a workaround (that works for me at least) - Disabling the drive write cache on all RAID member drives with hdparm -W0 seems to work. Maybe this is a clue for diagnosing as well. I didn't mention it above, but I have my RAID mounted with data=writeback if that could be having an effect. This may be all for not if it's truly fixed in .27 anyway. I'll be looking forward to the F9 .27 kernel update if/when it comes... (In reply to comment #14) > > This may be all for not if it's truly fixed in .27 anyway. I'll be looking > forward to the F9 .27 kernel update if/when it comes... https://admin.fedoraproject.org/updates/kernel-2.6.27.4-19.fc9 Woo hoo! I shall test when it hits the testing repo... With 2.6.27.4 (Vanilla) the problem still occurs. Justin. [198231.048036] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[198231.048045] ata5.00: cmd b0/d8:00:00:4f:c2/00:00:00:00:00/00 tag 0
[198231.048046] res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[198231.048050] ata5.00: status: { DRDY }
[198231.048054] ata5: hard resetting link
[198231.353033] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[198231.377941] ata5.00: configured for UDMA/133
[198231.377954] ata5: EH complete
[198231.378140] sd 4:0:0:0: [sde] 586072368 512-byte hardware sectors (300069 MB)
[198231.385337] sd 4:0:0:0: [sde] Write Protect is off
[198231.385344] sd 4:0:0:0: [sde] Mode Sense: 00 3a 00 00
[198231.385383] sd 4:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
$ uname -a
Linux box 2.6.27.4 #1 SMP Sun Oct 26 04:46:17 EDT 2008 x86_64 GNU/Linux
Justin, did you test disabling write caching on the drives themselves to see what happens? I have been running that way since I posted that workaround with no trouble under 2.6.26.6-79.fc9.x86_64. I'm just wondering if we are experiencing the same problem with the same workaround. That may help with future debugging of this issue... I have just turned off the cache on all of the drives now and will see if this problem recurs. Justin. I used hdparm -W0 /dev/sda etc to turn it off, is that the method you used (incase variance matters)? That's exactly what I did... I am still trying to reproduce it with the cache off, so far, I have not had any luck. Can you test 2.6.27.4: https://admin.fedoraproject.org/updates/kernel-2.6.27.4-24.fc9 Brian, I believe that was directed at you-- BTW so far you're correct, turning the cache off seems to fix the problem, but who's problem is it? The kernel's? Western Digitals? Intel/chipset? Is there an RPM for 2.6.27.4 somewhere yet? (and the dependencies). Much easier to test that I way. I haven't seen it hit the testing repo yet... I think we're on to something with this write caching thing - Mine is still stable, and I'm running 5 Seagate 7200.10 drives, so different than your WD setup... As I recall, my chipset/hardware is quiet a bit different as well: 00:02.0 PCI bridge: ALi Corporation M5249 HTT to PCI Bridge 00:03.0 ISA bridge: ALi Corporation M1563 HyperTransport South Bridge (rev 20) 00:03.1 Bridge: ALi Corporation M7101 Power Management Controller [PMU] 00:0a.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12) 00:0a.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01) 00:0b.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12) 00:0b.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01) 00:0e.0 IDE interface: ALi Corporation M5229 IDE (rev c5) 00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map 00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control 00:19.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 00:19.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map 00:19.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 00:19.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control 01:07.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27) 02:03.0 SCSI storage controller: Marvell Technology Group Ltd. MV88SX6081 8-port SATA II PCI-X Controller (rev 09) 03:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet (rev 03) 03:04.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet (rev 03) Happened again, this time, with cache OFF:
Nov 6 01:20:07 p34 kernel: [639232.946183] ata13.00: exception Emask 0x0 SAct
0x0 SErr 0x0 action 0x6 frozen
Nov 6 01:20:07 p34 kernel: [639232.946193] ata13.00: cmd
ec/00:00:00:00:00/00:00:00:00:00/00 tag 0 pio 512 in
Nov 6 01:20:07 p34 kernel: [639232.946195] res
40/00:00:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Nov 6 01:20:07 p34 kernel: [639232.946200] ata13.00: status: { DRDY }
Nov 6 01:20:07 p34 kernel: [639232.946206] ata13: hard resetting link
Nov 6 01:20:08 p34 kernel: [639233.403168] ata13: SATA link up 3.0 Gbps
(SStatus 123 SControl 300)
Nov 6 01:20:08 p34 kernel: [639233.440207] ata13.00: configured for UDMA/133
Nov 6 01:20:08 p34 kernel: [639233.449851] sd 12:0:0:0: [sdi] Write Protect is
off
Nov 6 01:20:08 p34 kernel: [639233.449858] sd 12:0:0:0: [sdi] Mode Sense: 00 3a
00 00
Nov 6 01:20:08 p34 kernel: [639233.476367] sd 12:0:0:0: [sdi] Write cache:
disabled, read cache: enabled, doesn't support DPO or FUA
Well mine didn't take long! Two freezes right on boot with 2.6.27.4-19.fc9.x86_64 #1 SMP Thu Oct 30 19:30:01 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux...
ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
ata1.00: cmd 61/08:00:08:d6:42/00:00:25:00:00/40 tag 0 ncq 4096 out
res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
ata1.00: status: { DRDY }
ata1: hard resetting link
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata1.00: max_sectors limited to 256 for NCQ
ata1.00: max_sectors limited to 256 for NCQ
ata1.00: configured for UDMA/133
ata1: EH complete
sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
ata1.00: cmd 61/08:00:08:d6:42/00:00:25:00:00/40 tag 0 ncq 4096 out
res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
ata1.00: status: { DRDY }
ata1: hard resetting link
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata1.00: max_sectors limited to 256 for NCQ
ata1.00: max_sectors limited to 256 for NCQ
ata1.00: configured for UDMA/133
ata1: EH complete
I turned off write caching, which I assume will work based on my previous experience...
Running 7 disk raid 5 array with the following card: SCSI storage controller: Marvell Technology Group Ltd. MV88SX6081 8-port SATA II PCI-X Controller (rev 09) I saw discussion of this in the linux-kernel mailing list and someone mentioned they where seeing my same issue with the super micro AOC-SAT2-MV8. That's also the card I'm using. file system is XFS. On heavy transfers i'm seeing a lot of this. I've been getting it since late august. Not going to lie, using ubuntu. see my initial bug report here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/263160/ If you read down, you'll see i _WAS_ using a RHEL based distro (2.6.18 32bit) just fine, and then i moved to ubuntu (2.6.27.2 64bit) and started getting these issues. -- since this posting, i've upgraded to 2.6.27-7 and its now gotten so bad that its desync'd my raid on a transfer. i'm now worried about loosing the data and have completely disconnected the drives. I'm not going to risk a rebuild without these issues fixed. really wish we could figure this out after 2 months of reported problems. I'm not sure if the redhat bugzilla is the right place to report this, but if someone replies i'll provide any information that i can. dmesg: [11285.918535] ata9.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen [11285.918567] ata9.00: cmd 61/03:00:49:00:00/00:00:00:00:00/40 tag 0 ncq 1536 out [11285.918568] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [11285.918619] ata9.00: status: { DRDY } [11285.918635] ata9: hard resetting link [11286.420039] ata9: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [11286.460065] ata9.00: max_sectors limited to 256 for NCQ [11286.520054] ata9.00: max_sectors limited to 256 for NCQ [11286.520059] ata9.00: configured for UDMA/133 [11286.520077] ata9: EH complete [11286.520119] sd 8:0:0:0: [sdd] 976773168 512-byte hardware sectors (500108 MB) [11286.520132] sd 8:0:0:0: [sdd] Write Protect is off [11286.520134] sd 8:0:0:0: [sdd] Mode Sense: 00 3a 00 00 [11286.520154] sd 8:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [11326.988529] ata8.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen [11326.988554] ata8.00: cmd 61/03:00:49:00:00/00:00:00:00:00/40 tag 0 ncq 1536 out [11326.988555] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [11326.988606] ata8.00: status: { DRDY } [11326.988623] ata8: hard resetting link [11327.500037] ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [11327.580053] ata8.00: max_sectors limited to 256 for NCQ [11327.657199] ata8.00: max_sectors limited to 256 for NCQ [11327.657202] ata8.00: configured for UDMA/133 [11327.657207] ata8: EH complete [11327.657257] sd 7:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB) [11327.657272] sd 7:0:0:0: [sdc] Write Protect is off [11327.657273] sd 7:0:0:0: [sdc] Mode Sense: 00 3a 00 00 [11327.657296] sd 7:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [11377.938532] ata7.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen [11377.938557] ata7.00: cmd 61/03:00:49:00:00/00:00:00:00:00/40 tag 0 ncq 1536 out [11377.938558] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [11377.938608] ata7.00: status: { DRDY } [11377.938624] ata7: hard resetting link [11378.440037] ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [11378.520056] ata7.00: max_sectors limited to 256 for NCQ [11378.600065] ata7.00: max_sectors limited to 256 for NCQ [11378.600068] ata7.00: configured for UDMA/133 [11378.600073] ata7: EH complete [11378.600120] sd 6:0:0:0: [sdb] 976773168 512-byte hardware sectors (500108 MB) [11378.600133] sd 6:0:0:0: [sdb] Write Protect is off [11378.600135] sd 6:0:0:0: [sdb] Mode Sense: 00 3a 00 00 [11378.600155] sd 6:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [11711.718523] ata9.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen [11711.718548] ata9.00: cmd 61/03:00:49:00:00/00:00:00:00:00/40 tag 0 ncq 1536 out [11711.718549] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [11711.718600] ata9.00: status: { DRDY } [11711.718616] ata9: hard resetting link [11712.220041] ata9: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [11712.260058] ata9.00: max_sectors limited to 256 for NCQ [11712.320057] ata9.00: max_sectors limited to 256 for NCQ [11712.320066] ata9.00: configured for UDMA/133 [11712.320072] ata9: EH complete [11712.320112] sd 8:0:0:0: [sdd] 976773168 512-byte hardware sectors (500108 MB) [11712.320125] sd 8:0:0:0: [sdd] Write Protect is off [11712.320127] sd 8:0:0:0: [sdd] Mode Sense: 00 3a 00 00 [11712.320148] sd 8:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [11849.328524] ata7.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen [11849.328549] ata7.00: cmd 61/03:00:49:00:00/00:00:00:00:00/40 tag 0 ncq 1536 out [11849.328549] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [11849.328600] ata7.00: status: { DRDY } [11849.328617] ata7: hard resetting link [11849.830037] ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [11849.910070] ata7.00: max_sectors limited to 256 for NCQ [11849.990053] ata7.00: max_sectors limited to 256 for NCQ [11849.990057] ata7.00: configured for UDMA/133 [11849.990069] ata7: EH complete [11849.990109] sd 6:0:0:0: [sdb] 976773168 512-byte hardware sectors (500108 MB) [11849.990123] sd 6:0:0:0: [sdb] Write Protect is off [11849.990125] sd 6:0:0:0: [sdb] Mode Sense: 00 3a 00 00 [11849.990147] sd 6:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [11909.629773] ata9.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen [11909.629797] ata9.00: cmd 61/03:00:49:00:00/00:00:00:00:00/40 tag 0 ncq 1536 out [11909.629798] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [11909.629849] ata9.00: status: { DRDY } [11909.629865] ata9: hard resetting link [11910.131295] ata9: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [11910.180068] ata9.00: max_sectors limited to 256 for NCQ [11910.231316] ata9.00: max_sectors limited to 256 for NCQ [11910.231319] ata9.00: configured for UDMA/133 [11910.231327] ata9: EH complete [11910.231381] sd 8:0:0:0: [sdd] 976773168 512-byte hardware sectors (500108 MB) [11910.231394] sd 8:0:0:0: [sdd] Write Protect is off [11910.231396] sd 8:0:0:0: [sdd] Mode Sense: 00 3a 00 00 [11910.231417] sd 8:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [11996.729773] ata7.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen [11996.729797] ata7.00: cmd 61/03:00:49:00:00/00:00:00:00:00/40 tag 0 ncq 1536 out [11996.729798] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [11996.729848] ata7.00: status: { DRDY } [11996.729865] ata7: hard resetting link [11997.231291] ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [11997.311308] ata7.00: max_sectors limited to 256 for NCQ [11997.391306] ata7.00: max_sectors limited to 256 for NCQ [11997.391316] ata7.00: configured for UDMA/133 [11997.391322] ata7: EH complete [11997.391366] sd 6:0:0:0: [sdb] 976773168 512-byte hardware sectors (500108 MB) [11997.391378] sd 6:0:0:0: [sdb] Write Protect is off [11997.391380] sd 6:0:0:0: [sdb] Mode Sense: 00 3a 00 00 [11997.391400] sd 6:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FU /var/log/messages: Aug 30 20:12:43 isis kernel: [11285.918635] ata9: hard resetting link Aug 30 20:12:43 isis kernel: [11286.420039] ata9: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Aug 30 20:12:43 isis kernel: [11286.460065] ata9.00: max_sectors limited to 256 for NCQ Aug 30 20:12:43 isis kernel: [11286.520054] ata9.00: max_sectors limited to 256 for NCQ Aug 30 20:12:43 isis kernel: [11286.520059] ata9.00: configured for UDMA/133 Aug 30 20:12:43 isis kernel: [11286.520077] ata9: EH complete Aug 30 20:12:43 isis kernel: [11286.520119] sd 8:0:0:0: [sdd] 976773168 512-byte hardware sectors (500108 MB) Aug 30 20:12:43 isis kernel: [11286.520132] sd 8:0:0:0: [sdd] Write Protect is off Aug 30 20:12:43 isis kernel: [11286.520154] sd 8:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Aug 30 20:13:24 isis kernel: [11326.988623] ata8: hard resetting link Aug 30 20:13:24 isis kernel: [11327.500037] ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Aug 30 20:13:24 isis kernel: [11327.580053] ata8.00: max_sectors limited to 256 for NCQ Aug 30 20:13:24 isis kernel: [11327.657199] ata8.00: max_sectors limited to 256 for NCQ Aug 30 20:13:24 isis kernel: [11327.657202] ata8.00: configured for UDMA/133 Aug 30 20:13:24 isis kernel: [11327.657207] ata8: EH complete Aug 30 20:13:24 isis kernel: [11327.657257] sd 7:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB) Aug 30 20:13:24 isis kernel: [11327.657272] sd 7:0:0:0: [sdc] Write Protect is off Aug 30 20:13:24 isis kernel: [11327.657296] sd 7:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Aug 30 20:14:15 isis kernel: [11377.938624] ata7: hard resetting link Aug 30 20:14:15 isis kernel: [11378.440037] ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Aug 30 20:14:15 isis kernel: [11378.520056] ata7.00: max_sectors limited to 256 for NCQ Aug 30 20:14:15 isis kernel: [11378.600065] ata7.00: max_sectors limited to 256 for NCQ Aug 30 20:14:15 isis kernel: [11378.600068] ata7.00: configured for UDMA/133 Aug 30 20:14:15 isis kernel: [11378.600073] ata7: EH complete Aug 30 20:14:15 isis kernel: [11378.600120] sd 6:0:0:0: [sdb] 976773168 512-byte hardware sectors (500108 MB) Aug 30 20:14:15 isis kernel: [11378.600133] sd 6:0:0:0: [sdb] Write Protect is off Aug 30 20:14:15 isis kernel: [11378.600155] sd 6:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Aug 30 20:19:48 isis kernel: [11711.718616] ata9: hard resetting link Aug 30 20:19:49 isis kernel: [11712.220041] ata9: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Aug 30 20:19:49 isis kernel: [11712.260058] ata9.00: max_sectors limited to 256 for NCQ Aug 30 20:19:49 isis kernel: [11712.320057] ata9.00: max_sectors limited to 256 for NCQ Aug 30 20:19:49 isis kernel: [11712.320066] ata9.00: configured for UDMA/133 Aug 30 20:19:49 isis kernel: [11712.320072] ata9: EH complete Aug 30 20:19:49 isis kernel: [11712.320112] sd 8:0:0:0: [sdd] 976773168 512-byte hardware sectors (500108 MB) Aug 30 20:19:49 isis kernel: [11712.320125] sd 8:0:0:0: [sdd] Write Protect is off Aug 30 20:19:49 isis kernel: [11712.320148] sd 8:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Aug 30 20:22:06 isis kernel: [11849.328617] ata7: hard resetting link Aug 30 20:22:06 isis kernel: [11849.830037] ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Aug 30 20:22:06 isis kernel: [11849.910070] ata7.00: max_sectors limited to 256 for NCQ Aug 30 20:22:07 isis kernel: [11849.990053] ata7.00: max_sectors limited to 256 for NCQ Aug 30 20:22:07 isis kernel: [11849.990057] ata7.00: configured for UDMA/133 Aug 30 20:22:07 isis kernel: [11849.990069] ata7: EH complete Aug 30 20:22:07 isis kernel: [11849.990109] sd 6:0:0:0: [sdb] 976773168 512-byte hardware sectors (500108 MB) Aug 30 20:22:07 isis kernel: [11849.990123] sd 6:0:0:0: [sdb] Write Protect is off Aug 30 20:22:07 isis kernel: [11849.990147] sd 6:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Aug 30 20:23:06 isis kernel: [11909.629865] ata9: hard resetting link Aug 30 20:23:07 isis kernel: [11910.131295] ata9: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Aug 30 20:23:07 isis kernel: [11910.180068] ata9.00: max_sectors limited to 256 for NCQ Aug 30 20:23:07 isis kernel: [11910.231316] ata9.00: max_sectors limited to 256 for NCQ Aug 30 20:23:07 isis kernel: [11910.231319] ata9.00: configured for UDMA/133 Aug 30 20:23:07 isis kernel: [11910.231327] ata9: EH complete Aug 30 20:23:07 isis kernel: [11910.231381] sd 8:0:0:0: [sdd] 976773168 512-byte hardware sectors (500108 MB) Aug 30 20:23:07 isis kernel: [11910.231394] sd 8:0:0:0: [sdd] Write Protect is off Aug 30 20:23:07 isis kernel: [11910.231417] sd 8:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Aug 30 20:24:33 isis kernel: [11996.729865] ata7: hard resetting link Aug 30 20:24:34 isis kernel: [11997.231291] ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Aug 30 20:24:34 isis kernel: [11997.311308] ata7.00: max_sectors limited to 256 for NCQ Aug 30 20:24:34 isis kernel: [11997.391306] ata7.00: max_sectors limited to 256 for NCQ Aug 30 20:24:34 isis kernel: [11997.391316] ata7.00: configured for UDMA/133 Aug 30 20:24:34 isis kernel: [11997.391322] ata7: EH complete Aug 30 20:24:34 isis kernel: [11997.391366] sd 6:0:0:0: [sdb] 976773168 512-byte hardware sectors (500108 MB) Aug 30 20:24:34 isis kernel: [11997.391378] sd 6:0:0:0: [sdb] Write Protect is off Aug 30 20:24:34 isis kernel: [11997.391400] sd 6:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA I've replaced the card and cables and i'm still getting the issue. This card&raid was working on a RHEL last week (2.6.18 32bit). Replaced OS (ubuntu 64bit), cpu (core2duo), mobo (asus p5k pro) I'm really at a loss here, not sure what else to do. I stressed the other components of the system in windows and they seemed fine. not sure if its the card or something with the newer kernels. I think this problem tends to get ignored because there are so many things that can cause it (bad drives, cables, power supplies, or any combination thereof).. Even with this bug, you can see that in my case disabling write caching solves the problem (not a great solution mind you, but a workaround for now), yet didn't help Justin. BTW, disabling write caching under the new kernel works for me, as with the older kernel. It seems that the one thing we do have in common is a larger than average number of drives in RAID. I have the least at 5, you have 7, and Justin 10 I believe...When I had 4, it was difficult to get this problem to show except for under heavy IO. With 5, I can simply boot... The write cache hack around is really only relevant to that specific type of drive (and at this point appears to be a bug in the drive itself) If it were a bug in the drive itself, wouldn't it show under most all write conditions/kernels? I never even saw this under a 4 drive RAID 5 until later kernel revisions. It was completely stable otherwise. Adding the 5th disk is what sent it over the edge with any kernel... Not sure if you took the time to read my post on the ubuntu bug tracker, but i'm getting the errors on both WDC and seagate drives. giving a thread back in september about this on the linux-kernel mailing list and another reference to the MV88SX6081 8-port SATA II PCI-X Controller (super micro AOC-SAT2-MV8) i was leaning towards that being the cause... That is another possibility (the 88SX6081 controller), although that isn't what Justin is using. Justin's problem seems hard to create, whereas mine and yours is hard to avoid (based on your "...its now gotten so bad that its desync'd my raid on a transfer...") Could be two different issues, but glad you see it with different drives... Just tested 2.6.27.5-37.fc9.x86_64 and same thing... "If it were a bug in the drive itself, wouldn't it show under most all write conditions/kernels" From past experience of drive firmware funnies probably not. If they were simple to cause the vendor would have discovered them before shipping product. Also btw I don't see any reason to believe the various bugs muddled together here are at all connected.. Searching on the controller and "frozen", I found an interesting comment from Mark Lord, where he said this in response to freezing issues with the Marvell controller: "My recollection is that the worst errata are for the 60x1 chips on PCI-X." (which happens to be my situation) He also mentioned that he was going to be resuming work on sata_mv as of October 28th. Original post here: http://webui.sourcelabs.com/kernel/issues/10321 Can someone who knows him point him in this direction while he is working on incorporating errata into the driver? I'd hate to miss out on an opportunity to get this resolved! I think I found his email address (at least it didn't bounce yet), so we'll see... I did a clean install of F10 and still see the same problem. It also has the same solution of disabling write cache. I see this under F10/ext4 now though: kernel: JBD: barrier-based sync failed on md3:8 - disabling barriers So I disabled them in fstab for now. Not to mix that in with this bug though..I'm sure that is likely something else... Pretty fed up with people saying this could be so many different issues. So much so that i finally decided to risk my data to prove it.... read the following.
***___This has got to be the card / chipset / sata_mv driver._____***
Short and simple version of my issues:
- This does not depend on drive types
- Appears to be caused by MV88SX6081 chipset
- Could be a problem in SATA_MV driver
- I need replacement controller suggestions
Details to all non believers (it’s not a power / hardware issue):
I moved 5 of the 7 drives to my onboard controller (have 6 sata ports on the mobo, last was used by the system drive).
Left 2 of the western digital drives on the MV88SX6081 8-port SATA II:
- sdg
- sdh
After the advice of some through email, I unplugged everything that wasn't needed. They assumed that it could have been power giving the number of drives I had in the machine. What was left on a tx750w corsair power supply:
- mobo (c2d, 4gb ram)
- 7 sata raid drives - spread across multiple power supply rails
- 1 sata system drive
- Super Micro SAT2-MV8 (MV88SX6081 8-port SATA II)
- intel pcie 10/100/1000 network card
Then I replaced the sate cables 1 more time with old cables I knew worked. I also threw in the brand new controller card as well (have a few spares lying around).
I brought everything up and upgraded to:
Then I started to rebuild the raid. Everything went fine, no freezes.
**This was the first indication that this only happens under heavy load on multiple ports as has been brought up before.
So then I started copying data over. About 180GB's the card hard reset both of the drives attached to it and knocked them both out of the raid.
**This was also significantly different from before when I was utilizing all the ports as it seemed to work great for quite some time, it wasn't until I was well into the process that the card finally gave up.
See the attached dmesg and /var/log/messages. This is the 2nd time I’ve had this card degrade my raid and almost give me a heart attack.
The cards are going in the trash at this point. I'm open to suggestions as to possibly replacement. I don’t need a hardware raid card, just a decent controller with great *nix support and lots of ports.
::sigh:: I don’t know who to contact but this is the end of the line for me with this controller and hopefully my issues.
Attempting to get my data back as we speak with 2 failed drives in a raid 5... wonderful times.
dmsg of the event:
[ 1061.040118] md: recovery of RAID array md1
[ 1061.040120] md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
[ 1061.040122] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
[ 1061.040126] md: using 128k window, over a total of 488383744 blocks.
[11208.852220] md: md1: recovery done.
[11209.020072] RAID5 conf printout:
[11209.020076] --- rd:7 wd:7
[11209.020079] disk 0, o:1, dev:sdd1
[11209.020080] disk 1, o:1, dev:sdb1
[11209.020081] disk 2, o:1, dev:sdh1
[11209.020082] disk 3, o:1, dev:sdc1
[11209.020083] disk 4, o:1, dev:sdf1
[11209.020084] disk 5, o:1, dev:sde1
[11209.020085] disk 6, o:1, dev:sdg1
[19844.431690] SGI XFS with ACLs, security attributes, realtime, large block/inode numbers, no debug enabled
[19844.433148] SGI XFS Quota Management subsystem
[19844.442507] Filesystem "md1": Disabling barriers, trial barrier write failed
[19844.442658] XFS mounting filesystem md1
[19844.893398] Ending clean XFS mount for filesystem: md1
[27027.170016] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[27027.170041] ata5.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
[27027.170041] res 40/00:00:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[27027.170083] ata5.00: status: { DRDY }
[27027.170099] ata5: hard resetting link
[27027.680034] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[27027.720050] ata5.00: max_sectors limited to 256 for NCQ
[27027.780047] ata5.00: max_sectors limited to 256 for NCQ
[27027.780050] ata5.00: configured for UDMA/133
[27027.780055] end_request: I/O error, dev sdg, sector 73
[27027.780073] md: super_written gets error=-5, uptodate=0
[27027.780076] raid5: Disk failure on sdg1, disabling device.
[27027.780077] raid5: Operation continuing on 6 devices.
[27027.780117] ata5: EH complete
[27027.780674] sd 4:0:0:0: [sdg] 976773168 512-byte hardware sectors (500108 MB)
[27027.780800] sd 4:0:0:0: [sdg] Write Protect is off
[27027.780803] sd 4:0:0:0: [sdg] Mode Sense: 00 3a 00 00
[27027.781038] sd 4:0:0:0: [sdg] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[27057.930015] ata12.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[27057.930039] ata12.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
[27057.930040] res 40/00:00:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[27057.930081] ata12.00: status: { DRDY }
[27057.930098] ata12: hard resetting link
[27058.440033] ata12: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[27058.480049] ata12.00: max_sectors limited to 256 for NCQ
[27058.540047] ata12.00: max_sectors limited to 256 for NCQ
[27058.540050] ata12.00: configured for UDMA/133
[27058.540055] end_request: I/O error, dev sdh, sector 71
[27058.540072] md: super_written gets error=-5, uptodate=0
[27058.540075] raid5: Disk failure on sdh1, disabling device.
[27058.540076] raid5: Operation continuing on 5 devices.
[27058.540113] ata12: EH complete
[27058.540754] sd 11:0:0:0: [sdh] 976773168 512-byte hardware sectors (500108 MB)
[27058.540879] sd 11:0:0:0: [sdh] Write Protect is off
[27058.540882] sd 11:0:0:0: [sdh] Mode Sense: 00 3a 00 00
[27058.541070] sd 11:0:0:0: [sdh] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[27058.584017] RAID5 conf printout:
[27058.584020] --- rd:7 wd:5
[27058.584022] disk 0, o:1, dev:sdd1
[27058.584023] disk 1, o:1, dev:sdb1
[27058.584024] disk 2, o:0, dev:sdh1
[27058.584025] disk 3, o:1, dev:sdc1
[27058.584027] disk 4, o:1, dev:sdf1
[27058.584028] disk 5, o:1, dev:sde1
[27058.584029] disk 6, o:0, dev:sdg1
[27061.521245] BUG: soft lockup - CPU#1 stuck for 61s! [smbd:28171]
[27061.521251] Modules linked in: xfs aes_x86_64 aes_generic ecb crypto_blkcipher ecryptfs ipv6 af_packet iptable_filter ip_tables x_tables ac sbp2 parport_pc lp parport loop psmouse pcspkr serio_raw iTCO_wdt iTCO_vendor_support evdev button intel_agp snd_hda_intel snd_pcm shpchp snd_timer pci_hotplug snd soundcore snd_page_alloc ext3 jbd mbcache sd_mod crc_t10dif sg pata_acpi pata_marvell usbhid hid ohci1394 ieee1394 sata_mv ata_generic ata_piix libata scsi_mod dock sky2 e1000e ehci_hcd uhci_hcd usbcore raid10 raid456 async_xor async_memcpy async_tx xor raid1 raid0 multipath linear md_mod dm_mirror dm_log dm_snapshot dm_mod thermal processor fan fbcon tileblit font bitblit softcursor fuse
[27061.521251] CPU 1:
[27061.521251] Modules linked in: xfs aes_x86_64 aes_generic ecb crypto_blkcipher ecryptfs ipv6 af_packet iptable_filter ip_tables x_tables ac sbp2 parport_pc lp parport loop psmouse pcspkr serio_raw iTCO_wdt iTCO_vendor_support evdev button intel_agp snd_hda_intel snd_pcm shpchp snd_timer pci_hotplug snd soundcore snd_page_alloc ext3 jbd mbcache sd_mod crc_t10dif sg pata_acpi pata_marvell usbhid hid ohci1394 ieee1394 sata_mv ata_generic ata_piix libata scsi_mod dock sky2 e1000e ehci_hcd uhci_hcd usbcore raid10 raid456 async_xor async_memcpy async_tx xor raid1 raid0 multipath linear md_mod dm_mirror dm_log dm_snapshot dm_mod thermal processor fan fbcon tileblit font bitblit softcursor fuse
[27061.521251] Pid: 28171, comm: smbd Not tainted 2.6.27-9-server #1
[27061.521251] RIP: 0010:[<ffffffff802abf0c>] [<ffffffff802abf0c>] find_get_pages+0x6c/0x110
[27061.521251] RSP: 0018:ffff880129453358 EFLAGS: 00000246
[27061.521251] RAX: ffff880128d89330 RBX: ffff880129453398 RCX: 0000000000000002
[27061.521251] RDX: 0000000000000003 RSI: 0000000000000000 RDI: ffffe200022e9e80
[27061.521251] RBP: ffff880129453308 R08: ffffe200009df6c8 R09: 0000000000000005
[27061.521251] R10: 0000000000000037 R11: 00000000001c5778 R12: ffffffff802b6b29
[27061.521251] R13: ffff880123a107d0 R14: ffffe20001c6f6c0 R15: 0000000000000286
[27061.521251] FS: 00007fb72cdf6700(0000) GS:ffff88012fc02980(0000) knlGS:0000000000000000
[27061.521251] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[27061.521251] CR2: 00007f1648629000 CR3: 000000012956d000 CR4: 00000000000006e0
[27061.521251] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[27061.521251] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[27061.521251]
[27061.521251] Call Trace:
[27061.521251] [<ffffffff802abee3>] ? find_get_pages+0x43/0x110
[27061.521251] [<ffffffff802b6984>] ? pagevec_lookup+0x24/0x30
[27061.521251] [<ffffffffa04e100d>] ? xfs_cluster_write+0xad/0x180 [xfs]
[27061.521251] [<ffffffffa04e1578>] ? xfs_page_state_convert+0x498/0x760 [xfs]
[27061.521251] [<ffffffffa04e19a1>] ? xfs_vm_writepage+0x71/0x120 [xfs]
[27061.521251] [<ffffffff802b9274>] ? pageout+0x124/0x270
[27061.521251] [<ffffffff802ab06a>] ? page_waitqueue+0xa/0x90
[27061.521251] [<ffffffff802b986d>] ? shrink_page_list+0x34d/0x530
[27061.521251] [<ffffffff802b8e49>] ? __isolate_lru_page+0x79/0xb0
[27061.521251] [<ffffffff802b8f0a>] ? isolate_lru_pages+0x8a/0x220
[27061.521251] [<ffffffff802b9bf2>] ? shrink_inactive_list+0x1a2/0x4b0
[27061.521251] [<ffffffff802b9f7b>] ? shrink_zone+0x7b/0x160
[27061.521251] [<ffffffff802ba0ed>] ? shrink_zones+0x8d/0x150
[27061.521251] [<ffffffff802ba236>] ? do_try_to_free_pages+0x86/0x2e0
[27061.521251] [<ffffffff802ba587>] ? try_to_free_pages+0x67/0x70
[27061.521251] [<ffffffff802b90a0>] ? isolate_pages_global+0x0/0x50
[27061.521251] [<ffffffff802b28b1>] ? __alloc_pages_internal+0x241/0x510
[27061.521251] [<ffffffff802d565d>] ? alloc_pages_current+0xad/0x110
[27061.521251] [<ffffffff802ac477>] ? __page_cache_alloc+0x67/0x80
[27061.521251] [<ffffffff802ad0b3>] ? __grab_cache_page+0x63/0xb0
[27061.521251] [<ffffffff80316a59>] ? block_write_begin+0x89/0xf0
[27061.521251] [<ffffffffa04e04ca>] ? xfs_vm_write_begin+0x2a/0x30 [xfs]
[27061.521251] [<ffffffffa04e0040>] ? xfs_get_blocks+0x0/0x20 [xfs]
[27061.521251] [<ffffffff802ab7ac>] ? generic_perform_write+0xbc/0x1c0
[27061.521251] [<ffffffff802ad512>] ? generic_file_buffered_write+0x92/0x170
[27061.521251] [<ffffffffa04e92d3>] ? xfs_write+0x6b3/0x9b0 [xfs]
[27061.521251] [<ffffffff80385a69>] ? apparmor_socket_recvmsg+0x19/0x20
[27061.521251] [<ffffffff803aaf70>] ? memset_c+0x20/0x30
[27061.521251] [<ffffffffa04e4c88>] ? xfs_file_aio_write+0x58/0x60 [xfs]
[27061.521251] [<ffffffff802e9559>] ? do_sync_write+0xf9/0x140
[27061.521251] [<ffffffff802e9699>] ? do_sync_read+0xf9/0x140
[27061.521251] [<ffffffff80266fb0>] ? autoremove_wake_function+0x0/0x40
[27061.521251] [<ffffffff80386821>] ? aa_file_permission+0x21/0xf0
[27061.521251] [<ffffffff80386948>] ? apparmor_file_permission+0x28/0x30
[27061.521251] [<ffffffff803613e6>] ? security_file_permission+0x16/0x20
[27061.521251] [<ffffffff802e9c1b>] ? vfs_write+0xcb/0x130
[27061.521251] [<ffffffff802e9d1a>] ? sys_pwrite64+0x9a/0xa0
[27061.521251] [<ffffffff8021285a>] ? system_call_fastpath+0x16/0x1b
[27061.521251]
[27095.080066] RAID5 conf printout:
[27095.080071] --- rd:7 wd:5
[27095.080074] disk 0, o:1, dev:sdd1
[27095.080076] disk 1, o:1, dev:sdb1
[27095.080077] disk 2, o:0, dev:sdh1
[27095.080079] disk 3, o:1, dev:sdc1
[27095.080080] disk 4, o:1, dev:sdf1
[27095.080082] disk 5, o:1, dev:sde1
[27095.080090] RAID5 conf printout:
[27095.080091] --- rd:7 wd:5
[27095.080092] disk 0, o:1, dev:sdd1
[27095.080093] disk 1, o:1, dev:sdb1
[27095.080094] disk 2, o:0, dev:sdh1
[27095.080095] disk 3, o:1, dev:sdc1
[27095.080097] disk 4, o:1, dev:sdf1
[27095.080098] disk 5, o:1, dev:sde1
[27095.140011] RAID5 conf printout:
[27095.140017] --- rd:7 wd:5
[27095.140019] disk 0, o:1, dev:sdd1
[27095.140022] disk 1, o:1, dev:sdb1
[27095.140024] disk 3, o:1, dev:sdc1
[27095.140026] disk 4, o:1, dev:sdf1
[27095.140027] disk 5, o:1, dev:sde1
[27095.140511] Buffer I/O error on device md1, logical block 455870845
[27095.140545] lost page write due to I/O error on md1
[27095.140550] Buffer I/O error on device md1, logical block 455870846
[27095.140567] lost page write due to I/O error on md1
[27095.140569] Buffer I/O error on device md1, logical block 455870847
[27095.140585] lost page write due to I/O error on md1
[27095.140587] Buffer I/O error on device md1, logical block 455870848
[27095.140604] lost page write due to I/O error on md1
[27095.140606] Buffer I/O error on device md1, logical block 455870849
[27095.140622] lost page write due to I/O error on md1
[27095.140624] Buffer I/O error on device md1, logical block 455870850
[27095.140641] lost page write due to I/O error on md1
[27095.140642] Buffer I/O error on device md1, logical block 455870851
[27095.140659] lost page write due to I/O error on md1
[27095.140661] Buffer I/O error on device md1, logical block 455870852
[27095.140677] lost page write due to I/O error on md1
[27095.140679] Buffer I/O error on device md1, logical block 455870853
[27095.140696] lost page write due to I/O error on md1
[27095.140697] Buffer I/O error on device md1, logical block 455870854
[27095.140714] lost page write due to I/O error on md1
[27095.141327] I/O error in filesystem ("md1") meta-data dev md1 block 0xaeaa9810 ("xlog_iodone") error 5 buf count 12288
[27095.141359] xfs_force_shutdown(md1,0x2) called from line 1056 of file /build/buildd/linux-2.6.27/fs/xfs/xfs_log.c. Return address = 0xffffffffa04c80d3
[27095.141380] Filesystem "md1": Log I/O Error Detected. Shutting down filesystem: md1
[27095.141407] Please umount the filesystem, and rectify the problem(s)
[27100.140015] Filesystem "md1": xfs_log_force: error 5 returned.
[27113.440011] Filesystem "md1": xfs_log_force: error 5 returned.
[27143.440010] Filesystem "md1": xfs_log_force: error 5 returned.
[27173.440009] Filesystem "md1": xfs_log_force: error 5 returned.
[27203.440012] Filesystem "md1": xfs_log_force: error 5 returned.
/var/log/messages:
Nov 30 18:39:24 isis kernel: [ 1061.040118] md: recovery of RAID array md1
Nov 30 18:39:24 isis kernel: [ 1061.040120] md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
Nov 30 18:39:24 isis kernel: [ 1061.040122] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
Nov 30 18:39:24 isis kernel: [ 1061.040126] md: using 128k window, over a total of 488383744 blocks.
Nov 30 19:02:08 isis -- MARK --
Nov 30 19:22:08 isis -- MARK --
Nov 30 19:42:08 isis -- MARK --
Nov 30 20:02:08 isis -- MARK --
Nov 30 20:22:08 isis -- MARK --
Nov 30 20:42:08 isis -- MARK --
Nov 30 21:02:08 isis -- MARK --
Nov 30 21:22:08 isis -- MARK --
Nov 30 21:28:32 isis kernel: [11208.852220] md: md1: recovery done.
Nov 30 21:28:32 isis kernel: [11209.020072] RAID5 conf printout:
Nov 30 21:28:32 isis kernel: [11209.020076] --- rd:7 wd:7
Nov 30 21:28:32 isis kernel: [11209.020079] disk 0, o:1, dev:sdd1
Nov 30 21:28:32 isis kernel: [11209.020080] disk 1, o:1, dev:sdb1
Nov 30 21:28:32 isis kernel: [11209.020081] disk 2, o:1, dev:sdh1
Nov 30 21:28:32 isis kernel: [11209.020082] disk 3, o:1, dev:sdc1
Nov 30 21:28:32 isis kernel: [11209.020083] disk 4, o:1, dev:sdf1
Nov 30 21:28:32 isis kernel: [11209.020084] disk 5, o:1, dev:sde1
Nov 30 21:28:32 isis kernel: [11209.020085] disk 6, o:1, dev:sdg1
Nov 30 21:42:08 isis -- MARK --
Nov 30 22:02:08 isis -- MARK --
Nov 30 22:22:08 isis -- MARK --
Nov 30 22:42:08 isis -- MARK --
Nov 30 23:02:08 isis -- MARK --
Nov 30 23:22:08 isis -- MARK --
Nov 30 23:42:08 isis -- MARK --
Nov 30 23:52:27 isis kernel: [19844.431690] SGI XFS with ACLs, security attributes, realtime, large block/inode numbers, no debug enabled
Nov 30 23:52:27 isis kernel: [19844.433148] SGI XFS Quota Management subsystem
Nov 30 23:52:27 isis kernel: [19844.442507] Filesystem "md1": Disabling barriers, trial barrier write failed
Nov 30 23:52:27 isis kernel: [19844.442658] XFS mounting filesystem md1
Dec 1 00:22:08 isis -- MARK --
Dec 1 00:42:08 isis -- MARK --
Dec 1 01:02:08 isis -- MARK --
Dec 1 01:22:08 isis -- MARK --
Dec 1 01:42:08 isis -- MARK --
Dec 1 01:52:10 isis kernel: [27027.170099] ata5: hard resetting link
Dec 1 01:52:10 isis kernel: [27027.680034] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Dec 1 01:52:11 isis kernel: [27027.720050] ata5.00: max_sectors limited to 256 for NCQ
Dec 1 01:52:11 isis kernel: [27027.780047] ata5.00: max_sectors limited to 256 for NCQ
Dec 1 01:52:11 isis kernel: [27027.780050] ata5.00: configured for UDMA/133
Dec 1 01:52:11 isis kernel: [27027.780073] md: super_written gets error=-5, uptodate=0
Dec 1 01:52:11 isis kernel: [27027.780117] ata5: EH complete
Dec 1 01:52:11 isis kernel: [27027.780674] sd 4:0:0:0: [sdg] 976773168 512-byte hardware sectors (500108 MB)
Dec 1 01:52:11 isis kernel: [27027.780800] sd 4:0:0:0: [sdg] Write Protect is off
Dec 1 01:52:11 isis kernel: [27027.781038] sd 4:0:0:0: [sdg] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Dec 1 01:52:41 isis kernel: [27057.930098] ata12: hard resetting link
Dec 1 01:52:41 isis kernel: [27058.440033] ata12: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Dec 1 01:52:41 isis kernel: [27058.480049] ata12.00: max_sectors limited to 256 for NCQ
Dec 1 01:52:41 isis kernel: [27058.540047] ata12.00: max_sectors limited to 256 for NCQ
Dec 1 01:52:41 isis kernel: [27058.540050] ata12.00: configured for UDMA/133
Dec 1 01:52:41 isis kernel: [27058.540072] md: super_written gets error=-5, uptodate=0
Dec 1 01:52:41 isis kernel: [27058.540113] ata12: EH complete
Dec 1 01:52:41 isis kernel: [27058.540754] sd 11:0:0:0: [sdh] 976773168 512-byte hardware sectors (500108 MB)
Dec 1 01:52:41 isis kernel: [27058.540879] sd 11:0:0:0: [sdh] Write Protect is off
Dec 1 01:52:41 isis kernel: [27058.541070] sd 11:0:0:0: [sdh] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Dec 1 01:52:41 isis kernel: [27058.584017] RAID5 conf printout:
Dec 1 01:52:41 isis kernel: [27058.584020] --- rd:7 wd:5
Dec 1 01:52:41 isis kernel: [27058.584022] disk 0, o:1, dev:sdd1
Dec 1 01:52:41 isis kernel: [27058.584023] disk 1, o:1, dev:sdb1
Dec 1 01:52:41 isis kernel: [27058.584024] disk 2, o:0, dev:sdh1
Dec 1 01:52:41 isis kernel: [27058.584025] disk 3, o:1, dev:sdc1
Dec 1 01:52:41 isis kernel: [27058.584027] disk 4, o:1, dev:sdf1
Dec 1 01:52:41 isis kernel: [27058.584028] disk 5, o:1, dev:sde1
Dec 1 01:52:41 isis kernel: [27058.584029] disk 6, o:0, dev:sdg1
Dec 1 01:52:44 isis kernel: [27061.521251] Modules linked in: xfs aes_x86_64 aes_generic ecb crypto_blkcipher ecryptfs ipv6 af_packet iptable_filter ip_tables x_tables ac sbp2 parport_pc lp parport loop psmouse pcspkr serio_raw iTCO_wdt iTCO_vendor_support evdev button intel_agp snd_hda_intel snd_pcm shpchp snd_timer pci_hotplug snd soundcore snd_page_alloc ext3 jbd mbcache sd_mod crc_t10dif sg pata_acpi pata_marvell usbhid hid ohci1394 ieee1394 sata_mv ata_generic ata_piix libata scsi_mod dock sky2 e1000e ehci_hcd uhci_hcd usbcore raid10 raid456 async_xor async_memcpy async_tx xor raid1 raid0 multipath linear md_mod dm_mirror dm_log dm_snapshot dm_mod thermal processor fan fbcon tileblit font bitblit softcursor fuse
Dec 1 01:52:44 isis kernel: [27061.521251] CPU 1:
Dec 1 01:52:44 isis kernel: [27061.521251] Modules linked in: xfs aes_x86_64 aes_generic ecb crypto_blkcipher ecryptfs ipv6 af_packet iptable_filter ip_tables x_tables ac sbp2 parport_pc lp parport loop psmouse pcspkr serio_raw iTCO_wdt iTCO_vendor_support evdev button intel_agp snd_hda_intel snd_pcm shpchp snd_timer pci_hotplug snd soundcore snd_page_alloc ext3 jbd mbcache sd_mod crc_t10dif sg pata_acpi pata_marvell usbhid hid ohci1394 ieee1394 sata_mv ata_generic ata_piix libata scsi_mod dock sky2 e1000e ehci_hcd uhci_hcd usbcore raid10 raid456 async_xor async_memcpy async_tx xor raid1 raid0 multipath linear md_mod dm_mirror dm_log dm_snapshot dm_mod thermal processor fan fbcon tileblit font bitblit softcursor fuse
Dec 1 01:52:44 isis kernel: [27061.521251] Pid: 28171, comm: smbd Not tainted 2.6.27-9-server #1
Dec 1 01:52:44 isis kernel: [27061.521251] RIP: 0010:[<ffffffff802abf0c>] [<ffffffff802abf0c>] find_get_pages+0x6c/0x110
Dec 1 01:52:44 isis kernel: [27061.521251] RSP: 0018:ffff880129453358 EFLAGS: 00000246
Dec 1 01:52:44 isis kernel: [27061.521251] RAX: ffff880128d89330 RBX: ffff880129453398 RCX: 0000000000000002
Dec 1 01:52:44 isis kernel: [27061.521251] RDX: 0000000000000003 RSI: 0000000000000000 RDI: ffffe200022e9e80
Dec 1 01:52:44 isis kernel: [27061.521251] RBP: ffff880129453308 R08: ffffe200009df6c8 R09: 0000000000000005
Dec 1 01:52:44 isis kernel: [27061.521251] R10: 0000000000000037 R11: 00000000001c5778 R12: ffffffff802b6b29
Dec 1 01:52:44 isis kernel: [27061.521251] R13: ffff880123a107d0 R14: ffffe20001c6f6c0 R15: 0000000000000286
Dec 1 01:52:44 isis kernel: [27061.521251] FS: 00007fb72cdf6700(0000) GS:ffff88012fc02980(0000) knlGS:0000000000000000
Dec 1 01:52:44 isis kernel: [27061.521251] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Dec 1 01:52:44 isis kernel: [27061.521251] CR2: 00007f1648629000 CR3: 000000012956d000 CR4: 00000000000006e0
Dec 1 01:52:44 isis kernel: [27061.521251] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Dec 1 01:52:44 isis kernel: [27061.521251] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Dec 1 01:52:44 isis kernel: [27061.521251]
Dec 1 01:52:44 isis kernel: [27061.521251] Call Trace:
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff802abee3>] ? find_get_pages+0x43/0x110
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff802b6984>] ? pagevec_lookup+0x24/0x30
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffffa04e100d>] ? xfs_cluster_write+0xad/0x180 [xfs]
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffffa04e1578>] ? xfs_page_state_convert+0x498/0x760 [xfs]
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffffa04e19a1>] ? xfs_vm_writepage+0x71/0x120 [xfs]
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff802b9274>] ? pageout+0x124/0x270
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff802ab06a>] ? page_waitqueue+0xa/0x90
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff802b986d>] ? shrink_page_list+0x34d/0x530
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff802b8e49>] ? __isolate_lru_page+0x79/0xb0
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff802b8f0a>] ? isolate_lru_pages+0x8a/0x220
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff802b9bf2>] ? shrink_inactive_list+0x1a2/0x4b0
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff802b9f7b>] ? shrink_zone+0x7b/0x160
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff802ba0ed>] ? shrink_zones+0x8d/0x150
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff802ba236>] ? do_try_to_free_pages+0x86/0x2e0
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff802ba587>] ? try_to_free_pages+0x67/0x70
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff802b90a0>] ? isolate_pages_global+0x0/0x50
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff802b28b1>] ? __alloc_pages_internal+0x241/0x510
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff802d565d>] ? alloc_pages_current+0xad/0x110
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff802ac477>] ? __page_cache_alloc+0x67/0x80
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff802ad0b3>] ? __grab_cache_page+0x63/0xb0
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff80316a59>] ? block_write_begin+0x89/0xf0
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffffa04e04ca>] ? xfs_vm_write_begin+0x2a/0x30 [xfs]
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffffa04e0040>] ? xfs_get_blocks+0x0/0x20 [xfs]
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff802ab7ac>] ? generic_perform_write+0xbc/0x1c0
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff802ad512>] ? generic_file_buffered_write+0x92/0x170
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffffa04e92d3>] ? xfs_write+0x6b3/0x9b0 [xfs]
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff80385a69>] ? apparmor_socket_recvmsg+0x19/0x20
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff803aaf70>] ? memset_c+0x20/0x30
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffffa04e4c88>] ? xfs_file_aio_write+0x58/0x60 [xfs]
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff802e9559>] ? do_sync_write+0xf9/0x140
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff802e9699>] ? do_sync_read+0xf9/0x140
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff80266fb0>] ? autoremove_wake_function+0x0/0x40
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff80386821>] ? aa_file_permission+0x21/0xf0
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff80386948>] ? apparmor_file_permission+0x28/0x30
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff803613e6>] ? security_file_permission+0x16/0x20
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff802e9c1b>] ? vfs_write+0xcb/0x130
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff802e9d1a>] ? sys_pwrite64+0x9a/0xa0
Dec 1 01:52:44 isis kernel: [27061.521251] [<ffffffff8021285a>] ? system_call_fastpath+0x16/0x1b
Dec 1 01:52:44 isis kernel: [27061.521251]
Dec 1 01:53:18 isis kernel: [27095.080066] RAID5 conf printout:
Dec 1 01:53:18 isis kernel: [27095.080071] --- rd:7 wd:5
Dec 1 01:53:18 isis kernel: [27095.080074] disk 0, o:1, dev:sdd1
Dec 1 01:53:18 isis kernel: [27095.080076] disk 1, o:1, dev:sdb1
Dec 1 01:53:18 isis kernel: [27095.080077] disk 2, o:0, dev:sdh1
Dec 1 01:53:18 isis kernel: [27095.080079] disk 3, o:1, dev:sdc1
Dec 1 01:53:18 isis kernel: [27095.080080] disk 4, o:1, dev:sdf1
Dec 1 01:53:18 isis kernel: [27095.080082] disk 5, o:1, dev:sde1
Dec 1 01:53:18 isis kernel: [27095.080090] RAID5 conf printout:
Dec 1 01:53:18 isis kernel: [27095.080091] --- rd:7 wd:5
Dec 1 01:53:18 isis kernel: [27095.080092] disk 0, o:1, dev:sdd1
Dec 1 01:53:18 isis kernel: [27095.080093] disk 1, o:1, dev:sdb1
Dec 1 01:53:18 isis kernel: [27095.080094] disk 2, o:0, dev:sdh1
Dec 1 01:53:18 isis kernel: [27095.080095] disk 3, o:1, dev:sdc1
Dec 1 01:53:18 isis kernel: [27095.080097] disk 4, o:1, dev:sdf1
Dec 1 01:53:18 isis kernel: [27095.080098] disk 5, o:1, dev:sde1
Dec 1 01:53:18 isis kernel: [27095.140011] RAID5 conf printout:
Dec 1 01:53:18 isis kernel: [27095.140017] --- rd:7 wd:5
Dec 1 01:53:18 isis kernel: [27095.140019] disk 0, o:1, dev:sdd1
Dec 1 01:53:18 isis kernel: [27095.140022] disk 1, o:1, dev:sdb1
Dec 1 01:53:18 isis kernel: [27095.140024] disk 3, o:1, dev:sdc1
Dec 1 01:53:18 isis kernel: [27095.140026] disk 4, o:1, dev:sdf1
Dec 1 01:53:18 isis kernel: [27095.140027] disk 5, o:1, dev:sde1
Dec 1 01:53:18 isis kernel: [27095.140545] lost page write due to I/O error on md1
Dec 1 01:53:18 isis kernel: [27095.140567] lost page write due to I/O error on md1
Dec 1 01:53:18 isis kernel: [27095.140585] lost page write due to I/O error on md1
Dec 1 01:53:18 isis kernel: [27095.140604] lost page write due to I/O error on md1
Dec 1 01:53:18 isis kernel: [27095.140622] lost page write due to I/O error on md1
Dec 1 01:53:18 isis kernel: [27095.140641] lost page write due to I/O error on md1
Dec 1 01:53:18 isis kernel: [27095.140659] lost page write due to I/O error on md1
Dec 1 01:53:18 isis kernel: [27095.140677] lost page write due to I/O error on md1
Dec 1 01:53:18 isis kernel: [27095.140696] lost page write due to I/O error on md1
Dec 1 01:53:18 isis kernel: [27095.140714] lost page write due to I/O error on md1
Dec 1 01:53:18 isis kernel: [27095.141359] xfs_force_shutdown(md1,0x2) called from line 1056 of file /build/buildd/linux-2.6.27/fs/xfs/xfs_log.c. Return address = 0xffffffffa04c80d3
Dec 1 01:53:23 isis kernel: [27100.140015] Filesystem "md1": xfs_log_force: error 5 returned.
Dec 1 01:53:36 isis kernel: [27113.440011] Filesystem "md1": xfs_log_force: error 5 returned.
Dec 1 01:54:06 isis kernel: [27143.440010] Filesystem "md1": xfs_log_force: error 5 returned.
Dec 1 01:54:36 isis kernel: [27173.440009] Filesystem "md1": xfs_log_force: error 5 returned.
Dec 1 01:55:06 isis kernel: [27203.440012] Filesystem "md1": xfs_log_force: error 5 returned.
Dec 1 01:55:36 isis kernel: [27233.440011] Filesystem "md1": xfs_log_force: error 5 returned.
Dec 1 01:56:06 isis kernel: [27263.440011] Filesystem "md1": xfs_log_force: error 5 returned.
Dec 1 01:56:36 isis kernel: [27293.440010] Filesystem "md1": xfs_log_force: error 5 returned.
Dec 1 01:57:06 isis kernel: [27323.440016] Filesystem "md1": xfs_log_force: error 5 returned.
Dec 1 01:57:36 isis kernel: [27353.440015] Filesystem "md1": xfs_log_force: error 5 returned.
Dec 1 01:58:06 isis kernel: [27383.440015] Filesystem "md1": xfs_log_force: error 5 returned.
Dec 1 01:58:36 isis kernel: [27413.440016] Filesystem "md1": xfs_log_force: error 5 returned.
^^^^^^continues this for a while
Dec 1 02:12:06 isis kernel: [28223.440015] Filesystem "md1": xfs_log_force: error 5 returned.
Dec 1 02:12:36 isis kernel: [28253.440013] Filesystem "md1": xfs_log_force: error 5 returned.
Dec 1 02:13:06 isis kernel: [28283.440014] Filesystem "md1": xfs_log_force: error 5 returned.
Dec 1 02:13:36 isis kernel: [28313.440013] Filesystem "md1": xfs_log_force: error 5 returned.
Dec 1 02:14:06 isis kernel: [28343.440013] Filesystem "md1": xfs_log_force: error 5 returned.
Dec 1 02:14:36 isis kernel: [28373.440012] Filesystem "md1": xfs_log_force: error 5 returned.
Dec 1 02:14:59 isis kernel: [28395.820448] Filesystem "md1": xfs_log_force: error 5 returned.
Dec 1 02:14:59 isis kernel: [28395.820456] Filesystem "md1": xfs_log_force: error 5 returned.
Dec 1 02:14:59 isis kernel: [28395.820462] xfs_force_shutdown(md1,0x1) called from line 420 of file /build/buildd/linux-2.6.27/fs/xfs/xfs_rw.c. Return address = 0xffffffffa04decc3
Dec 1 02:14:59 isis kernel: [28395.820466] Filesystem "md1": xfs_log_force: error 5 returned.
Dec 1 02:14:59 isis kernel: [28395.820468] Filesystem "md1": xfs_log_force: error 5 returned.
Dec 1 02:14:59 isis kernel: [28395.820471] xfs_force_shutdown(md1,0x1) called from line 420 of file /build/buildd/linux-2.6.27/fs/xfs/xfs_rw.c. Return address = 0xffffffffa04decc3
Dec 1 02:14:59 isis kernel: [28396.669470] Filesystem "md1": xfs_log_force: error 5 returned.
Dec 1 02:14:59 isis kernel: [28396.669487] Filesystem "md1": xfs_log_force: error 5 returned.
Dec 1 02:14:59 isis kernel: [28396.669517] Filesystem "md1": xfs_log_force: error 5 returned.
Dec 1 02:14:59 isis kernel: [28396.669525] Filesystem "md1": xfs_log_force: error 5 returned.
Dec 1 02:14:59 isis kernel: [28396.669635] Filesystem "md1": xfs_log_force: error 5 returned.
sorry, upgraded to: Linux isis 2.6.27-9-server #1 SMP Thu Nov 20 22:56:07 UTC 2008 x86_64 GNU/Linux yes... still using ubuntu.. realize this isn't a strict redhat issues, but hope this is shedding some light on other peoples problems here. I gave up as well and bought a 3ware controller. I will install it today or tomorrow. One thing I noticed though is when I disabled all the smart tests and hddtemp daemons and anything else that queries the disk regularly (besides just having smart 'monitor' the statistics) I have not had a repeat event yet, but also, I had the same problem you did, two disks dropped out of my raid5 and everything went bye bye, I had most it backed up elsewhere but yeah I got sick of it too. Justin. I received an email from Mark Lord, who said that he would likely be implementing more Marvell errata before Christmas. Don't know how long it would take to hit a Fedora update after that, but this is good news! Still not sure about your problem Justin, but I hope the new controller works for ya...Hate to see all those good 10k drives go to waste (what do you use that thing for anyway?) I am back on my raptor150s for now. I just like/prefer fast disk/access time. Disabling write caching on the drives apparently does not entirely resolve this issue. I got it again last night:
ata2.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
ata2.00: cmd 60/18:00:b3:f8:ba/00:00:00:00:00/40 tag 0 ncq 12288 in
ata2.00: status: { DRDY }
ata2: hard resetting link
ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata2.00: max_sectors limited to 256 for NCQ
ata2.00: max_sectors limited to 256 for NCQ
ata2.00: configured for UDMA/133
ata2: EH complete
I'll take one a month over one every few minutes though.
We'll just have to see how Mark's errata implementation goes...
I replaced my (12) Velociraptors with (12) Raptor150s, not a single error. I suggest (if you can) try other drives. I'm seeing the same errors on a Fujitsu Siemens Econel 50 server on EL5 U2 running kernel 2.6.18-92.1.22.el5. There was running EL4 for two years without problem. HW: Intel ICH6R in AHCI mode My comment only applies indirectly ... I'm running RHEL 4, kernel 2.6.9-67.0.15.EL and recently got: Dec 28 06:31:02 forest kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Dec 28 06:31:02 forest kernel: ata1.00: cmd ca/00:10:76:0c:43/00:00:00:00:00/e0 tag 0 cdb 0x0 data 8192 out Dec 28 06:31:02 forest kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Dec 28 06:31:09 forest kernel: ata1: port is slow to respond, please be patient (Status 0xd0) Dec 28 06:31:32 forest kernel: ata1: port failed to respond (30 secs, Status 0xd0) Dec 28 06:31:32 forest kernel: ata1: soft resetting port Dec 28 06:32:02 forest kernel: ata1.00: qc timeout (cmd 0xec) Dec 28 06:32:02 forest kernel: ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4) Dec 28 06:32:02 forest kernel: ata1.00: revalidation failed (errno=-5) Dec 28 06:32:02 forest kernel: ata1: failed to recover some devices, retrying in 5 secs Dec 28 06:32:14 forest kernel: ata1: port is slow to respond, please be patient (Status 0xd0) Dec 28 06:32:37 forest kernel: ata1: port failed to respond (30 secs, Status 0xd0) Dec 28 06:32:37 forest kernel: ata1: soft resetting port Dec 28 06:33:07 forest kernel: ata1.00: qc timeout (cmd 0xec) Dec 28 06:33:07 forest kernel: ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4) Dec 28 06:33:07 forest kernel: ata1.00: revalidation failed (errno=-5) Dec 28 06:33:07 forest kernel: ata1: failed to recover some devices, retrying in 5 secs Dec 28 06:33:19 forest kernel: ata1: port is slow to respond, please be patient (Status 0xd0) Dec 28 06:33:42 forest kernel: ata1: port failed to respond (30 secs, Status 0xd0) Dec 28 06:33:42 forest kernel: ata1: soft resetting port Dec 28 06:34:13 forest kernel: ata1.00: qc timeout (cmd 0xec) Dec 28 06:34:13 forest kernel: ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4) Dec 28 06:34:13 forest kernel: ata1.00: revalidation failed (errno=-5) Dec 28 06:34:13 forest kernel: ata1.00: disabled Dec 28 06:34:13 forest kernel: ata1: EH complete This is just one disk, no RAID. ... since I rebooted on the 28th, everything has been fine. I will receive a brand new disk today (the other one was almost new), perform a complete Seagate diagnostics on the disk, then replace the root disk, and do a complete diagnostics on the old disk, but I doubt it's the disk that's the problem here. MB: Intel S5000PSL ata1: SATA max UDMA/133 cmd 0x40C8 ctl 0x40E6 bmdma 0x40A0 irq 193 ata1.00: ATA-7, max UDMA/133, 488397168 sectors: LBA48 NCQ (depth 0/32) ata1.00: ata1: dev 0 multi count 16 ata1.00: configured for UDMA/133 scsi1 : ata_piix Vendor: ATA Model: ST3250410AS Rev: 3.AA Type: Direct-Access ANSI SCSI revision: 05 SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB) SCSI device sda: drive cache: write back SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB) SCSI device sda: drive cache: write back sda: sda1 sda2 sda3 Attached scsi disk sda at scsi0, channel 0, id 0, lun 0 This is just to say that the problem might apply to older kernels as well. Your trace is fairly clear The drive stops responding We notice the timeout It reports 0xD0 (busy) We reset it We ask it to identify Its still wedged. Difficult to see how that can be a kernel problem when the drive won't respond to a reset. Could be PSU - that has been an issue with some systems but could also be the drive firmware went castors up. The original bug at the top of this report was fixed in 2.6.26.xx --> this was the mv_qc_defer() bug that Tejun found way back then.
The other reports also on this bug are for different problems, yet to be sorted out. There do seem to be a number of "timeouts" reported here and elsewhere, with the ATA opcode often being an NCQ R/W ("FPDMA") command, or a "FLUSH_CACHE_EXT" command.
Apart from that, there's not a lot of useful information yet. I need to see specific kernel versions (kernel.org, not vendor kernels), and knowing the exact drive models and PCI bus type (eg. is the 6081 card on a 133MHz/64-bit PCI-X slot, or a 33Mhz/32-bit PCI slot, or a ...). These chips have a number of quirks that are specific to particular bus types.
Scream now, and you'll be heard!
-Mark
(room goes silent - Marvell owners bow down in the presence of Mark Lord)
Mine is all the same - Here are the last 3 errors I got:
Jan 11 14:12:56 radfiles kernel: ata2.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
Jan 11 14:12:56 radfiles kernel: ata2.00: cmd 61/08:00:cb:d5:42/00:00:25:00:00/40 tag 0 ncq 4096 out
Jan 11 14:12:56 radfiles kernel: res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Jan 11 14:12:56 radfiles kernel: ata2.00: status: { DRDY }
Jan 11 14:12:56 radfiles kernel: ata2: hard resetting link
Jan 11 14:12:56 radfiles kernel: ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Jan 11 14:12:56 radfiles kernel: ata2.00: max_sectors limited to 256 for NCQ
Jan 11 14:12:56 radfiles kernel: ata2.00: max_sectors limited to 256 for NCQ
Jan 11 14:12:56 radfiles kernel: ata2.00: configured for UDMA/133
Jan 11 14:12:56 radfiles kernel: ata2: EH complete
Jan 11 14:12:56 radfiles kernel: sd 1:0:0:0: [sdb] 625142448 512-byte hardware sectors (320073 MB)
Jan 11 14:12:56 radfiles kernel: sd 1:0:0:0: [sdb] Write Protect is off
Jan 11 14:12:56 radfiles kernel: sd 1:0:0:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
Jan 11 14:15:02 radfiles kernel: ata2.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
Jan 11 14:15:02 radfiles kernel: ata2.00: cmd 61/08:00:cb:d5:42/00:00:25:00:00/40 tag 0 ncq 4096 out
Jan 11 14:15:02 radfiles kernel: res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Jan 11 14:15:02 radfiles kernel: ata2.00: status: { DRDY }
Jan 11 14:15:02 radfiles kernel: ata2: hard resetting link
Jan 11 14:15:03 radfiles kernel: ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Jan 11 14:15:03 radfiles kernel: ata2.00: max_sectors limited to 256 for NCQ
Jan 11 14:15:03 radfiles kernel: ata2.00: max_sectors limited to 256 for NCQ
Jan 11 14:15:03 radfiles kernel: ata2.00: configured for UDMA/133
Jan 11 14:15:03 radfiles kernel: ata2: EH complete
Jan 11 14:15:03 radfiles kernel: sd 1:0:0:0: [sdb] 625142448 512-byte hardware sectors (320073 MB)
Jan 11 14:15:03 radfiles kernel: sd 1:0:0:0: [sdb] Write Protect is off
Jan 11 14:15:03 radfiles kernel: sd 1:0:0:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
Jan 11 14:26:03 radfiles kernel: ata2.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
Jan 11 14:26:03 radfiles kernel: ata2.00: cmd 60/08:00:3b:aa:47/00:00:00:00:00/40 tag 0 ncq 4096 in
Jan 11 14:26:03 radfiles kernel: res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Jan 11 14:26:03 radfiles kernel: ata2.00: status: { DRDY }
Jan 11 14:26:03 radfiles kernel: ata2: hard resetting link
Jan 11 14:26:03 radfiles kernel: ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Jan 11 14:26:03 radfiles kernel: ata2.00: max_sectors limited to 256 for NCQ
Jan 11 14:26:03 radfiles kernel: ata2.00: max_sectors limited to 256 for NCQ
Jan 11 14:26:03 radfiles kernel: ata2.00: configured for UDMA/133
Jan 11 14:26:03 radfiles kernel: ata2: EH complete
Jan 11 14:26:03 radfiles kernel: sd 1:0:0:0: [sdb] 625142448 512-byte hardware sectors (320073 MB)
Jan 11 14:26:03 radfiles kernel: sd 1:0:0:0: [sdb] Write Protect is off
Jan 11 14:26:03 radfiles kernel: sd 1:0:0:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
uname -a:
Linux radfiles.net 2.6.27.9-159.fc10.x86_64 #1 SMP Tue Dec 16 14:47:52 EST 2008 x86_64 x86_64 x86_64 GNU/Linux
(is there something more I can do here to get you more specific information?)
lspci -vv:
00:02.0 PCI bridge: ALi Corporation M5249 HTT to PCI Bridge (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Bus: primary=00, secondary=01, subordinate=01, sec-latency=32
I/O behind bridge: 0000d000-0000dfff
Memory behind bridge: fb000000-fcffffff
Prefetchable memory behind bridge: e2000000-e20fffff
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
BridgeCtl: Parity- SERR+ NoISA- VGA+ MAbort- >Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: [b0] HyperTransport: Slave or Primary Interface
Command: BaseUnitID=3 UnitCnt=1 MastHost- DefDir- DUL-
Link Control 0: CFlE- CST- CFE- <LkFail- Init+ EOC- TXO- <CRCErr=0 IsocEn- LSEn- ExtCTL- 64b-
Link Config 0: MLWI=8bit DwFcIn- MLWO=8bit DwFcOut- LWI=8bit DwFcInEn- LWO=8bit DwFcOutEn-
Link Control 1: CFlE- CST- CFE- <LkFail+ Init- EOC+ TXO+ <CRCErr=0 IsocEn- LSEn- ExtCTL- 64b-
Link Config 1: MLWI=8bit DwFcIn- MLWO=8bit DwFcOut- LWI=8bit DwFcInEn- LWO=8bit DwFcOutEn-
Revision ID: 1.04
Link Frequency 0: 200MHz
Link Error 0: <Prot- <Ovfl- <EOC- CTLTm-
Link Frequency Capability 0: 200MHz+ 300MHz+ 400MHz+ 500MHz- 600MHz- 800MHz- 1.0GHz- 1.2GHz- 1.4GHz- 1.6GHz- Vend-
Feature Capability: IsocFC- LDTSTOP+ CRCTM- ECTLT- 64bA- UIDRD-
Link Frequency 1: 200MHz
Link Error 1: <Prot- <Ovfl- <EOC- CTLTm-
Link Frequency Capability 1: 200MHz- 300MHz- 400MHz- 500MHz- 600MHz- 800MHz- 1.0GHz- 1.2GHz- 1.4GHz- 1.6GHz- Vend-
Error Handling: PFlE- OFlE- PFE- OFE- EOCFE- RFE- CRCFE- SERRFE- CF- RE- PNFE- ONFE- EOCNFE- RNFE- CRCNFE- SERRNFE-
Prefetchable memory behind bridge Upper: 00-00
Bus Number: 00
Capabilities: [f0] HyperTransport: Interrupt Discovery and Configuration
Kernel modules: shpchp
00:03.0 ISA bridge: ALi Corporation M1563 HyperTransport South Bridge (rev 20)
Subsystem: Device 19d5:2203
Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0 (250ns min, 6000ns max)
00:03.1 Bridge: ALi Corporation M7101 Power Management Controller [PMU]
Subsystem: Device 19d5:2203
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Kernel modules: alim7101_wdt
00:0a.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 32
Bus: primary=00, secondary=02, subordinate=02, sec-latency=32
I/O behind bridge: 0000e000-0000efff
Memory behind bridge: fd000000-fd0fffff
Secondary status: 66MHz+ FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- <SERR- <PERR-
BridgeCtl: Parity- SERR+ NoISA- VGA- MAbort- >Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: [a0] PCI-X bridge device
Secondary Status: 64bit+ 133MHz+ SCD- USC- SCO- SRD- Freq=100MHz
Status: Dev=00:0a.0 64bit+ 133MHz+ SCD- USC- SCO- SRD-
Upstream: Capacity=14 CommitmentLimit=65535
Downstream: Capacity=2 CommitmentLimit=65535
Capabilities: [b8] HyperTransport: Interrupt Discovery and Configuration
Capabilities: [c0] HyperTransport: Slave or Primary Interface
!!! Possibly incomplete decoding
Command: BaseUnitID=10 UnitCnt=2 MastHost- DefDir-
Link Control 0: CFlE- CST- CFE- <LkFail- Init+ EOC- TXO- <CRCErr=0
Link Config 0: MLWI=16bit MLWO=16bit LWI=16bit LWO=16bit
Link Control 1: CFlE- CST- CFE- <LkFail- Init+ EOC- TXO- <CRCErr=0
Link Config 1: MLWI=8bit MLWO=8bit LWI=8bit LWO=8bit
Revision ID: 1.02
Kernel modules: shpchp
00:0a.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01) (prog-if 10 [IO-APIC])
Subsystem: Device 19d5:2203
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Region 0: Memory at febfe000 (64-bit, non-prefetchable) [size=4K]
00:0b.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 32
Bus: primary=00, secondary=03, subordinate=03, sec-latency=32
Memory behind bridge: fd100000-fd1fffff
Prefetchable memory behind bridge: 00000000e2100000-00000000e21fffff
Secondary status: 66MHz+ FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
BridgeCtl: Parity- SERR+ NoISA- VGA- MAbort- >Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: [a0] PCI-X bridge device
Secondary Status: 64bit+ 133MHz+ SCD- USC- SCO- SRD- Freq=100MHz
Status: Dev=00:0b.0 64bit+ 133MHz+ SCD- USC- SCO- SRD-
Upstream: Capacity=14 CommitmentLimit=65535
Downstream: Capacity=2 CommitmentLimit=65535
Capabilities: [b8] HyperTransport: Interrupt Discovery and Configuration
Kernel modules: shpchp
00:0b.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01) (prog-if 10 [IO-APIC])
Subsystem: Device 19d5:2203
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Region 0: Memory at febff000 (64-bit, non-prefetchable) [size=4K]
00:0e.0 IDE interface: ALi Corporation M5229 IDE (rev c5) (prog-if fa)
Subsystem: Device 19d5:2203
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 32
Interrupt: pin A routed to IRQ 19
Region 0: [virtual] Memory at 000001f0 (32-bit, non-prefetchable) [disabled] [size=8]
Region 1: [virtual] Memory at 000003f0 (type 3, non-prefetchable) [disabled] [size=1]
Region 2: [virtual] Memory at 00000170 (32-bit, non-prefetchable) [disabled] [size=8]
Region 3: [virtual] Memory at 00000370 (type 3, non-prefetchable) [disabled] [size=1]
Region 4: I/O ports at f000 [size=16]
Kernel driver in use: pata_ali
Kernel modules: pata_ali, pata_acpi, ata_generic
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Capabilities: [80] HyperTransport: Host or Secondary Interface
!!! Possibly incomplete decoding
Command: WarmRst+ DblEnd-
Link Control: CFlE- CST- CFE- <LkFail- Init+ EOC- TXO- <CRCErr=0
Link Config: MLWI=16bit MLWO=16bit LWI=16bit LWO=16bit
Revision ID: 1.02
Capabilities: [a0] HyperTransport: Host or Secondary Interface
!!! Possibly incomplete decoding
Command: WarmRst+ DblEnd-
Link Control: CFlE- CST- CFE- <LkFail- Init+ EOC- TXO- <CRCErr=0
Link Config: MLWI=16bit MLWO=16bit LWI=16bit LWO=16bit
Revision ID: 1.02
Capabilities: [c0] HyperTransport: Host or Secondary Interface
!!! Possibly incomplete decoding
Command: WarmRst+ DblEnd-
Link Control: CFlE- CST- CFE- <LkFail+ Init- EOC+ TXO+ <CRCErr=0
Link Config: MLWI=16bit MLWO=16bit LWI=N/C LWO=N/C
Revision ID: 1.02
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Kernel driver in use: k8temp
Kernel modules: k8temp
00:19.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Capabilities: [80] HyperTransport: Host or Secondary Interface
!!! Possibly incomplete decoding
Command: WarmRst+ DblEnd-
Link Control: CFlE- CST- CFE- <LkFail+ Init- EOC+ TXO+ <CRCErr=0
Link Config: MLWI=16bit MLWO=16bit LWI=N/C LWO=N/C
Revision ID: 1.02
Capabilities: [a0] HyperTransport: Host or Secondary Interface
!!! Possibly incomplete decoding
Command: WarmRst+ DblEnd-
Link Control: CFlE- CST- CFE- <LkFail- Init+ EOC- TXO- <CRCErr=0
Link Config: MLWI=16bit MLWO=16bit LWI=16bit LWO=16bit
Revision ID: 1.02
Capabilities: [c0] HyperTransport: Host or Secondary Interface
!!! Possibly incomplete decoding
Command: WarmRst+ DblEnd-
Link Control: CFlE- CST- CFE- <LkFail+ Init- EOC+ TXO+ <CRCErr=0
Link Config: MLWI=16bit MLWO=16bit LWI=N/C LWO=N/C
Revision ID: 1.02
00:19.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
00:19.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
00:19.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Kernel driver in use: k8temp
Kernel modules: k8temp
01:07.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27) (prog-if 00 [VGA controller])
Subsystem: ATI Technologies Inc Rage XL
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping+ SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 32 (2000ns min), Cache Line Size: 32 bytes
Interrupt: pin A routed to IRQ 7
Region 0: Memory at fb000000 (32-bit, non-prefetchable) [size=16M]
Region 1: I/O ports at d000 [size=256]
Region 2: Memory at fc020000 (32-bit, non-prefetchable) [size=4K]
[virtual] Expansion ROM at e2000000 [disabled] [size=128K]
Capabilities: [5c] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
Kernel modules: atyfb
02:03.0 SCSI storage controller: Marvell Technology Group Ltd. MV88SX6081 8-port SATA II PCI-X Controller (rev 09)
Subsystem: Marvell Technology Group Ltd. Device 11ab
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 32, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 26
Region 0: Memory at fd000000 (64-bit, non-prefetchable) [size=1M]
Region 2: I/O ports at e000 [size=256]
Capabilities: [40] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] Message Signalled Interrupts: Mask- 64bit+ Count=1/1 Enable-
Address: 0000000000000000 Data: 0000
Capabilities: [60] PCI-X non-bridge device
Command: DPERE- ERO- RBC=512 OST=4
Status: Dev=02:03.0 64bit+ 133MHz+ SCD- USC- DC=simple DMMRBC=512 DMOST=4 DMCRS=8 RSCEM- 266MHz- 533MHz-
Kernel driver in use: sata_mv
Kernel modules: sata_mv
03:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet (rev 03)
Subsystem: ABIT Computer Corp. Device 2202
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 32 (16000ns min), Cache Line Size: 32 bytes
Interrupt: pin A routed to IRQ 31
Region 0: Memory at fd100000 (64-bit, non-prefetchable) [size=64K]
Region 2: Memory at fd110000 (64-bit, non-prefetchable) [size=64K]
[virtual] Expansion ROM at e2100000 [disabled] [size=64K]
Capabilities: [40] PCI-X non-bridge device
Command: DPERE- ERO+ RBC=512 OST=1
Status: Dev=03:04.0 64bit+ 133MHz+ SCD- USC- DC=simple DMMRBC=2048 DMOST=1 DMCRS=16 RSCEM- 266MHz- 533MHz-
Capabilities: [48] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=1 PME-
Capabilities: [50] Vital Product Data <?>
Capabilities: [58] Message Signalled Interrupts: Mask- 64bit+ Count=1/8 Enable-
Address: 24100073000144a4 Data: 10d0
Kernel driver in use: tg3
Kernel modules: tg3
03:04.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet (rev 03)
Subsystem: ABIT Computer Corp. Device 2202
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 32 (16000ns min), Cache Line Size: 32 bytes
Interrupt: pin B routed to IRQ 28
Region 0: Memory at fd120000 (64-bit, non-prefetchable) [size=64K]
Region 2: Memory at fd130000 (64-bit, non-prefetchable) [size=64K]
[virtual] Expansion ROM at e2110000 [disabled] [size=64K]
Capabilities: [40] PCI-X non-bridge device
Command: DPERE- ERO- RBC=2048 OST=1
Status: Dev=03:04.1 64bit+ 133MHz+ SCD- USC- DC=simple DMMRBC=2048 DMOST=1 DMCRS=16 RSCEM- 266MHz- 533MHz-
Capabilities: [48] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold+)
Status: D0 PME-Enable+ DSel=0 DScale=1 PME-
Capabilities: [50] Vital Product Data <?>
Capabilities: [58] Message Signalled Interrupts: Mask- 64bit+ Count=1/8 Enable-
Address: 2c02d024720c49a0 Data: 5103
Kernel driver in use: tg3
Kernel modules: tg3
(write caching forced off on all drives using hdparm)
/dev/sda:
Model=ST3320620AS , FwRev=3.AAM , SerialNo= 5QF3T3XP
Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% }
RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4
BuffType=unknown, BuffSize=16384kB, MaxMultSect=16, MultSect=?16?
CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=625142448
IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
PIO modes: pio0 pio1 pio2 pio3 pio4
DMA modes: mdma0 mdma1 mdma2
UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 *udma6
AdvancedPM=no WriteCache=disabled
Drive conforms to: Unspecified: ATA/ATAPI-1,2,3,4,5,6,7
/dev/sdb:
Model=ST3320620AS , FwRev=3.AAM , SerialNo= 5QF3V2C3
Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% }
RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4
BuffType=unknown, BuffSize=16384kB, MaxMultSect=16, MultSect=?16?
CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=625142448
IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
PIO modes: pio0 pio1 pio2 pio3 pio4
DMA modes: mdma0 mdma1 mdma2
UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 *udma6
AdvancedPM=no WriteCache=disabled
Drive conforms to: Unspecified: ATA/ATAPI-1,2,3,4,5,6,7
/dev/sdc:
Model=ST3320620AS , FwRev=3.AAM , SerialNo= 5QF3T3YM
Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% }
RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4
BuffType=unknown, BuffSize=16384kB, MaxMultSect=16, MultSect=?16?
CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=625142448
IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
PIO modes: pio0 pio1 pio2 pio3 pio4
DMA modes: mdma0 mdma1 mdma2
UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 *udma6
AdvancedPM=no WriteCache=disabled
Drive conforms to: Unspecified: ATA/ATAPI-1,2,3,4,5,6,7
/dev/sdd:
Model=ST3320620AS , FwRev=3.AAM , SerialNo= 5QF3RA0R
Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% }
RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4
BuffType=unknown, BuffSize=16384kB, MaxMultSect=16, MultSect=?16?
CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=625142448
IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
PIO modes: pio0 pio1 pio2 pio3 pio4
DMA modes: mdma0 mdma1 mdma2
UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 *udma6
AdvancedPM=no WriteCache=disabled
Drive conforms to: Unspecified: ATA/ATAPI-1,2,3,4,5,6,7
/dev/sde:
Model=ST3320620AS , FwRev=3.AAM , SerialNo= 9QFAH509
Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% }
RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4
BuffType=unknown, BuffSize=16384kB, MaxMultSect=16, MultSect=?16?
CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=625142448
IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
PIO modes: pio0 pio1 pio2 pio3 pio4
DMA modes: mdma0 mdma1 mdma2
UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 *udma6
AdvancedPM=no WriteCache=disabled
Drive conforms to: Unspecified: ATA/ATAPI-1,2,3,4,5,6,7
/proc/mdstat:
md2 : active raid1 sdc2[0] sdd2[1]
1052160 blocks [2/2] [UU]
md0 : active raid1 sda1[0] sde1[4](S) sdd1[3] sdc1[2] sdb1[1]
64128 blocks [4/4] [UUUU]
md1 : active raid1 sda2[0] sde2[2](S) sdb2[1]
1052160 blocks [2/2] [UU]
md3 : active raid5 sda3[0] sde3[4] sdd3[3] sdc3[2] sdb3[1]
1245807616 blocks level 5, 256k chunk, algorithm 2 [5/5] [UUUUU]
(part of dmesg showing sata_mv ver)
sata_mv 0000:02:03.0: version 1.24
sata_mv 0000:02:03.0: PCI INT A -> GSI 26 (level, low) -> IRQ 26
sata_mv 0000:02:03.0: Gen-II 32 slots 8 ports SCSI mode IRQ via INTx
scsi0 : sata_mv
scsi1 : sata_mv
scsi2 : sata_mv
scsi3 : sata_mv
scsi4 : sata_mv
scsi5 : sata_mv
scsi6 : sata_mv
scsi7 : sata_mv
ata1: SATA max UDMA/133 mmio m1048576@0xfd000000 port 0xfd022000 irq 26
ata2: SATA max UDMA/133 mmio m1048576@0xfd000000 port 0xfd024000 irq 26
ata3: SATA max UDMA/133 mmio m1048576@0xfd000000 port 0xfd026000 irq 26
ata4: SATA max UDMA/133 mmio m1048576@0xfd000000 port 0xfd028000 irq 26
ata5: SATA max UDMA/133 mmio m1048576@0xfd000000 port 0xfd032000 irq 26
ata6: SATA max UDMA/133 mmio m1048576@0xfd000000 port 0xfd034000 irq 26
ata7: SATA max UDMA/133 mmio m1048576@0xfd000000 port 0xfd036000 irq 26
ata8: SATA max UDMA/133 mmio m1048576@0xfd000000 port 0xfd038000 irq 26
As I said in my email to you, let me know if there is anything I can do to assist. I can only imagine how difficult things like this are to track down...
That's great information, thanks. Now, there may be multiple issues here, but I have found one possible cause of the reported behaviour. Brian's info above indicates that we are losing an NCQ interrupt somehow, from time to time. So I spent this afternoon nitpicking and bitpicking through the interrupt code in sata_mv.c, and I believe I found a race on the hc_irq_cause register. The code was "helpfully" attempting to use read-modify-write to clear individual port bits there, but this is impossible to do in a race-free fashion. So.. the obvious fix is to just write the bits being cleared, without touching anything else. This will also be faster, too, since no read is required or desired. I really don't see a downside, as long as it actually works! :) Patch to be attached here for trial use only. I still need to run it past Marvell as well as the linux-ide development list. Cheers Created attachment 328914 [details]
Patch for 2.6.28: sata_mv: remove update races from hc_irq_cause register
Try and report back. This bug should be affecting all users of sata_mv, so anyone on the wire could help by testing it and posting results here.
Thanks
Okay, FOUND IT! But first.. a very important question: Has anyone ever seen the timeouts on ports 4,5,6,7 of the 6081? My theory is that this only ever happens on ports 0,1,2,3 -- because that's where I've finally found the bug. So, please: (1) tell me if ports 4,5,6,7 have every given you timeout grief (check your logs if need be, this is important). Thanks. (2) regardless, apply the next patch I'm about to attach, which fixes incorrect use of port numbers on the 6081 chip. (3) run with the patch applied, and report back ASAP. Once I hear from you folks, I'll feed the patch upstream/backstream, as this is a rather important fix. Thanks. Created attachment 329048 [details]
sata_mv: Fix timeouts on Marvell 6081 ports 0..3.
This patch should fix the remaining "timeout" issues for Marvell 6081 chipset users. Please apply and report back ASAP.
Thanks
By the way, I also suspect that timeouts NEVER happen when: (1) there are no drives on ports 0..3, OR (2) there are no drives on ports 4..7. So if only half of the chip is in use, either the upper or lower half, this bug is probably never seen. Cheers Old patch didn't work - Failed on boot:
ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
ata1.00: cmd 61/08:00:cb:d5:42/00:00:25:00:00/40 tag 0 ncq 4096 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata1.00: status: { DRDY }
ata1: hard resetting link
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata1.00: max_sectors limited to 256 for NCQ
ata1.00: max_sectors limited to 256 for NCQ
ata1.00: configured for UDMA/133
ata1: EH complete
And to answer your question (1) - NEVER, and I have ports 0-5 filled, with 0-4
comprising the same software RAID array...
Testing new patch now!
I hate to speak prematurely, but IT WORKS!!! No errors, and I've tried copying quite a bit of data (let alone all of the other server stuff going on in the background), and NOTHING. This is with write caching enabled, which before would cause errors very frequently. Although early in the testing, I feel very confident that this is the fix based on how quickly I could get it to fail before... Thank you, thank you, thank you! (and to Harri Olin on the dev mailing list that mentioned the port issue - that was apparently the key). It's really nice to see a lengthy bug come together like this and result in something so positive... BTW, I tested only with the new patch and not along with the "remove update races from hc_irq_cause register" patch... That's fine. The first patch does not fix the problem, but merely speeds up your system by a fraction of a percent. :) -ml @mlord Hi, just joining the party here... I too was seeing this error:
[ 105.430353] sda:<3>ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
[ 135.842355] ata1.00: cmd 60/08:00:00:00:00/00:00:00:00:00/40 tag 0 ncq 4096 in
[ 135.842355] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 135.846352] ata1.00: status: { DRDY }
[ 135.850353] ata1: hard resetting link
..on a Sun x4500 with a Marvell MV88SX6081 controller. Your first "sata_mv_fix_hc_irq_cause_race" allowed me to boot successfully.
Uptime is 1.5 days on 2.6.28 with only your patch applied. Thanks!
-sp
Okay, we have lots of confirmations of success now (using only the second patch from me), on the 6081 chipset as well as for the 508x 8-port controller. I believe this bugzilla entry belongs to Jeff Garzik, so he can take it from here. Cheers Mark Hello, It sounds like the 2.6.18-92 series are affected by, at least, the timeout effect on ports 0..3 as it runs sata_mv 1.01 (backported from the 2.6.24). Is there any plan to backport that in the 2.6.18-92 series ? Sincerly, And RHEL-backports maybe? :-) Patch is in the queue for 2.6.27.15 Hey all
I here looking for a solution to the same or simular issue?
ata4.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
ata4.01: cmd c8/00:08:c7:d8:ba/00:00:00:00:00/f1 tag 0 dma 4096 in
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata4.01: status: { DRDY }
ata4: soft resetting link
ata4.01: configured for UDMA/133
ata4: EH complete
These are not the system volumes. (different file systems)
This dev was working properly until this issue appeared? the only things that have changes is an updated kernel and i had plugged in a new USB dev (lexmark printer)
The gui reported the free space on the dev's ?
logged out and back in system froze ?
the system would not shut-down in this state
hard reset- power down remove added usb printer **I had noted the system was booting slower that previously **
system can up as normal--
all dev can be mounted and used a required
Sooooo it appears that there is an issue may be with the usb, which is nothing new, this board and chipset is not the best (wonky to say the least)
asus P5LD2
0:00.0 Host bridge: Intel Corporation 82945G/GZ/P/PL Memory Controller Hub (rev 02)
Subsystem: Intel Corporation Unknown device 2580
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
Latency: 0
Capabilities: <access denied>
00:01.0 PCI bridge: Intel Corporation 82945G/GZ/P/PL PCI Express Root Port (rev 02) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR+ <PERR- INTx-
Latency: 0, Cache Line Size: 16 bytes
Bus: primary=00, secondary=04, subordinate=04, sec-latency=0
I/O behind bridge: 0000e000-0000efff
Memory behind bridge: cff00000-cfffffff
Prefetchable memory behind bridge: 00000000d0000000-00000000dfffffff
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
BridgeCtl: Parity- SERR+ NoISA- VGA+ MAbort- >Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: <access denied>
Kernel driver in use: pcieport-driver
Kernel modules: shpchp
00:1b.0 Audio device: Intel Corporation 82801G (ICH7 Family) High Definition Audio Controller (rev 01)
Subsystem: ASUSTeK Computer Inc. Unknown device 8237
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 16 bytes
Interrupt: pin A routed to IRQ 19
Region 0: Memory at cfcf8000 (64-bit, non-prefetchable) [size=16K]
Capabilities: <access denied>
Kernel driver in use: HDA Intel
Kernel modules: snd-hda-intel
00:1c.0 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1 (rev 01) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 16 bytes
Bus: primary=00, secondary=03, subordinate=03, sec-latency=0
I/O behind bridge: 0000d000-0000dfff
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
BridgeCtl: Parity- SERR+ NoISA+ VGA- MAbort- >Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: <access denied>
Kernel driver in use: pcieport-driver
Kernel modules: shpchp
00:1c.3 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 4 (rev 01) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 16 bytes
Bus: primary=00, secondary=02, subordinate=02, sec-latency=0
I/O behind bridge: 0000c000-0000cfff
Memory behind bridge: cfe00000-cfefffff
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
BridgeCtl: Parity- SERR+ NoISA+ VGA- MAbort- >Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: <access denied>
Kernel driver in use: pcieport-driver
Kernel modules: shpchp
00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #1 (rev 01) (prog-if 00 [UHCI])
Subsystem: ASUSTeK Computer Inc. P5KPL-VM Motherboard
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 20
Region 4: I/O ports at 7000 [size=32]
Kernel driver in use: uhci_hcd
Kernel modules: uhci-hcd
00:1d.1 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #2 (rev 01) (prog-if 00 [UHCI])
Subsystem: ASUSTeK Computer Inc. P5KPL-VM Motherboard
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin B routed to IRQ 17
Region 4: I/O ports at 7400 [size=32]
Kernel driver in use: uhci_hcd
Kernel modules: uhci-hcd
00:1d.2 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #3 (rev 01) (prog-if 00 [UHCI])
Subsystem: ASUSTeK Computer Inc. P5KPL-VM Motherboard
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin C routed to IRQ 18
Region 4: I/O ports at 7800 [size=32]
Kernel driver in use: uhci_hcd
Kernel modules: uhci-hcd
00:1d.3 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #4 (rev 01) (prog-if 00 [UHCI])
Subsystem: ASUSTeK Computer Inc. P5KPL-VM Motherboard
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin D routed to IRQ 19
Region 4: I/O ports at 8000 [size=32]
Kernel driver in use: uhci_hcd
Kernel modules: uhci-hcd
00:1d.7 USB Controller: Intel Corporation 82801G (ICH7 Family) USB2 EHCI Controller (rev 01) (prog-if 20 [EHCI])
Subsystem: ASUSTeK Computer Inc. P5KPL-VM Motherboard
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 20
Region 0: Memory at cfcff800 (32-bit, non-prefetchable) [size=1K]
Capabilities: <access denied>
Kernel driver in use: ehci_hcd
Kernel modules: ehci-hcd
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev e1) (prog-if 01 [Subtractive decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Bus: primary=00, secondary=01, subordinate=01, sec-latency=32
I/O behind bridge: 0000a000-0000bfff
Memory behind bridge: cfd00000-cfdfffff
Prefetchable memory behind bridge: 00000000cc000000-00000000cc0fffff
Secondary status: 66MHz- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
BridgeCtl: Parity- SERR+ NoISA+ VGA- MAbort- >Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: <access denied>
00:1f.0 ISA bridge: Intel Corporation 82801GB/GR (ICH7 Family) LPC Interface Bridge (rev 01)
Subsystem: ASUSTeK Computer Inc. P5KPL-VM Motherboard
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Capabilities: <access denied>
Kernel modules: iTCO_wdt, intel-rng
00:1f.1 IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller (rev 01) (prog-if 8a [Master SecP PriP])
Subsystem: ASUSTeK Computer Inc. P5KPL-VM Motherboard
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 22
Region 0: I/O ports at 01f0 [size=8]
Region 1: I/O ports at 03f4 [size=1]
Region 2: I/O ports at 0170 [size=8]
Region 3: I/O ports at 0374 [size=1]
Region 4: I/O ports at ffa0 [size=16]
Kernel driver in use: ata_piix
Kernel modules: ata_generic, ata_piix, pata_acpi
00:1f.2 IDE interface: Intel Corporation 82801GB/GR/GH (ICH7 Family) SATA IDE Controller (rev 01) (prog-if 8f [Master SecP SecO PriP PriO])
Subsystem: ASUSTeK Computer Inc. Unknown device 2601
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin B routed to IRQ 23
Region 0: I/O ports at 9800 [size=8]
Region 1: I/O ports at 9400 [size=4]
Region 2: I/O ports at 9000 [size=8]
Region 3: I/O ports at 8800 [size=4]
Region 4: I/O ports at 8400 [size=16]
Region 5: Memory at cfcffc00 (32-bit, non-prefetchable) [size=1K]
Capabilities: <access denied>
Kernel driver in use: ata_piix
Kernel modules: ata_generic, ata_piix, pata_acpi
00:1f.3 SMBus: Intel Corporation 82801G (ICH7 Family) SMBus Controller (rev 01)
Subsystem: ASUSTeK Computer Inc. P5KPL-VM Motherboard
Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Interrupt: pin B routed to IRQ 23
Region 4: I/O ports at 0400 [size=32]
Kernel driver in use: i801_smbus
Kernel modules: i2c-i801
01:03.0 Mass storage controller: Integrated Technology Express, Inc. ITE 8211F Single Channel UDMA 133 (rev 11)
Subsystem: ASUSTeK Computer Inc. P5GD1-VW Mainboard
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 64 (2000ns min, 2000ns max)
Interrupt: pin A routed to IRQ 20
Region 0: I/O ports at b800 [size=8]
Region 1: I/O ports at b400 [size=4]
Region 2: I/O ports at b000 [size=8]
Region 3: I/O ports at a800 [size=4]
Region 4: I/O ports at a400 [size=16]
Expansion ROM at cc000000 [disabled] [size=128K]
Capabilities: <access denied>
Kernel driver in use: pata_it821x
Kernel modules: pata_it821x
02:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053 PCI-E Gigabit Ethernet Controller (rev 19)
Subsystem: ASUSTeK Computer Inc. Marvell 88E8053 Gigabit Ethernet controller PCIe (Asus)
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 16 bytes
Interrupt: pin A routed to IRQ 19
Region 0: Memory at cfefc000 (64-bit, non-prefetchable) [size=16K]
Region 2: I/O ports at c800 [size=256]
Expansion ROM at cfec0000 [disabled] [size=128K]
Capabilities: <access denied>
Kernel driver in use: sky2
Kernel modules: sky2
04:00.0 VGA compatible controller: ATI Technologies Inc RV515 PRO [Radeon X1300/X1550 Series] (prog-if 00 [VGA controller])
Subsystem: ASUSTeK Computer Inc. EAX1300PRO/TD/256M
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 16 bytes
Interrupt: pin A routed to IRQ 5
Region 0: Memory at d0000000 (64-bit, prefetchable) [size=256M]
Region 2: Memory at cffe0000 (64-bit, non-prefetchable) [size=64K]
Region 4: I/O ports at e000 [size=256]
Expansion ROM at cffc0000 [disabled] [size=128K]
Capabilities: <access denied>
04:00.1 Display controller: ATI Technologies Inc RV515 PRO [Radeon X1300/X1550 Series] (Secondary)
Subsystem: ASUSTeK Computer Inc. Unknown device 0143
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 16 bytes
Region 0: Memory at cfff0000 (64-bit, non-prefetchable) [size=64K]
Capabilities: <access denied>
dmesg
Initializing cgroup subsys cpuset
Linux version 2.6.27.12-78.2.8.fc9.x86_64 (mockbuild@) (gcc version 4.3.0 20080428 (Red Hat 4.3.0-8) (GCC) ) #1 SMP Mon Jan 19 19:25:03 EST 2009
Command line: ro root=/dev/VolGroup00/LogVol00 vga=791
KERNEL supported cpus:
Intel GenuineIntel
AMD AuthenticAMD
Centaur CentaurHauls
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000e4000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 00000000c7f90000 (usable)
BIOS-e820: 00000000c7f90000 - 00000000c7f9e000 (ACPI data)
BIOS-e820: 00000000c7f9e000 - 00000000c7fe0000 (ACPI NVS)
BIOS-e820: 00000000c7fe0000 - 00000000c8000000 (reserved)
BIOS-e820: 00000000ffb80000 - 0000000100000000 (reserved)
DMI 2.4 present.
AMI BIOS detected: BIOS may corrupt low RAM, working it around.
last_pfn = 0xc7f90 max_arch_pfn = 0x3ffffffff
x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
init_memory_mapping
0000000000 - 00c7e00000 page 2M
00c7e00000 - 00c7f90000 page 4k
kernel direct mapping tables up to c7f90000 @ 10000-16000
last_map_addr: c7f90000 end: c7f90000
RAMDISK: 37c31000 - 37fefa2c
ACPI: RSDP 000FACA0, 0024 (r2 ACPIAM)
ACPI: XSDT C7F90100, 004C (r1 ������ �������� 7000720 MSFT 97)
ACPI: FACP C7F90290, 00F4 (r3 A_M_I_ OEMFACP 7000720 MSFT 97)
ACPI: DSDT C7F90590, 8391 (r1 A0227 A0227000 0 INTL 20051117)
ACPI: FACS C7F9E000, 0040
ACPI: APIC C7F90390, 0080 (r1 A_M_I_ OEMAPIC 7000720 MSFT 97)
ACPI: SLIC C7F90410, 0176 (r1 ������ �������� 7000720 MSFT 97)
ACPI: OEMB C7F9E040, 0066 (r1 A_M_I_ AMI_OEM 7000720 MSFT 97)
ACPI: MCFG C7F98930, 003C (r1 A_M_I_ OEMMCFG 7000720 MSFT 97)
No NUMA configuration found
Faking a node at 0000000000000000-00000000c7f90000
Bootmem setup node 0 0000000000000000-00000000c7f90000
NODE_DATA [0000000000014000 - 0000000000028fff]
bootmap [0000000000029000 - 0000000000041ff7] pages 19
(6 early reservations) ==> bootmem [0000000000 - 00c7f90000]
#0 [0000000000 - 0000001000] BIOS data page ==> [0000000000 - 0000001000]
#1 [0000006000 - 0000008000] TRAMPOLINE ==> [0000006000 - 0000008000]
#2 [0000200000 - 0000972d2c] TEXT DATA BSS ==> [0000200000 - 0000972d2c]
#3 [0037c31000 - 0037fefa2c] RAMDISK ==> [0037c31000 - 0037fefa2c]
#4 [000009fc00 - 0000100000] BIOS reserved ==> [000009fc00 - 0000100000]
#5 [0000010000 - 0000014000] PGTABLE ==> [0000010000 - 0000014000]
found SMP MP-table at [ffff8800000ff780] 000ff780
[ffffe20000000000-ffffe20002bfffff] PMD -> [ffff880001200000-ffff880003dfffff] on node 0
Zone PFN ranges:
DMA 0x00000010 -> 0x00001000
DMA32 0x00001000 -> 0x00100000
Normal 0x00100000 -> 0x00100000
Movable zone start PFN for each node
early_node_map[2] active PFN ranges
0: 0x00000010 -> 0x0000009f
0: 0x00000100 -> 0x000c7f90
On node 0 totalpages: 818975
DMA zone: 1916 pages, LIFO batch:0
DMA32 zone: 803849 pages, LIFO batch:31
ACPI: PM-Timer IO Port: 0x808
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x82] disabled)
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x83] disabled)
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, version 0, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Setting APIC routing to flat
Using ACPI (MADT) for SMP configuration information
SMP: Allowing 4 CPUs, 2 hotplug CPUs
PM: Registered nosave memory: 000000000009f000 - 00000000000a0000
PM: Registered nosave memory: 00000000000a0000 - 00000000000e4000
PM: Registered nosave memory: 00000000000e4000 - 0000000000100000
Allocating PCI resources starting at cc000000 (gap: c8000000:37b80000)
PERCPU: Allocating 64928 bytes of per cpu data
NR_CPUS: 64, nr_cpu_ids: 4, nr_node_ids 1
Built 1 zonelists in Node order, mobility grouping on. Total pages: 805765
Policy zone: DMA32
Kernel command line: ro root=/dev/VolGroup00/LogVol00 vga=791
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 32768 bytes)
TSC: PIT calibration confirmed by PMTIMER.
TSC: using PMTIMER calibration value
Detected 2424.936 MHz processor.
Console: colour dummy device 80x25
console [tty0] enabled
Checking aperture...
No AGP bridge found
Calgary: detecting Calgary via BIOS EBDA area
Calgary: Unable to locate Rio Grande table in EBDA - bailing!
Memory: 3218840k/3276352k available (2850k kernel code, 57060k reserved, 1581k data, 1268k init)
CPA: page pool initialized 1 of 1 pages preallocated
SLUB: Genslabs=13, HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
Calibrating delay loop (skipped), value calculated using timer frequency.. 4849.87 BogoMIPS (lpj=2424936)
Security Framework initialized
SELinux: Initializing.
SELinux: Starting in permissive mode
Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
Mount-cache hash table entries: 256
Initializing cgroup subsys ns
Initializing cgroup subsys cpuacct
Initializing cgroup subsys devices
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 2048K
CPU 0/0 -> Node 0
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
CPU0: Thermal monitoring enabled (TM2)
using mwait in idle threads.
ACPI: Core revision 20080609
..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
CPU0: Intel(R) Core(TM)2 CPU 6300 @ 1.86GHz stepping 02
Using local APIC timer interrupts.
APIC timer calibration result 21651235
Detected 21.651 MHz APIC timer.
Booting processor 1/1 ip 6000
Initializing CPU#1
Calibrating delay using timer specific routine.. 4849.88 BogoMIPS (lpj=2424941)
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 2048K
CPU 1/1 -> Node 0
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 1
CPU1: Thermal monitoring enabled (TM2)
x86 PAT enabled: cpu 1, old 0x7040600070406, new 0x7010600070106
CPU1: Intel(R) Core(TM)2 CPU 6300 @ 1.86GHz stepping 02
checking TSC synchronization [CPU#0 -> CPU#1]: passed.
Brought up 2 CPUs
Total of 2 processors activated (9699.75 BogoMIPS).
sizeof(vma)=176 bytes
sizeof(page)=56 bytes
sizeof(inode)=560 bytes
sizeof(dentry)=208 bytes
sizeof(ext3inode)=760 bytes
sizeof(buffer_head)=104 bytes
sizeof(skbuff)=232 bytes
sizeof(task_struct)=5856 bytes
CPU0 attaching sched-domain:
domain 0: span 0-1 level MC
groups: 0 1
domain 1: span 0-1 level NODE
groups: 0-1
CPU1 attaching sched-domain:
domain 0: span 0-1 level MC
groups: 1 0
domain 1: span 0-1 level NODE
groups: 0-1
net_namespace: 1552 bytes
Booting paravirtualized kernel on bare hardware
Time: 14:57:21 Date: 02/18/09
NET: Registered protocol family 16
No dock devices found.
ACPI: bus type pci registered
PCI: Found Intel Corporation 945G/GZ/P/PL Express Memory Controller Hub without MMCONFIG support.
PCI: Using configuration type 1 for base access
ACPI: EC: Look up EC in DSDT
ACPI: Interpreter enabled
ACPI: (supports S0 S1 S3 S4 S5)
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
pci 0000:00:01.0: PME# supported from D0 D3hot D3cold
pci 0000:00:01.0: PME# disabled
PCI: 0000:00:1b.0 reg 10 64bit mmio: [cfcf8000, cfcfbfff]
pci 0000:00:1b.0: PME# supported from D0 D3hot D3cold
pci 0000:00:1b.0: PME# disabled
pci 0000:00:1c.0: PME# supported from D0 D3hot D3cold
pci 0000:00:1c.0: PME# disabled
pci 0000:00:1c.3: PME# supported from D0 D3hot D3cold
pci 0000:00:1c.3: PME# disabled
PCI: 0000:00:1d.0 reg 20 io port: [7000, 701f]
PCI: 0000:00:1d.1 reg 20 io port: [7400, 741f]
PCI: 0000:00:1d.2 reg 20 io port: [7800, 781f]
PCI: 0000:00:1d.3 reg 20 io port: [8000, 801f]
PCI: 0000:00:1d.7 reg 10 32bit mmio: [cfcff800, cfcffbff]
pci 0000:00:1d.7: PME# supported from D0 D3hot D3cold
pci 0000:00:1d.7: PME# disabled
pci 0000:00:1f.0: Force enabled HPET at 0xfed00000
pci 0000:00:1f.0: quirk: region 0800-087f claimed by ICH6 ACPI/GPIO/TCO
pci 0000:00:1f.0: quirk: region 0480-04bf claimed by ICH6 GPIO
PCI: 0000:00:1f.1 reg 10 io port: [0, 7]
PCI: 0000:00:1f.1 reg 14 io port: [0, 3]
PCI: 0000:00:1f.1 reg 18 io port: [0, 7]
PCI: 0000:00:1f.1 reg 1c io port: [0, 3]
PCI: 0000:00:1f.1 reg 20 io port: [ffa0, ffaf]
PCI: 0000:00:1f.2 reg 10 io port: [9800, 9807]
PCI: 0000:00:1f.2 reg 14 io port: [9400, 9403]
PCI: 0000:00:1f.2 reg 18 io port: [9000, 9007]
PCI: 0000:00:1f.2 reg 1c io port: [8800, 8803]
PCI: 0000:00:1f.2 reg 20 io port: [8400, 840f]
PCI: 0000:00:1f.2 reg 24 32bit mmio: [cfcffc00, cfcfffff]
pci 0000:00:1f.2: PME# supported from D3hot
pci 0000:00:1f.2: PME# disabled
PCI: 0000:00:1f.3 reg 20 io port: [400, 41f]
PCI: 0000:04:00.0 reg 10 64bit mmio: [d0000000, dfffffff]
PCI: 0000:04:00.0 reg 18 64bit mmio: [cffe0000, cffeffff]
PCI: 0000:04:00.0 reg 20 io port: [e000, e0ff]
PCI: 0000:04:00.0 reg 30 32bit mmio: [cffc0000, cffdffff]
pci 0000:04:00.0: supports D1
pci 0000:04:00.0: supports D2
PCI: 0000:04:00.1 reg 10 64bit mmio: [cfff0000, cfffffff]
pci 0000:04:00.1: supports D1
pci 0000:04:00.1: supports D2
Pre-1.1 PCIe device detected, disable ASPM for 0000:00:01.0. It can be enabled forcedly with 'pcie_aspm=force'
PCI: bridge 0000:00:01.0 io port: [e000, efff]
PCI: bridge 0000:00:01.0 32bit mmio: [cff00000, cfffffff]
PCI: bridge 0000:00:01.0 64bit mmio pref: [d0000000, dfffffff]
PCI: bridge 0000:00:1c.0 io port: [d000, dfff]
PCI: 0000:02:00.0 reg 10 64bit mmio: [cfefc000, cfefffff]
PCI: 0000:02:00.0 reg 18 io port: [c800, c8ff]
PCI: 0000:02:00.0 reg 30 32bit mmio: [cfec0000, cfedffff]
pci 0000:02:00.0: supports D1
pci 0000:02:00.0: supports D2
pci 0000:02:00.0: PME# supported from D0 D1 D2 D3hot D3cold
pci 0000:02:00.0: PME# disabled
Pre-1.1 PCIe device detected, disable ASPM for 0000:00:1c.3. It can be enabled forcedly with 'pcie_aspm=force'
PCI: bridge 0000:00:1c.3 io port: [c000, cfff]
PCI: bridge 0000:00:1c.3 32bit mmio: [cfe00000, cfefffff]
PCI: 0000:01:03.0 reg 10 io port: [b800, b807]
PCI: 0000:01:03.0 reg 14 io port: [b400, b403]
PCI: 0000:01:03.0 reg 18 io port: [b000, b007]
PCI: 0000:01:03.0 reg 1c io port: [a800, a803]
PCI: 0000:01:03.0 reg 20 io port: [a400, a40f]
PCI: 0000:01:03.0 reg 30 32bit mmio: [cfde0000, cfdfffff]
pci 0000:00:1e.0: transparent bridge
PCI: bridge 0000:00:1e.0 io port: [a000, bfff]
PCI: bridge 0000:00:1e.0 32bit mmio: [cfd00000, cfdfffff]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P1._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P3._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P4._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P7._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 *5 6 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 *4 5 6 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 *7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKE] (IRQs *3 4 5 6 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 *5 6 7 10 11 12 14 15)
ACPI Warning (tbutils-0217): Incorrect checksum in table [OEMB] - 72, should be 49 [20080609]
Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI init
ACPI: bus type pnp registered
pnp: PnP ACPI: found 14 devices
ACPI: ACPI bus type pnp unregistered
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
PCI: Using ACPI for IRQ routing
NetLabel: Initializing
NetLabel: domain hash size = 128
NetLabel: protocols = UNLABELED CIPSOv4
NetLabel: unlabeled traffic allowed by default
PCI-GART: No AMD northbridge found.
hpet clockevent registered
hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0
hpet0: 3 64-bit timers, 14318180 Hz
tracer: 1286 pages allocated for 65536 entries of 80 bytes
actual entries 65586
ACPI: RTC can wake from S4
system 00:01: iomem range 0xfed13000-0xfed19fff has been reserved
system 00:07: ioport range 0x290-0x297 has been reserved
system 00:08: ioport range 0x4d0-0x4d1 has been reserved
system 00:08: ioport range 0x800-0x87f has been reserved
system 00:08: ioport range 0x480-0x4bf has been reserved
system 00:08: ioport range 0x900-0x91f has been reserved
system 00:08: iomem range 0xfed1c000-0xfed1ffff has been reserved
system 00:08: iomem range 0xfed20000-0xfed8ffff has been reserved
system 00:08: iomem range 0xffb00000-0xffbfffff could not be reserved
system 00:08: iomem range 0xfff00000-0xffffffff could not be reserved
system 00:09: iomem range 0xfec00000-0xfec00fff has been reserved
system 00:09: iomem range 0xfee00000-0xfee00fff has been reserved
system 00:0c: iomem range 0xf0000000-0xf3ffffff has been reserved
system 00:0d: iomem range 0x0-0x9ffff could not be reserved
system 00:0d: iomem range 0xc0000-0xdffff has been reserved
system 00:0d: iomem range 0xe0000-0xfffff could not be reserved
system 00:0d: iomem range 0x100000-0xc7ffffff could not be reserved
pci 0000:00:01.0: PCI bridge, secondary bus 0000:04
pci 0000:00:01.0: IO window: 0xe000-0xefff
pci 0000:00:01.0: MEM window: 0xcff00000-0xcfffffff
pci 0000:00:01.0: PREFETCH window: 0x000000d0000000-0x000000dfffffff
pci 0000:00:1c.0: PCI bridge, secondary bus 0000:03
pci 0000:00:1c.0: IO window: 0xd000-0xdfff
pci 0000:00:1c.0: MEM window: disabled
pci 0000:00:1c.0: PREFETCH window: disabled
pci 0000:00:1c.3: PCI bridge, secondary bus 0000:02
pci 0000:00:1c.3: IO window: 0xc000-0xcfff
pci 0000:00:1c.3: MEM window: 0xcfe00000-0xcfefffff
pci 0000:00:1c.3: PREFETCH window: disabled
pci 0000:00:1e.0: PCI bridge, secondary bus 0000:01
pci 0000:00:1e.0: IO window: 0xa000-0xbfff
pci 0000:00:1e.0: MEM window: 0xcfd00000-0xcfdfffff
pci 0000:00:1e.0: PREFETCH window: 0x000000cc000000-0x000000cc0fffff
pci 0000:00:01.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
pci 0000:00:01.0: setting latency timer to 64
pci 0000:00:1c.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
pci 0000:00:1c.0: setting latency timer to 64
pci 0000:00:1c.3: PCI INT D -> GSI 19 (level, low) -> IRQ 19
pci 0000:00:1c.3: setting latency timer to 64
pci 0000:00:1e.0: setting latency timer to 64
bus: 00 index 0 io port: [0, ffff]
bus: 00 index 1 mmio: [0, ffffffffffffffff]
bus: 04 index 0 io port: [e000, efff]
bus: 04 index 1 mmio: [cff00000, cfffffff]
bus: 04 index 2 mmio: [d0000000, dfffffff]
bus: 04 index 3 mmio: [0, 0]
bus: 03 index 0 io port: [d000, dfff]
bus: 03 index 1 mmio: [0, 0]
bus: 03 index 2 mmio: [0, 0]
bus: 03 index 3 mmio: [0, 0]
bus: 02 index 0 io port: [c000, cfff]
bus: 02 index 1 mmio: [cfe00000, cfefffff]
bus: 02 index 2 mmio: [0, 0]
bus: 02 index 3 mmio: [0, 0]
bus: 01 index 0 io port: [a000, bfff]
bus: 01 index 1 mmio: [cfd00000, cfdfffff]
bus: 01 index 2 mmio: [cc000000, cc0fffff]
bus: 01 index 3 io port: [0, ffff]
bus: 01 index 4 mmio: [0, ffffffffffffffff]
NET: Registered protocol family 2
IP route cache hash table entries: 131072 (order: 8, 1048576 bytes)
TCP established hash table entries: 524288 (order: 11, 8388608 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 524288 bind 65536)
TCP reno registered
NET: Registered protocol family 1
checking if image is initramfs... it is
Freeing initrd memory: 3834k freed
audit: initializing netlink socket (disabled)
type=2000 audit(1234969041.423:1): initialized
HugeTLB registered 2 MB page size, pre-allocated 0 pages
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
msgmni has been set to 6294
SELinux: Registering netfilter hooks
Block layer SCSI generic (bsg) driver version 0.4 loaded (major 252)
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered (default)
pci 0000:04:00.0: Boot video device
pcieport-driver 0000:00:01.0: setting latency timer to 64
pcieport-driver 0000:00:01.0: found MSI capability
pci_express 0000:00:01.0:pcie00: allocate port service
pcieport-driver 0000:00:1c.0: setting latency timer to 64
pcieport-driver 0000:00:1c.0: found MSI capability
pci_express 0000:00:1c.0:pcie00: allocate port service
pci_express 0000:00:1c.0:pcie02: allocate port service
pcieport-driver 0000:00:1c.3: setting latency timer to 64
pcieport-driver 0000:00:1c.3: found MSI capability
pci_express 0000:00:1c.3:pcie00: allocate port service
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
vesafb: framebuffer at 0xd0000000, mapped to 0xffffc20001080000, using 3072k, total 16384k
vesafb: mode is 1024x768x16, linelength=2048, pages=9
vesafb: scrolling: redraw
vesafb: Truecolor: size=0:5:6:5, shift=0:11:5:0
Console: switching to colour frame buffer device 128x48
fb0: VESA VGA frame buffer device
input: Power Button (FF) as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input0
ACPI: Power Button (FF) [PWRF]
input: Power Button (CM) as /devices/LNXSYSTM:00/device:00/PNP0C0C:00/input/input1
ACPI: Power Button (CM) [PWRB]
ACPI: SSDT C7F9E0B0, 01C6 (r1 AMI CPU1PM 1 INTL 20051117)
processor ACPI0007:00: registered as cooling_device0
ACPI: Processor [CPU1] (supports 8 throttling states)
ACPI: SSDT C7F9E280, 013A (r1 AMI CPU2PM 1 INTL 20051117)
processor ACPI0007:01: registered as cooling_device1
ACPI: Processor [CPU2] (supports 8 throttling states)
Non-volatile memory driver v1.2
Linux agpgart interface v0.103
Serial: 8250/16550 driver4 ports, IRQ sharing enabled
brd: module loaded
input: Macintosh mouse button emulation as /devices/virtual/input/input2
PNP: No PS/2 controller found. Probing ports directly.
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
rtc_cmos 00:03: rtc core: registered rtc_cmos as rtc0
rtc0: alarms up to one month, hpet irqs
cpuidle: using governor ladder
cpuidle: using governor menu
usbcore: registered new interface driver hiddev
usbcore: registered new interface driver usbhid
usbhid: v2.6:USB HID core driver
TCP cubic registered
Initializing XFRM netlink socket
NET: Registered protocol family 17
registered taskstats version 1
Magic number: 13:254:991
Freeing unused kernel memory: 1268k freed
Write protecting the kernel read-only data: 4060k
Switched to high resolution mode on CPU 1
Switched to high resolution mode on CPU 0
ehci_hcd 0000:00:1d.7: PCI INT A -> GSI 20 (level, low) -> IRQ 20
ehci_hcd 0000:00:1d.7: setting latency timer to 64
ehci_hcd 0000:00:1d.7: EHCI Host Controller
ehci_hcd 0000:00:1d.7: new USB bus registered, assigned bus number 1
ehci_hcd 0000:00:1d.7: debug port 1
ehci_hcd 0000:00:1d.7: cache line size of 32 is not supported
ehci_hcd 0000:00:1d.7: irq 20, io mem 0xcfcff800
ehci_hcd 0000:00:1d.7: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 8 ports detected
usb usb1: New USB device found, idVendor=1d6b, idProduct=0002
usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
usb usb1: Product: EHCI Host Controller
usb usb1: Manufacturer: Linux 2.6.27.12-78.2.8.fc9.x86_64 ehci_hcd
usb usb1: SerialNumber: 0000:00:1d.7
ohci_hcd: 2006 August 04 USB 1.1 'Open' Host Controller (OHCI) Driver
USB Universal Host Controller Interface driver v3.0
uhci_hcd 0000:00:1d.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20
uhci_hcd 0000:00:1d.0: setting latency timer to 64
uhci_hcd 0000:00:1d.0: UHCI Host Controller
uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 2
uhci_hcd 0000:00:1d.0: irq 20, io base 0x00007000
usb usb2: configuration #1 chosen from 1 choice
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
usb usb2: New USB device found, idVendor=1d6b, idProduct=0001
usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
usb usb2: Product: UHCI Host Controller
usb usb2: Manufacturer: Linux 2.6.27.12-78.2.8.fc9.x86_64 uhci_hcd
usb usb2: SerialNumber: 0000:00:1d.0
uhci_hcd 0000:00:1d.1: PCI INT B -> GSI 17 (level, low) -> IRQ 17
uhci_hcd 0000:00:1d.1: setting latency timer to 64
uhci_hcd 0000:00:1d.1: UHCI Host Controller
uhci_hcd 0000:00:1d.1: new USB bus registered, assigned bus number 3
uhci_hcd 0000:00:1d.1: irq 17, io base 0x00007400
usb usb3: configuration #1 chosen from 1 choice
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
usb usb3: New USB device found, idVendor=1d6b, idProduct=0001
usb usb3: New USB device strings: Mfr=3, Product=2, SerialNumber=1
usb usb3: Product: UHCI Host Controller
usb usb3: Manufacturer: Linux 2.6.27.12-78.2.8.fc9.x86_64 uhci_hcd
usb usb3: SerialNumber: 0000:00:1d.1
uhci_hcd 0000:00:1d.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18
uhci_hcd 0000:00:1d.2: setting latency timer to 64
uhci_hcd 0000:00:1d.2: UHCI Host Controller
uhci_hcd 0000:00:1d.2: new USB bus registered, assigned bus number 4
uhci_hcd 0000:00:1d.2: irq 18, io base 0x00007800
usb usb4: configuration #1 chosen from 1 choice
hub 4-0:1.0: USB hub found
hub 4-0:1.0: 2 ports detected
usb 3-2: new low speed USB device using uhci_hcd and address 2
usb usb4: New USB device found, idVendor=1d6b, idProduct=0001
usb usb4: New USB device strings: Mfr=3, Product=2, SerialNumber=1
usb usb4: Product: UHCI Host Controller
usb usb4: Manufacturer: Linux 2.6.27.12-78.2.8.fc9.x86_64 uhci_hcd
usb usb4: SerialNumber: 0000:00:1d.2
uhci_hcd 0000:00:1d.3: PCI INT D -> GSI 19 (level, low) -> IRQ 19
uhci_hcd 0000:00:1d.3: setting latency timer to 64
uhci_hcd 0000:00:1d.3: UHCI Host Controller
uhci_hcd 0000:00:1d.3: new USB bus registered, assigned bus number 5
uhci_hcd 0000:00:1d.3: irq 19, io base 0x00008000
usb usb5: configuration #1 chosen from 1 choice
hub 5-0:1.0: USB hub found
hub 5-0:1.0: 2 ports detected
usb 3-2: configuration #1 chosen from 1 choice
input: Logitech USB Receiver as /devices/pci0000:00/0000:00:1d.1/usb3/3-2/3-2:1.0/input/input3
usb usb5: New USB device found, idVendor=1d6b, idProduct=0001
usb usb5: New USB device strings: Mfr=3, Product=2, SerialNumber=1
usb usb5: Product: UHCI Host Controller
usb usb5: Manufacturer: Linux 2.6.27.12-78.2.8.fc9.x86_64 uhci_hcd
usb usb5: SerialNumber: 0000:00:1d.3
input,hidraw0: USB HID v1.10 Keyboard [Logitech USB Receiver] on usb-0000:00:1d.1-2
device-mapper: uevent: version 1.0.3
device-mapper: ioctl: 4.14.0-ioctl (2008-04-23) initialised: dm-devel
input: Logitech USB Receiver as /devices/pci0000:00/0000:00:1d.1/usb3/3-2/3-2:1.1/input/input4
input,hidraw1: USB HID v1.10 Mouse [Logitech USB Receiver] on usb-0000:00:1d.1-2
usb 3-2: New USB device found, idVendor=046d, idProduct=c505
usb 3-2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
usb 3-2: Product: USB Receiver
usb 3-2: Manufacturer: Logitech
SCSI subsystem initialized
Driver 'sd' needs updating - please use bus_type methods
libata version 3.00 loaded.
pata_acpi 0000:00:1f.1: PCI INT A -> GSI 22 (level, low) -> IRQ 22
pata_acpi 0000:00:1f.1: setting latency timer to 64
pata_acpi 0000:00:1f.1: PCI INT A disabled
pata_acpi 0000:00:1f.2: PCI INT B -> GSI 23 (level, low) -> IRQ 23
pata_acpi 0000:00:1f.2: setting latency timer to 64
pata_acpi 0000:00:1f.2: PCI INT B disabled
ata_piix 0000:00:1f.1: version 2.12
ata_piix 0000:00:1f.1: PCI INT A -> GSI 22 (level, low) -> IRQ 22
ata_piix 0000:00:1f.1: setting latency timer to 64
scsi0 : ata_piix
scsi1 : ata_piix
ata1: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0xffa0 irq 14
ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0xffa8 irq 15
ata1.00: ATA-7: WDC WD2500JB-00REA0, 20.00K20, max UDMA/100
ata1.00: 488397168 sectors, multi 16: LBA48
ata1.01: ATAPI: HL-DT-ST DVDRAM GSA-H42N, RL01, max UDMA/66
ata1.00: configured for UDMA/100
ata1.01: configured for UDMA/66
isa bounce pool size: 16 pages
scsi 0:0:0:0: Direct-Access ATA WDC WD2500JB-00R 20.0 PQ: 0 ANSI: 5
sd 0:0:0:0: [sda] 488397168 512-byte hardware sectors (250059 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 0:0:0:0: [sda] 488397168 512-byte hardware sectors (250059 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sda: sda1 sda2
sd 0:0:0:0: [sda] Attached SCSI disk
scsi 0:0:1:0: CD-ROM HL-DT-ST DVDRAM GSA-H42N RL01 PQ: 0 ANSI: 5
ata_piix 0000:00:1f.2: PCI INT B -> GSI 23 (level, low) -> IRQ 23
ata_piix 0000:00:1f.2: MAP [ P0 P2 P1 P3 ]
ata_piix 0000:00:1f.2: setting latency timer to 64
scsi2 : ata_piix
scsi3 : ata_piix
ata3: SATA max UDMA/133 cmd 0x9800 ctl 0x9400 bmdma 0x8400 irq 23
ata4: SATA max UDMA/133 cmd 0x9000 ctl 0x8800 bmdma 0x8408 irq 23
ata4.01: ATA-7: ST3320620AS, 3.AAK, max UDMA/133
ata4.01: 625142448 sectors, multi 16: LBA48 NCQ (depth 0/32)
ata4.01: configured for UDMA/133
scsi 3:0:1:0: Direct-Access ATA ST3320620AS 3.AA PQ: 0 ANSI: 5
sd 3:0:1:0: [sdb] 625142448 512-byte hardware sectors (320073 MB)
sd 3:0:1:0: [sdb] Write Protect is off
sd 3:0:1:0: [sdb] Mode Sense: 00 3a 00 00
sd 3:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 3:0:1:0: [sdb] 625142448 512-byte hardware sectors (320073 MB)
sd 3:0:1:0: [sdb] Write Protect is off
sd 3:0:1:0: [sdb] Mode Sense: 00 3a 00 00
sd 3:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sdb: sdb1 sdb2
sd 3:0:1:0: [sdb] Attached SCSI disk
EXT3-fs: INFO: recovery required on readonly filesystem.
EXT3-fs: write access will be enabled during recovery.
kjournald starting. Commit interval 5 seconds
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
type=1404 audit(1234969056.599:2): enforcing=1 old_enforcing=0 auid=4294967295 ses=4294967295
SELinux: 8192 avtab hash slots, 177506 rules.
SELinux: 8192 avtab hash slots, 177506 rules.
SELinux: 8 users, 12 roles, 2428 types, 118 bools, 1 sens, 1024 cats
SELinux: 73 classes, 177506 rules
SELinux: Completing initialization.
SELinux: Setting up existing superblocks.
SELinux: initialized (dev dm-1, type ext3), uses xattr
SELinux: initialized (dev usbfs, type usbfs), uses genfs_contexts
SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
SELinux: initialized (dev selinuxfs, type selinuxfs), uses genfs_contexts
SELinux: initialized (dev mqueue, type mqueue), uses transition SIDs
SELinux: initialized (dev hugetlbfs, type hugetlbfs), uses genfs_contexts
SELinux: initialized (dev devpts, type devpts), uses transition SIDs
SELinux: initialized (dev inotifyfs, type inotifyfs), uses genfs_contexts
SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
SELinux: initialized (dev anon_inodefs, type anon_inodefs), uses genfs_contexts
SELinux: initialized (dev pipefs, type pipefs), uses task SIDs
SELinux: initialized (dev debugfs, type debugfs), uses genfs_contexts
SELinux: initialized (dev sockfs, type sockfs), uses task SIDs
SELinux: initialized (dev proc, type proc), uses genfs_contexts
SELinux: initialized (dev bdev, type bdev), uses genfs_contexts
SELinux: initialized (dev rootfs, type rootfs), uses genfs_contexts
SELinux: initialized (dev sysfs, type sysfs), uses genfs_contexts
type=1403 audit(1234969056.909:3): policy loaded auid=4294967295 ses=4294967295
sky2 0000:02:00.0: PCI INT A -> GSI 19 (level, low) -> IRQ 19
sky2 0000:02:00.0: setting latency timer to 64
sky2 0000:02:00.0: v1.22 addr 0xcfefc000 irq 19 Yukon-2 EC rev 2
sky2 eth0: addr 00:18:f3:1a:33:c9
intel_rng: FWH not detected
sd 0:0:0:0: Attached scsi generic sg0 type 0
scsi 0:0:1:0: Attached scsi generic sg1 type 5
sd 3:0:1:0: Attached scsi generic sg2 type 0
Driver 'sr' needs updating - please use bus_type methods
sr0: scsi3-mmc drive: 48x/48x writer dvd-ram cd/rw xa/form2 cdda tray
Uniform CD-ROM driver Revision: 3.20
sr 0:0:1:0: Attached scsi CD-ROM sr0
pata_it821x 0000:01:03.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20
pata_it821x: controller in pass through mode.
pata_it821x 0000:01:03.0: setting latency timer to 64
scsi4 : pata_it821x
scsi5 : pata_it821x
ata5: PATA max UDMA/133 cmd 0xb800 ctl 0xb400 bmdma 0xa400 irq 20
ata6: PATA max UDMA/133 cmd 0xb000 ctl 0xa800 bmdma 0xa408 irq 20
iTCO_vendor_support: vendor-support=0
iTCO_wdt: Intel TCO WatchDog Timer Driver v1.03 (30-Apr-2008)
iTCO_wdt: Found a ICH7 or ICH7R TCO device (Version=2, TCOBASE=0x0860)
iTCO_wdt: initialized. heartbeat=30 sec (nowayout=0)
gameport: NS558 PnP Gameport is pnp00:0a/gameport0, io 0x200, speed 826kHz
input: PC Speaker as /devices/platform/pcspkr/input/input5
i801_smbus 0000:00:1f.3: PCI INT B -> GSI 23 (level, low) -> IRQ 23
ACPI: I/O resource 0000:00:1f.3 [0x400-0x41f] conflicts with ACPI region SMRG [0x400-0x40f]
ACPI: Device needs an ACPI driver
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
HDA Intel 0000:00:1b.0: PCI INT A -> GSI 19 (level, low) -> IRQ 19
HDA Intel 0000:00:1b.0: setting latency timer to 64
hda_codec: Unknown model for ALC883, trying auto-probe from BIOS...
ALSA sound/pci/hda/hda_codec.c:3021: autoconfig: line_outs=4 (0x14/0x15/0x16/0x17/0x0)
ALSA sound/pci/hda/hda_codec.c:3025: speaker_outs=0 (0x0/0x0/0x0/0x0/0x0)
ALSA sound/pci/hda/hda_codec.c:3029: hp_outs=1 (0x1b/0x0/0x0/0x0/0x0)
ALSA sound/pci/hda/hda_codec.c:3030: mono: mono_out=0x0
ALSA sound/pci/hda/hda_codec.c:3038: inputs: mic=0x18, fmic=0x19, line=0x1a, fline=0x0, cd=0x0, aux=0x0
device-mapper: multipath: version 1.0.5 loaded
EXT3 FS on dm-1, internal journal
kjournald starting. Commit interval 5 seconds
EXT3 FS on dm-2, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
SELinux: initialized (dev dm-2, type ext3), uses xattr
kjournald starting. Commit interval 5 seconds
EXT3 FS on sda1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
SELinux: initialized (dev sda1, type ext3), uses xattr
SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
Adding 2031608k swap on /dev/mapper/VolGroup00-LogVol01. Priority:-1 extents:1 across:2031608k
SELinux: initialized (dev binfmt_misc, type binfmt_misc), uses genfs_contexts
IA-32 Microcode Update Driver: v1.14a <tigran.co.uk>
firmware: requesting intel-ucode/06-0f-02
firmware: requesting intel-ucode/06-0f-02
microcode: CPU0 updated from revision 0x51 to 0x5a, date = 09262007
microcode: CPU1 updated from revision 0x51 to 0x5a, date = 09262007
NET: Registered protocol family 10
lo: Disabled Privacy Extensions
ip6_tables: (C) 2000-2006 Netfilter Core Team
nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
CONFIG_NF_CT_ACCT is deprecated and will be removed soon. Plase use
nf_conntrack.acct=1 kernel paramater, acct=1 nf_conntrack module option or
sysctl net.netfilter.nf_conntrack_acct=1 to enable it.
ip_tables: (C) 2000-2006 Netfilter Core Team
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
SELinux: initialized (dev rpc_pipefs, type rpc_pipefs), uses genfs_contexts
warning: `dbus-daemon' uses deprecated v2 capabilities in a way that may be insecure.
sky2 eth0: enabling interface
ADDRCONF(NETDEV_UP): eth0: link is not ready
sky2 eth0: Link is up at 100 Mbps, full duplex, flow control both
ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
vboxdrv: Trying to deactivate the NMI watchdog permanently...
vboxdrv: Successfully done.
vboxdrv: Found 2 processor cores.
VBoxDrv: dbg - g_abExecMemory=ffffffffa038c180
vboxdrv: fAsync=0 offMin=0x2d1 offMax=0x1195
vboxdrv: TSC mode is 'synchronous', kernel timer mode is 'normal'.
vboxdrv: Successfully loaded version 2.1.2 (interface 0x000a0009).
VBoxNetFlt: dbg - g_abExecMemory=ffffffffa0526f60
eth0: no IPv6 routers present
fuse init (API version 7.9)
SELinux: initialized (dev fuse, type fuse), uses genfs_contexts
SELinux: initialized (dev sdb1, type fuseblk), uses genfs_contexts
SELinux: initialized (dev sdb2, type fuseblk), uses genfs_contexts
This may not be very helpfull as the suspected dev has been removed
If you require I can recreate the issue and submit the info?
An ATA driver timeout
ata4.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
ata4.01: cmd c8/00:08:c7:d8:ba/00:00:00:00:00/f1 tag 0 dma 4096 in
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
is a VERY generic diagnostic message. It could mean anything, and should not be assumed associated with any particular bug.
All that means is that a timeout occurred, for unknown reasons.
@mlord - just some info you may find useful:
I patched a vanilla stable branch (2.6.28.3) with only the patch that you posted on 1/14/2009:
"sata_mv_fix_timeouts_on_Marvell_6081_ports_0..3"
Current uptime is 24 days! I've hit this x4500 with very heavy disk and NFS I/O pretty consistently.
My machine has the following components:
SATA
----
0b:01.0 SCSI storage controller: Marvell Technology Group Ltd. MV88SX6081 8-port SATA II PCI-X Controller (rev 09)
Subsystem: Marvell Technology Group Ltd. Device 11ab
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 64, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 76
Region 0: Memory at fe000000 (64-bit, non-prefetchable) [size=1M]
Region 2: I/O ports at dc00 [size=256]
Capabilities: [40] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable-
Address: 0000000000000000 Data: 0000
Capabilities: [60] PCI-X non-bridge device
Command: DPERE- ERO- RBC=512 OST=4
Status: Dev=0b:01.0 64bit+ 133MHz+ SCD- USC- DC=simple DMMRBC=512 DMOST=4 DMCRS=8 RSCEM- 266MHz- 533MHz-
Kernel driver in use: sata_mv
HDDs (48)
---------
Device Model: HITACHI HUA7210SASUN1.0T 0830GPLE8E
Serial Number: GTE002PAKPLE8E
Firmware Version: GKAOA90A
User Capacity: 1,000,204,886,016 bytes
ATA Version is: 7
ATA Standard is: ATA/ATAPI-7 T13 1532D revision 1
Please let me know if you need more data. Is this patch is queued up for a merge the stable branch yet?
Never mind on that last question, I see it was merged in 2.6.28.4. ;-) Cheers, Scott It's also now in the latest 2.6.27 kernels. Dunno if/when it will appear in a RedHat / Fedora kernel. That part is up to Jeff G., I think. Cheers (In reply to comment #71) > It's also now in the latest 2.6.27 kernels. > Dunno if/when it will appear in a RedHat / Fedora kernel. > That part is up to Jeff G., I think. > Cheers Hopefully soon! Getting the source and compiling my own driver and redoing initrd on every kernel update is getting a bit old... 2.6.27.12-170.2.5.fc10 just came out today, and nothing yet :( Mark, do you know if your patch could cause the filesystem to disappear? Since this patch (now running kernel 2.6.27.19-170.2.35.fc10.x86_64), I've had two major system crashes where the filesystem just vanishes (I'm guessing). The hardware itself is still active (ie - it isn't locking up), but the system becomes totally unresponsive to logins, http requests, etc., and nothing is logged, which is why I'm guessing that the filesystem is going offline... No, it would not cause that to happen. Cheers I thought not...I'll start looking at other things... I am having exactly the same problem with an XFS filesystem residing on a Samsung HD103UJ but with a different controller:
00:1f.2 IDE interface: Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 4 port SATA IDE Controller (rev 02) (prog-if 8f [Master SecP SecO PriP PriO])
Subsystem: Holco Enterprise Co, Ltd/Shuttle Computer Device 3116
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 19
Region 0: I/O ports at f900 [size=8]
Region 1: I/O ports at f800 [size=4]
Region 2: I/O ports at f700 [size=8]
Region 3: I/O ports at f600 [size=4]
Region 4: I/O ports at f500 [size=16]
Region 5: I/O ports at f400 [size=16]
Capabilities: [70] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [b0] Vendor Specific Information <?>
Kernel driver in use: ata_piix
Kernel modules: ata_generic, pata_acpi
00:1f.5 IDE interface: Intel Corporation 82801I (ICH9 Family) 2 port SATA IDE Controller (rev 02) (prog-if 85 [Master SecO PriO])
Subsystem: Holco Enterprise Co, Ltd/Shuttle Computer Device 3116
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 19
Region 0: I/O ports at f200 [size=8]
Region 1: I/O ports at f100 [size=4]
Region 2: I/O ports at f000 [size=8]
Region 3: I/O ports at ef00 [size=4]
Region 4: I/O ports at ee00 [size=16]
Region 5: I/O ports at ed00 [size=16]
Capabilities: [70] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [b0] Vendor Specific Information <?>
Kernel driver in use: ata_piix
Kernel modules: ata_generic, pata_acpi
/var/log/messages shows the error:
Apr 13 21:47:19 xpcsp35p2p131 kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Apr 13 21:47:19 xpcsp35p2p131 kernel: ata1.00: cmd 35/00:00:bf:95:f7/00:04:62:00:00/e0 tag 0 dma 524288 out
Apr 13 21:47:19 xpcsp35p2p131 kernel: res 40/00:02:00:08:00/00:00:00:00:00/b0 Emask 0x4 (timeout)
Apr 13 21:47:19 xpcsp35p2p131 kernel: ata1.00: status: { DRDY }
Apr 13 21:47:19 xpcsp35p2p131 kernel: ata1: hard resetting link
Apr 13 21:47:25 xpcsp35p2p131 kernel: ata1: link is slow to respond, please be patient (ready=0)
Apr 13 21:47:29 xpcsp35p2p131 kernel: ata1: SRST failed (errno=-16)
Apr 13 21:47:29 xpcsp35p2p131 kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Apr 13 21:47:34 xpcsp35p2p131 kernel: ata1.00: qc timeout (cmd 0xec)
Apr 13 21:47:34 xpcsp35p2p131 kernel: ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Apr 13 21:47:34 xpcsp35p2p131 kernel: ata1.00: revalidation failed (errno=-5)
Apr 13 21:47:34 xpcsp35p2p131 kernel: ata1: hard resetting link
Apr 13 21:47:40 xpcsp35p2p131 kernel: ata1: link is slow to respond, please be patient (ready=0)
Apr 13 21:47:44 xpcsp35p2p131 kernel: ata1: SRST failed (errno=-16)
Apr 13 21:47:44 xpcsp35p2p131 kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Apr 13 21:47:54 xpcsp35p2p131 kernel: ata1.00: qc timeout (cmd 0xec)
Apr 13 21:47:54 xpcsp35p2p131 kernel: ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Apr 13 21:47:54 xpcsp35p2p131 kernel: ata1.00: revalidation failed (errno=-5)
Apr 13 21:47:54 xpcsp35p2p131 kernel: ata1: limiting SATA link speed to 1.5 Gbps
Apr 13 21:47:54 xpcsp35p2p131 kernel: ata1: hard resetting link
Apr 13 21:48:00 xpcsp35p2p131 kernel: ata1: link is slow to respond, please be patient (ready=0)
Apr 13 21:48:05 xpcsp35p2p131 kernel: ata1: SRST failed (errno=-16)
Apr 13 21:48:05 xpcsp35p2p131 kernel: ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Apr 13 21:48:35 xpcsp35p2p131 kernel: ata1.00: qc timeout (cmd 0xec)
Apr 13 21:48:35 xpcsp35p2p131 kernel: ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Apr 13 21:48:35 xpcsp35p2p131 kernel: ata1.00: revalidation failed (errno=-5)
Apr 13 21:48:35 xpcsp35p2p131 kernel: ata1.00: disabled
Apr 13 21:48:35 xpcsp35p2p131 kernel: ata1.01: failed to set xfermode (err_mask=0x40)
Apr 13 21:48:35 xpcsp35p2p131 kernel: ata1: hard resetting link
Apr 13 21:48:36 xpcsp35p2p131 ntpd[2540]: kernel time sync status change 0001
Apr 13 21:48:40 xpcsp35p2p131 kernel: ata1: link is slow to respond, please be patient (ready=0)
Apr 13 21:48:45 xpcsp35p2p131 kernel: ata1: SRST failed (errno=-16)
Apr 13 21:48:45 xpcsp35p2p131 kernel: ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Apr 13 21:48:45 xpcsp35p2p131 kernel: ata1.01: configured for UDMA/100
Apr 13 21:48:45 xpcsp35p2p131 kernel: ata1: EH complete
Apr 13 21:48:45 xpcsp35p2p131 kernel: sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Apr 13 21:48:45 xpcsp35p2p131 kernel: end_request: I/O error, dev sda, sector 1660392895
Kernel is 2.6.27.21-170.2.56.fc10.x86_64
Would this controller need a similar patch?
Regards,
Gijsbert
FYI, One of the workarounds I found on the internet was to insert a CD into the DVD-drive (see also https://bugs.launchpad.net/ubuntu/+bug/104581) and indeed this seems to work! Any ideas why? Regards, Gijsbert FYI: Patch is in intrepid-proposed: https://launchpad.net/ubuntu/intrepid/+source/linux/2.6.27-14.33 See my esteemed colleague's notes for enabling here: http://ubuntuforums.org/showthread.php?t=1145513 This problem is getting quite annoying. I am also getting it now on my cluster nodes with entirely different hardware and an XFS filesystem residing on a SSD disk:
lspci -vv
00:1f.2 IDE interface: Intel Corporation 82801GB/GR/GH (ICH7 Family) SATA IDE Controller (rev 01) (prog-if 8f [Master SecP SecO PriP PriO])
Subsystem: ASUSTeK Computer Inc. P5KPL-VM Motherboard
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin B routed to IRQ 19
Region 0: I/O ports at c080 [size=8]
Region 1: I/O ports at c000 [size=4]
Region 2: I/O ports at bc00 [size=8]
Region 3: I/O ports at b880 [size=4]
Region 4: I/O ports at b800 [size=16]
Capabilities: [70] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
Kernel driver in use: ata_piix
Kernel modules: ata_generic, pata_acpi
/var/log/messages:
May 6 09:39:54 nodep141 kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
May 6 09:39:54 nodep141 kernel: ata4.00: cmd c8/00:08:1f:cc:d6/00:00:00:00:00/e0 tag 0 dma 4096 in
May 6 09:39:54 nodep141 kernel: res 40/00:02:00:08:00/00:00:00:00:00/b0 Emask 0x4 (timeout)
May 6 09:39:54 nodep141 kernel: ata4.00: status: { DRDY }
May 6 09:39:54 nodep141 kernel: ata4: soft resetting link
May 6 09:39:59 nodep141 kernel: ata4.00: qc timeout (cmd 0xec)
May 6 09:39:59 nodep141 kernel: ata4.00: failed to IDENTIFY (I/O error, err_mask=0x4)
May 6 09:39:59 nodep141 kernel: ata4.00: revalidation failed (errno=-5)
May 6 09:40:04 nodep141 kernel: ata4: link is slow to respond, please be patient (ready=0)
May 6 09:40:09 nodep141 kernel: ata4: device not ready (errno=-16), forcing hardreset
May 6 09:40:09 nodep141 kernel: ata4: soft resetting link
May 6 09:40:14 nodep141 kernel: ata4: link is slow to respond, please be patient (ready=0)
May 6 09:40:19 nodep141 kernel: ata4: SRST failed (errno=-16)
May 6 09:40:19 nodep141 kernel: ata4: soft resetting link
May 6 09:40:25 nodep141 kernel: ata4: link is slow to respond, please be patient (ready=0)
May 6 09:40:29 nodep141 kernel: ata4: SRST failed (errno=-16)
May 6 09:40:29 nodep141 kernel: ata4: soft resetting link
May 6 09:40:35 nodep141 kernel: ata4: link is slow to respond, please be patient (ready=0)
May 6 09:41:04 nodep141 kernel: ata4: SRST failed (errno=-16)
May 6 09:41:04 nodep141 kernel: ata4: soft resetting link
May 6 09:41:09 nodep141 kernel: ata4: SRST failed (errno=-16)
May 6 09:41:09 nodep141 kernel: ata4: reset failed, giving up
May 6 09:41:09 nodep141 kernel: ata4.00: disabled
May 6 09:41:09 nodep141 kernel: ata4.01: disabled
May 6 09:41:14 nodep141 kernel: ata4: link is slow to respond, please be patient (ready=0)
May 6 09:41:20 nodep141 kernel: ata4: device not ready (errno=-16), forcing hardreset
May 6 09:41:20 nodep141 kernel: ata4: soft resetting link
May 6 09:41:25 nodep141 kernel: ata4: link is slow to respond, please be patient (ready=0)
May 6 09:41:30 nodep141 kernel: ata4: SRST failed (errno=-16)
May 6 09:41:30 nodep141 kernel: ata4: soft resetting link
May 6 09:41:35 nodep141 kernel: ata4: link is slow to respond, please be patient (ready=0)
May 6 09:41:40 nodep141 kernel: ata4: SRST failed (errno=-16)
May 6 09:41:40 nodep141 kernel: ata4: soft resetting link
May 6 09:41:45 nodep141 kernel: ata4: link is slow to respond, please be patient (ready=0)
May 6 09:42:15 nodep141 kernel: ata4: SRST failed (errno=-16)
May 6 09:42:15 nodep141 kernel: ata4: soft resetting link
May 6 09:42:20 nodep141 kernel: ata4: SRST failed (errno=-16)
May 6 09:42:20 nodep141 kernel: ata4: reset failed, giving up
May 6 09:42:20 nodep141 kernel: ata4: EH complete
May 6 09:42:20 nodep141 kernel: sd 3:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
May 6 09:42:20 nodep141 kernel: end_request: I/O error, dev sdb, sector 14076959
uname -a:
Linux nodep141 2.6.27.21-170.2.56.fc10.x86_64 #1 SMP Mon Mar 23 23:08:10 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
Has the problem been fixed in this kernel version?
Regards,
Gijsbert
I had the same kind of message:
May 3 07:49:47 localhost kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
May 3 07:49:47 localhost kernel: ata1.00: cmd a0/00:00:00:00:00/00:00:00:00:00/a0 tag 0
May 3 07:49:47 localhost kernel: cdb 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
May 3 07:49:47 localhost kernel: res 51/20:03:00:00:00/00:00:00:00:00/a0 Emask 0x3 (HSM violation)
May 3 07:49:47 localhost kernel: ata1.00: status: { DRDY ERR }
May 3 07:49:47 localhost kernel: ata1: soft resetting link
May 3 07:49:47 localhost kernel: ata1.00: configured for UDMA/33
May 3 07:49:47 localhost kernel: ata1: EH complete
and appearently, adding 'acpi=off noapic' to the kernel in /etc.grub.conf seems to have solved the problem for me.
kernel /vmlinuz-2.6.27.21-170.2.56.fc10.i686 ro root=/dev/VolGroup00/LogVol00 rhgb quiet vga=792 acpi=off noapic
source:
http://forums.fedoraforum.org/showthread.php?t=213585
This message is a reminder that Fedora 9 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 9. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '9'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 9's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 9 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping Great thread and thank you Mark for providing a patch! I'm running RHEL5 with kernel 2.6.18-128.1.14.el5 with 2 PCI-X cards containing the Marvell chipset and am currently experiencing the exact same symptoms. I found the sata_mv.c file and edited the line in question and rebooted. Unfortunately doing just that didn't solve the problem so I believe I missed a critical step. Do I need to recompile the kernel itself or anything else in order to take advantage of this patch/bug fix? Yes, I'm fairly new to Linux troubleshooting, so any advice with regards to implementing the fix is greatly appreciated as it doesn't seem to be fixed in the latest Red Hat update. Regards, Dave Following up on the comment from Bug Zapper I now notice that this thread applies to Fedora Core 9. I was experiencing this problem on Fedora Core 10, so could this bug be assigned to Fedore Core 10? Regards, Gijsbert (In reply to comment #83) > Following up on the comment from Bug Zapper I now notice that this thread > applies to Fedora Core 9. I was experiencing this problem on Fedora Core 10, so > could this bug be assigned to Fedore Core 10? The original bug was fixed in 2.6.27.15 . The bug is still in fedora 15.
My system has:
- Card: Conceptronic Serial ATA & IDE Combo Card. (pci card)
- Chip: VIA Technologies, Inc. VT6421 IDE RAID Controller (rev 50).
- O.S.: Fedora release 15 (Lovelock).
- Kernel: 2.6.38.7-30.fc15.i686 (32 bits)
I'm sure my sata disk drive is ok (I've tested it with other controller and no errors appear). So the problem is at the controller hardware, or at the controller driver. I bet it's at the controller driver.
The error log is similar to the already posted ones:
---------------
[ 1885.024110] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 1885.024128] ata5.00: failed command: READ DMA EXT
[ 1885.024145] ata5.00: cmd 25/00:00:80:e7:1c/00:02:04:00:00/e0 tag 0 dma 262144 in
[ 1885.024148] res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 1885.024155] ata5.00: status: { DRDY }
[ 1885.024169] ata5: hard resetting link
[ 1885.329091] ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[ 1885.448805] ata5.00: configured for UDMA/133
[ 1885.448818] ata5.00: device reported invalid CHS sector 0
[ 1885.448840] ata5: EH complete
[ 3123.040076] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 3123.040088] ata5.00: failed command: READ DMA
[ 3123.040103] ata5.00: cmd c8/00:00:80:f6:72/00:00:00:00:00/e2 tag 0 dma 131072 in
[ 3123.040107] res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 3123.040113] ata5.00: status: { DRDY }
[ 3123.040128] ata5: hard resetting link
[ 3123.347077] ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[ 3123.475194] ata5.00: configured for UDMA/133
[ 3123.475216] ata5.00: device reported invalid CHS sector 0
[ 3123.475261] ata5: EH complete
---------------
As you see, the communication is frozen, so a hard reset is launched and the link is re-stablished. No data corruption is done, but the computer is frozen until the link is reset. Same as posted by other guys.
Some news on this bug? Is it going to be fixed? Is there some trick to decently work until it's fixed?
Thanks
(P.S. Please reopen this bug, update bug product version to "fedora 15", and add the 32-bit version to the bug plattforms)
FYI, I switched from Fedora to CentOS a couple of years ago because I needed GFS2 support on my cluster nodes, but got the same error frequently initially. However, the frequency has gone down with every kernel update over the years, and hardly ever occurs with the current CentOS kernel (2.6.18-238.9.1.el5), but still does now and then (say once every two month's on one of the cluster nodes). So you might give CentOS a try to see if that helps. Regards, Gijsbert (In reply to comment #85) > (P.S. Please reopen this bug, update bug product version to "fedora 15", and > add the 32-bit version to the bug plattforms) Please open a new bug against F15, since your errors are not the same as the ones reported here and there are 86 comments on this bug that we would have to wade through when working on it. (In reply to comment #87) > Please open a new bug against F15, since your errors are not the same as the > ones reported here and there are 86 comments on this bug that we would have to > wade through when working on it. Done. Bug 718475 |
I recently upgraded from Kernel 2.6.25.14-108.fc9.x86_64 to 2.6.26.3-29.fc9.x86_64 and ended up with 323 megs of error logs (see messages.zip attachment) related to at least one of my drives going offline. Prior to the kernel update, I never had a problem. This occurred during very heavy disk activity (I was running an rdiff backup and transfering a large file locally over gigabit ethernet via samba, plus all of the normal server activity taking place (httpd, sendmail, spamassassin, etc.) Drives are 4 Seagate 320gb 7200.10 (ST3320620AS) drives in software RAID-5/ext3. Backup was going to a single Samsung Spinpoint F1 (HD103UJ) 1TB drive/ext4. mcelog is clean, and always has been, and the server has been stable through various Fedora releases... Controller for all drives is: 02:03.0 SCSI storage controller: Marvell Technology Group Ltd. MV88SX6081 8-port SATA II PCI-X Controller (rev 09) Subsystem: Marvell Technology Group Ltd. Unknown device 11ab Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 32, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 26 Region 0: Memory at fd000000 (64-bit, non-prefetchable) [size=1M] Region 2: I/O ports at e000 [size=256] Capabilities: [40] Power Management version 2 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=0 PME- Capabilities: [50] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable- Address: 0000000000000000 Data: 0000 Capabilities: [60] PCI-X non-bridge device Command: DPERE- ERO- RBC=512 OST=4 Status: Dev=02:03.0 64bit+ 133MHz+ SCD- USC- DC=simple DMMRBC=512 DMOST=4 DMCRS=8 RSCEM- 266MHz- 533MHz- Kernel driver in use: sata_mv Kernel modules: sata_mv Extra things I have in rc.local: /sbin/hdparm -B1 /dev/sde echo 128 > /sys/block/sda/queue/max_sectors_kb echo 128 > /sys/block/sdb/queue/max_sectors_kb echo 128 > /sys/block/sdc/queue/max_sectors_kb echo 128 > /sys/block/sdd/queue/max_sectors_kb echo 16384 > /sys/block/md4/md/stripe_cache_size blockdev --setra 4096 /dev/sda blockdev --setra 4096 /dev/sdb blockdev --setra 4096 /dev/sdc blockdev --setra 4096 /dev/sdd blockdev --setra 4096 /dev/sde blockdev --setra 32768 /dev/md4 Section of where things start to go bad and the first trace occurs in the message log: Sep 15 12:22:20 radfiles kernel: ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen Sep 15 12:22:20 radfiles kernel: ata1.00: cmd 61/08:00:08:d6:42/00:00:25:00:00/40 tag 0 ncq 4096 out Sep 15 12:22:20 radfiles kernel: res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Sep 15 12:22:20 radfiles kernel: ata1.00: status: { DRDY } Sep 15 12:22:20 radfiles kernel: ata1: hard resetting link Sep 15 12:22:20 radfiles kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Sep 15 12:22:21 radfiles kernel: ata1.00: max_sectors limited to 256 for NCQ Sep 15 12:22:21 radfiles kernel: ata1.00: max_sectors limited to 256 for NCQ Sep 15 12:22:21 radfiles kernel: ata1.00: configured for UDMA/133 Sep 15 12:22:21 radfiles kernel: ata1: EH complete Sep 15 12:22:21 radfiles kernel: sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB) Sep 15 12:22:21 radfiles kernel: sd 0:0:0:0: [sda] Write Protect is off Sep 15 12:22:21 radfiles kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Sep 15 12:24:36 radfiles kernel: ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen Sep 15 12:24:36 radfiles kernel: ata1.00: cmd 61/08:00:08:d6:42/00:00:25:00:00/40 tag 0 ncq 4096 out Sep 15 12:24:36 radfiles kernel: res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Sep 15 12:24:36 radfiles kernel: ata1.00: status: { DRDY } Sep 15 12:24:36 radfiles kernel: ata1: hard resetting link Sep 15 12:24:37 radfiles kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Sep 15 12:24:37 radfiles kernel: ata1.00: max_sectors limited to 256 for NCQ Sep 15 12:24:37 radfiles kernel: ata1.00: max_sectors limited to 256 for NCQ Sep 15 12:24:37 radfiles kernel: ata1.00: configured for UDMA/133 Sep 15 12:24:37 radfiles kernel: ata1: EH complete Sep 15 12:24:37 radfiles kernel: sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB) Sep 15 12:24:37 radfiles kernel: sd 0:0:0:0: [sda] Write Protect is off Sep 15 12:24:37 radfiles kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Sep 15 12:25:38 radfiles kernel: ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen Sep 15 12:25:38 radfiles kernel: ata1.00: cmd 61/08:00:08:d6:42/00:00:25:00:00/40 tag 0 ncq 4096 out Sep 15 12:25:38 radfiles kernel: res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Sep 15 12:25:38 radfiles kernel: ata1.00: status: { DRDY } Sep 15 12:25:38 radfiles kernel: ata1: hard resetting link Sep 15 12:25:38 radfiles kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Sep 15 12:25:39 radfiles kernel: ata1.00: max_sectors limited to 256 for NCQ Sep 15 12:25:39 radfiles kernel: ata1.00: max_sectors limited to 256 for NCQ Sep 15 12:25:39 radfiles kernel: ata1.00: configured for UDMA/133 Sep 15 12:25:39 radfiles kernel: ata1: EH complete Sep 15 12:25:39 radfiles kernel: sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB) Sep 15 12:25:39 radfiles kernel: sd 0:0:0:0: [sda] Write Protect is off Sep 15 12:25:39 radfiles kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Sep 15 12:26:39 radfiles kernel: ata1.00: NCQ disabled due to excessive errors Sep 15 12:26:39 radfiles kernel: ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen Sep 15 12:26:39 radfiles kernel: ata1.00: cmd 61/08:00:08:d6:42/00:00:25:00:00/40 tag 0 ncq 4096 out Sep 15 12:26:39 radfiles kernel: res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Sep 15 12:26:39 radfiles kernel: ata1.00: status: { DRDY } Sep 15 12:26:39 radfiles kernel: ata1: hard resetting link Sep 15 12:26:40 radfiles kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Sep 15 12:26:40 radfiles kernel: ata1.00: max_sectors limited to 256 for NCQ Sep 15 12:26:40 radfiles kernel: ata1.00: max_sectors limited to 256 for NCQ Sep 15 12:26:40 radfiles kernel: ata1.00: configured for UDMA/133 Sep 15 12:26:40 radfiles kernel: ata1: EH complete Sep 15 12:26:40 radfiles kernel: sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB) Sep 15 12:26:40 radfiles kernel: sd 0:0:0:0: [sda] Write Protect is off Sep 15 12:26:40 radfiles kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Sep 15 12:26:40 radfiles kernel: ------------[ cut here ]------------ Sep 15 12:26:40 radfiles kernel: WARNING: at drivers/ata/libata-core.c:4752 ata_qc_issue+0x41/0x2aa [libata]() (Not tainted) Sep 15 12:26:40 radfiles kernel: Modules linked in: ext4dev jbd2 crc16 dm_mirror dm_log dm_mod sr_mod cdrom pata_acpi floppy pcspkr sg tg3 k8temp hwmo n pata_ali ata_generic raid1 shpchp sata_mv libata sd_mod scsi_mod raid456 async_xor async_memcpy async_tx xor ext3 jbd mbcache uhci_hcd ohci_hcd ehci _hcd [last unloaded: scsi_wait_scan] Sep 15 12:26:40 radfiles kernel: Pid: 4, comm: ksoftirqd/0 Not tainted 2.6.26.3-29.fc9.x86_64 #1 Sep 15 12:26:40 radfiles kernel: Sep 15 12:26:40 radfiles kernel: Call Trace: Sep 15 12:26:40 radfiles kernel: <IRQ> [<ffffffff81036db7>] warn_on_slowpath+0x60/0xa3 Sep 15 12:26:40 radfiles kernel: [<ffffffffa008c0e2>] ? :scsi_mod:scsi_sg_alloc+0x43/0x45 Sep 15 12:26:40 radfiles kernel: [<ffffffff81140246>] ? __sg_alloc_table+0x6d/0xf1 Sep 15 12:26:40 radfiles kernel: [<ffffffffa008c051>] ? :scsi_mod:scsi_init_sgtable+0x96/0x9f Sep 15 12:26:40 radfiles kernel: [<ffffffffa008c2b6>] ? :scsi_mod:scsi_init_io+0x22/0xcc Sep 15 12:26:40 radfiles kernel: [<ffffffffa00b73e8>] ? :libata:ata_build_rw_tf+0x19d/0x250 Sep 15 12:26:40 radfiles kernel: [<ffffffffa008c3e9>] ? :scsi_mod:scsi_setup_fs_cmnd+0x89/0x91 Sep 15 12:26:40 radfiles kernel: [<ffffffffa00b92d0>] :libata:ata_qc_issue+0x41/0x2aa Sep 15 12:26:40 radfiles kernel: [<ffffffffa008668b>] ? :scsi_mod:scsi_done+0x0/0x21 Sep 15 12:26:40 radfiles kernel: [<ffffffffa00be504>] :libata:ata_scsi_translate+0x11f/0x155 Sep 15 12:26:40 radfiles kernel: [<ffffffffa008668b>] ? :scsi_mod:scsi_done+0x0/0x21 Sep 15 12:26:40 radfiles kernel: [<ffffffffa00c05f3>] :libata:ata_scsi_queuecmd+0x17d/0x1cd Sep 15 12:26:40 radfiles kernel: [<ffffffffa008699b>] :scsi_mod:scsi_dispatch_cmd+0x1cd/0x259 Sep 15 12:26:40 radfiles kernel: [<ffffffffa008bcc3>] :scsi_mod:scsi_request_fn+0x320/0x3ff Sep 15 12:26:40 radfiles kernel: [<ffffffff811298ef>] __blk_run_queue+0x7d/0xf6 Sep 15 12:26:40 radfiles kernel: [<ffffffff81129989>] blk_run_queue+0x21/0x35 Sep 15 12:26:40 radfiles kernel: [<ffffffffa008b367>] :scsi_mod:scsi_run_queue+0x279/0x2ac Sep 15 12:26:40 radfiles kernel: [<ffffffffa008bfab>] :scsi_mod:scsi_next_command+0x36/0x46 Sep 15 12:26:40 radfiles kernel: [<ffffffffa008c161>] :scsi_mod:scsi_end_request+0x7d/0x90 Sep 15 12:26:40 radfiles kernel: [<ffffffffa008c721>] :scsi_mod:scsi_io_completion+0x1b3/0x3b0 Sep 15 12:26:40 radfiles kernel: [<ffffffffa008663b>] :scsi_mod:scsi_finish_command+0xce/0xd7 Sep 15 12:26:40 radfiles kernel: [<ffffffffa008cb8f>] :scsi_mod:scsi_softirq_done+0xe4/0xed Sep 15 12:26:40 radfiles kernel: [<ffffffff81128005>] blk_done_softirq+0x77/0x87 Sep 15 12:26:40 radfiles kernel: [<ffffffff8103bfaa>] __do_softirq+0x6d/0xe1 Sep 15 12:26:40 radfiles kernel: [<ffffffff8100d4ec>] call_softirq+0x1c/0x28 Sep 15 12:26:40 radfiles kernel: <EOI> [<ffffffff8100e770>] do_softirq+0x44/0x8b Sep 15 12:26:40 radfiles kernel: [<ffffffff8103bbdf>] ksoftirqd+0x58/0xcf Sep 15 12:26:40 radfiles kernel: [<ffffffff8103bb87>] ? ksoftirqd+0x0/0xcf Sep 15 12:26:40 radfiles kernel: [<ffffffff81049baf>] kthread+0x49/0x76 Sep 15 12:26:40 radfiles kernel: [<ffffffff8100d148>] child_rip+0xa/0x12 Sep 15 12:26:40 radfiles kernel: [<ffffffff81049b66>] ? kthread+0x0/0x76 Sep 15 12:26:40 radfiles kernel: [<ffffffff8100d13e>] ? child_rip+0x0/0x12 Sep 15 12:26:40 radfiles kernel: Sep 15 12:26:40 radfiles kernel: ---[ end trace e87afce5152dfd41 ]--- Sep 15 12:26:40 radfiles kernel: ------------[ cut here ]------------ I don't think this is a bad drive issue, but I can also provide smartctl info if it helps any. Not sure if this is a libata/sata_mv issue, or?? I'm still running the same kernel for more testing, as it didn't seem to cause any data corruption that I'm aware of, but the system did hang for quite a while...