+++ This bug was initially created as a clone of Bug #232811 +++ Description of problem: After running with the kernel-2.6.20-1.2925 for a while (randomly), the system freezes. Keyboard and mouse lose responding. But the display of the screen doesn't disappear.It is really "freezing" Version-Release number of selected component (if applicable): kernel-2.6.20-1.2925 How reproducible: always Steps to Reproduce: 1.running with kernel-2.6.20-1.2925 for a while 2. 3. Actual results: system freezing. Expected results: Additional info: http://smolt.fedoraproject.org/show?UUID=e5b52a3c-03b9-4f38-a4df-14f1f872e389 -- Additional comment from cebbert on 2007-03-19 14:07 EST -- Can you post the log from when you boot? Just post the contents of /var/log/dmesg -- Additional comment from hanpingtian on 2007-03-20 07:00 EST -- Created an attachment (id=150474) /var/log/dmesg -- Additional comment from scott-bugzilla on 2007-03-20 18:29 EST -- I have also been experiencing system freezes on this kernel with exactly the same symptoms, but with i686. Screen freezes, Pings stop, no message to screen, nothing in the logs. I have been running 2.6.18-1.2798.fc6-i686 since late Feb with no issues, upgraded 24 hours ago to 2.6.20-1.2925 and have had 5 or 6 hangs since. System checks out with Memtest. Booting back to 2.6.18 fixes it. The freezes seem to coincide with heavy IO on my 8 disk RAID 5 stripe on a Supermicro sata_mv card. If I don't attempt to rebuild the array the system will stay up for several hours. Rebuilding the array under 2.6.20-1.2925 will never complete and often the system will not even complete booting. Again 2.6.18-1.2798 is fine. -- Additional comment from cebbert on 2007-03-21 11:14 EST -- Can you post the exact models of your disk drives? Lines in kernel log should look something like this: scsi 0:0:0:0: Direct-Access ATA SAMSUNG HD160JJ/ ZM10 PQ: 0 ANSI: 5 Also, can you post whether NCQ was enabled for each drive, for example: ata1.00: ATA-7, max UDMA7, 312500000 sectors: LBA48 NCQ (depth 31/32) -- Additional comment from misek on 2007-03-21 19:09 EST -- Similar problems here (FC 5 with kernel-2.6.20-1.2300.fc5 and sata_nv). Log shows: kernel: ata2: EH in ADMA mode, notifier 0x0 notifier_error 0x0 gen_ctl 0x1501000 status 0x400 kernel: ata2: CPB 0: ctl_flags 0x1f, resp_flags 0x0 kernel: ata2: CPB 1: ctl_flags 0x1f, resp_flags 0x1 kernel: ata2: CPB 2: ctl_flags 0x1f, resp_flags 0x1 kernel: ata2: CPB 3: ctl_flags 0x1f, resp_flags 0x1 kernel: ata2: CPB 4: ctl_flags 0x1f, resp_flags 0x1 kernel: ata2: CPB 5: ctl_flags 0x1f, resp_flags 0x1 kernel: ata2: CPB 6: ctl_flags 0x1f, resp_flags 0x1 kernel: ata2: Resetting port kernel: ata2.00: exception Emask 0x0 SAct 0x1 SErr 0x0 acti on 0x2 frozen kernel: ata2.00: cmd 61/08:00:cd:e3:50/00:00:09:00:00/40 ta g 0 cdb 0x0 data 4096 out kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Em ask 0x4 (timeout) kernel: ata2: soft resetting port kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 3 kernel: ata2.00: configured for UDMA/133 kernel: ata2: EH complete kernel: scsi 0:0:0:0: Direct-Access ATA ST380817AS 3.42 PQ: 0 ANSI: 5 kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 3 00) kernel: ata2.00: ATA-6, max UDMA/133, 156301488 sectors: LB A48 NCQ (depth 31/32) The problems appeared just with this latest kernel update. -- Additional comment from hanpingtian on 2007-03-22 10:07 EST -- scsi 2:0:0:0: Direct-Access ATA WDC WD1600JS-22M 02.0 PQ: 0 ANSI: 5 And it seems there is no "NCQ" in the logs. -- Additional comment from cebbert on 2007-03-22 10:11 EST -- (In reply to comment #0) > > Additional info: > http://smolt.fedoraproject.org/show?UUID=e5b52a3c-03b9-4f38-a4df-14f1f872e389 Can you resend the smolt info while running the new kernel? The driver info has alsmost certainly changed. -- Additional comment from cebbert on 2007-03-22 10:32 EST -- Created an attachment (id=150666) smolt info from hanpingtian using 2.6.18 -- Additional comment from cebbert on 2007-03-22 12:11 EST -- (In reply to comment #3) > I have also been experiencing system freezes on this kernel with exactly the > same symptoms, but with i686. Screen freezes, Pings stop, no message to screen, > nothing in the logs. > The freezes seem to coincide with heavy IO on my 8 disk RAID 5 stripe on a > Supermicro sata_mv card. If I don't attempt to rebuild the array the system > will stay up for several hours. Rebuilding the array under 2.6.20-1.2925 will > never complete and often the system will not even complete booting. Again > 2.6.18-1.2798 is fine. This is a separate bug. Please file a new bugzilla report so we can track it properly. -- Additional comment from hanpingtian on 2007-03-23 09:33 EST -- Created an attachment (id=150755) smolt profile with kernel-2.6.20-1.2925 -- Additional comment from hanpingtian on 2007-03-23 09:34 EST -- Created an attachment (id=150756) smolt profile with kernel-2.6.20-1.2925 -- Additional comment from cebbert on 2007-03-23 17:49 EST -- Okay, it is still using sata_sil for the hard drives. -- Additional comment from cebbert on 2007-03-26 10:41 EST -- Test kernels (1.2937) for this issue are at: http://people.redhat.com/cebbert Please test and report back. -- Additional comment from pasik on 2007-03-26 11:27 EST -- Hmm.. 2.6.18 and 2.6.19 fc6 xen kernels work OK for me, but 2.6.20 freezed after a while (from a couple of seconds to some minutes..).. Now this 1.2937 crashes immediately during the bootup.. :( Anything I can try? -- Additional comment from pasik on 2007-03-26 11:34 EST -- I tried 1.2937 again and with the second and third try it booted ok.. I wonder what happened with the first try.. then the system rebooted itself while booting the kernel? Now let's see if 1.2937 actually stays up and doesn't crash by itself like 1.2933 did. My hardware is Intel P4 with i955x chipset, ahci sata disks. -- Additional comment from pasik on 2007-03-26 11:58 EST -- No win.. the server is rebooting itself every 1-10 mins with 1.2937.. the server is idle when that happens (or maybe md-raid1 reconstruction running, but nothing else). -- Additional comment from cebbert on 2007-03-26 12:01 EST -- (In reply to comment #16) > No win.. the server is rebooting itself every 1-10 mins with 1.2937.. the server > is idle when that happens (or maybe md-raid1 reconstruction running, but nothing > else). > > Please report a separate bug for this, as it involves Xen. -- Additional comment from pasik on 2007-03-26 12:40 EST -- Done: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=234008 -- Additional comment from pasik on 2007-03-27 02:51 EST -- Also related?: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=233918 -- Additional comment from cebbert on 2007-03-27 10:26 EST -- Can someone who originally reported this bug please test kernel 2937 or greater? The Xen problem is a completely different bug. -- Additional comment from hanpingtian on 2007-03-28 09:00 EST -- (In reply to comment #20) > Can someone who originally reported this bug please test kernel 2937 or greater? > The Xen problem is a completely different bug. > I am testing it now .... One question: there is no such package as "kmod-fglrx.2.6.20-1.2937", but my X-window is still running, could you tell me why? -- Additional comment from cebbert on 2007-03-28 09:30 EST -- (In reply to comment #21) > I am testing it now .... > One question: there is no such package as "kmod-fglrx.2.6.20-1.2937", but my > X-window is still running, could you tell me why? I was wondering about that myself... -- Additional comment from hanpingtian on 2007-03-28 10:05 EST -- It freezes just now...... Before that, I am running yum. It blocked at futex and I killed it. And then, just for a while, the system freezes. -- Additional comment from djuran on 2007-04-03 12:53 EST -- I've just tested with 2.6.20-1.2940.fc6 and it works considerably better then 2933, but still not perfect. With 2933 the computer locked up completely and a power cycle (reset was not enough) was required to obtain access to the SATA disk again. Performing the same operation with 2940 the system became unresponsive for a while and then the messages below showed up in the syslog but the machine recovered. Apr 3 19:10:22 localhost kernel: ata3: EH in ADMA mode, notifier 0x0 notifier_error 0x0 gen_ctl 0x1501000 status 0x400 Apr 3 19:10:22 localhost kernel: ata3: CPB 0: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:22 localhost kernel: ata3: CPB 1: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:22 localhost kernel: ata3: CPB 2: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:22 localhost kernel: ata3: CPB 3: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:22 localhost kernel: ata3: CPB 4: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:22 localhost kernel: ata3: CPB 5: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:22 localhost kernel: ata3: CPB 6: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:22 localhost kernel: ata3: CPB 7: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:22 localhost kernel: ata3: CPB 8: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:22 localhost kernel: ata3: CPB 9: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:22 localhost kernel: ata3: CPB 10: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:30 localhost kernel: ata3: CPB 11: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:30 localhost kernel: ata3: CPB 12: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:31 localhost kernel: ata3: CPB 13: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:31 localhost kernel: ata3: CPB 14: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:31 localhost kernel: ata3: CPB 15: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:31 localhost kernel: ata3: CPB 16: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:31 localhost kernel: ata3: CPB 17: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:31 localhost kernel: ata3: CPB 18: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:31 localhost kernel: ata3: CPB 19: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:31 localhost kernel: ata3: CPB 20: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:31 localhost kernel: ata3: CPB 21: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:31 localhost kernel: ata3: CPB 22: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:31 localhost kernel: ata3: CPB 23: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:31 localhost kernel: ata3: CPB 24: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:31 localhost kernel: ata3: CPB 25: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:31 localhost kernel: ata3: CPB 26: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:31 localhost kernel: ata3: CPB 27: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:31 localhost kernel: ata3: CPB 28: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:31 localhost kernel: ata3: CPB 29: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:31 localhost kernel: ata3: CPB 30: ctl_flags 0x1f, resp_flags 0x2 Apr 3 19:10:31 localhost kernel: ata3: Resetting port Apr 3 19:10:32 localhost kernel: ata3.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x2 frozen Apr 3 19:10:32 localhost kernel: ata3.00: cmd 61/e8:00:8d:fb:39/01:00:0f:00:00/40 tag 0 cdb 0x0 data 249856 out Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3.00: cmd 60/00:08:75:99:59/02:00:0e:00:00/40 tag 1 cdb 0x0 data 262144 in Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3.00: cmd 61/80:10:55:14:3a/01:00:0f:00:00/40 tag 2 cdb 0x0 data 196608 out Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3.00: cmd 61/e8:18:dd:15:3a/01:00:0f:00:00/40 tag 3 cdb 0x0 data 249856 out Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3.00: cmd 61/e8:20:85:10:3a/01:00:0f:00:00/40 tag 4 cdb 0x0 data 249856 out Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3.00: cmd 61/e8:28:c5:17:3a/01:00:0f:00:00/40 tag 5 cdb 0x0 data 249856 out Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3.00: cmd 61/e8:30:ad:19:3a/01:00:0f:00:00/40 tag 6 cdb 0x0 data 249856 out Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3.00: cmd 61/e8:38:35:23:3a/01:00:0f:00:00/40 tag 7 cdb 0x0 data 249856 out Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3.00: cmd 61/e8:40:1d:25:3a/01:00:0f:00:00/40 tag 8 cdb 0x0 data 249856 out Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3.00: cmd 61/e8:48:b5:0c:3a/01:00:0f:00:00/40 tag 9 cdb 0x0 data 249856 out Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3.00: cmd 61/e8:50:05:27:3a/01:00:0f:00:00/40 tag 10 cdb 0x0 data 249856 out Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3.00: cmd 61/e8:58:65:f2:39/01:00:0f:00:00/40 tag 11 cdb 0x0 data 249856 out Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3.00: cmd 61/80:60:4d:f4:39/01:00:0f:00:00/40 tag 12 cdb 0x0 data 196608 out Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3.00: cmd 60/10:68:b5:d9:97/00:00:03:00:00/40 tag 13 cdb 0x0 data 8192 in Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3.00: cmd 61/e8:70:75:fd:39/01:00:0f:00:00/40 tag 14 cdb 0x0 data 249856 out Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3.00: cmd 61/e8:78:5d:ff:39/01:00:0f:00:00/40 tag 15 cdb 0x0 data 249856 out Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3.00: cmd 61/e8:80:9d:0e:3a/01:00:0f:00:00/40 tag 16 cdb 0x0 data 249856 out Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3.00: cmd 61/e8:88:6d:12:3a/01:00:0f:00:00/40 tag 17 cdb 0x0 data 249856 out Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3.00: cmd 61/e8:90:2d:03:3a/01:00:0f:00:00/40 tag 18 cdb 0x0 data 249856 out Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3.00: cmd 61/e8:98:fd:06:3a/01:00:0f:00:00/40 tag 19 cdb 0x0 data 249856 out Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3.00: cmd 61/e8:a0:e5:08:3a/01:00:0f:00:00/40 tag 20 cdb 0x0 data 249856 out Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3.00: cmd 61/e8:a8:a5:f9:39/01:00:0f:00:00/40 tag 21 cdb 0x0 data 249856 out Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3.00: cmd 61/e8:b0:cd:0a:3a/01:00:0f:00:00/40 tag 22 cdb 0x0 data 249856 out Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3.00: cmd 61/e8:b8:95:1b:3a/01:00:0f:00:00/40 tag 23 cdb 0x0 data 249856 out Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3.00: cmd 61/e8:c0:7d:1d:3a/01:00:0f:00:00/40 tag 24 cdb 0x0 data 249856 out Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3.00: cmd 61/e8:c8:65:1f:3a/01:00:0f:00:00/40 tag 25 cdb 0x0 data 249856 out Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3.00: cmd 61/e8:d0:bd:f7:39/01:00:0f:00:00/40 tag 26 cdb 0x0 data 249856 out Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3.00: cmd 61/e8:d8:4d:21:3a/01:00:0f:00:00/40 tag 27 cdb 0x0 data 249856 out Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3.00: cmd 60/08:e0:3d:da:97/00:00:03:00:00/40 tag 28 cdb 0x0 data 4096 in Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3.00: cmd 60/10:e8:bd:ff:97/00:00:03:00:00/40 tag 29 cdb 0x0 data 8192 in Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3.00: cmd 61/e8:f0:d5:f5:39/01:00:0f:00:00/40 tag 30 cdb 0x0 data 249856 out Apr 3 19:10:32 localhost kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 3 19:10:32 localhost kernel: ata3: soft resetting port Apr 3 19:10:32 localhost kernel: ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300) Apr 3 19:10:32 localhost kernel: ata3.00: configured for UDMA/133 Apr 3 19:10:32 localhost kernel: ata3: EH complete Apr 3 19:10:32 localhost kernel: SCSI device sda: 398297088 512-byte hdwr sectors (203928 MB) Apr 3 19:10:32 localhost kernel: sda: Write Protect is off Apr 3 19:10:32 localhost kernel: SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA -- Additional comment from djuran on 2007-04-13 05:00 EST -- Is this the correct place to suggest additions to the ata_device_blacklist in drivers/ata/libata-core.c? If so I'd suggest adding the following entry there: { "Maxtor 6B200M0", "BANC", ATA_HORKAGE_NONCQ } With this entry, my computer works fine again and the drive no longer locks the machine up under load. -- Additional comment from djuran on 2007-04-13 08:35 EST -- "works fine" turned out to be a bit of an exaggeration, under heave I/O the machine hard-locked and needed a power cycle to recover )-: I'm now running 2.6.20-1.2944.fc6 with the parameter "adma=0" passed to the sata_nv module and this seems (so far) to work fine... -- Additional comment from cebbert on 2007-04-19 11:32 EST -- kernel 2944 has the latest NCQ blacklist from 2.6.21 -- Additional comment from misek on 2007-04-19 17:01 EST -- For me, the 2944 makes the same problems as the previous kernels. Maybe the sata_nv fix from the latest kernels should help. -- Additional comment from hancockr on 2007-04-19 19:17 EST -- It doesn't look like David Juran's issue is a problem with the driver. The CPB response flags indicate 0x2 which means the controller has sent the command to the device and is waiting for it to indicate completion, obviously it never did. Quite likely NCQ does not work properly on that drive and it needs to be added to the NCQ blacklist. Disabling ADMA also disables NCQ so it is not surprising that it also stops the problem from showing up. -- Additional comment from cebbert on 2007-04-19 19:38 EST -- (In reply to comment #29) > It doesn't look like David Juran's issue is a problem with the driver. The CPB > response flags indicate 0x2 which means the controller has sent the command to > the device and is waiting for it to indicate completion, obviously it never did. > Quite likely NCQ does not work properly on that drive and it needs to be added > to the NCQ blacklist. > But David says he added the drive to the blacklist himself and that didn't fix the problem. Maybe he didn't add it properly? -- Additional comment from hancockr on 2007-04-20 02:05 EST -- I don't think the firmware part of the line he mentioned he added is correct. The SCSI layer lists only the first 4 characters of the firmware string but the actual ATA string is longer, you need the full string (from hdparm -I for example). -- Additional comment from djuran on 2007-04-23 13:21 EST -- D'Oh! So the firmware revision should be "BANC1BM0". I'll re-enable adma and try this for a few days and let you know how it fares... -- Additional comment from hancockr on 2007-04-23 18:41 EST -- If the blacklist entry has been recognized properly you should see "NCQ (not used)" instead of "NCQ (depth 31/32)". -- Additional comment from djuran on 2007-04-26 14:53 EST -- It seems my drive is more messed up then it has any kind of right to be. To find out the model and revision, I inserted into ata_device_blacklisted the following printk: printk(KERN_NOTICE "modellen ar: XXX%sXXX\n",model_num); printk(KERN_NOTICE "revisionen ar: XXX%sXXX\n",model_rev); and this is what I got into dmesg: modellen ar: XXXMaxtor 6B200M0XXX revisionen ar: XXXBANC1BM0Maxtor 6<C0>^E^?^?XXX There seem to be some non-printable characters in model_rev! Maybe it would make sense to just blacklist the entire model irregardless of revision i.e. { "Maxtor 6B200M0", NULL, ATA_HORKAGE_NONCQ } -- Additional comment from hanpingtian on 2007-05-13 08:06 EST -- Any updates? The kernel-2.6.20-1.2948.fc6.x86_64 doesn't fix this problem .... -- Additional comment from hanpingtian on 2007-06-08 09:25 EST -- kernel-2.6.20-1.2952.fc6.x86_64 failed. Any updates? -- Additional comment from hanpingtian on 2007-06-17 08:20 EST -- kernel-2.6.21-1.3194.fc7 and kernel-2.6.21-1.3228.fc7 both failed in fedora 7. -- Additional comment from hanpingtian on 2007-07-19 09:02 EST -- Any updates? Why kernel-2.6.18-1.2798 no such problem but all updated kernel have this problem? -- Additional comment from jwilson on 2007-07-23 11:30 EST -- (In reply to comment #38) > Any updates? Why kernel-2.6.18-1.2798 no such problem but all updated kernel have > this problem? Hard to say without having your exact system in front of us here. All these kernels along the way work for the vast majority of users. Have you tried the recently pushed 2.6.22.1-based kernels yet? -- Additional comment from hanpingtian on 2007-07-23 23:42 EST -- > Hard to say without having your exact system in front of us here. All these Did you need any infos? Could I do something? > kernels along the way work for the vast majority of users. Have you tried the > recently pushed 2.6.22.1-based kernels yet? I will try it later. -- Additional comment from hanpingtian on 2007-07-24 08:42 EST -- kernel-2.6.22.1-27.fc7.x86_64 fails also ...
The newest kernel kernel-2.6.22.1-33.fc7.x86_64 fails also. I have to clone it to F7 from fc6.
(In reply to comment #1) > The newest kernel kernel-2.6.22.1-33.fc7.x86_64 fails also. I have to clone it > to F7 from fc6. Does adding "pci=nomsi,nommconf" to the kernel command line help, or did we try that already?
(In reply to comment #2) > (In reply to comment #1) > > The newest kernel kernel-2.6.22.1-33.fc7.x86_64 fails also. I have to clone it > > to F7 from fc6. > > Does adding "pci=nomsi,nommconf" to the kernel command line help, or did we try > that already? No, I hadn't added those command line options. And I have switched to i386 release now.
On my system it seems this bug is solved by kernel-2.6.22.1-41.fc7, although I'm not sure what was changed. BTW haven't tried pci=nomsi,nommconf option before.
Hello, I'm reviewing this bug as part of the kernel bug triage project, an attempt to isolate current bugs in the fedora kernel. http://fedoraproject.org/wiki/KernelBugTriage I am closing this bug as it appears resolved. If I have erred, please accept my profuse apologies and re-open and I will attempt to assist in its resolution. Cheers Chris