Description of problem: After a while of using the disc (I have /var/log and /VM - virtualbox discs - linux partitions and a couple of windows ntfs partitions on it.) linux partitions show as read only, but whole disc actually becomes completely unavailable. Disc Utility recognizes that there is a 1TB disc, but provides no information on it. This happens only when I use linux and disk more intensively, like copying some larger (couple of hundertes of MB) files to it and using virtual disc with virtual machines. Mainboard is asrock z68 extreme4 and disc is connected to one of two SATA 3 Marvel SE9120 connectors. lspci: 02:00.0 SATA controller: Marvell Technology Group Ltd. Device 9120 (rev 12) Disc itself should be fine. I have run Self-tests a couple of times with fedora disc utility and on windows with hd tune and they show no problems. S.M.A.R.T status is always fine. Since this never happens with windows, could that exclude a hardware failure? Beside that I noticed that other people have problem with marvell controllers and linux to. Version-Release number of selected component (if applicable): How reproducible: On my machine It happens randomly, usually in the first hour when disk is intensively used. Steps to Reproduce: 1.Boot Fedora 15 2.Use marvell SATA 3 connector for some time Actual results: Disc becomes unavailale Expected results: Working Disc Additional info: I never got any logs, since /var/log partition is on the disc. I will move it and check again. Last time (yesterday) tty1 showed for a second and I saw messages you'll find bellow. I got them from dmesg. [ 324.799501] TCP lp registered [ 3315.178540] EXT4-fs (sdd5): recovery complete [ 3315.178805] EXT4-fs (sdd5): mounted filesystem with ordered data mode. Opts: (null) [ 3315.178813] SELinux: initialized (dev sdd5, type ext4), uses xattr [ 3369.312792] ata7.00: exception Emask 0x0 SAct 0x1f SErr 0x0 action 0x6 frozen [ 3369.312795] ata7.00: failed command: READ FPDMA QUEUED [ 3369.312798] ata7.00: cmd 60/00:00:ce:44:8a/01:00:4d:00:00/40 tag 0 ncq 131072 in [ 3369.312799] res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) [ 3369.312800] ata7.00: status: { DRDY } [ 3369.312802] ata7.00: failed command: READ FPDMA QUEUED [ 3369.312812] ata7.00: cmd 60/00:08:ce:43:8a/01:00:4d:00:00/40 tag 1 ncq 131072 in [ 3369.312813] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [ 3369.312815] ata7.00: status: { DRDY } [ 3369.312816] ata7.00: failed command: WRITE FPDMA QUEUED [ 3369.312819] ata7.00: cmd 61/08:10:48:c4:77/00:00:39:00:00/40 tag 2 ncq 4096 out [ 3369.312820] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [ 3369.312821] ata7.00: status: { DRDY } [ 3369.312822] ata7.00: failed command: WRITE FPDMA QUEUED [ 3369.312825] ata7.00: cmd 61/08:18:0e:89:40/00:00:60:00:00/40 tag 3 ncq 4096 out [ 3369.312826] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [ 3369.312827] ata7.00: status: { DRDY } [ 3369.312829] ata7.00: failed command: WRITE FPDMA QUEUED [ 3369.312832] ata7.00: cmd 61/08:20:16:89:40/00:00:60:00:00/40 tag 4 ncq 4096 out [ 3369.312832] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [ 3369.312834] ata7.00: status: { DRDY } [ 3369.312838] ata7: hard resetting link [ 3369.771467] ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 330) [ 3374.759167] ata7.00: qc timeout (cmd 0xec) [ 3374.759182] ata7.00: failed to IDENTIFY (I/O error, err_mask=0x4) [ 3374.759185] ata7.00: revalidation failed (errno=-5) [ 3374.759192] ata7: hard resetting link [ 3384.731673] ata7: softreset failed (1st FIS failed) [ 3384.731679] ata7: hard resetting link [ 3394.703204] ata7: softreset failed (1st FIS failed) [ 3394.703211] ata7: hard resetting link [ 3429.603480] ata7: softreset failed (1st FIS failed) [ 3429.603487] ata7: limiting SATA link speed to 3.0 Gbps [ 3429.603490] ata7: hard resetting link [ 3434.590222] ata7: softreset failed (1st FIS failed) [ 3434.590228] ata7: reset failed, giving up [ 3434.590231] ata7.00: disabled [ 3434.590236] ata7.00: device reported invalid CHS sector 0 [ 3434.590240] ata7.00: device reported invalid CHS sector 0 [ 3434.590243] ata7.00: device reported invalid CHS sector 0 [ 3434.590245] ata7.00: device reported invalid CHS sector 0 [ 3434.590248] ata7.00: device reported invalid CHS sector 0 [ 3434.590263] ata7: hard resetting link [ 3444.561733] ata7: softreset failed (1st FIS failed) [ 3444.561738] ata7: hard resetting link [ 3454.534249] ata7: softreset failed (1st FIS failed) [ 3454.534255] ata7: hard resetting link [ 3489.434539] ata7: softreset failed (1st FIS failed) [ 3489.434546] ata7: limiting SATA link speed to 1.5 Gbps [ 3489.434548] ata7: hard resetting link [ 3494.421287] ata7: softreset failed (1st FIS failed) [ 3494.421292] ata7: reset failed, giving up [ 3494.421310] ata7: EH complete [ 3494.421330] sd 6:0:0:0: [sdd] Unhandled error code [ 3494.421340] sd 6:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK [ 3494.421342] sd 6:0:0:0: [sdd] CDB: Write(10): 2a 00 60 40 89 16 00 00 08 00 [ 3494.421346] end_request: I/O error, dev sdd, sector 1614842134 [ 3494.421353] sd 6:0:0:0: [sdd] Unhandled error code [ 3494.421355] sd 6:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK [ 3494.421357] sd 6:0:0:0: [sdd] CDB: Write(10): 2a 00 60 40 89 0e 00 00 08 00 [ 3494.421368] end_request: I/O error, dev sdd, sector 1614842126 [ 3494.421372] Aborting journal on device sdd4-8. [ 3494.421374] sd 6:0:0:0: [sdd] Unhandled error code [ 3494.421375] sd 6:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK [ 3494.421377] sd 6:0:0:0: [sdd] CDB: Write(10): 2a 00 39 77 c4 48 00 00 08 00 [ 3494.421381] end_request: I/O error, dev sdd, sector 964150344 [ 3494.421383] Buffer I/O error on device sdd5, logical block 39878656 [ 3494.421385] lost page write due to I/O error on sdd5 [ 3494.421394] JBD2: I/O error detected when updating journal superblock for sdd5-8. [ 3494.421396] sd 6:0:0:0: [sdd] Unhandled error code [ 3494.421398] sd 6:0:0:0: [sdd] Unhandled error code [ 3494.421399] sd 6:0:0:0: [sdd] [ 3494.421401] sd 6:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK [ 3494.421403] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK [ 3494.421405] sd 6:0:0:0: [sdd] CDB: [ 3494.421407] sd 6:0:0:0: [sdd] CDB: Read(10)Write(10):: 28 2a 00 00 4d 60 8a 40 43 88 ce 4e 00 00 01 00 00 08 00 00 [ 3494.421415] [ 3494.421416] end_request: I/O error, dev sdd, sector 1300906958 [ 3494.421418] end_request: I/O error, dev sdd, sector 1614841934 [ 3494.421420] Buffer I/O error on device sdd4, logical block 39354368 [ 3494.421422] lost page write due to I/O error on sdd4 [ 3494.421428] sd 6:0:0:0: [sdd] Unhandled error code [ 3494.421429] sd 6:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK [ 3494.421432] sd 6:0:0:0: [sdd] CDB: Read(10): 28 00 4d 8a 44 [ 3494.421435] JBD2: I/O error detected when updating journal superblock for sdd4-8. [ 3494.421437] ce 00 01 00 00 [ 3494.421439] end_request: I/O error, dev sdd, sector 1300907214 [ 3494.421470] sd 6:0:0:0: [sdd] READ CAPACITY(16) failed [ 3494.421471] sd 6:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK [ 3494.421473] sd 6:0:0:0: [sdd] Sense not available. [ 3494.421488] sd 6:0:0:0: [sdd] READ CAPACITY failed [ 3494.421489] sd 6:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK [ 3494.421490] sd 6:0:0:0: [sdd] Sense not available. [ 3494.421512] sd 6:0:0:0: [sdd] Asking for cache data failed [ 3494.421513] sd 6:0:0:0: [sdd] Assuming drive cache: write through [ 3494.421516] sdd: detected capacity change from 1000204886016 to 0 [ 3494.429595] Aborting journal on device sdd3-8. [ 3494.429604] JBD2: I/O error detected when updating journal superblock for sdd3-8. [ 3494.429615] EXT4-fs error (device sdd3): ext4_journal_start_sb:260: Detected aborted journal [ 3494.429619] EXT4-fs (sdd3): Remounting filesystem read-only [ 3494.429620] EXT4-fs (sdd3): previous I/O error to superblock detected [ 3494.429623] JBD2: Detected IO errors while flushing file data on sdd3-8 [ 3494.521799] EXT4-fs (sdd5): delayed block allocation failed for inode 12 at logical offset 32768 with max blocks 2048 with error -5 [ 3494.521801] EXT4-fs (sdd5): This should not happen!! Data will be lost [ 3494.521802] [ 3494.522380] EXT4-fs (sdd5): delayed block allocation failed for inode 12 at logical offset 32768 with max blocks 2048 with error -5 [ 3494.522382] EXT4-fs (sdd5): This should not happen!! Data will be lost [ 3494.522383] [ 3494.522394] ------------[ cut here ]------------ [ 3494.522415] kernel BUG at fs/ext4/inode.c:2188! [ 3494.522432] invalid opcode: 0000 [#1] SMP [ 3494.522450] last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map [ 3494.522477] CPU 7 [ 3494.522485] Modules linked in: tcp_lp fuse ppdev parport_pc lp parport vboxnetadp vboxnetflt 8021q garp stp llc vboxdrv cpufreq_ondemand acpi_cpufreq freq_table mperf sco bnep l2cap bluetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables coretemp snd_hda_codec_hdmi snd_hda_codec_realtek usblp microcode joydev snd_hda_intel i2c_i801 tg3 serio_raw snd_hda_codec iTCO_wdt snd_hwdep iTCO_vendor_support snd_seq snd_seq_device shpchp snd_pcm xhci_hcd snd_timer wmi snd soundcore snd_page_alloc ipv6 firewire_ohci firewire_core crc_itu_t i915 drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan] [ 3494.522754] [ 3494.522761] Pid: 2572, comm: flush-8:48 Not tainted 2.6.38.8-35.fc15.x86_64 #1 To Be Filled By O.E.M. To Be Filled By O.E.M./Z68 Extreme4 [ 3494.522810] RIP: 0010:[<ffffffff81194de3>] [<ffffffff81194de3>] ext4_da_block_invalidatepages+0x7a/0xec [ 3494.522848] RSP: 0018:ffff880228dd18a0 EFLAGS: 00010246 [ 3494.522867] RAX: 0040000000000024 RBX: 00000000000087ff RCX: 000000000000000e [ 3494.522892] RDX: 000000000000000e RSI: ffff880228dd18b0 RDI: ffffea00056770d8 [ 3494.522917] RBP: ffff880228dd1950 R08: ffff880228dd17e0 R09: 0000000000000002 [ 3494.522942] R10: 000000000000800e R11: ffff88019c005b20 R12: ffff8801fcac2638 [ 3494.522967] R13: ffffea0005676b98 R14: 0000000000000000 R15: 000000000000000e [ 3494.523000] FS: 0000000000000000(0000) GS:ffff88001efc0000(0000) knlGS:0000000000000000 [ 3494.523028] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 3494.523048] CR2: 00007f563639d000 CR3: 0000000001a03000 CR4: 00000000000406e0 [ 3494.523072] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 3494.523096] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 3494.523121] Process flush-8:48 (pid: 2572, threadinfo ffff880228dd0000, task ffff88022bd78000) [ 3494.523150] Stack: [ 3494.523157] 000000000000000e ffff880228dd18b0 ffffea0005676b98 ffffea0005676b60 [ 3494.523188] ffffea0005676b28 ffffea0005676af0 ffffea0005676ab8 ffffea0005676a80 [ 3494.523217] ffffea0005676f50 ffffea0005676f88 ffffea0005676fc0 ffffea0005676ff8 [ 3494.523246] Call Trace: [ 3494.523258] [<ffffffff8119a225>] mpage_da_map_and_submit+0x1fa/0x2e0 [ 3494.523281] [<ffffffff8119a3d9>] mpage_add_bh_to_extent+0xce/0xdd [ 3494.523303] [<ffffffff8119a640>] write_cache_pages_da+0x258/0x346 [ 3494.523326] [<ffffffff81111763>] ? kmem_cache_alloc+0x90/0x105 [ 3494.523347] [<ffffffff8119aa23>] ext4_da_writepages+0x2f5/0x4e9 [ 3494.523370] [<ffffffff810e05d5>] do_writepages+0x21/0x2a [ 3494.523390] [<ffffffff8113e392>] writeback_single_inode+0x96/0x194 [ 3494.523413] [<ffffffff8113e6e7>] writeback_sb_inodes+0xa1/0x12b [ 3494.523435] [<ffffffff8113f4fc>] writeback_inodes_wb+0x163/0x175 [ 3494.523457] [<ffffffff8113f74d>] wb_writeback+0x23f/0x35a [ 3494.523477] [<ffffffff81080b7b>] ? arch_local_irq_save+0x15/0x1b [ 3494.523499] [<ffffffff8113f9ab>] wb_do_writeback+0x143/0x19d [ 3494.523521] [<ffffffff814771eb>] ? schedule_timeout+0xb0/0xde [ 3494.523542] [<ffffffff8113fa8d>] bdi_writeback_thread+0x88/0x1e5 [ 3494.523564] [<ffffffff8113fa05>] ? bdi_writeback_thread+0x0/0x1e5 [ 3494.523586] [<ffffffff8106ebaf>] kthread+0x84/0x8c [ 3494.523605] [<ffffffff8100a9e4>] kernel_thread_helper+0x4/0x10 [ 3494.523626] [<ffffffff8106eb2b>] ? kthread+0x0/0x8c [ 3494.523644] [<ffffffff8100a9e0>] ? kernel_thread_helper+0x0/0x10 [ 3494.523664] Code: ea 4c 89 e6 e8 18 c1 f4 ff 85 c0 41 89 c7 74 7b 45 31 f6 eb 3e 4e 8b ac f5 60 ff ff ff 49 39 5d 20 77 35 49 8b 45 00 a8 01 75 02 <0f> 0b 49 8b 45 00 49 ff c6 f6 c4 20 74 02 0f 0b 31 f6 4c 89 ef [ 3494.523794] RIP [<ffffffff81194de3>] ext4_da_block_invalidatepages+0x7a/0xec [ 3494.523820] RSP <ffff880228dd18a0> [ 3494.536954] JBD2: Detected IO errors while flushing file data on sdd5-8 [ 3494.536985] Aborting journal on device sdd5-8. [ 3494.537014] JBD2: I/O error detected when updating journal superblock for sdd5-8. [ 3494.584228] ---[ end trace b36f588e61c2cf6a ]---
And yes, I forgot, kernel is 2.6.38.8-35.fc15.x86_64 and on the second marvell connector I have old SATA I disc with ubuntu 10.04 on it, which I use in VirtualBox now. Never had any problems with it.
I am sorry, I also forgot to mention that the disc in question is WD WD1002FAEX 1TB SATA 3 capable disc.
Some more info from lspci: 02:00.0 SATA controller [0106]: Marvell Technology Group Ltd. Device [1b4b:9120] (rev 12) Subsystem: ASRock Incorporation Device [1849:9120] Kernel driver in use: ahci from lsmod: Module Size Used by tcp_lp 2183 0 fuse 62289 3 ppdev 7836 0 parport_pc 21216 0 lp 9725 0 parport 32438 3 ppdev,parport_pc,lp vboxnetadp 5658 0 vboxnetflt 17806 0 8021q 18739 0 garp 6087 1 8021q stp 1951 1 garp llc 4716 2 garp,stp vboxdrv 1789313 2 vboxnetadp,vboxnetflt cpufreq_ondemand 9466 8 acpi_cpufreq 7001 1 freq_table 3963 2 cpufreq_ondemand,acpi_cpufreq mperf 1505 1 acpi_cpufreq sco 16268 2 bnep 14899 2 l2cap 52225 3 bnep bluetooth 91191 5 sco,bnep,l2cap rfkill 16552 2 bluetooth ip6t_REJECT 4048 2 nf_conntrack_ipv6 7978 1 nf_defrag_ipv6 9531 1 nf_conntrack_ipv6 ip6table_filter 1695 1 ip6_tables 16850 1 ip6table_filter coretemp 5771 0 snd_hda_codec_hdmi 22998 1 snd_hda_codec_realtek 325262 1 usblp 10814 0 microcode 18117 0 snd_hda_intel 23660 2 joydev 9651 0 snd_hda_codec 80838 3 snd_hda_codec_hdmi,snd_hda_codec_realtek,snd_hda_intel snd_hwdep 6368 1 snd_hda_codec snd_seq 52438 0 tg3 106353 0 i2c_i801 9213 0 snd_seq_device 6001 1 snd_seq serio_raw 4426 0 iTCO_wdt 11480 0 iTCO_vendor_support 2634 1 iTCO_wdt snd_pcm 78484 3 snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec shpchp 24582 0 snd_timer 19593 2 snd_seq,snd_pcm wmi 9105 0 snd 62670 13 snd_hda_codec_hdmi,snd_hda_codec_realtek,snd_hda_intel,snd_hda_codec,snd_hwdep,snd_seq,snd_seq_device,snd_pcm,snd_timer xhci_hcd 103527 0 soundcore 6299 1 snd snd_page_alloc 7431 2 snd_hda_intel,snd_pcm ipv6 282108 47 ip6t_REJECT,nf_conntrack_ipv6,nf_defrag_ipv6 firewire_ohci 25351 0 firewire_core 48190 1 firewire_ohci crc_itu_t 1587 1 firewire_core i915 347062 3 drm_kms_helper 27515 1 i915 drm 187984 4 i915,drm_kms_helper i2c_algo_bit 5014 1 i915 i2c_core 25468 5 i2c_i801,i915,drm_kms_helper,drm,i2c_algo_bit video 12432 1 i915
I have booted with kernel line parameter ahci.marvell_enable=1. The machine is running for about 27 hours and it seems that everything is working fine. Should I change status to closed?
It happened again. Now I am using 2.6.40-4.fc15.x86_64 kernel. Windows VM was running at the moment. System locked for a minute or two as I tried to access file on an external disk (This one is also connected to the marvell SATA 3 conector) owned by root (I was normal user). Afterwards I noticed VM crashed, and rsyslogd, not being able to write to /var/log, was taking > 90 % CPU.
we've seen a lot of random corruption problems with virtualbox modules loaded. If you can't reproduce problems without them, I recommend reporting problems to the virtualbox developers.
Which version of VirtualBox do you have installed?
(In reply to comment #8) > Which version of VirtualBox do you have installed? Hello Frank, I always upto date, so at the moment 4.1.4. The doesn't happen only when VirtualBox is working (however the model is loaded), it happens sometimes during excessive or itense writes. It never happenes when I am using only VM with physical disk access (same controller, SATA I disk).
Sorry for english in my last comment. I was talking on the phone at the same time... So my VirtualBox is always up to date and the problem happens also when I am not using VirtualBox.
I am having similar (if not the same) problems in Fedora 16. I have the same hard drive as Denis mentioned in Comment #2 (WD1002FAEX-00Y9A0). My system has also a Marvell sata controller, but it also has an Intel one, I am not sure to which one the hard drive is connected: 00:1f.2 SATA controller: Intel Corporation 6 Series/C200 Series Chipset Family 6 port SATA AHCI Controller (rev 05) 0a:00.0 SATA controller: Marvell Technology Group Ltd. Device 9172 (rev 11)
I am too on Fedora 16 now. The problem persist. Same log messages, but it happens more seldom. But when it happens it seems it causes sometimes (~50% cases) the system freeze.
With kernel-3.1.9-1.fc16.x86_64 from @updates-testing repo the problem happens remarkably more often. I think I never reach an hour when using VM residing on that disk.
[mass update] kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository. Please retest with this update.
I switched about one week ago to Ubuntu 12.04, and this bug was one of the reasons. Second one (not relevant for this bug at all, I know) is that it is LTS edition and as such more appropriate for my needs. I was just thinking to report here that this issue seems to be non existent on Ubuntu. I still have my Fedora installation, so I'll try to check 3.3 kernel in next few days and report back.
Hi, did some IO test like copying tens of gigabytes to the disk, shreding gigabytes of other files at the same time while watching the movie and everything looks fine. Cannot see anything suspicious in the /var/log/messages also. I was never able to stress this disk like this with Fedora 15-16 successfully, without it going to read only mode and all above mentioned, until now. I am glad that this issue seems to be resolved! Thanks Denis
Forgot to mention, kernel used was 3.3.0-4.fc16.x86_64.
Sh*t. It just happaned again. There are to much errors in /var/log/messages file so I uploaded it to box.net. Link: http://www.box.com/s/4eb1b71e45c6c8df1b80 I think errors concerning this start line 3797.
Just to say that it still happens to me too, with a 3.3.0-4 kernel.
I too have been seeing this on and off for many months. No special kernel params passed at boot. I have 5x 3TB drives across 3 different controllers as /dev/md0. Copying over NFS to /dev/md0 will reproduce the error. Sometimes it takes minutes, other times a few hours, but usually within 90 minutes. A local dd from disk to disk will also reproduce the error. I have tried: Moving disks around/swapping connectors Replaced SATA cables Splitting the load over multiple power supplies Motherboard and controller BIOS updates Not using mdadm Not using the Marvell SS8E9123 SATA3 PCIx add-in controller or mdadm (just using jbod/lvm over SB7x0/SS8E9128) dmesg excerpts: Linux version 3.4.4-4.fc16.x86_64 Memory: 16345368k/17563648k available CPU0: AMD Phenom(tm) II X6 1090T Processor stepping 00 Total of 6 processors activated (38569.86 BogoMIPS). ###Motherboard = Gigabyte GA-790FXTA-UD5 ###6x SB7x0 SATA2 - ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata3.00: ATA-8: WDC WD30EZRX-00MMMB0, 80.00A80, max UDMA/133 ata1.00: ATA-8: OCZ-VERTEX2, 1.35, max UDMA/133 ####BOOT/SYS ata2.00: ATA-8: OCZ-AGILITY3, 2.15, max UDMA/133 ata4.00: ATA-8: ST31500341AS, CC1H, max UDMA/133 ata6.00: ATA-8: ST31500341AS, CC1H, max UDMA/133 ###2x Marvell SS8E9128 SATA3 ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300) ata7.00: ATA-8: Hitachi HDS723030ALA640, MKAOA3B0, max UDMA/133 ata8: SATA link up 6.0 Gbps (SStatus 133 SControl 300) ata8.00: ATA-8: Hitachi HDS723030ALA640, MKAOA580, max UDMA/133 ###2x JMB362 eSATA (nothing connected) ###2x Marvell 2x ata14.00: ATAPI: MARVELL VIRTUALL, 1.09, max UDMA/66 ata17: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata17.00: ATA-9: WDC WD30EFRX-68AX9N0, 80.00A80, max UDMA/133 ata18: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata18.00: ATA-9: WDC WD30EFRX-68AX9N0, 80.00A80, max UDMA/133 ----------------- Selected output of lspci -v -------------- 00:11.0 SATA controller: ATI Technologies Inc SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] (prog-if 01 [AHCI 1.0]) Subsystem: Giga-byte Technology Device b002 Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 22 I/O ports at ff00 [size=8] I/O ports at fe00 [size=4] I/O ports at fd00 [size=8] I/O ports at fc00 [size=4] I/O ports at fb00 [size=16] Memory at fe02f000 (32-bit, non-prefetchable) [size=1K] Capabilities: [60] Power Management version 2 Capabilities: [70] SATA HBA v1.0 Kernel driver in use: ahci 03:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9128 PCIe SATA 6 Gb/s RAID controller (rev 11) (prog-if 01 [AHCI 1.0]) Subsystem: Giga-byte Technology Device b000 Flags: bus master, fast devsel, latency 0, IRQ 51 I/O ports at 9f00 [size=8] I/O ports at 9e00 [size=4] I/O ports at 9d00 [size=8] I/O ports at 9c00 [size=4] I/O ports at 9b00 [size=16] Memory at fd3ff000 (32-bit, non-prefetchable) [size=2K] [virtual] Expansion ROM at fd200000 [disabled] [size=64K] Capabilities: [40] Power Management version 3 Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit- Capabilities: [70] Express Legacy Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting Kernel driver in use: ahci 07:00.1 IDE interface: JMicron Technology Corp. JMB362/JMB363 Serial ATA Controller (rev 03) (prog-if 85 [Master SecO PriO]) Subsystem: Giga-byte Technology GA-EP45-DS5 Motherboard Flags: bus master, fast devsel, latency 0, IRQ 19 I/O ports at ef00 [size=8] I/O ports at ee00 [size=4] I/O ports at ed00 [size=8] I/O ports at ec00 [size=4] I/O ports at eb00 [size=16] Capabilities: [68] Power Management version 2 Kernel driver in use: pata_jmicron Kernel modules: ata_generic, pata_acpi, pata_jmicron 0b:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9123 PCIe SATA 6.0 Gb/s controller (rev 10) (prog-if 01 [AHCI 1.0]) Subsystem: Marvell Technology Group Ltd. 88SE9123 PCIe SATA 6.0 Gb/s controller Flags: bus master, fast devsel, latency 0, IRQ 52 I/O ports at cf00 [size=8] I/O ports at ce00 [size=4] I/O ports at cd00 [size=8] I/O ports at cc00 [size=4] I/O ports at cb00 [size=16] Memory at fd8ff000 (32-bit, non-prefetchable) [size=2K] [virtual] Expansion ROM at fd600000 [disabled] [size=64K] Capabilities: [40] Power Management version 3 Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit- Capabilities: [70] Express Legacy Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting Kernel driver in use: ahci ---------------- snippet of /var/log/messages when an error occurs Jul 24 08:55:03 ned kernel: [ 6024.833459] ata17.00: qc timeout (cmd 0xec) Jul 24 08:55:03 ned kernel: [ 6024.834549] ata18.00: qc timeout (cmd 0xec) Jul 24 08:55:04 ned kernel: [ 6025.332520] ata17.00: failed to IDENTIFY (I/O error, err_mask=0x4) Jul 24 08:55:04 ned kernel: [ 6025.332530] ata17.00: revalidation failed (errno=-5) Jul 24 08:55:04 ned kernel: [ 6025.332542] ata17: hard resetting link Jul 24 08:55:04 ned kernel: [ 6025.333558] ata18.00: failed to IDENTIFY (I/O error, err_mask=0x4) Jul 24 08:55:04 ned kernel: [ 6025.333569] ata18.00: revalidation failed (errno=-5) Jul 24 08:55:04 ned kernel: [ 6025.333580] ata18: hard resetting link Jul 24 08:55:05 ned kernel: [ 6026.136113] ata17: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Jul 24 08:55:05 ned kernel: [ 6026.137109] ata18: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Jul 24 08:55:15 ned kernel: [ 6036.118009] ata17.00: qc timeout (cmd 0xec) Jul 24 08:55:15 ned kernel: [ 6036.118953] ata18.00: qc timeout (cmd 0xec) Jul 24 08:55:15 ned kernel: [ 6036.617010] ata17.00: failed to IDENTIFY (I/O error, err_mask=0x4) Jul 24 08:55:15 ned kernel: [ 6036.617020] ata17.00: revalidation failed (errno=-5) Jul 24 08:55:15 ned kernel: [ 6036.617030] ata17: limiting SATA link speed to 1.5 Gbps Jul 24 08:55:15 ned kernel: [ 6036.617040] ata17: hard resetting link Jul 24 08:55:15 ned kernel: [ 6036.618041] ata18.00: failed to IDENTIFY (I/O error, err_mask=0x4) Jul 24 08:55:15 ned kernel: [ 6036.618053] ata18.00: revalidation failed (errno=-5) Jul 24 08:55:15 ned kernel: [ 6036.618062] ata18: limiting SATA link speed to 1.5 Gbps Jul 24 08:55:15 ned kernel: [ 6036.618073] ata18: hard resetting link Jul 24 08:55:16 ned kernel: [ 6037.420588] ata17: SATA link up 3.0 Gbps (SStatus 123 SControl 310) Jul 24 08:55:16 ned kernel: [ 6037.421675] ata18: SATA link up 3.0 Gbps (SStatus 123 SControl 310) Jul 24 08:55:46 ned kernel: [ 6067.366174] ata17.00: qc timeout (cmd 0xec) Jul 24 08:55:46 ned kernel: [ 6067.367195] ata18.00: qc timeout (cmd 0xec) Jul 24 08:55:46 ned kernel: [ 6067.865265] ata17.00: failed to IDENTIFY (I/O error, err_mask=0x4) Jul 24 08:55:46 ned kernel: [ 6067.865276] ata17.00: revalidation failed (errno=-5) Jul 24 08:55:46 ned kernel: [ 6067.865284] ata17.00: disabled Jul 24 08:55:46 ned kernel: [ 6067.865304] ata17.00: device reported invalid CHS sector 0 Jul 24 08:55:46 ned kernel: [ 6067.866283] ata18.00: failed to IDENTIFY (I/O error, err_mask=0x4) Jul 24 08:55:46 ned kernel: [ 6067.866293] ata18.00: revalidation failed (errno=-5) Jul 24 08:55:46 ned kernel: [ 6067.866300] ata18.00: disabled Jul 24 08:55:46 ned kernel: [ 6067.866319] ata18.00: device reported invalid CHS sector 0 Jul 24 08:55:47 ned kernel: [ 6068.364351] ata17: hard resetting link Jul 24 08:55:47 ned kernel: [ 6068.365364] ata18: hard resetting link Jul 24 08:55:48 ned kernel: [ 6069.167939] ata17: SATA link up 3.0 Gbps (SStatus 123 SControl 310) Jul 24 08:55:48 ned kernel: [ 6069.169061] ata18: SATA link up 3.0 Gbps (SStatus 123 SControl 310) Jul 24 08:55:48 ned kernel: [ 6069.666995] ata17: EH complete Jul 24 08:55:48 ned kernel: [ 6069.667072] sd 16:0:0:0: [sdh] Unhandled error code Jul 24 08:55:48 ned kernel: [ 6069.667079] sd 16:0:0:0: [sdh] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK Jul 24 08:55:48 ned kernel: [ 6069.667089] sd 16:0:0:0: [sdh] CDB: Write(10): 2a 00 20 7b 38 00 00 04 00 00 Jul 24 08:55:48 ned kernel: [ 6069.667110] end_request: I/O error, dev sdh, sector 544946176 Jul 24 08:55:48 ned kernel: [ 6069.667180] md/raid:md0: Disk failure on sdh1, disabling device. Jul 24 08:55:48 ned kernel: [ 6069.667184] md/raid:md0: Operation continuing on 4 devices. Jul 24 08:55:48 ned kernel: [ 6069.668012] ata18: EH complete Jul 24 08:55:48 ned kernel: [ 6069.668129] sd 17:0:0:0: [sdi] Unhandled error code Jul 24 08:55:48 ned kernel: [ 6069.668135] sd 17:0:0:0: [sdi] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK Jul 24 08:55:48 ned kernel: [ 6069.668145] sd 17:0:0:0: [sdi] CDB: Write(10): 2a 00 20 7b 3c 00 00 04 00 00 Jul 24 08:55:48 ned kernel: [ 6069.668166] end_request: I/O error, dev sdi, sector 544947200 Jul 24 08:55:48 ned kernel: [ 6069.668306] sd 17:0:0:0: [sdi] Unhandled error code Jul 24 08:55:48 ned kernel: [ 6069.668312] sd 17:0:0:0: [sdi] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK Jul 24 08:55:48 ned kernel: [ 6069.668320] sd 17:0:0:0: [sdi] CDB: Write(10): 2a 00 00 00 08 08 00 00 01 00 Jul 24 08:55:48 ned kernel: [ 6069.668338] end_request: I/O error, dev sdi, sector 2056 Jul 24 08:55:48 ned kernel: [ 6069.668345] end_request: I/O error, dev sdi, sector 2056 Jul 24 08:55:48 ned kernel: [ 6069.668351] md: super_written gets error=-5, uptodate=0 Jul 24 08:55:48 ned kernel: [ 6069.668360] md/raid:md0: Disk failure on sdi1, disabling device. Jul 24 08:55:48 ned kernel: [ 6069.668363] md/raid:md0: Operation continuing on 3 devices. Jul 24 08:55:48 ned kernel: [ 6069.893225] Buffer I/O error on device md0, logical block 272341760 Jul 24 08:55:48 ned kernel: [ 6069.893238] Buffer I/O error on device md0, logical block 272341761 ####lots more of the above 2 lines, different blocks. Jul 24 08:58:52 ned kernel: [ 6252.686622] sd 17:0:0:0: [sdi] Unhandled error code Jul 24 08:58:52 ned kernel: [ 6252.686632] sd 17:0:0:0: [sdi] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK Jul 24 08:58:52 ned kernel: [ 6252.686643] sd 17:0:0:0: [sdi] CDB: Read(16): 88 00 00 00 00 01 5d 50 a3 88 00 00 00 18 00 00 Jul 24 08:58:52 ned kernel: [ 6252.686668] end_request: I/O error, dev sdi, sector 5860533128 Jul 24 08:58:52 ned kernel: [ 6252.686677] Buffer I/O error on device sdi, logical block 732566641 ###and so on
@Denis Controller #1: ahci 0000:00:1f.2: AHCI 0001.0300 32 slots 6 ports 6 Gbps 0x3f impl SATA mode ahci 0000:00:1f.2: flags: 64bit ncq sntf led clo pio slum part ems apst scsi0 : ahci scsi1 : ahci scsi2 : ahci scsi3 : ahci scsi4 : ahci scsi5 : ahci ata1: SATA max UDMA/133 abar m2048@0xfbe05000 port 0xfbe05100 irq 53 ata2: SATA max UDMA/133 abar m2048@0xfbe05000 port 0xfbe05180 irq 53 ata3: SATA max UDMA/133 abar m2048@0xfbe05000 port 0xfbe05200 irq 53 ata4: SATA max UDMA/133 abar m2048@0xfbe05000 port 0xfbe05280 irq 53 ata5: SATA max UDMA/133 abar m2048@0xfbe05000 port 0xfbe05300 irq 53 ata6: SATA max UDMA/133 abar m2048@0xfbe05000 port 0xfbe05380 irq 53 ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300) ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata5: SATA link down (SStatus 0 SControl 300) ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata1.00: ATA-8: INTEL SSDSC2MH120A2, PPG4, max UDMA/133 ata1.00: 234441648 sectors, multi 16: LBA48 NCQ (depth 31/32), AA ata1.00: configured for UDMA/133 scsi 0:0:0:0: Direct-Access ATA INTEL SSDSC2MH12 PPG4 PQ: 0 ANSI: 5 sd 0:0:0:0: Attached scsi generic sg0 type 0 sd 0:0:0:0: [sda] 234441648 512-byte logical blocks: (120 GB/111 GiB) sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sda: sda1 sd 0:0:0:0: [sda] Attached SCSI disk ata2.00: ATA-9: M4-CT064M4SSD2, 0001, max UDMA/100 ata2.00: 125045424 sectors, multi 16: LBA48 NCQ (depth 31/32), AA ata2.00: configured for UDMA/100 scsi 1:0:0:0: Direct-Access ATA M4-CT064M4SSD2 0001 PQ: 0 ANSI: 5 sd 1:0:0:0: Attached scsi generic sg1 type 0 sd 1:0:0:0: [sdb] 125045424 512-byte logical blocks: (64.0 GB/59.6 GiB) sd 1:0:0:0: [sdb] Write Protect is off sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sdb: sdb1 sd 1:0:0:0: [sdb] Attached SCSI disk ata3.00: ATA-6: WDC WD2500JD-00HBB0, 08.02D08, max UDMA/133 ata3.00: 488397168 sectors, multi 16: LBA48 ata3.00: configured for UDMA/133 scsi 2:0:0:0: Direct-Access ATA WDC WD2500JD-00H 08.0 PQ: 0 ANSI: 5 sd 2:0:0:0: Attached scsi generic sg2 type 0 sd 2:0:0:0: [sdc] 488397168 512-byte logical blocks: (250 GB/232 GiB) sd 2:0:0:0: [sdc] Write Protect is off sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sdc: sdc1 sdc2 sdc3 < sdc5 sdc6 sdc7 > sdc4 sd 2:0:0:0: [sdc] Attached SCSI disk ata4.00: ATA-8: WDC WD15EARS-00MVWB0, 51.0AB51, max UDMA/133 ata4.00: 2930277168 sectors, multi 16: LBA48 NCQ (depth 31/32), AA ata4.00: configured for UDMA/133 scsi 3:0:0:0: Direct-Access ATA WDC WD15EARS-00M 51.0 PQ: 0 ANSI: 5 sd 3:0:0:0: [sdd] 2930277168 512-byte logical blocks: (1.50 TB/1.36 TiB) sd 3:0:0:0: Attached scsi generic sg3 type 0 sd 3:0:0:0: [sdd] Write Protect is off sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sdd: sdd1 sdd2 sdd3 sdd4 sdd5 sd 3:0:0:0: [sdd] Attached SCSI disk ata5.00: (EMPTY) ata6.00: ATAPI: ATAPI iHAS122, ZL0C, max UDMA/100 ata6.00: configured for UDMA/100 scsi 5:0:0:0: CD-ROM ATAPI iHAS122 ZL0C PQ: 0 ANSI: 5 sr0: scsi3-mmc drive: 48x/48x writer dvd-ram cd/rw xa/form2 cdda tray cdrom: Uniform CD-ROM driver Revision: 3.20 sr 5:0:0:0: Attached scsi generic sg4 type 5 Controller #2: ahci 0000:02:00.0: AHCI 0001.0000 32 slots 2 ports 6 Gbps 0x3 impl SATA mode ahci 0000:02:00.0: flags: 64bit ncq sntf led only pmp fbs pio slum part sxs scsi6 : ahci scsi7 : ahci ata7: SATA max UDMA/133 abar m2048@0xfbd10000 port 0xfbd10100 irq 54 ata8: SATA max UDMA/133 abar m2048@0xfbd10000 port 0xfbd10180 irq 54 ata7: SATA link down (SStatus 0 SControl 330) ata8: SATA link up 6.0 Gbps (SStatus 133 SControl 330) ata7: (EMPTY) ata8.00: ATA-8: WDC WD1002FAEX-00Z3A0, 05.01D05, max UDMA/133 ata8.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA ata8.00: configured for UDMA/133 scsi 7:0:0:0: Direct-Access ATA WDC WD1002FAEX-0 05.0 PQ: 0 ANSI: 5 sd 7:0:0:0: [sde] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB) sd 7:0:0:0: [sde] Write Protect is off sd 7:0:0:0: Attached scsi generic sg5 type 0 sd 7:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sde: sde1 sde2 sde3 sde4 sde5 sde6 sd 7:0:0:0: [sde] Attached SCSI disk Error: ata8.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x6 frozen ata8.00: failed command: WRITE FPDMA QUEUED ata8.00: cmd 61/00:00:48:9c:84/04:00:2e:00:00/40 tag 0 ncq 524288 out res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) ata8.00: status: { DRDY } <ERROR REPEATS SEVERAL TIMES, SAME BLOCK> ata8: hard resetting link ata8: softreset failed (1st FIS failed) <ERROR REPEATS SEVERAL TIMES, 3 TIMES> ata8: limiting SATA link speed to 3.0 Gbps ata8: hard resetting link ata8: softreset failed (1st FIS failed) ata8: reset failed, giving up ata8.00: disabled ata8.00: device reported invalid CHS sector 0 <MULTIPLE TIMES> ata8.00: device reported invalid CHS sector 0 ata8: EH complete <THE HARD DISK IS 'GONE'> sd 7:0:0:0: [sde] Unhandled error code sd 7:0:0:0: [sde] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK sd 7:0:0:0: [sde] CDB: Write(10): 2a 00 2e 84 fc 48 00 04 00 00 end_request: I/O error, dev sde, sector 780467272 Buffer I/O error on device sde5, logical block 16918399 EXT4-fs warning (device sde5): ext4_end_bio:243: I/O error writing to inode 11534341 (offset 1198522368 size 524288 starting block 97558537) sd 7:0:0:0: [sde] Unhandled error code sd 7:0:0:0: [sde] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK sd 7:0:0:0: [sde] CDB: Write(10): 2a 00 2e 84 f8 48 00 04 00 00 end_request: I/O error, dev sde, sector 780466248 lost page write due to I/O error on sde5 lost page write due to I/O error on sde5 EXT4-fs (sde5): I/O error while writing superblock JBD2: I/O error detected when updating journal superblock for sde5-8. EXT4-fs error (device sde5): ext4_journal_start_sb:327: Detected aborted journal EXT4-fs (sde5): Remounting filesystem read-only EXT4-fs (sde5): previous I/O error to superblock detected EXT4-fs error (device sde5): ext4_journal_start_sb:327: Detected aborted journal ------------------------- Z68 Extreme4 has BIOS 2.20 (7/12/2012) 02:00.0 SATA controller: Marvell Technology Group Ltd. Device 9120 (rev 12) ------------------------- Please provide: (change sde to the correct drive, remove drive serial number from output) # hdparm -I /dev/sde # smartctl -x /dev/sde # lspci -nnvvv -s 02:00.0 # lspci -nnvvv -s 00:1f.2 Do you always experienced it with WDC WD1002FAEX-00Z3A0, if you put other disk in any of the marvell controllers, does it happen?
I reported having this issue with the WD1002FAEX-00Y9A0 hard drive (comment #11). Then I tried with a Seagate hard drive (also 1GB), and the problem persisted. The issue didn't show when connecting the hard drives to another controller, so it was likely a problem with the marvell controller. However, since I updated the bios of my mother board (an asus P8P67) the problem has disappeared. Hope this helps some of you too.
# Mass update to all open bugs. Kernel 3.6.2-1.fc16 has just been pushed to updates. This update is a significant rebase from the previous version. Please retest with this kernel, and let us know if your problem has been fixed. In the event that you have upgraded to a newer release and the bug you reported is still present, please change the version field to the newest release you have encountered the issue with. Before doing so, please ensure you are testing the latest kernel update in that release and attach any new and relevant information you may have gathered. If you are not the original bug reporter and you still experience this bug, please file a new report, as it is possible that you may be seeing a different problem. (Please don't clone this bug, a fresh bug referencing this bug in the comment is sufficient).
With no response, we are closing this bug under the assumption that it is no longer an issue. If you still experience this bug, please feel free to reopen the bug report.