Red Hat Bugzilla – Bug 587664
Partition locks after heavy writing
Last modified: 2010-04-30 17:57:30 EDT
Description of problem:
When doing a task involving heavy writing, for example, backing up. Task aborts with "Read-only file system" message after a gigbyte or so has been written. Error occurs with tar, cp and rsync, ext3 and ext4 filesystems.
After the error occurs the partition is no longer useable. Cannot be mounted, gparted doesn't see device. Reboot allow partition to be mountable again, fsck shows no errors. Have seen this on two partitions on two different physical drives.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Perform a task involving a lot of writing such as rsync
rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: Broken pipe (32)
rsync: write failed on "/media/oslash/Cimbaoth/Cimbaoth/jjmcd-091022.tar.gz": Read-only file system (30)
rsync error: error in file IO (code 11) at receiver.c(302) [receiver=3.0.7]
rsync: connection unexpectedly closed (34 bytes received so far) [sender]
rsync error: error in rsync protocol data stream (code 12) at io.c(601) [sender=3.0.7]
similar errors occur with cp, tar.
dmesg will contain the errors that caused the filesystem to go readonly; can you attach that please?
Given that ext4 & ext3 both fail, I'm wondering if it could be an IO error due to a hardware problem, although if you've seen it on 2 different devices...
Anyway, dmesg after the error should offer more clues. Please just attach the whole thing, to avoid editing out anything relevant.
Created attachment 410571 [details]
dmesg immediately after error
Previously the failure would occur after less than 2G of transfer. This time I got more like 5G before the failure.
I don't see any messages related to the filesystem going readonly in dmesg, or any other storage errors for that matter.
Which device is causing the problem?
I initially saw the error on /dev/sdd1
Since I was concerned about that drive as a result of the messages, I started
backing things up to /dev/sdc2 with the same result.
Once I saw it on another partition, I began to get suspicious of F13. The
drives had previously performed without issue on F10, and this error began
shortly after doing a clean install of F13. /dev/sdd was formatted on F10.
/dev/sdd may have been F10 or possibly earlier. The OS itself is running from
Sorry /dev/sdC may have been F10 or earlier. sdd was definitely F10.
The thing is, if ext3 or ext4 -really- went readonly as rsync is saying, it would have been due to some error that would show up in the logs. I'm stumped about why we don't see that.
You say the partitions aren't even visible after this happens?
> Cannot be mounted, gparted doesn't see device.
what happens when you try to mount it? What does the kernel say when you try?
Something else is going on here ...
When you try a mount, it says:
mount: you must specify the filesystem type
This occurs with any partition on the device once the error has occurred.
I am about 99% convinced that it is hardware, and it's occurrence when I installed F13 is just coincidence.
Since F10 had worked well before, I burned a Live F10 CD and tried the same copy from it, and it failed (the message was slightly different but seemed to be telling me the same thing).
I then did essentially the same copy on F13 to a partition on a drive on a different controller, and it went OK. That is only one success, but since the lone success is a different controller, that makes the controller pretty suspicious.
I'm going to go ahead and close this as NOTABUG. It will take me a bit to round up a new controller, but since most of the system is on another controller I can limp along for a while.
Thanks for the help and sorry about the false alarm.
Ok, thanks for the followup, and good luck with the hardware :)