Description of problem:
Getting 60000+ messages "kernel: bio too big device md3", rpm also complains, due to:
Running root filesystem on luks.
This luks is on md3.
This md3 is raid1 on sata sda3 and on USB sdb3.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Setup such system, it happens during writes.
60000 messages in my current /var/log/messages:
Apr 29 10:36:17 host0 kernel: bio too big device md3 (248 > 240)
rpmdb: fsync: Input/output error
error: db4 error(5) from db->sync: Input/output error
Googled it is a known upstream problem with USB disks as kernel cannot split large I/O requests.
Still in such case curious how can a problem with only one of the raid1 disks propagate through raid1 to the application level.
Google says it corrupts data, it seems to me even the USB partition is perfectly valid (but did not do any artificial test besides boot/fsck for it).
It was happening even on F10, just filed it now for the fresh kernel.
It looks like this is really a dm problem. Maybe dm should limit itself to 128 sectors no matter what the underlying device says it can handle?
Was the USB-disk hot-added to md-raid1?
If so, it is known problem and cannot be fixed unless redesigning the bio interface.
(In reply to comment #2)
> Was the USB-disk hot-added to md-raid1?
Yes, it occurs at least if the USB disk _was_ hot-added.
(Whether it does / does not also occur when it gets found while assembling the md device I do not know.)
See also upstream bug
(In reply to comment #2)
> Was the USB-disk hot-added to md-raid1?
> If so, it is known problem and cannot be fixed unless redesigning the bio
How about never letting dm use more than 240 sectors even if the device will allow more?
I get the exact same error message "kernel: bio too big device md2 (248 > 240)" on my fedora 10 box (kernel 126.96.36.199-52.fc10.x86_64). This is a raid1 with 2 or 3 drives (2 sata partitions which are always present, plus one sometimes present usb partition used for occasional mirror backups). This error shows up even after failing/removing the usb partition. It seems the lower value sticks around somewhere in the md driver. Thankfully it doesn't seem to cause any data corruption...
Oh, I should also point out that the exact same setup of a raid1 md1 does not suffer this problem (direct ext3 on top of it, for /boot), while for raid1 md2, there is a luks-md2 layer, on top of which there is an ext3 partition (/). So inline with the bug linked above this does seem to be specific to the layering.
dmsetup reload luks-md2 --table "`dmsetup table --showkeys luks-md2`"
dmsetup suspend luks-md2; dmsetup resume luks-md2
Does appear to fix the issue - although it's pretty scary (md2 is the root fs, so this only works because of caching).
This bug appears to have been reported against 'rawhide' during the Fedora 11 development cycle.
Changing version to '11'.
More information and reason for this action is here:
FYI still happens with Fedora 12
There will be no fix. It is design problem and it can't be fixed unless bio interface is redesigned and rewritten (and I don't know if Jens Axboe allows it). If you want to avoid it, don't run raid1 on different device types.
Well... even just plain hardcoding md to never return that it supports more than 240 would be a certain kind of fix. Alternatively if dm on receiving an error (does it even see the error?) refetched the value from the base device, it would cause this to rarely happen...
If we hardcode it for 240 sectors, it will fail with some device that has lower limits (i.e. raid1 on floppies :)
Refetching the values would make the error happen less likely, but not really fix it.
The problem is this:
1. some code builds bio, keeps on adding pages to it until it fills raid1 sector limits
2. some other code adds a new leg with smaller i/o limits to that raid1 array
3. the bio is submitted to raid1
4. the bio is already built, it can't be shrunk, it can't be split and it will certainly fail when submitted to the new raid1 leg
--- this can't be just fixed, it needs redesign.
If we applied the workaround and refetched limits, we would change reproducible failure into a racy random failure as described in the previous paragraph. Racy failure is worse than reproducible failure.
Ok, I was under the impression that the current behaviour printed a warning, but still continued to function correctly. Is that not the case? Does this end up corrupting something?
If it is the case, that this is just a warning that is 'harmless' and the kernel works around the problem, then refetching the limits sounds like it would just work. You would get one error (well a couple if it raced on smp) when the device shrunk, and afterwards you would get no more errors.
Currently you get errors occasionally 'for ever more' when a raid device shrinks it's max bio size.
Am I misunderstanding something?
Yes, it may corrupt a filesystem (i.e. lose writes to files, or less likely, metadata).
When raid device shrinks bio size, you also get error, it is similar case.
The I/O layer just needs redesign. dm-raid or md-raid can't do anything about it.
This is still a problem with F13. I ran into it trying to
make a simple RAID copy of my laptop using another disk. I
am using an encrypted LV for a home directory, which then
resides on an MD member.
If I understand right, once I start seeing these "bio too
big" I had better force-umount and reboot with sysrq because
if I let everything sync I will see filesystem corruption.
So this is a known issue that causes *silent data corruption*
in the Linux kernel for actually several years (since at
least FC9, I just upgraded from, hoping it would go away)?
Well, kernel tells you it's corrupting your data, but still
proceeds and user thinks everything is ok if he's not
watching kernel log.
This turns what would be a simple plug-in USB disk
auto-backup via udev/mdadm into data corruption. And to
backup instead, have to boot with init=/bin/bash before LVM
starts and wait for the sync.
Is it really so uncommon and unimportant a thing to be
assigned low/low priority?
Perhaps a module option for MD telling it "never make bio
higher than this" until upstream feels like fixing this?
It sounds like the required "redesign" will essentialy never
happen. Possibly the bug should be closed as a WONTFIX
This message is a reminder that Fedora 12 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 12. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora
'version' of '12'.
Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version prior to Fedora 12's end of life.
Bug Reporter: Thank you for reporting this issue and we are sorry that
we may not be able to fix it before Fedora 12 is end of life. If you
would still like to see this bug fixed and are able to reproduce it
against a later version of Fedora please change the 'version' of this
bug to the applicable version. If you are unable to change the version,
please add a comment here and someone will do it for you.
Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.
The process we are following is described here:
This is still very much a problem.
I like the module option suggestion.
I know I will never put in devices that support less than 240.
Still bug for sure.
I keep forgetting about this until I plug in disk and it starts chewing
on my filesystems. This seems like pretty severe bug, filesystem
corruption for years in linux kernel causing silent data corruption.
Once this happens I never know about the integrity of my data again.
I also would be happy with some way to just pass in a never-exceed
bio size for the device so we could at least put the max size in a file
somewhere and don't have to risk this silent data corruption if we do
not remember about this permanent limitation of Linux kernel.
Fedora 12 changed to end-of-life (EOL) status on 2010-12-02. Fedora 12 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.
If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version.
Thank you for reporting this bug and we are sorry it could not be fixed.
Problem still present on f20 kernel 3.13.7-200.fc20.i686.
By all means keep adding instances of this to this bug, but this is an upstream design issue best pursued in a different forum (a kernel mailing list).