Red Hat Bugzilla – Bug 433932
RAID arrays not properly set up intermittantly
Last modified: 2008-04-21 17:33:53 EDT
Description of problem: Occassionly while doing the file system check the boot process is halted with a complaint about a bad superblock on a raid array. This does not happen consistantly and if I reboot enough, eventually all of the arrays are set up correctly.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
Actual results:Boot halts with a complaint about a bad superblock.
Boot runs to completion.
Additional info:The raid arrays are raid 1 using pata drives and the fs is ext3.
udev is doing the file system check??
Is it a specific component to the raid array that is missing each time?
I am not sure that udev is doing this. There is a pause at the starting udev prompt (sometimes unusually long)
and then when the error happens that gets scrolled off the screen. So it might be
something that happens shortly afterwards. Also the error message is where the check
is made. The actual problem may be happening earlier. There are messages about
the various raid arrays being started before the starting udev message.
Different arrays have the problem at different boots. It has happened on two
different machines, but it happens a lot more at home, since I have to reboot
more often to clear up some problems for hardware (video card and multiple cpu
related) that my old work machine doesn't have.
I haven't seen a message about the swap raid array yet, but that isn't getting fsck'd so it might still be having problems.
This is still happening but as I have been watching it, it has only been
happening on two /boot partitions on one machine (out of three using raid),
since I have been watching more carefully.
It seems to be happening a little less often but that might be just statistical
I believe those two partitions have the same label as they were both initialized
as /boot partitions (for different installs) and perhaps this has some bearing
on the problem.
I am still seeing this with kernel 2.6.25-0.195.rc8.git1.fc9.i686, udev
This is the mdadm.conf file for both my system and the initrd image:
# mdadm.conf written out by anaconda
ARRAY /dev/md5 super-minor=5
ARRAY /dev/md1 super-minor=1
ARRAY /dev/md0 super-minor=0
ARRAY /dev/md2 super-minor=2
ARRAY /dev/md7 super-minor=7
ARRAY /dev/md3 super-minor=3
Just a 'me too' here. As a workaround I put a '/sbin/mdadm -A -s' into
/etc/rc.sysinit right before fsck to get at least some other things tested.
I needed to do some testing this morning for another bug and I recorded what
happened with 10 reboots. The part that is relevant here is that in 5 of these
10 reboots raid failed to work properly and in the other 5 it did work properly.
Maybe that statiscal information will be useful.
Is there some more information that I can collect to help track down what is
happening with this? When this fails I have a shell prompt and can do some
poking around. I'd at least be able to find out how not to trigger the condition.
(I have a remotely administered machine runng FC5, that I'd like to upgrade to
F9, but having reboots fail this way would be painful.)
I think this bug is in rc.sysinit, not udev.
(In reply to comment #8)
> Is there some more information that I can collect to help track down what is
> happening with this?
Ok, let me recap what I've read so far:
* Fedora 8 runs a `/sbin/mdadm -A -s --auto=yes` inside /etc/rc.sysinit
depending on the existence of /etc/mdadm.conf
* Now notting came up with a udev rule (/etc/rules.d/70-mdadm.rules) that is
carried with the mdadm package
* This rule is supposed to replace the invocation of mdadm inside
/etc/rc.sysinit, but it obviously does not work sometimes
* Some discussion that I don't (want to) fully understand happens in bug 429604
Practically, this one could be marked as a duplicate of bug 429604. If it won't
be fixed in time for F9 it should be possible to add the invocation of mdadm
back into rc.sysinit.
notting, harald, please correct me if I'm wrong.
The directory stuff in 429604 is only if you partition your arrays. For the
non-partitioned case, I need to get a machine that reproduces it. So far, all my
attempts to reproduce it have failed.
Bruno/Frank - are you currently up to date with udev?
The machine that is showing the problems is at udev-120-2.fc9.i386. It looks
like -3 is available now. The comments don't suggest that a change affecting
this bug was made though.
Here is a lisit of /etc/udev/rules.d:
[root@bruno rules.d]# ls
10-console.rules 60-pcmcia.rules 80-drivers.rules
10-libifp.rules 60-persistent-input.rules 85-pcscd_ccid.rules
40-alsa.rules 60-persistent-storage.rules 85-pcscd_egate.rules
40-multipath.rules 60-persistent-storage-tape.rules 88-clock.rules
40-redhat.rules 60-wacom.rules 90-alsa.rules
50-udev-default.rules 61-persistent-storage-edd.rules 90-hal.rules
51-vdr.rules 64-device-mapper.rules 91-drm-modeset.rules
52_nut-usbups.rules 64-md-raid.rules 95-pam-console.rules
60-cdrom_id.rules 70-mdadm.rules 95-udev-late.rules
60-libmtp.rules 70-persistent-cd.rules 97-bluetooth.rules
60-libnjb.rules 70-persistent-net.rules 99-fuse.rules
60-net.rules 75-cd-aliases-generator.rules zaptel.rules
(In reply to comment #12)
> Bruno/Frank - are you currently up to date with udev?
I'm updating daily right now. Currently udev-120-3.fc9.x86_64, which fails too.
If it would be of any help I can capture the startup debug output. With
udev_log="info" I'm not able to reproduce it anymore. But with udev_log="debug"
it failed. Tried it only once though because it takes some time.
Created attachment 302239 [details]
Console logs taken during startup
Anyway, here is what I was able to fetch during startup (.tar.bz2, 1.1M):
udev-debug-startup-failed.log 37M Log generated with udev_log=debug
udev-info-startup-succeeded.log 655K Log generated with udev_log=info
Does it persist if you move out some of the non-required rule files?
udevsettle: udevsettle: timeout waiting for queue
(from your failed log)
So, it seems to 'fail' with debug only because it causes udev startup to take so
long that it times out and continues with it running in the background.
you can set the start_udev timeout on the kernel command line.. try:
How do I tell which ones are non required? Should I just look for cruft left
behind after upgrades (by checking rpm -q --whatprovides for each entry) or did
you ahve something else in mind. Or was your comment directed to solving the
udev timeout issue before you had a better recommendation?
Bruno: are you still seeing this in cases where you aren't passing udev debug
If so, I'd try just moving out of the way:
and see if that makes things better.
(In reply to comment #18)
> you can set the start_udev timeout on the kernel command line.. try:
No udevsettle anymore, but it didn't run fully through and locked up somewhere.
Increased the timeout to 10000, ran it again, and it got stuck somewhere else.
I don't think this gives us much because it takes more than an hour to do a
I had a glance at the udev man page. Would it be a good idea to modify the rules
somehow to do some output, so we can actually see what gets set up? Like RUN
something to output the imported ENV variables for each device that gets set up?
(In reply to comment #16)
> Does it persist if you move out some of the non-required rule files?
I excluded following rules
But still, one out of six arrays does not get set up properly. This result is
quite reliable to reproduce.
I just tried 5 reboots this morning with udev set to debug and using the
udevtimeout kernel option and I didn't get it to fail. I'll try again tonight.
My reboots take about 15-20 minutes with the udev debugging on. I only end up
with about 9MB of logs. I am not sure why I am getting so much less.
Is it always the same array that fails? If so, is there anything about it
configured differently (either at setup, or in mdadm.conf)
mdadm -D output may help here.
The problem is I can't reproduce this; without that, it's hard to code a fix.
In my case its the two /boot arrays (one is mounted as /otherboot, but was
originally a /boot partition on a test install). I have only seen those two have
the problem recently. They are md0 and md1.
I did take a look at mdadm -D output and didn't see anything odd looking. When I
tried it after a failed boot it said there wasn't any such array for the one
that didn't work. I'll try looking more carefully to see if I can spot something.
(In reply to comment #23)
> Is it always the same array that fails?
No. Different ones each time. All arrays are similar in their setup in my case.
> The problem is I can't reproduce this; without that, it's hard to code a fix.
Yeah, I know ;) I'm currently at work, testing the other thing coming from you
that's scheduled to be released... so I can't do anything now. One thing I tried
before going to work:
I've modified the 70-mdadm.rules. I added the line
RUN+="echo something", \
before the mdadm RUN line. This caused the system to boot up cleanly twice.
Tried the same thing with a usleep 1000, and it didn't work. I'll tinker around
a bit more as soon as I'll be back home.
Created attachment 302535 [details]
Output of dmesg
(In reply to comment #25)
> I've modified the 70-mdadm.rules. I added the line
> RUN+="echo something", \
> before the mdadm RUN line. This caused the system to boot up cleanly twice.
Pure luck. Does not work each time. But I should have had the idea to look at
the output of dmesg sometime earlier...
The arrays that fail get logged there with:
md: array mdx already has disks!
Full dmesg attached (md3 failed in this case).
Created attachment 302584 [details]
More verbose output of dmesg
I've added MD_BUG(); for the condition where it fails to see the actual state
of all arrays. In this case md6 failed and got one of its members already
Thanks for the debugging!
Can you try the mdadm packages at:
Some more information for interested parties, without looking at the changes to
For each array that failed to get set up md_ioctl() got called twice with the
command SET_ARRAY_INFO in a short period of time, and those calls seemed to
tread on each other's toes.
The best way would be a 'binding already in progress' flag for each device
inside the md driver to fit each and every corner case, if that's feasible.
At least the usual case does work now reliably. Rebooted more than 10 times in a
row and no double call was made.
Am I right? ... :)
Basically, the change made in that package is for the routine to open the device
with O_EXCL so that only one mdadm call has the device open at once; if it can't
open it, it waits a short time and tries again.
Frank, just confirming - the new package works for you?
(In reply to comment #31)
> Frank, just confirming - the new package works for you?
Yes, it's working.
I tried the foo version of mdadm this morning (and removed the udev debuging
which seemed to mask the problem) and had no raid failures in 10 reboots. At the
50% failure rate I was seeing previously, this should have happened by change
only 1 time in 1000. So I think it is pretty likely you fixed the problem.
Thanks you guys!
Fixed in 2.6.4-4.
This package has landed in rawhide, and the problem seems fixed for me.
I'm closing the bug to get it off the blocker list, but feel free to reopen it
if the problem manifests with mdadm-2.6.4-4 or later.