Description of problem: Occassionly while doing the file system check the boot process is halted with a complaint about a bad superblock on a raid array. This does not happen consistantly and if I reboot enough, eventually all of the arrays are set up correctly. Version-Release number of selected component (if applicable): udev-118-1.fc9.i386 How reproducible: Roughly 50%. Steps to Reproduce: 1.reboot 2. 3. Actual results:Boot halts with a complaint about a bad superblock. Expected results: Boot runs to completion. Additional info:The raid arrays are raid 1 using pata drives and the fs is ext3.
udev is doing the file system check??
Is it a specific component to the raid array that is missing each time?
I am not sure that udev is doing this. There is a pause at the starting udev prompt (sometimes unusually long) and then when the error happens that gets scrolled off the screen. So it might be something that happens shortly afterwards. Also the error message is where the check is made. The actual problem may be happening earlier. There are messages about the various raid arrays being started before the starting udev message. Different arrays have the problem at different boots. It has happened on two different machines, but it happens a lot more at home, since I have to reboot more often to clear up some problems for hardware (video card and multiple cpu related) that my old work machine doesn't have. I haven't seen a message about the swap raid array yet, but that isn't getting fsck'd so it might still be having problems.
This is still happening but as I have been watching it, it has only been happening on two /boot partitions on one machine (out of three using raid), since I have been watching more carefully. It seems to be happening a little less often but that might be just statistical variation. I believe those two partitions have the same label as they were both initialized as /boot partitions (for different installs) and perhaps this has some bearing on the problem.
I am still seeing this with kernel 2.6.25-0.195.rc8.git1.fc9.i686, udev udev-120-1.fc9.i386. This is the mdadm.conf file for both my system and the initrd image: # mdadm.conf written out by anaconda DEVICE partitions MAILADDR root ARRAY /dev/md5 super-minor=5 ARRAY /dev/md1 super-minor=1 ARRAY /dev/md0 super-minor=0 ARRAY /dev/md2 super-minor=2 ARRAY /dev/md7 super-minor=7 ARRAY /dev/md3 super-minor=3
Just a 'me too' here. As a workaround I put a '/sbin/mdadm -A -s' into /etc/rc.sysinit right before fsck to get at least some other things tested.
I needed to do some testing this morning for another bug and I recorded what happened with 10 reboots. The part that is relevant here is that in 5 of these 10 reboots raid failed to work properly and in the other 5 it did work properly. Maybe that statiscal information will be useful.
Is there some more information that I can collect to help track down what is happening with this? When this fails I have a shell prompt and can do some poking around. I'd at least be able to find out how not to trigger the condition. (I have a remotely administered machine runng FC5, that I'd like to upgrade to F9, but having reboots fail this way would be painful.)
I think this bug is in rc.sysinit, not udev. Reassigning.
(In reply to comment #8) > Is there some more information that I can collect to help track down what is > happening with this? Ok, let me recap what I've read so far: * Fedora 8 runs a `/sbin/mdadm -A -s --auto=yes` inside /etc/rc.sysinit depending on the existence of /etc/mdadm.conf * Now notting came up with a udev rule (/etc/rules.d/70-mdadm.rules) that is carried with the mdadm package * This rule is supposed to replace the invocation of mdadm inside /etc/rc.sysinit, but it obviously does not work sometimes * Some discussion that I don't (want to) fully understand happens in bug 429604 Practically, this one could be marked as a duplicate of bug 429604. If it won't be fixed in time for F9 it should be possible to add the invocation of mdadm back into rc.sysinit. notting, harald, please correct me if I'm wrong.
The directory stuff in 429604 is only if you partition your arrays. For the non-partitioned case, I need to get a machine that reproduces it. So far, all my attempts to reproduce it have failed.
Bruno/Frank - are you currently up to date with udev?
The machine that is showing the problems is at udev-120-2.fc9.i386. It looks like -3 is available now. The comments don't suggest that a change affecting this bug was made though. Here is a lisit of /etc/udev/rules.d: [root@bruno rules.d]# ls 10-console.rules 60-pcmcia.rules 80-drivers.rules 10-libifp.rules 60-persistent-input.rules 85-pcscd_ccid.rules 40-alsa.rules 60-persistent-storage.rules 85-pcscd_egate.rules 40-multipath.rules 60-persistent-storage-tape.rules 88-clock.rules 40-redhat.rules 60-wacom.rules 90-alsa.rules 50-udev-default.rules 61-persistent-storage-edd.rules 90-hal.rules 51-vdr.rules 64-device-mapper.rules 91-drm-modeset.rules 52_nut-usbups.rules 64-md-raid.rules 95-pam-console.rules 60-cdrom_id.rules 70-mdadm.rules 95-udev-late.rules 60-libmtp.rules 70-persistent-cd.rules 97-bluetooth.rules 60-libnjb.rules 70-persistent-net.rules 99-fuse.rules 60-net.rules 75-cd-aliases-generator.rules zaptel.rules 60-openct.rules 75-persistent-net-generator.rules
(In reply to comment #12) > Bruno/Frank - are you currently up to date with udev? I'm updating daily right now. Currently udev-120-3.fc9.x86_64, which fails too. If it would be of any help I can capture the startup debug output. With udev_log="info" I'm not able to reproduce it anymore. But with udev_log="debug" it failed. Tried it only once though because it takes some time.
Created attachment 302239 [details] Console logs taken during startup Anyway, here is what I was able to fetch during startup (.tar.bz2, 1.1M): udev-debug-startup-failed.log 37M Log generated with udev_log=debug udev-info-startup-succeeded.log 655K Log generated with udev_log=info
Does it persist if you move out some of the non-required rule files?
udevsettle[668]: udevsettle: timeout waiting for queue (from your failed log) So, it seems to 'fail' with debug only because it causes udev startup to take so long that it times out and continues with it running in the background.
you can set the start_udev timeout on the kernel command line.. try: udevtimeout=1000
How do I tell which ones are non required? Should I just look for cruft left behind after upgrades (by checking rpm -q --whatprovides for each entry) or did you ahve something else in mind. Or was your comment directed to solving the udev timeout issue before you had a better recommendation?
Bruno: are you still seeing this in cases where you aren't passing udev debug options? If so, I'd try just moving out of the way: 40-multipath.rules and see if that makes things better.
(In reply to comment #18) > you can set the start_udev timeout on the kernel command line.. try: > udevtimeout=1000 No udevsettle anymore, but it didn't run fully through and locked up somewhere. Increased the timeout to 10000, ran it again, and it got stuck somewhere else. I don't think this gives us much because it takes more than an hour to do a single run. I had a glance at the udev man page. Would it be a good idea to modify the rules somehow to do some output, so we can actually see what gets set up? Like RUN something to output the imported ENV variables for each device that gets set up? (In reply to comment #16) > Does it persist if you move out some of the non-required rule files? I excluded following rules 40-multipath 60-pcmcia 60-persistent-storage-tape 60-wacom 64-device-mapper But still, one out of six arrays does not get set up properly. This result is quite reliable to reproduce.
I just tried 5 reboots this morning with udev set to debug and using the udevtimeout kernel option and I didn't get it to fail. I'll try again tonight. My reboots take about 15-20 minutes with the udev debugging on. I only end up with about 9MB of logs. I am not sure why I am getting so much less.
Is it always the same array that fails? If so, is there anything about it configured differently (either at setup, or in mdadm.conf) mdadm -D output may help here. The problem is I can't reproduce this; without that, it's hard to code a fix.
In my case its the two /boot arrays (one is mounted as /otherboot, but was originally a /boot partition on a test install). I have only seen those two have the problem recently. They are md0 and md1. I did take a look at mdadm -D output and didn't see anything odd looking. When I tried it after a failed boot it said there wasn't any such array for the one that didn't work. I'll try looking more carefully to see if I can spot something.
(In reply to comment #23) > Is it always the same array that fails? No. Different ones each time. All arrays are similar in their setup in my case. > The problem is I can't reproduce this; without that, it's hard to code a fix. Yeah, I know ;) I'm currently at work, testing the other thing coming from you that's scheduled to be released... so I can't do anything now. One thing I tried before going to work: I've modified the 70-mdadm.rules. I added the line RUN+="echo something", \ before the mdadm RUN line. This caused the system to boot up cleanly twice. Tried the same thing with a usleep 1000, and it didn't work. I'll tinker around a bit more as soon as I'll be back home.
Created attachment 302535 [details] Output of dmesg (In reply to comment #25) > I've modified the 70-mdadm.rules. I added the line > RUN+="echo something", \ > before the mdadm RUN line. This caused the system to boot up cleanly twice. Pure luck. Does not work each time. But I should have had the idea to look at the output of dmesg sometime earlier... The arrays that fail get logged there with: md: array mdx already has disks! Full dmesg attached (md3 failed in this case).
Created attachment 302584 [details] More verbose output of dmesg I've added MD_BUG(); for the condition where it fails to see the actual state of all arrays. In this case md6 failed and got one of its members already assigned...
Thanks for the debugging! Can you try the mdadm packages at: http://koji.fedoraproject.org/koji/taskinfo?taskID=569782
Some more information for interested parties, without looking at the changes to mdadm: For each array that failed to get set up md_ioctl() got called twice with the command SET_ARRAY_INFO in a short period of time, and those calls seemed to tread on each other's toes. The best way would be a 'binding already in progress' flag for each device inside the md driver to fit each and every corner case, if that's feasible. At least the usual case does work now reliably. Rebooted more than 10 times in a row and no double call was made. Am I right? ... :)
Basically, the change made in that package is for the routine to open the device with O_EXCL so that only one mdadm call has the device open at once; if it can't open it, it waits a short time and tries again.
Frank, just confirming - the new package works for you?
(In reply to comment #31) > Frank, just confirming - the new package works for you? Yes, it's working.
I tried the foo version of mdadm this morning (and removed the udev debuging which seemed to mask the problem) and had no raid failures in 10 reboots. At the 50% failure rate I was seeing previously, this should have happened by change only 1 time in 1000. So I think it is pretty likely you fixed the problem. Thanks you guys!
Fixed in 2.6.4-4.
This package has landed in rawhide, and the problem seems fixed for me. I'm closing the bug to get it off the blocker list, but feel free to reopen it if the problem manifests with mdadm-2.6.4-4 or later.