This AMD system has been working fine with Fedora 20 for months. It consists of two identical drives working as RAID 1. These drives only have 1 partition: the root. After yesterday's yum update, the system no longer boots. It waits for a long time at startup, and then dies with a dracut error, "dracut initqueue device uuid not found" (and it shows the UUID of the raid system), and drops me in dracut rescue prompt. At this prompt, mdadm shows NO raid devices found, cat /proc/mdstat is likewise missing. However, when I boot from live F20 USB, it finds the raid system fine, and working -- I see both devices in /proc/mdstat, and I can mount the /dev/md127 just fine. I even did an fsck -f /dev/md127 and it was fine. Likewise, if from grub2 menu I pick the third kernel listed (3.14.5-200), that boots the system right up! So only the newest two grub entries (kernel 3.14.7 and 3.14.8) fail to boot the system. (Yes; I checked to see what was different about the third entry versus the first two, and only the kernel and initrd version numbers are different) Note: After some more fiddling with mkinitrd, I managed to lose the third kernel as well. Only the "rescue" kernel listed now boots the machine
Created attachment 911292 [details] boot error
Comment on attachment 911292 [details] boot error Note: The "warning: could not boot" line appears about 5 minutes after the first few lines are displayed.
Hi, There should be nothing new mdadm related in recent yum updates, so I wonder if this is related to the kernel. Once you hit the dracut shell, could you please run the following: cat /proc/mdstat and also supply your /etc/mdadm.conf and /etc/fstab ? Could you also provide a copy of /proc/mdstat for when the system is booted succesfully using the older kernel? We also would need 'rpm -q mdadm dracut' output. Thanks, Jes
Right now I can only supply some of them, because while trying everything since yesterday, I inadvertantly lost access to my data when I zero-blocked my raid drives. However I can say that /proc/mdstat was empty -- there was no mdstat file at all. /etc/mdadm.conf contained a few lines that were put there when anaconda created the raid array. The /etc/fstab contained the UUID= of the raid array, something very simple like: UUID=........ / defaults 1 1
there is a little more info here: http://forums.fedoraforum.org/showthread.php?p=1702763&posted=1#post1702763
Ouf, sorry to hear that! :( I hope you didn't lose any valuable data. The only thing in your yum update log that would be relevant for RAID is the kernel package. There is no mention of mdadm, systemd, dracut, etc. in that list. Your message mentioned you were rebuilding the RAID, is that still going on or did you lose the data? Thanks, Jes
I had done a mdadm --zero-superblock on both drives, but re-creating the array with the same parameters allowed me to get my data back)..
I also just managed to get the system back to boot. I had to tweak with mdadm, dracut, and grub2-mkconfig..
Any chance this could be related to this problem: https://bugzilla.redhat.com/show_bug.cgi?id=1111442 ?
I read it and it sounds similar indeed. But I'm not an expert whether to say they are identical or not. It felt like the "mdadm" module was not loaded by the couple of latest kernel updates..
I understand, the reason I suggest it could be the kernel is that there have been no mdadm updates in a long time, and your yum update log didn't show any mdadm updates. It would be interesting to know if the problem goes away once the new kernel propagates out. Cheers, Jes
Turgut, Did you try this out again with a recent kernel? Thanks, Jes
Ping! Is this still an issue? Jes
Might be related: bug 1097664
I have had similar problems with Fedora 20. Booting goes fine with all the 3.16 series kernels like 3.16.7, but none of the 3.17 series kernels work :( Raid is a mirror configuration on two identical 2TB disks. All updates have been installed. Booting with 3.17 fails in a timeout. Systemd will timeout trying to mount raid in a boot. cat /proc/mdstat does not find any raid configurations. mdadm --assemble --scan will give unexpected failure opening With 3.16 series kernels raid setup is working just fine.
Here cat /proc/mdstat with 3.16 series kernel, when all is working as it should. Personalities : [raid1] md2013 : active raid1 sde1[0] sdd1[1] 1953382208 blocks super 1.2 [2/2] [UU] unused devices: <none>
Here also output for mdadm --detail /dev/md2013. Just masked the hostname from the output :) /dev/md2013: Version : 1.2 Creation Time : Sun Apr 14 03:55:41 2013 Raid Level : raid1 Array Size : 1953382208 (1862.89 GiB 2000.26 GB) Used Dev Size : 1953382208 (1862.89 GiB 2000.26 GB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent Update Time : Sun Feb 8 22:55:15 2015 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Name : MASKEDAWAY.SOME:2013 (local to host MASKEDAWAY.SOME) UUID : 10db33a0:78e4d98e:cb278297:b54a8347 Events : 3143 Number Major Minor RaidDevice State 0 8 65 0 active sync /dev/sde1 1 8 49 1 active sync /dev/sdd1
cat /etc/mdadm.conf ARRAY /dev/md/2013 metadata=1.2 UUID=10db33a0:78e4d98e:cb278297:b54a8347 name=MASKEDAWAY.SOME:2013 MAILADDR root /etc/fstab line for this array is following /dev/md2013 /var/pub ext4 defaults 1 2
(In reply to Matti Laitala from comment #18) > cat /etc/mdadm.conf > > ARRAY /dev/md/2013 metadata=1.2 UUID=10db33a0:78e4d98e:cb278297:b54a8347 > name=MASKEDAWAY.SOME:2013 > MAILADDR root > > /etc/fstab line for this array is following > > /dev/md2013 /var/pub ext4 defaults 1 2 Matti, You tell mdadm to create /dev/md/2013, but at the same time you try to mount /dev/md2013 - that makes no sense. Jes
Jes, Good point, thanks... If I remember correctly there is a link between those to device files. But I'll fix this anyway and I will retest this with new kernel. But from my understaning this does not solve the problem that array is not usable for mounting. Here referring that cat /proc/mdstat does not find array.
Jes, fstab change had no effect (there is a symbolic link between those two file handles). System works with all the 3.16 series kernels like Linux kernel 3.16.7-200.fc20.x86_64 But none of the 3.17 or 3.18 kernels. Just tested kernel-3.18.5-101.fc20.x86_64. cat mdstat.txt with 3.18 kernel is the same as with 3.17 kernels Personalities : unused devices: <none> Causing the boot to fail. Do you have any ideas or am I forced to overwrite superblocks to fix the problem (as others have done to fix the issue)?
Fixing the problem required update of name and device file for an array. Steps for the fix: mdadm --stop /dev/md2013 mdadm --assemble /dev/md1 --name=MASKEDAWAY.SOME:1 --update=name /dev/sde1 /dev/sdd1 mdadm --detail --scan > /etc/mdadm.conf After that I edit the /etc/fstab to use /dev/md1. Now machine works with 3.18.7-200.fc21.x86_64 kernel.
This message is a reminder that Fedora 20 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 20. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '20'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 20 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora 20 changed to end-of-life (EOL) status on 2015-06-23. Fedora 20 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.