Description of problem: My system hangs when powering off. Log messages are roughly as follows: Sending SIGTERM Sending SIGKILL Unmounting file systems /sys/kernel/config /sys/kernel/debug /dev/mqueue (more mounts follow) Disabling swaps Disabling loop devices Deataching DM devices one left Cannot finalize, trying to kill Deataching DM devices one left Cannot finalize, giving up Successfully changed into root pivot Umounted /oldroot/proc Umounted /oldroot/dev/pts Umounted /oldroot/run (more mounts here) Umounted /oldroot/sys And then hangs Version-Release number of selected component (if applicable): kernel-3.4.2-4.fc17.x86_64 How reproducible: Always Additional info: The system does reboot cleanly with kernels kernel-3.3.7-1.fc17.x86_64 and before The system does not reboot with kernel-3.4.2 and kernel-3.4.0 My system has Intel BIOS Raid. It suffered bug #752593 (shutdown does not complete with Intel BIOS RAID) May this bug be the same under disguise?
Exactly the same problem here with an IMSM Raid 10 array.
kernel-3.4.3-1.fc17.x86_64 does not work either
Kernel 3.4.4-3.fc17.x86_64 does not work
Confirm same issue as Mr. Pascual. Kernel 3.4.4-3.fc17.x86_64 System also is Intel BIOS Raid.
I can also confirm the issue using Intel BIOS raid 10
Confirming issue with 3.4.6-2.fc17.x86_64 and Intel BIOS raid 1
I have exactly the same problem, using Intel BIOS raid and kernel 3.4.7-1.fc16.i686.PAE. Like Sergio Pascual (comment 0), I have wondered if this might be a disguised return of bug 752593, which I also experienced.
(In reply to comment #0) > Description of problem: ... > May this bug be the same under disguise? This issue happens in F17 with kernel 3.5.2-1.fc17.x86_64. My system has ICH10R-chip on motherboard and there is RAID1 volume configured. # mdadm -D /dev/md127 /dev/md127: Version : imsm Raid Level : container Total Devices : 2 Working Devices : 2 Member Arrays : /dev/md/Mirror0 I typically shut down the system with Alt-SysRq-o, since it is well stuck there. A fix would be appreciated.
Raising severity to High, perhaps we can get some attention from kernel maintainers
What mdadm version is installed on the systems where the reboots are failing? When did the problem occur? IMSM RAID has a close tie between kernel _and_ mdadm. Jes
I have mdadm-3.2.5-4.fc17.x86_64. This started happening with kernels >= 3.4, just a few kernel updates after Fedora 17 was released. I didn't recall the version of mdadm I had when it worked
Hmmm, it really shouldn't be this, but did you update via yum or manually? If so, did you remember to run 'dracut -f' afterwards? Is the IMSM array your root device? Any chance you can try and downgrade the kernels to a < 3.4 F17 one and see if it still shows the problem? Cheers, Jes
I update via yum and the ISM is my root device. I have installed kernel-3.3.4-5.fc17.x86_64, in the next comment I'll tell you if it worked or not.
Ok, after upgrading via yum, please try to run 'dracut -f' before rebooting into the new system. One the system is booted, could you try running 'ps aux | grep dmon' 'ps aux | grep mdadm' Thanks, Jes
Too late for dracut -f, I have rebooted already. Was it important? The system *does* reboot with kernel-3.3.4-5.fc17.x86_64 These are the outputs of the commands $ ps aux | grep dmon root 405 0.0 0.0 15004 10904 ? SLsl 15:57 0:00 @dmon --offroot md127 root 647 0.0 0.0 14972 10872 ? SLsl 15:57 0:00 /sbin/mdmon --takeover md127 $ ps aux | grep mdadm root 1173 0.0 0.0 4908 492 ? Ss 15:57 0:00 /sbin/mdadm --monitor --scan -f --pid-file=/var/run/mdadm/mdadm.pid Perhaps you need this also: $ cat /proc/mdstat Personalities : [raid1] md126 : active raid1 sda[1] sdb[0] 976759808 blocks super external:/md127/0 [2/2] [UU] md127 : inactive sdb[1](S) sda[0](S) 5288 blocks super external:imsm unused devices: <none>
Sergio, Thanks, you have the @dmon there, so that means the initramfs is launching mdmon correctly. Given that this changes based on the kernel version, it sounds like we need to look for the problem there or in the init scripts, rather than mdadm. Cheers, Jes
*** Bug 853467 has been marked as a duplicate of this bug. ***
Same problem here. power-off is okay with F17 fresh install (Kernel 3.3.4). But when updated to latest version via 'Software Update' (Kernel 3.5.4): power-off or restart hang at 'Umounted /oldroot/sys' With the updated system, booting with the old 3.3.4 kernel solves the problem. Gigabyte Z68P-DS3 rev1.0 F9 - Intel i5 - RAID 1
This problem seems to be fixed in kernel-3.6.1-1.fc17.x86_64.
Interesting - I've been using the 3.3.4 kernel since the occurrence of this bug. When I try to switch to 3.6.1-1, I have the following error at the 'password' prompt (the RAID1 drives are encrypted): udevd [241] inotify_add_watch /dev/sda1, 10 : failed, no such file or directory This looks like a new bug, preventing me from confirming Aram's comment 19.
In my case, it works partially with 3.6.1-1. I can reboot the system, but I cannot power off the system. In power off the system hangs with the fedora logo. In doesn't respond even to sys req.
(In reply to comment #20) > Interesting - I've been using the 3.3.4 kernel since the occurrence of this > bug. When I try to switch to 3.6.1-1, I have the following error at the > 'password' prompt (the RAID1 drives are encrypted): > > udevd [241] inotify_add_watch /dev/sda1, 10 : failed, no such file or > directory > > This looks like a new bug, preventing me from confirming Aram's comment 19. This is a different bug, it sounds like something goes wrong with the boot scripts. Jes
(In reply to comment #21) > In my case, it works partially with 3.6.1-1. I can reboot the system, but I > cannot power off the system. > > In power off the system hangs with the fedora logo. In doesn't respond even > to sys req. Please try to remove 'rhgb quiet' from the kernel boot command line and see where it hangs.
(In reply to comment #22) > (In reply to comment #20) > > Interesting - I've been using the 3.3.4 kernel since the occurrence of this > > bug. When I try to switch to 3.6.1-1, I have the following error at the > > 'password' prompt (the RAID1 drives are encrypted): > > > > udevd [241] inotify_add_watch /dev/sda1, 10 : failed, no such file or > > directory > > > > This looks like a new bug, preventing me from confirming Aram's comment 19. > > This is a different bug, it sounds like something goes wrong with the boot > scripts. > > Jes Thanks Jes - FYI, I've created Bug #866395
I have tested several times and it seems to work correctly. My system does reboot and poweroff. I don't know why it didn't poweroff on Monday
Issue no longer occurs under kernel 3.6.6-1.fc17.x86_64.
I'm suffering this problem again. My system does not shutdown nor reboot with kernel-3.6.8-2.fc17.x86_64 The reboot process hangs after Sending SIGTERM Sending SIGKILL Unmounting file systems /sys/kernel/config /sys/kernel/debug /dev/mqueue /dev/hugepages
When providing updated version numbers, please make sure to include: mdadm dracut kernel Thanks, Jes
Here it is mdadm-3.2.6-1.fc17.x86_64 dracut-018-105.git20120927.fc17.noarch kernel-3.6.8-2.fc17.x86_64
I have the same problem in F18 Beta. mdadm-3.2.6-1.fc18.x86_64 dracut-024-10.git20121121.fc18.x86_64 kernel-3.6.7-5.fc18.x86_64 It was impossible for me to reboot after an update with fedup. Finally i decided to do a clean install from F18 DVD but the problem remained so i need to do always hard reset because neither reboot or poweroff go further unmouting devices.
Chema, Any chance you can provide us the output of /proc/mdstat as well as 'dmesg' ? Thanks, Jes
Just resurfaced again for me also, with the latest kernel. Currently running: mdadm-3.2.6-1-fc16.i686 dracut-018-60.git20120927.fc16.noarch kernel-3.6.7-4.fc16.i686.PAE
Created attachment 656878 [details] Dmesg-txenoo
Jes, I've just added the dmesg and here it is the outpot of /proc/mdstat Personalities : [raid1] md126 : active raid1 sdb[1] sdc[0] 1953511424 blocks super external:/md127/0 [2/2] [UU] md127 : inactive sdc[1](S) sdb[0](S) 6056 blocks super external:imsm unused devices: <none> I have mounted in the raid the next volumes and the swap. /dev/md126p2 on / type ext4 (rw,relatime,data=ordered) /dev/md126p1 on /boot type ext4 (rw,relatime,data=ordered) /dev/md126p3 on /home type ext4 (rw,relatime,data=ordered) Chema
Interesting, so basically the problem seems to be gone in kernels between 3.6.1 and 3.6.6 but back again with kernel 3.6.7+ Harald, any idea what to look for next?
I don't know if it's related, but a few weeks ago my md126 device changed its name to md125. In other aspects, my raid configuration is equivalent to Chema's (Intel BIOS Raid)
Sergio, I don't think it is related. In principle the names should be assigned automatically, unless you have explicitly assigned a name in your /etc/mdadm.conf Cheers, Jes
I have downgraded mdadm to mdadm-3.2.3-6.fc17.x86_64 (just to be sure) and it hangs, but later. The reboot process goes until the "reboot" word appear in the console and then hangs.
3.2.3-6 should still have the --offroot support, as this was added in 3.2.3-4, so it shouldn't be that causing it. 3.2.3-3 and older are expected to hang, 3.2.3-4 and later shouldn't :(
I am also experiencing this problem again, but not I can't even boot. kernel 3.6.7+, freezes on "Unmounting file systems" on boot.
The past week I installed a new F17 x86_64 system with Intel BIOS RAID. I used a Live CD. After the install, the system was able to reboot (was kernel-3.3.4-5). The I updated the system. After the update, the system could not reboot anymore, even with the 3.3.4 kernel. So the kernel is not (the only?) culprit.
Sergio, If possible, could you try and downgrade dracut to the version from the install as well? Please run 'dracut -f' after upgrading/downgrading it. Cheers, Jes
I have downgraded to dracut-018-35.git20120510.fc17.noarch and I have done dracut -f afterwards. The system freezes on "Unmounting file systems". Perhaps is a systemd bug? How safe is to downgrade systemd?
Downgrading systemd is reasonably safe.
If you add "rd.break=pre-shutdown" to the kernel command line, dracut should give you a shell before it runs its unmounting loop. In the shell it would be helpful to explore what filesystems are still mounted and what processes are running.
It's weird, but with "rd.break=pre-shutdown" my kernel stops on boot, before printing "Fedora 17...." in blue in the text console. On shutdown nothing happens (apart on hanging on Unmounting file systems, of course) Kernel 3.6.9-2.fc17.x86_64
Running tests with Fedora 18 TC3, I am seeing the same problem there after fresh installs. Basically the install goes fine, but on reboot it goes: Stopping Software RAID monitor takeover Unmounting / <hang> Looks like the latest systemd or dracut is doing something wrong or ignoring the --offroot argument. Jes
If stopping "mdmon" is the culprit, then it should not have been doing the takeover in the real root in the first place, but leave the mdmon from the initramfs running (which should have set "@" as the first char in argv[0]).
I do not know if it's relevant at the current subject, but... In my Fedora 17 x86_64, the reboot process occurs after a period of approximately 10 min. after "Unmounting file systems." message, but the shutdown hangs infinitely. (some watchdog?) It is a clean installation of Fedora 17 x86_64 from DVD, plus "yum update -y". dracut-018-105.git20120927.fc17.noarch mdadm-3.2.6-1.fc17.x86_64 kernel-6.3.10-2.fc17.x86_64 Personalities: [raid1] md126: active raid1 sda [1] sdb [0] 488383488 blocks super external :/ md127 / 0 [2/2] [UU] md127: inactive sdb [1] (S) sda [0] (S) 5928 blocks super external: IMSM unused devices: <none> Diego
Hum... after a "yum downgrade mdadm" plus "dracut -f", my system returned to reboot and turn off normally! Apparently, the culprit was the mdadm. Current versions: mdadm-3.2.3-6.fc17.x86_64 dracut-018-105.git20120927.fc17.noarch kernel-3.6.10-2.fc17.x86_64
"yum downgrade mdadm" does the trick! mdadm-3.2.3-6.fc17.x86_64 dracut-018-105.git20120927.fc17.noarch kernel-3.6.10-2.fc17.x86_64
Confirming the same problem in FC18 with any upgrades up to: dracut-024-16.git20121220.fc18.x86_64 kernel-3.6.11-3.fc18.x86_64 mdadm-3.2.6-7.fc18.x86_64 I think my bug report #879327 can be marked as a duplicate of this.
*** Bug 887562 has been marked as a duplicate of this bug. ***
(In reply to comment #51) > "yum downgrade mdadm" does the trick! reassigning back to mdadm
mdadm-3.2.6-8.fc17 has been submitted as an update for Fedora 17. https://admin.fedoraproject.org/updates/mdadm-3.2.6-8.fc17
Ok I think I finally understood the problem here. Basically when we rolled out the --offroot support there was a bug in the mdadm upstream code which meant that mdmon processes launched with --offroot would not be taken over in case of a follow-on 'mdmon --takeover' launch from the mdmonitor-takeover.service. This was fixed between 3.2.5 and 3.2.6 upstream, which is why the problem is showing up now. Basically mdmonitor-takeover.service is now obsolete since we roll back to the initrd during shutdown, and we rely on the mdmon launched from there to handle the metadata writeout before rebooting. I have pushed mdadm-3.2.6-8 into updates-testing which should fix this problem for Fedora 17+ Please give it a spin and report back. Thanks, Jes
Hi Jes, I have updated but it didn't work. System now does not boot completely. It hangs right after "started initialized storage subsystems (RAID, LVM, etc.)" and started monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling". Help! How can I boot the system now? I mast add that I also installed the latest release version of dracut (yesterday's release). After upgrading mdadm and dracut I ran dracut -f. I suspect the mdmon is not running. Thanks, Dennis
Sorry, disregard the above comment. It was meant for bug #879327
Package mdadm-3.2.6-8.fc17: * should fix your issue, * was pushed to the Fedora 17 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing mdadm-3.2.6-8.fc17' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2013-0275/mdadm-3.2.6-8.fc17 then log in and leave karma (feedback).
Any chance this fix could make it into Fedora 16?
mdadm-3.2.6-8.fc17 installed. I can now reboot successfully.
(In reply to comment #60) > Any chance this fix could make it into Fedora 16? Thomas, I have a build for F16 which also fixes the dangling symlink to mdmonitor-takeover.service which reappeared in 3.2.6-8 I don't have a Fedora 16 system ready for testing so if you want to test this build and report back, that would be useful. Just be sure to install it, run dracut -f, and try (the first reboot after the dracut run will still hang). http://alt.fedoraproject.org/pub/alt/stage/18-RC1/Fedora/x86_64/ Note this is at your risk, but I hope it works. Jes
We are near F18 release and I wonder if this problem will appear in F18 install media
Sergio, I think we're ok for the installation itself, but post installation mdadm will need to be updates to 3.2.6-11 at least. Note there is still a problem if a user has two BIOS raid arrays, see BZ#879327 Jes
Thanks, Jes! I would be happy to test with Fedora 16. However, I can't find anything labeled for Fedora 16 in that link. Can you point me in the right direction? I am running Fedora 16 32-bit.
(In reply to comment #64) > Sergio, > > I think we're ok for the installation itself, but post installation mdadm > will need to be updates to 3.2.6-11 at least. > If I understand correctly, this means that a host installing F18 will experience a hang when rebooting after installing the OS and bootloader? > Note there is still a problem if a user has two BIOS raid arrays, see > BZ#879327 > > Jes
Sergio, Yes indeed it will - F18 installs will need the fixes from here to be able to reboot correctly: https://admin.fedoraproject.org/updates/dracut-024-23.git20130118.fc18,mdadm-3.2.6-12.fc18 Jes
Any news on when this going to land in F17?
Any update on these fixes in F17? It's been 'ON_QA' for a while now. Unforseen troubles? Thanks.
It has been ok for me. F17x64.
mdadm-3.2.6-8.fc17 has been pushed to the Fedora 17 stable repository. If problems still persist, please make note of it in this bug report.
oldroot now unmounts okay. but then dracut say: 'waiting mraid devices to be clean' and hangs. I suspect this is because my raid array is in 'verify' mode, since I had to manually reset & power off the PC many times.
everything now fine when the raid array is in a normal state. Thanks.