Description of problem: I run my rawhide system from a RAID-1 md device. Since I think the standard GRUB2 menus look nicer than the "grubby" ones, I always run grub2-mkconfig after a kernel update when my "yum update" finishes. That script calls the os-prober, the os-prober script hangs at the 'newns "$@"' command, which never returns. (/usr/libexec/newns is part of the os-prober rpm package.) Version-Release number of selected component (if applicable): The os-prober rpm is version 1.58=1.0.fc22.x86_64 How reproducible: Every time Steps to Reproduce: 1. Open a su terminal 2. Enter the os-prober command 3. See a few lines, and then wait "forever" (well, I waited am hour, but all the os-prober processes were in the "S+" state.) Actual results: Hung process Expected results: List of all installed systems Additional info: Here's some terminal output: Running os-prober: [root ~]# os-prober /dev/sda1:Windows 7 (loader):Windows:chain /dev/sda3:Windows Recovery Environment (loader):Windows1:chain /dev/sdb1:Fedora release 20 (Heisenbug):Fedora:linux ========================= (Note that the RAID 1 array components are reported in the "raided" file in /tmp, as shown below, in addtion to the actual mount (/dev/md127) shown in the mounted-map file.) Speculation: Could the newns program be trying to re-mount a mounted RAID component instead of using the already mounted array? In fact, why would 'os-prober' need to know anything about a raid device except its logical name? Unless it's looking for bootable os's on unmounted RAID devices, and it's hung trying to mount the RAID array containing the woking system in, e.g., /var/lib/os-prober/mount to look for the rawhide os. (My system is all in a single ext4 file system.) From another terminal: [root ~]# ps -aux | egrep '(19648)|(19725)|(19726)' root 19648 0.0 0.0 113824 3256 pts/3 S+ 20:52 0:00 /bin/sh /bin/os-prober root 19725 0.0 0.0 113824 1540 pts/3 S+ 20:52 0:00 /bin/sh /bin/os-prober root 19726 0.0 0.0 113824 2312 pts/3 S+ 20:52 0:00 /bin/sh /bin/os-prober Here's the /tmp files that were generated before the hang. (They all seem correct): [root ~]# ls /tmp/os-prober.MxQGsY/ btrfs-vols mounted-map raided-map swaps-map ========================== [root ~]# cd /tmp/os-prober.MxQGsY/ ========================== [root os-prober.MxQGsY]# cat btrfs-vols da1e3dad-5460-4a45-95b6-a1c3d43f760d ========================= [root os-prober.MxQGsY]# cat mounted-map /dev/md127 / ext4 /dev/md127 /dev/sdb1 /Fedora ext4 /dev/sdb1 /dev/sdc1 /Backups btrfs /dev/sdc1 /dev/sda2 /Win7 fuseblk /dev/sda2 /dev/sda1 /Win7/System fuseblk /dev/sda1 /dev/sda3 /Win7/HP_Recover fuseblk /dev/sda3 ======================== [root os-prober.MxQGsY]# cat raided-map /dev/sda5 /dev/sdb2 ======================== [root os-prober.MxQGsY]# cat swaps-map /dev/sdc2 swap ======================= root ~]# kill -hup 19648 19725 19726 (The os-prober scrpt traps the hangup and removes the /tmp files.)
After additional thought (and a night's sleep), I realized that I should try the obvious: I rebooted using 3.18.0-0.rc0.git6.1.fc22.x86_64 #1 SMP instead of the newer 3.18.0-0.rc0.git9.4.fc22.x86_64 #1 SMP version, and os-prober worked with no problem. So, there is in incompatibility between os-prober and git9.1 (and git9.4) versions of the 3.18 kernel. (By the way, google-chrome also fails using the git9 kernels.) Anyhow, I think this bug should either be closed or moved to a kernel bug. (I hadn't reported the the google-chrome bug because I'm using a hacked version.)
Thanks for the report. However, newns does almost nothing! It just execs the command given to it after calling "unshare(CLONE_NEWNS)". So, maybe its semantics have changed somehow, or it might be a kernel bug or.... Let's see if this problem persists in future kernel snapshots.
Well, 18.0 rc1 was just posted, and the problem is still there. Can you (or I) move this report to the kernel people's attention? (I don't know how I do that, but I would if I could.) Some change made between kernel 18.0-rc0.git6.1 and 18.0-rc0.git9.1 has impacted os-pober (and google-chrome) and perhaps other applications. . . . As an interim measure, I think I'll post a bug against the kernel referencing this thread.
Would you please try again with 3.18.1-1.fc22.x86_64 kernel? Seems to be fixed there.
O.K., no problem with 3.18.1-2.fc22.x86_64. In fact, I'd forgotten this since it disappeared after 3.18.0 rc2 (if I recollect correctly), and I thought this had been closed then.
Thanks, closing then.