Bug 1155297

Summary: When the os-prober script is run (on my system) the newns program in /usr/libexec enters an infinite loop.
Product: [Fedora] Fedora Reporter: Peter Trenholme <PTrenholme>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED UPSTREAM QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, mchehab, PTrenholme
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-10-29 20:23:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
strace output none

Description Peter Trenholme 2014-10-21 20:51:45 UTC
Description of problem:
See bug 1154518 for more details.

This problem prevents 'grub2-mkconfig' from finishing, and happened between 18.0.0.rc0.git6.1 and 18.0.0.rc0.git9.1, and is still present in 18.0.0.rc1.git0

Version-Release number of selected component (if applicable):
18.0.0.rc0.git9.1 through 18.0.0.rc1.git0

How reproducible:
Every time

Steps to Reproduce:
1. Run the os-prober script
2.
3.

Actual results:
Never finishes (Needs a 'pkill -hup os-prober' to finish with cleanup.)

Expected results:
Listing of bootable operating systems on the computer.

Additional info:
This problem (or something else) also keeps google-chrome from starting, if you have installed it.

Comment 1 Josh Boyer 2014-10-22 13:20:40 UTC
I can't recreate this on my rawhide system with 3.18.0-0.rc1.git1.1.fc22.  When I run os-prober, it exits silently and does not hang.

Would it be possible for you to strace the run of os-prober to see where it's hanging?  Also, it might be beneficial to use sysrq-t to get a backtrace of all current process on the machine.  Please attach the output for those as a plain text file.

Comment 2 Peter Trenholme 2014-10-22 23:08:22 UTC
Created attachment 949600 [details]
strace output

Attached is a compressed tar of the output of a 'strace -Dt -o os_prober -ff os-prober' command. (It's in compressed, directory format because that command created 499 files.)

I could find no command 'sysrq-t' (or 'sysr*') for listing running processes. In lieu thereof, I have included in the archive, files 'os-prober.{before,during,after}' of the output of a 'pstree -ap'.

Looking at that output suggests (to me, at least) that the os-prober program might be hanging trying to find an os on a btrfs drive (formatted by btrfs, not the partition manager) with which I've been playing. (There is no OS on it.)

I tried unmounting that drive, but os-prober "helpfully" mounts any unmounted drive attached to the system.

On the other hand, I can't believe that a btrfs problem would preven google-chrome from executing. (With no error messages.)

P.S.: This is with the 18.0 rc1.2 kernel.

Comment 3 Josh Boyer 2014-10-28 18:26:52 UTC
Please try the 3.18.0-0.rc2.git1.1 kernel that is current building.  There was an upstream RCU issue that was causing hangs for a number of people.

Comment 4 Peter Trenholme 2014-10-29 00:21:41 UTC
I just ran a 'yum update' on my "rawhide" system that installed 3.18.0-0.rc2.git0.1, which still exhibits the problem.

If your reference to git1.1 (nsted of git0.1) is correct, I'll have to wait 'till that kernel hits the 'rawhide' repo.

On the other hand, if your reference to git1.1 was a typo, then there's no joy with git0.1. . .

Comment 5 Josh Boyer 2014-10-29 11:35:46 UTC
(In reply to Peter Trenholme from comment #4)
> I just ran a 'yum update' on my "rawhide" system that installed
> 3.18.0-0.rc2.git0.1, which still exhibits the problem.
> 
> If your reference to git1.1 (nsted of git0.1) is correct, I'll have to wait
> 'till that kernel hits the 'rawhide' repo.
> 
> On the other hand, if your reference to git1.1 was a typo, then there's no
> joy with git0.1. . .

3.18.0-0.rc2.git1.1 was not a typo.  Please let me know when you test that.

Comment 6 Peter Trenholme 2014-10-29 20:23:17 UTC
O.K.: Today's update installed rc2.git1.1, and both os-prober and google-chrome now work.

(There are other, new, problems: reboot and shutdown both hang, and GUI login does not display. Those are, however, probably just "rawhide" stuff, not related to this, and - hopefully - common enough to not need a bug report.)

Anyhow, I marked this "closed, upstream"

Thanks.