Bug 818378 - grub2 can't install because /proc/device-tree exists
Summary: grub2 can't install because /proc/device-tree exists
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 17
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: AcceptedBlocker
: 819151 819971 (view as bug list)
Depends On:
Blocks: F17Blocker, F17FinalBlocker
TreeView+ depends on / blocked
 
Reported: 2012-05-02 22:10 UTC by Samuel Sieb
Modified: 2013-01-10 08:29 UTC (History)
22 users (show)

Fixed In Version: kernel-3.3.4-5.fc17
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-05-12 16:20:34 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Dmesg after grub2 installation failure (65.48 KB, text/plain)
2012-05-06 08:00 UTC, A.J. Werkman
no flags Details

Description Samuel Sieb 2012-05-02 22:10:01 UTC
When installing F17-beta on a server, grub failed to install because the source_dir didn't exist.  Further investigation showed that source_dir was being set wrong because /proc/device-tree existed, so grub2-install chooses a target of i386-ieee1275 instead of i386-pc.  I manually installed grub by setting the correct target on the command line.  After rebooting into the installed system, /proc/device-tree does not exist.  So this may actually be a bug with anaconda, but I'll start here.

Comment 1 Mads Kiilerich 2012-05-02 22:20:18 UTC
What kind of hardware/vm is that?

Please attach dmesg output.

Comment 2 Samuel Sieb 2012-05-02 22:26:40 UTC
This is the same system from Bug 809111.  It's an HP ProLiant DL560.  Dmesg from the running system probably wouldn't show anything as that directory is not there.  I'll reboot it into the installer again tomorrow and see what I can find.

Comment 3 Mads Kiilerich 2012-05-02 22:39:42 UTC
So /proc/device-tree only exist when booting from the installer media, not when booting the (partly manually) installed system?

dmesg from these two situations would probably be interesting ... preferably with the same kernel version.

There is nothing "open firmware"-ish about this system ... except that the installer system think there is?

Grub2 did the right thing ... considering that it was told that it was an open firmware system.

Comment 4 Samuel Sieb 2012-05-03 20:47:13 UTC
The installer has kernel 3.3.0 and the installed system has 3.3.4.  I removed all the modules I could from the installer booted system and the directory didn't go away, so maybe it's just a bug in that kernel version.  I will wait for the next installer release and try again.

Comment 5 Mads Kiilerich 2012-05-05 23:25:59 UTC
Please attach 'dmesg' output from a installer system. Please also attach something like 'tar czf device-tree.tgz /proc/device-tree'.

(In reply to comment #3)
> Grub2 did the right thing ... considering that it was told that it was an open
> firmware system.

Grub2 thought it did the right thing ... assuming that the existence of /proc/device-tree means that it is an open firmware system ... and that might be a wrong assumption.

Comment 6 Mads Kiilerich 2012-05-05 23:49:57 UTC
*** Bug 819151 has been marked as a duplicate of this bug. ***

Comment 7 A.J. Werkman 2012-05-06 08:00:12 UTC
Created attachment 582377 [details]
Dmesg after grub2 installation failure

Attached is my dmesg output from the installer system.

I have a /proc/device-tree directory, but it is empty.

Comment 8 Mads Kiilerich 2012-05-06 11:10:20 UTC
Weird. There should never be a /proc/device-tree on bios i686.

Comment 9 Sven Lankes 2012-05-06 11:46:28 UTC
I hit this issue too yesterday on a 32bit kvm instance that I upgraded to fc17.

Manually specifying the target can be used as a workaround for now:

grub2-install --target=i386-pc /dev/vda

This is kernel 3.3.4-4.fc17.i686 and it does also have an empty /proc/device-tree directory.

Comment 10 Mads Kiilerich 2012-05-06 11:53:25 UTC
Sven, do you _always_ have an empty /proc/device-tree - not only when booting the installer?

If so: can you try with an older kernel version and with a PAE kernel?

Comment 11 Sven Lankes 2012-05-06 12:34:09 UTC
So - the PAE-Kernel does _not_ have a /proc/device-tree dir.

kernel-3.3.1-3.fc17.i686 was the "oldest" one I managed to test and that one _does_ have the /proc/device-tree directory.

Comment 12 Andre Robatino 2012-05-06 12:37:44 UTC
Verified that specifying "--target=i386-pc" with grub2-install fixes up all my broken TC3 i386 guests with both KVM (using /dev/vda) and VirtualBox (using /dev/sda).

Comment 13 Hongqing Yang 2012-05-07 05:05:55 UTC
I hit this with F17 Final TC3 i386 image, the F17 Final TC3 x86_64 image works well.

Comment 14 Mads Kiilerich 2012-05-07 10:03:53 UTC
My best guess is that the non-PAE kernel is compiled to work on for example OLPC which might be more like OF or arm than ordinary x86 bios.

It seems like this requires a fix or explanation from the kernel team - reassigning.

CC'ing mjg and jwb who showed some interest in this on IRC.

Comment 15 Josh Boyer 2012-05-07 13:33:10 UTC
(In reply to comment #14)
> My best guess is that the non-PAE kernel is compiled to work on for example
> OLPC which might be more like OF or arm than ordinary x86 bios.

OLPC support has been configured in, yes.  It has been like that for quite some time, going back well past F15.  The F15/F16 kernels should show similar things.

> It seems like this requires a fix or explanation from the kernel team -
> reassigning.

Erm... maybe.  Not sure what the "fix" would really be other than disabling OLPC support.  Grub2 would probably do better to test for the existence of the directory and it be non-empty.

I'm curious if people have this issue with non-PAE F16 installs too, as there really shouldn't be differences in how the kernel behaves in this regard.

Comment 16 Andre Robatino 2012-05-07 13:46:16 UTC
(In reply to comment #15)

> I'm curious if people have this issue with non-PAE F16 installs too, as there
> really shouldn't be differences in how the kernel behaves in this regard.

I did a F16 live install on a non-PAE i686 machine with no such problem. Its F14 smolt URL is

http://www.smolts.org/client/show/pub_d157199d-ccaa-4731-aca2-5bc53d75c516

(I'm unable to submit its F16 smolt profile due to bug 727518. Since it only has 512 MiB RAM, I had to use the workaround in bug 708966.)

Comment 17 Mads Kiilerich 2012-05-07 14:05:05 UTC
Josh,

There has been strange f16 reports where grub2 running from the installer behaved differently than when running from a regular os (for example bug 750794). But I guess the main reason we see it now in f17 is that grub2 has evolved.

If you can say with confidence that the logic in http://bzr.savannah.gnu.org/lh/grub/trunk/grub/annotate/4300/util/grub-install.in#L311 is wrong and /proc/device-tree regularly appear on bios x86 systems then it should be fixed in grub2 and upstream.

Comment 18 Dave Jones 2012-05-07 15:24:59 UTC
I think we can probably fix this by just making the kernel remove the empty proc dir when of_find_node_by_path fails.

Something like this maybe..


diff --git a/fs/proc/proc_devtree.c b/fs/proc/proc_devtree.c
index 927cbd1..c6a612b 100644
--- a/fs/proc/proc_devtree.c
+++ b/fs/proc/proc_devtree.c
@@ -233,6 +233,7 @@ void __init proc_device_tree_init(void)
                return;
        root = of_find_node_by_path("/");
        if (root == NULL) {
+               remove_proc_entry(proc_device_tree, NULL);
                pr_debug("/proc/device-tree: can't find root\n");
                return;
        }

Comment 19 Dave Jones 2012-05-07 15:28:14 UTC
remove_proc_entry("device-tree", NULL);  

probably has more chance of compiling. derp.

Comment 20 Mads Kiilerich 2012-05-07 15:45:16 UTC
/me stating the obvious: If you say that upstream linux in some configurations do create the empty device-tree directory, then upstream grub will have to work around it anyway and it might not be worth it "fix" it in the kernel.

Comment 21 Adam Williamson 2012-05-07 17:06:44 UTC
pjones concurs that the kernel is the place to fix this, and agrees with Dave's suggested fix in comment #18 and comment #19.



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 22 Fedora Update System 2012-05-08 00:54:48 UTC
kernel-3.3.4-5.fc17 has been submitted as an update for Fedora 17.
https://admin.fedoraproject.org/updates/kernel-3.3.4-5.fc17

Comment 23 Mads Kiilerich 2012-05-08 19:10:31 UTC
*** Bug 819971 has been marked as a duplicate of this bug. ***

Comment 24 Tim Flink 2012-05-08 20:34:46 UTC
If anyone wants to poke at this before TC4 is spun up, I built a test image with kernel-3.3.5-2.fc17

 - http://tflink.fedorapeople.org/iso/20120508_preTC4.i686.boot.iso
 - http://tflink.fedorapeople.org/iso/20120508_preTC4.i686.boot.iso.sha256

Other than the updated kernel package, it should be identical to TC3. The fix seems to work so far in my very limited smoke testing, though.

Comment 25 Andre Robatino 2012-05-09 00:10:22 UTC
(In reply to comment #24)

> Other than the updated kernel package, it should be identical to TC3. The fix
> seems to work so far in my very limited smoke testing, though.

Minimal install works fine for me (including the bootloader).

Comment 26 Adam Williamson 2012-05-09 05:27:13 UTC
Discussed at 2012-05-08 QA meeting, acting as a blocker review meeting. Accepted as a blocker per criterion "In most cases (see Blocker_Bug_FAQ), a system installed according to any of the above criteria (or the appropriate Beta or Final criteria, when applying this criterion to those releases) must boot to the 'firstboot' utility on the first boot after installation, without unintended user intervention, unless the user explicitly chooses to boot in non-graphical mode.", as it breaks 32-bit install.



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 27 Fedora Update System 2012-05-09 16:11:30 UTC
Package kernel-3.3.4-5.fc17:
* should fix your issue,
* was pushed to the Fedora 17 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing kernel-3.3.4-5.fc17'
as soon as you are able to, then reboot.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-7531/kernel-3.3.4-5.fc17
then log in and leave karma (feedback).

Comment 28 Andre Robatino 2012-05-10 15:31:00 UTC
Confirmed fixed in TC4 - minimal and default non-networked i386 DVD installs work normally, including the bootloader.

Comment 29 Adam Williamson 2012-05-11 06:03:27 UTC

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 30 Fedora Update System 2012-05-12 16:20:34 UTC
kernel-3.3.4-5.fc17 has been pushed to the Fedora 17 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 31 Garrett Mitchener 2012-05-31 01:24:18 UTC
I saw this today upgrading an EEEPC from Fedora 16 to Fedora 17 using preupgrade.  No PAE.  It (partially) booted using an old kernel-3.3.6-f16 loaded with what I guess was the old grub2.  I eventually had to run

grub2-install --target=i386-pc /dev/sda

(When I rebooted, there was a kernel panic, but it only happens that once.)  The upgrade process installed kernel 3.3.7 but grubby failed with an error message about "unable to find a suitable template", possibly without trying kernel-3.3.4.  I don't know where things are going wrong, but I think closing this bug was a bit optimistic.

And for some reason, grub also didn't get installed properly on my desktop machine, which is PAE, but I never got an error message about source_dir.

I just feel obligated to say this: I really like linux, and I think fedora is a great distribution, but every time I upgrade, and this is going back many versions, no matter what method, and on every machine, something ALWAYS goes wrong with the boot loader.  No error messages or warning during the upgrade-- stuff just fails when it reboots.  The fix is always something very simple but also technical, requiring more expertise with bugzilla and the command line than I'd expect of most users.  If I were a newbie trying to upgrade fedora at home with just one computer, I'd have just given up on linux long ago.  I don't know if it's preupgrade or grub or anaconda or what, but I'm seeing a serious chronic quality control problem here.  The boot loader is the front door for linux.  It needs to work!

Comment 32 Adam Williamson 2012-05-31 01:45:24 UTC
Your issue is clearly completely different from the initial report here. It doesn't match the description, it doesn't match the circumstances, and it doesn't match the error message. The only thing in common is that it happens to involve the bootloader. The initial bug report is a failure to install a bootloader at all, with one error message, on fresh installation specifically with the i686 edition of Fedora; your bug (so far as I can make out) is failure to add a new kernel to the grub configuration file, using preupgrade, with a different error message. Please file a new bug. Thanks.

As far as bootloader QA goes: I have tried assembling the grub source code and the people responsible for the design of PC system initialization (hint: it's a deeply shit design) and yelling at them 'IT NEEDS TO WORK!', but the results were indifferent. If you would like a two hour lecture on why it is fundamentally impossible to make PC initialization 'just work' I will be happy to call you at any number you care to provide and yell down the phone for a while...


Note You need to log in before you can comment on or make changes to this bug.