Bug 826833

Summary: unable to install bootloader on SAS storage
Product: [Fedora] Fedora Reporter: Isao Shimizu <isaoshimizu>
Component: grub2Assignee: Peter Jones <pjones>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 17CC: bcl, dennis, derrien, elliott.forney, hawk, iglesias, mads, nicolas.vieville, pjones, watanabe.yu
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-08-01 08:53:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Patched Fedora 17 grub source package none

Description Isao Shimizu 2012-05-31 05:51:30 UTC
Description of problem:

unable to install bootloader on SAS storage.
anaconda stopped after install rpm packege from repository. and show error message.

"There was an error installing the bootloader. The system may not be bootable."

/tmp/program.log
19:14:37,207 INFO program: Running... grub2-install --no-floppy /dev/sda
19:14:39,134 ERR program: Installtion finished. No error reported.
19:14:39,295 INFO program: Running... grub2-set-default Fedora Linux, with Linux 3.3.4-5.fc17.x86_64
19:14:39,385 INFO program: Running... grub2-mkconfig -o /boot/grub2/grub.cfg
19:14:39,552 ERR program: Generating grub.cfg ...
stopped.

Version-Release number of selected component (if applicable):

Fedora 17 Final
grub 2.00 beta4

How reproducible:

always

Steps to Reproduce:
1.
2.
3.
  
Actual results:

grub2-mkconfig can't generate to /boot/grub2/grub.cfg.

Expected results:

grub2-mkconfig can generate to /boot/grub2/grub.cfg.
and boot successfully.

Additional info:

grub2-mkconfig stopped at /usr/share/grub/grub-mkconfig_lib, prepare_grub_to_access_device().

problem line:
hints="`"${grub_probe}" --device "${device}" --target=hints_string 2> /dev/null`"

I executed this command on anaconda console.
but occured error "No such file or directory".

grub2-probe --device /dev/sda --target=hints_string
grub2-probe: error: cannot open `/sys/devices/pci0000:00/0000:00:03.0/0000:02:00.0/host4/port-4:0/end_device-4:0/sas_device:end_device-4:0/phy_identifier': No such file or directory.

the path is wrong. correct path is 
/sys/devices/pci0000:00/0000:00:03.0/0000:02:00.0/host4/port-4:0/end_device-4:0/sas_device/end_device-4:0/phy_identifier
(colon or slash)

and I found problem code in grub source. 

http://alpha.gnu.org/gnu/grub/grub-2.00~beta4.tar.gz
check_sas() util/ieee1275/ofpath.c

snprintf (path, path_size, "%s/sas_device:%s/phy_identifier", p, ed);

to 

snprintf (path, path_size, "%s/sas_device/%s/phy_identifier", p, ed);

Probably, this code is executed only machine attached SAS devices.

I fixed this code,and built rpm package. and include my rpm package by Fedora 17 anaconda.
And it works. (bootloader installation is successfully!)

Comment 1 Mads Kiilerich 2012-05-31 09:47:12 UTC
It seems like you know your way around grub. Could you please try to patch it in grub upstream?

(and btw: is /boot/grub2/device.map generated correctly in this case?)

Comment 2 Isao Shimizu 2012-06-01 01:22:11 UTC
(In reply to comment #1)
> It seems like you know your way around grub. Could you please try to patch
> it in grub upstream?

Thank you. I try to patch it.

> (and btw: is /boot/grub2/device.map generated correctly in this case?)

generated device.map. It seems correctly.

# this device map was generated by anaconda
(hd0)      /dev/sda

Comment 3 Isao Shimizu 2012-06-04 09:17:27 UTC
Comitted in grub upstream.

http://bzr.savannah.gnu.org/lh/grub/trunk/grub/revision/4411

Comment 4 nicolas.vieville 2012-06-06 09:11:10 UTC
Hello,

While waiting for the commit from upstream to land in F-17, I had to modify /usr/share/grub/grub-mkconfig_lib file as shown in the patch below, to be able to use grub2-mkconfig to generate /boot/grub2/grub.cfg file. 
This workaround can't be considered as a strong solution as it deactivates sh error checking for the line that fails in the /usr/share/grub/grub-mkconfig_lib sh script. This permits the grub2-mkconfig tool to generate the grub.cfg file without failing and exiting before the end of the process. 
The result: the first "if" statement that comes after the "set -e" instruction is not correct. So this is ***not a solution*** to use if your machine boots from SAS drives, as the grub configuration file generated is wrong. Use it at your own risk, or try to find another workaround to generate the right line (maybe found in an old grub.cfg file).
For my part, this modified script has generated a grub.cfg file successfully usable with a machine that uses SATA drives as boot and system drives (no SAS drives are involved in the booting process for this machine).

Hope this will be useful for users in the difficulty.

Cordially,


-- 
NVieville


The patch:

--- grub-mkconfig_lib.orig	2012-05-09 23:06:27.000000000 +0200
+++ grub-mkconfig_lib	2012-06-06 10:34:51.221909383 +0200
@@ -153,7 +153,9 @@
     echo "set root='$fs_hint'"
   fi
   if fs_uuid="`"${grub_probe}" --device "${device}" --target=fs_uuid 2> /dev/null`" ; then
+    set +e
     hints="`"${grub_probe}" --device "${device}" --target=hints_string 2> /dev/null`"
+    set -e
     echo "if [ x\$feature_platform_search_hint = xy ]; then"
     echo "  search --no-floppy --fs-uuid --set=root ${hints} ${fs_uuid}"
     echo "else"

Comment 5 Mads Kiilerich 2012-06-12 13:10:03 UTC
This fix can be tested with an unofficial scratch build with a snapshot from bzr: grub2-2.0-0.37.beta6.4462.fc17 at http://koji.fedoraproject.org/koji/taskinfo?taskID=4152623 .

Comment 6 nicolas.vieville 2012-08-29 10:58:37 UTC
(In reply to comment #5)
> This fix can be tested with an unofficial scratch build with a snapshot from
> bzr: grub2-2.0-0.37.beta6.4462.fc17 at
> http://koji.fedoraproject.org/koji/taskinfo?taskID=4152623 .

Sorry for the late response. I didn't tried the package you build, but yesterday the new official grub2 package for f-17 (grub2-2.0-0.38.beta6.fc17.x86_64) didn't include Isao Shimizu proposed patch, even if the proposed modification are included upstream in trunk (see http://bzr.savannah.gnu.org/lh/grub/trunk/grub/annotate/head:/util/ieee1275/ofpath.c).

F-18 and F-19 Grub2 packages already include this patch.

To be complete on this subject, I also tried in early July F-18 Grub2 package including this patch, but this one was unable to boot F-17 kernel (from memory reason was kernel and initram file size or end of file were not conform). I didn't tried with the last F-18 package to see if kernel boots correctly. If needed I only could try this next week.

So, I wonder if it could be possible to push these modifications in official package for F-17, to let sas hardware machines reboot quietly, even after a grub2 upgrade and a power interruption for example (administrators are not always in front of servers while they are booting). 

Thanks in advance.

Cordially,


-- 
NVieville

Comment 7 Mads Kiilerich 2012-08-29 11:42:49 UTC
f18 grub2 is getting closer to the point where it is usable. You could give it a try.

Comment 8 Mads Kiilerich 2012-09-10 14:09:31 UTC
*** Bug 855875 has been marked as a duplicate of this bug. ***

Comment 9 Elliott Forney 2012-10-03 21:58:26 UTC
This appears to affect a lot of machines.  Any hope of getting a patch pushed to fedora 17?

Comment 10 Mike Iglesias 2012-10-16 22:34:47 UTC
I just hit this bug today.  It's kind of disappointing to see that it's been 4 months since a fix was provided and it's still not pushed out for F17.

Comment 11 Elliott Forney 2012-10-16 22:44:15 UTC
It also appears to affect new installs (had to use rescue image to apply "set -e" work around).  So, I think a strong argument could be made that the patch should also be pushed to the Fedora 17 install media.

Comment 12 Mads Kiilerich 2012-10-17 00:05:45 UTC
Fedora do not create new official install media after the release day.

That is why it is so important to get involved in testing alpha/beta releases.

The issue can thus not be fixed in the install media, but it could be described on https://fedoraproject.org/wiki/Common_F17_bugs. F17 grub2 could also be updated ... but that would only help for some update methods.

I guess the simplest workaround is to use f18 rpms.

Comment 13 Elliott Forney 2012-10-17 00:20:36 UTC
Does Fedora NEVER repair the install media?  This seems extreme to me given that anyone who tries to install Fedora 17 on a machine with a SAS controller is going to wind up with an unbootable system until they find this bug report and figure out how to do chroot grub2-mkconfig from a rescue disk.

Shall I make a write up for the wiki?

Comment 14 Mads Kiilerich 2012-10-17 00:34:41 UTC
F18 will (hopefully) soon be released, and I guess 95% of the f17 installations that ever will be made already has been done.

Instead I would recommend focusing on testing f18 to ensure a flawless experience there once it is released.

Comment 15 Elliott Forney 2012-10-19 08:48:52 UTC
I realize that maintaining grub2 must be a ton of work (thank you!) but I agree that the lack of interest in fixing this in F17 is disappointing.  Beefy miracle is currently less than half way through its life cycle and this rather serious bug report has been around for a while.  I worry that this kind of thing can cause a new user to switch distros on day one.

I am also concerned with the fact that grub2-mkconfig does not report an error when this command fails.  It simply stops midway through generating the grub.cfg file with no warning or error message.  I had to add "set -x" to a chain of scripts before discovering the location where it was failing which finally led me here.  I do hope that the error reporting can be improved in future releases of grub2.

Comment 16 Mike Iglesias 2012-10-19 15:34:07 UTC
My experience was pretty much the same as Elliott's.  I had installed F17 on a new Dell R310 server and Anaconda complained at the end about a problem writing the bootloader.  I knew I had a problem then, but it took about an hour and half of poking around in the scripts in rescue mode (and running commands on another F17 system I had to see what the difference was) before I figured out where the problem was and removed the problematic line from the script.  I found this bug when I got back to my office and looked in bugzilla.

Even if the install media is bad, it would be nice if this was fixed so I don't have to keep an eye on this system after every update run.

Comment 17 Richard Neuboeck 2012-11-08 09:15:17 UTC
Created attachment 640652 [details]
Patched Fedora 17 grub source package

Comment 18 Richard Neuboeck 2012-11-08 09:16:00 UTC
I stumbled on this bug recently and found the same solution as Isao Shimizu before finding his bug report. I already created a Fedora 17 compatible package for our network installs. But invested a lot of time in doing so.

However pointing to Fedora 18 (which is delayed til 2013 as of today) only partially helpful since you need to find this bug report first to know you should use a package from a different distribution.

Updating the package in the current release would only take a few minutes and spare at least some people network installing some time poking around in the dark.

I've attached my src rpm. The patch applied is the same as already described by Isao Shimizu.

Comment 19 Mads Kiilerich 2012-11-08 10:53:05 UTC
(In reply to comment #18)

A link to a git repo would probably have a bigger chance of getting accepted in f17. For instance 'fedpkg clone -a grub2' and work on the f17 branch and post it on github.

Comment 20 Richard Neuboeck 2012-11-09 10:20:29 UTC
(In reply to comment #19)

Thanks for the info. I hope the link is what you expected (otherwise I need more details): https://github.com/tbihawk/grub2

Comment 21 Mads Kiilerich 2012-11-09 11:53:03 UTC
Yes, it mostly looks fine.

But you shouldn't increment epoch - that is the most significant part of the version. Use 'rpmdev-bumpspec grub2.spec' to bump the revision correctly.

And a minor not: a link to the corresponding upstream fix would be nice.

You might have to mail pjones directly and ask him to pull from your repo.

Comment 22 Richard Neuboeck 2012-11-09 13:17:25 UTC
(In reply to comment #21)

Ok. Epoch is back to 1. rpmdev-dumpspec increase the minor version to 0.39. GitHub is up to date.

Grub Bug Report: http://savannah.gnu.org/bugs/?36572

Comment 23 Mads Kiilerich 2012-11-09 13:26:20 UTC
A link to http://bzr.savannah.gnu.org/lh/grub/trunk/grub/revision/4411 would help explaining that the fix already is in 2.0 which is 4542.

Folding your two changesets to one would make it easier to review and give a cleaner history. YMMV.

Comment 24 Richard Neuboeck 2012-11-09 13:45:50 UTC
Wouldn't it be easier to just grab the new source and use the one which has already most of the fixes applied?

I already wrote a mail to pjones. I hope what I've done helped. Otherwise I really need a handbook on how to handle patching on this scale :-)

Comment 25 Fedora End Of Life 2013-07-04 02:37:26 UTC
This message is a reminder that Fedora 17 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 17. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '17'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 17's end of life.

Bug Reporter:  Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 17 is end of life. If you 
would still like  to see this bug fixed and are able to reproduce it 
against a later version  of Fedora, you are encouraged  change the 
'version' to a later Fedora version prior to Fedora 17's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 26 Elliott Forney 2013-07-04 03:07:03 UTC
I believe this was fixed in F18, is that correct?

I really wish grub2 would ditch "set -e" in shell scripts so that error conditions like this one could be handled gracefully.

Comment 27 Fedora End Of Life 2013-08-01 08:53:51 UTC
Fedora 17 changed to end-of-life (EOL) status on 2013-07-30. Fedora 17 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.