Bug 742226 - /sbin/grub2-probe: error: cannot find a GRUB drive for /dev/mapper/nvidia_cjfffajep2
Summary: /sbin/grub2-probe: error: cannot find a GRUB drive for /dev/mapper/nvidia_cjf...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: grub2
Version: 16
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Peter Jones
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: AcceptedBlocker
: 746394 (view as bug list)
Depends On:
Blocks: F16Blocker, F16FinalBlocker
TreeView+ depends on / blocked
 
Reported: 2011-09-29 12:49 UTC by James Laska
Modified: 2013-09-02 06:57 UTC (History)
10 users (show)

Fixed In Version: grub2-1.99-11.fc16
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-10-29 06:41:24 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
anaconda.log (23.24 KB, text/plain)
2011-09-29 12:51 UTC, James Laska
no flags Details
program.log (118.02 KB, text/plain)
2011-09-29 12:51 UTC, James Laska
no flags Details
storage.log (295.19 KB, text/plain)
2011-09-29 12:51 UTC, James Laska
no flags Details
syslog (163.03 KB, text/plain)
2011-09-29 12:51 UTC, James Laska
no flags Details
screenshot-0001.png (46.32 KB, image/png)
2011-09-29 12:51 UTC, James Laska
no flags Details
/mnt/sysimage/boot/grub2/device.map (126 bytes, text/plain)
2011-09-29 12:57 UTC, James Laska
no flags Details
Anaconda log (20.91 KB, text/x-log)
2011-10-09 00:26 UTC, Timothy Davis
no flags Details
Anaconda storage log (49.86 KB, text/plain)
2011-10-09 00:27 UTC, Timothy Davis
no flags Details
Output of grub2-mkconfig (2.02 KB, text/plain)
2011-10-09 14:19 UTC, Kyle
no flags Details
Output of grub2-probe (6.63 KB, text/plain)
2011-10-09 14:20 UTC, Kyle
no flags Details
output from grub2-probe (7.15 KB, text/x-log)
2011-10-21 18:18 UTC, Brian Lane
no flags Details
device.map from usb install (166 bytes, application/octet-stream)
2011-10-21 18:18 UTC, Brian Lane
no flags Details

Description James Laska 2011-09-29 12:49:17 UTC
Description of problem:

While testing F-16-Beta-RC4, I am unable to complete an install to a BIOS RAID system.  Anaconda formats the BIOS RAID disk properly, but a dialog appears at the end of the install noting that the bootloader failed to install.

  "There was an error installing the bootloader.
   The system may not be bootable."

Version-Release number of selected component (if applicable):
 * anaconda version 16.20 on x86_64 starting

How reproducible:
 * 1 of 1 attempts failed

Steps to Reproduce:
1. Install to a system configured for BIOS RAID as indicated by https://fedoraproject.org/wiki/QA:Testcase_Install_to_BIOS_RAID
  
Actual results:


Expected results:


Additional info:

$ tail /tmp/program.log

08:25:05,038 INFO program: Running... grub2-set-default Fedora Linux, with Linux 3.1.0-0.rc8.git0.0.fc16.x86_64
08:25:05,287 INFO program: Running... grub2-mkconfig -o /boot/grub2/grub.cfg
08:26:29,924 ERR program: Generating grub.cfg ...
08:26:30,065 ERR program: cat: /boot/grub2/video.lst: No such file or directory
08:27:04,342 ERR program: Found linux image: /boot/vmlinuz-3.1.0-0.rc8.git0.0.fc16.x86_64
08:27:04,473 ERR program: Found initrd image: /boot/initramfs-3.1.0-0.rc8.git0.0.fc16.x86_64.img
08:27:18,243 ERR program: /sbin/grub2-probe: error: cannot find a GRUB drive for /dev/mapper/nvidia_cjfffajep2.  Check your device.map.
08:27:32,032 ERR program: /sbin/grub2-probe: error: cannot find a GRUB drive for /dev/mapper/nvidia_cjfffajep2.  Check your device.map.
08:28:09,202 ERR program: done
08:28:09,359 INFO program: Running... grub2-install --no-floppy (hd0)
08:28:38,874 ERR program: /sbin/grub2-probe: error: cannot find a GRUB drive for /dev/mapper/nvidia_cjfffajep2.  Check your device.map.
08:28:38,877 ERR program: Auto-detection of a filesystem of /dev/mapper/nvidia_cjfffajep2 failed.
08:28:38,877 ERR program: Try with --recheck.
08:28:38,878 ERR program: If the problem persists please report this together with the output of "/sbin/grub2-probe --device-map="/boot/grub2/device.map" --  target=fs -v /boot/grub2" to <bug-grub>

Comment 1 James Laska 2011-09-29 12:51:41 UTC
Created attachment 525557 [details]
anaconda.log

Comment 2 James Laska 2011-09-29 12:51:43 UTC
Created attachment 525558 [details]
program.log

Comment 3 James Laska 2011-09-29 12:51:50 UTC
Created attachment 525559 [details]
storage.log

Comment 4 James Laska 2011-09-29 12:51:53 UTC
Created attachment 525560 [details]
syslog

Comment 5 James Laska 2011-09-29 12:51:58 UTC
Created attachment 525561 [details]
screenshot-0001.png

Comment 6 James Laska 2011-09-29 12:52:51 UTC
Proposing as an F16Beta blocker per the Beta criteria ...
  "The installer must be able to create and install to software, hardware or BIOS RAID-0, RAID-1 or RAID-5 partitions for anything except /boot "

Comment 7 James Laska 2011-09-29 12:57:57 UTC
Created attachment 525562 [details]
/mnt/sysimage/boot/grub2/device.map

Comment 8 Adam Williamson 2011-09-29 17:49:02 UTC
So, some points to consider on this bug:

1) It is at least to some degree probably specific to some element of James' configuration: the motherboard in question and/or the use of RAID-0

2) fundamentally, grub2's BIOS RAID support is just not all there yet. This means this isn't a case of 'we broke something we really shouldn't have' but 'we switched to a new upstream tool with somewhat less good coverage of this area of functionality than the one we were using before'.

3) due to 2, there's no guarantee this could be fixed at all easily. pjones describes the situation as "the code to handle bios raid on that codepath in grub2 doesn't appear to /be there/.", which obviously means it's not exactly a one-line fix. And even if we fix it, that could easily break something else sensitive.

So, there are reasons we could decide not to block Beta for this, is all I'm sayin'. We should kick it around in the go/no-go.

I'm going to go buy another hard disk in a minute so I can test on my system. It's entirely possible that that might work.

Comment 9 Adam Williamson 2011-09-29 19:52:16 UTC
I've had a successful test with Intel BIOS RAID, so this failure is indeed to some degree hardware-specific. Right now we have two tests, one pass, one fail. Fortunately my motherboard has two SATA controllers, so I can also try with the Marvell controller and see how that goes.

Comment 10 Adam Williamson 2011-09-29 20:36:53 UTC
So far I have pass for Intel RAID-1 and Marvell RAID-0, an odd fail for Marvell RAID-1 which might go away if i re-try.

Comment 11 Tim Flink 2011-09-30 15:22:38 UTC
Discussed in the 2011-09-29 go/no-go meeting. Rejected as a blocker bug for Fedora 16 beta because it only affects some BIOS RAID controllers.

Re-proposed as a blocker for Fedora 16 final and noted as a common bug for beta.

Comment 12 Adam Williamson 2011-09-30 19:01:13 UTC
Discussed at the 2011-09-30 blocker review meeting. We agreed we need more detail on the actual bug here and its impact to assess its blocker status: whether it's very specific to NVIDIA controllers, or affects all firmware RAID-0 installs, or what.

pjones, if you could get more info as to what's going wrong by next Friday, that'd be awesome. thanks!

Comment 13 Timothy Davis 2011-10-03 21:15:51 UTC
I have a DFI Lan Party Nforce4 motherboard and tried to install twice on a mirror; once with a 40/15 Gb mirror that was formatted at the bios level and again with two 10Gb HDs; both times anaconda failed to install the bootloader.

Comment 14 Adam Williamson 2011-10-04 02:46:15 UTC
with this error?

Comment 15 Timothy Davis 2011-10-04 12:45:17 UTC
Yes

Comment 16 Timothy Davis 2011-10-04 12:45:45 UTC
I will grab those hard drives and pull the logs from them.

Comment 17 Adam Williamson 2011-10-07 18:05:33 UTC
Discussed at 2011-10-07 blocker review meeting. As per last week's note, we still need an evaluation from pjones to determine blocker status.

Comment 18 Timothy Davis 2011-10-09 00:26:26 UTC
Created attachment 527053 [details]
Anaconda log

DFI LanParty UT, nforce4 ide bios raid 1
Maxtor 10Gb IDE x2 in bios mirror

Comment 19 Timothy Davis 2011-10-09 00:27:41 UTC
Created attachment 527054 [details]
Anaconda storage log

Comment 20 Kyle 2011-10-09 14:18:22 UTC
I'm also running into this error when I install grub2, except I'm on a Promise RAID controller. 
lspci gives this line: 01:02.0 RAID bus controller: Promise Technology, Inc. PDC20277 (SBFastTrak133 Lite) (rev 01)
It's set up as a RAID set spanning 1 drive. There are no other controllers on the board as this is an old 1U.

I'm not running the testcase though - I upgraded the existing F15 system with yum.

My install of F16 boots fine with the existing GRUB 0.97 though (the kernel was added by grubby when I did the distro-sync), so my system works. (This also means that my /sbin/grub2-install failed though.)

Are there any other logs/information from me that could be helpful? This is a test system, so I can easily blow stuff away.

Comment 21 Kyle 2011-10-09 14:19:43 UTC
Created attachment 527093 [details]
Output of grub2-mkconfig

Comment 22 Kyle 2011-10-09 14:20:22 UTC
Created attachment 527094 [details]
Output of grub2-probe

Comment 23 Adam Williamson 2011-10-14 17:42:16 UTC
Discussed at the 2011-10-14 blocker review meeting. Though we have no pjones input yet, more and more people seem to be hitting this, so we shaded towards accepting it as a blocker per criterion "The installer must be able to create and install to any workable partition layout using any file system offered in a default installer configuration, LVM, software, hardware or BIOS RAID, or combination of the above".

Comment 24 Brian Lane 2011-10-21 18:18:11 UTC
Created attachment 529546 [details]
output from grub2-probe

Comment 25 Brian Lane 2011-10-21 18:18:47 UTC
Created attachment 529547 [details]
device.map from usb install

Comment 26 Brian Lane 2011-10-21 18:19:42 UTC
Attempted to install to a 2 disk 'mirrored' nvidia RAID. There is a 3rd disk that was not selected from install and a USB stick was the source.

Comment 27 Adam Williamson 2011-10-24 19:26:15 UTC
There's a grub2 build which we believe ought to fix this up here:

http://koji.fedoraproject.org/koji/buildinfo?buildID=270388

I will put up a public repo with that grub2 in it so people can add it as a repository when installing F16 and test that way.

Comment 28 Adam Williamson 2011-10-24 19:48:53 UTC
OK, if you want to test this fix, please test an install using F16 Final TC2:

http://dl.fedoraproject.org/pub/alt/stage/16.TC2/

and add this repo at the repo selection stage:

http://www.happyassassin.net/extras/repo_grub/x86_64

to ensure the updated grub2 is present. If anyone needs a 32-bit repo, just yell. Thanks!

Comment 29 Adam Williamson 2011-10-25 00:06:13 UTC
Can we please get some testing on this fix? thanks!

Comment 30 Kyle 2011-10-25 01:23:26 UTC
I'll yell out for a 32-bit repo to try the install. 

I'll try the rpm as soon as possible, but I'm on call for the next 2 days at work.

Comment 31 Adam Williamson 2011-10-25 01:39:18 UTC
We really need this tested for tomorrow.

okay, added an i686 dir to the repo. 

I'll also put up a live image with the updated grub included if I get time...

Comment 32 Timothy Davis 2011-10-25 12:55:01 UTC
Having problems with my PC reading my DVD+RW, will try again tomorrow.
(it's a frankenstein machine Athlon FX-60, nforce4 chipset DFI LanParty UT)

Comment 33 James Laska 2011-10-25 13:42:25 UTC
Tested and VERIFIED on the same system I initially hit this problem on using TC2 and the custom repo provided by adamw in comment#28

Comment 34 Fedora Update System 2011-10-25 17:48:17 UTC
grub2-1.99-11.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/grub2-1.99-11.fc16

Comment 35 Kyle 2011-10-28 15:56:54 UTC
Ended up grabbing grub2-1.99-12.fc16 from http://koji.fedoraproject.org/koji/buildinfo?buildID=270962.

Well, grub2-probe doesn't fail anymore:
/sbin/grub2-probe: info: changing current directory to mapper.
/sbin/grub2-probe: info: /dev/mapper/pdc_dbijaaabhp1 starts from 2048.
/sbin/grub2-probe: info: opening the device hd3.
/sbin/grub2-probe: info: the size of hd3 is 312581745.
/sbin/grub2-probe: info: Partition 0 starts from 2048.
/sbin/grub2-probe: info: opening hd3,msdos1.
/sbin/grub2-probe: info: the size of hd3 is 312581745.
ext2

grub2-install fails, but the symptoms match https://bugzilla.redhat.com/show_bug.cgi?id=737508, so I think it works fine.
===
Question though: I'm seeing times of ~20 mins for grub2-install to return:
[root@caesium ~]# date; time grub2-install --no-floppy /dev/mapper/pdc_dbijaaabh;date
Fri Oct 28 23:01:14 SGT 2011
/sbin/grub2-setup: error: out of disk.

real    20m42.660s
user    5m25.013s
sys     12m24.793s
Fri Oct 28 23:21:56 SGT 2011
[root@caesium ~]# date; time grub2-install --no-floppy /dev/mapper/pdc_dbijaaabhp1;date
Fri Oct 28 23:37:05 SGT 2011
/sbin/grub2-setup: warn: Attempting to install GRUB to a partitionless disk or to a partition.  This is a BAD idea..
/sbin/grub2-setup: warn: Embedding is not possible.  GRUB can only be installed in this setup by using blocklists.  However, blocklists are UNRELIABLE and their use is discouraged..
/sbin/grub2-setup: error: will not proceed with blocklists.

real    18m10.724s
user    4m45.101s
sys     10m54.830s
Fri Oct 28 23:55:16 SGT 2011

Is this normal?

Comment 36 Adam Williamson 2011-10-28 19:05:51 UTC
We're pretty sure this is fixed as of TC3, but can someone with affected hardware please test with TC3 and verify? Thanks!

http://dl.fedoraproject.org/pub/alt/stage/16.TC3/

Comment 37 Adam Williamson 2011-10-29 06:41:24 UTC
grub2 1.99-12 went stable as part of the glibc rebuild update, so CLOSING. could still do with verification that this is fixed in TC3 or pending RC1. Thanks!



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 38 Kyle 2011-10-29 12:03:17 UTC
Adam: Doing a clean install using the LXDE iso from the link you provided works. My system is booting fine with grub2.

I'm seeing grub2-mkconfig still taking a long time though - running time grub2-mkconfig -o /boot/grub2/grub.cfg shows it took 8m32sec for 2 installed kernels and Xen. 

On my other system (an i7 with the drive appearing as a normal ICH10 controller) the same command takes only 20sec or so despite having 4 kernels and Xen in /boot. 

It makes a working config file, and there are no errors produced, just that it's strange it takes such a long time to finish.

Comment 39 Adam Williamson 2011-10-29 17:11:41 UTC
dunno about that. you could sh -x it to see where it gets stuck, maybe?



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 40 Kyle 2011-10-30 15:42:13 UTC
Ooh. Nice. I didn't know about sh -x.

Anyway. It gets hung up on grub2-probe finding /boot (determining the grub device, then determining the UUID, then finding which partition is /boot, then determining *that* UUID and some other stuff which I've forgotten), then again while determining the fs type. 

After that it pauses in each menu item (the linux_10 and xen_20 entries) after printing "insmod gzio", then prints "insmod part_gpt", so it looks like some part of the BIOS RAID code could be streamlined a bit more since that's where it seems to be fouling up. 

One thing which might help - every time it paused, I took a look at top, and grub2-probe was running at ~90% of CPU consistently.

I tried to make a screencast of it, but istanbul refused to record past the 6:30 mark.

Comment 41 Adam Williamson 2011-10-30 17:44:43 UTC
"I tried to make a screencast of it, but istanbul refused to record past the
6:30 mark."

Another trick for you: in gnome-shell, ctrl-alt-shift-r .

anyway, that looks like a different bug.



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 42 Adam Williamson 2011-11-08 04:35:16 UTC

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 43 Marek Goldmann 2011-11-09 18:47:17 UTC
*** Bug 746394 has been marked as a duplicate of this bug. ***

Comment 44 Fedora Update System 2011-12-10 06:09:25 UTC
grub2-1.99-13.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/grub2-1.99-13.fc16


Note You need to log in before you can comment on or make changes to this bug.