Bug 1125382

Summary: option rom boot fails on micron p320h & p420m pcie ssd drives in RHEL 7 due to grub2 misconfiguration
Product: Red Hat Enterprise Linux 7 Reporter: Sam Bradshaw <sbradshaw>
Component: grub2Assignee: Peter Jones <pjones>
Status: CLOSED DUPLICATE QA Contact: Release Test Team <release-test-team-automation>
Severity: high Docs Contact:
Priority: unspecified    
Version: 7.1CC: asamymuthupa, pjones, release-test-team-automation, sbradshaw
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1087962 Environment:
Last Closed: 2014-12-15 20:29:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Sam Bradshaw 2014-07-31 17:35:19 UTC
+++ This bug was initially created as a clone of Bug #1087962 +++

Description of problem:  Micron p320h devices have legacy option rom boot support.  The way grub2 is configured in RHEL7 causes boot to fail as the first stage is not able to read the partition table from the drive.  Boot always fails with the following error message:
.
error: no such device: <UUID-of-drive>
Entering rescue mode...
grub rescue>

Version-Release number of selected component (if applicable): RHEL 7 Public Beta

How reproducible: 100% reproducible

Steps to Reproduce:
1. Enable option rom boot in p320h device (contact sbradshaw for instructions and/or preconfigured drives)
2. Install RHEL 7 such that the p320h is used as a boot device (MBR only)
3. Reboot when prompted.  The error message listed above occurs.  

Actual results:
Installation completes without error however upon reboot, the error message listed above occurs.  

Expected results:
Boot should succeed without error or additional configuration.

Additional info:

When a single ext3 root partition is used (and a single swap partition):

grub rescue> ls
(hd0)
grub rescue> set
prefix=(hd0)/boot/grub2
root=hd0

We have deduced that the option rom is providing the same data that was written to the drive when requested by the platform bios.  We have verified this by single stepping though INT13 calls and checking data placement as well as taking pcie traces.  Beyond that, we have installed a Gentoo distribution to the p320 drive and successfully booted.

--- Additional comment from Sam Bradshaw on 2014-04-15 16:58:14 EDT ---

One other data point: changing the mtip32xx driver to register a device node named similarly to SATA drives (/dev/sd* as opposed to what it currently uses /dev/rssd*) enables successful legacy boot through the drive's option rom.  We believe, but have not independently verified, that the same phenomenon would occur with other bootable block devices.  NVMe devices, for example, register a device node named /dev/nvme*.

--- Additional comment from Sam Bradshaw on 2014-04-15 18:29:07 EDT ---

See also: http://savannah.gnu.org/bugs/?41883
In particular, please review the patch in comment #1.  That patch is needed for nvme boot support and a similar one is needed for rssd boot support.

--- Additional comment from RHEL Product and Program Management on 2014-04-23 01:47:21 EDT ---

This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

--- Additional comment from Sam Bradshaw on 2014-04-23 13:45:11 EDT ---

Yes, we would like this issue considered for fixing in the next Red Hat Enterprise Linux release.

Comment 2 Peter Jones 2014-12-15 15:43:59 UTC
Allright, so I'm confused about what this bug and #1087962 are asking for, and how they differ - Is one for nvme and the other for rssd* devices?  If so, which one is which?

Also some examples of the full device paths for rssd devices, partitions one them, etc (i.e. instead of just showing the globbed bit) would be helpful.

Comment 3 Sam Bradshaw 2014-12-15 17:16:53 UTC
This is a clone of (In reply to Peter Jones from comment #2)
> Allright, so I'm confused about what this bug and #1087962 are asking for,
> and how they differ - Is one for nvme and the other for rssd* devices?  If
> so, which one is which?
> 
> Also some examples of the full device paths for rssd devices, partitions one
> them, etc (i.e. instead of just showing the globbed bit) would be helpful.

This bug is a clone of #1087962.  I saw the message indicating that #1087962 was not resolved for 7.0 so I cloned it for 7.1, hoping it would get resolved in that timeframe.  If that's not the workflow you use, feel free to dup one to to the other.

Full raw device examples include:
/dev/rssda
/dev/rssdb
/dev/rssdc
etc.
These devices only ever have a single LUN.

If we create 2 partitions on /dev/rssda, for example, we would see /dev/rssda1 and /dev/rssda2.

Comment 4 David Cantrell 2014-12-15 20:29:53 UTC
That's not the bug workflow.  The Version field in the bug report does not necessarily correspond with anything consistently.  If it's a customer or partner bug, it tends to be the version they see the bug in.  For other reporters it tends to get set to the version they want to see a fix in.  There is no consistent reporting policy we enforce or even follow ourselves.

Bottom line, that field is largely meaningless in development.  We go by the flags, which you may or may not be able to see.  I'm duping this one back to the original, which is still approved and a work in progress.

*** This bug has been marked as a duplicate of bug 1087962 ***