Bug 1279599

Summary: grub2-install fails on devices with minor numbers that exceed 8 bits
Product: Red Hat Enterprise Linux 7 Reporter: jcastran
Component: grub2Assignee: Peter Jones <pjones>
Status: CLOSED ERRATA QA Contact: Release Test Team <release-test-team-automation>
Severity: high Docs Contact: Clayton Spicer <cspicer>
Priority: high    
Version: 7.1CC: cww, jcastran, jstodola, mbanas, pjones, pkotvan, tim.walberg
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: grub-2.02-0.35.el7 Doc Type: Bug Fix
Doc Text:
A wider variety of partitions can be used as `/boot` Previously, the GRUB2 boot loader only supported 8-bit device node minor numbers. Consequently, boot loader installation failed on device nodes with minor numbers larger than `255`. All valid Linux device node minor numbers are now supported, and as a result a wider variety of partitions can be used as `/boot` partitions.
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-04 03:59:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1203710, 1295926, 1313485    

Description jcastran 2015-11-09 20:20:06 UTC
grub2-install fails on devices with minor numbers that exceed 8 bits. On large systems used for virtualization of large numbers of virtual machines, the boot drive for a new machine can end up with disks/partitions that have minor numbers > 255, in which case "chroot grub2-install ...." will fail.

Where are you experiencing the behavior?  What environment?

In grub-core/osdep/devmapper/getroot.c, there is this function, which makes the erroneous assumption that minor numbers are still 8-bit. This assumption has not been true for quite some time:

char *
grub_util_devmapper_part_to_disk (struct stat *st,
                                  int *is_part, const char *path)
{
  int major, minor;

  if (grub_util_get_dm_node_linear_info (st->st_rdev,
                                         &major, &minor, 0))
    {
      *is_part = 1;
      return grub_find_device ("/dev",
                               (major << 8) | minor);         <<<<< --------- ERROR!
    }
  *is_part = 0;
  return xstrdup (path);
}

When does the behavior occur? Frequently?  Repeatedly?   At certain times?

Repeatable. Depends on how many other VMs have been deployed previously on a host.

Comment 2 jcastran 2015-11-10 18:30:55 UTC
I asked the customer what their environment was and how they are running into this issue.

Jim Bayer:
In our particular case, we are using relatively large systems (HP DL380p Gen8s, for example) as virtual machine hosts. Each virtual machine has a boot drive that corresponds to a LUN on an IBM SVC array that is connected either via iSCSI or fibre channel. So, for example, running 100 virtual machines on a host means there are >100 disks attached to the host (many VMs have more than just a single drive), and when you account for partitions on top of the raw devices, it doesn't take too many VMs before the device mapper starts hitting minor numbers beyond 255. Each new VM is created by allocating a LUN, partitioning it, creating file systems, mounting them on the host, and then unpacking a template (which was originally build by installing from an ISO, but, rather than go through rebuilding from ISO every time, we create an archive of an install that has a certain level of our local configuration changes already made). The installation script then attempts to run the GRUB installer under chroot to set up GRUB for the VM. Many of our older VMs used RHEL or CentOS 5.x or 6.x, which were all based off "legacy GRUB", and this issue did not occur in the older code base. When we started deploying 7.x VMs, though, the switch in those releases to GRUB2 exposed this particular bug that has apparently been in the re-implemented code base for a while (according to git logs).

Comment 5 Jan Stodola 2016-02-04 21:01:07 UTC
Reproduced in a virtual machine:
1) created 266 disk images and associated them with /dev/loop devices (loop0-loop265)
2) install the anaconda package
3) make sure the minor number of /dev/loop265 is higher than 255:
[root@localhost ~]# ls -l /dev/loop265
brw-rw----. 1 root disk 7, 265 Feb  4 21:46 /dev/loop265
[root@localhost ~]#

4) proceed through installation to the disk image associated with /dev/loop265:
anaconda --repo <REPO_PATH> --text --image /dev/loop265

The installation fails at the end of the installation when installing boot loader.

program.log:
...
21:35:21,881 INFO program: Running... grub2-install --no-floppy /dev/loop261
21:35:30,001 INFO program: Installing for i386-pc platform.
21:35:30,002 INFO program: grub2-install: error: cannot find a GRUB drive for /dev/mapper/loop265p1.  Check your device.map.
21:35:30,002 DEBUG program: Return code: 1
...

Comment 8 Peter Kotvan 2016-08-18 13:31:33 UTC
Reproduced on RHEL-7.2 GA with anaconda-21.48.22.56-1.el7. Verified on RHEL-7.3-20160817.1 with anaconda-21.48.22.82-1.el7.

Comment 11 errata-xmlrpc 2016-11-04 03:59:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2336.html