Bug 751875

Summary: root fs UUID from /etc/fstab is used in new grub.cfg entries - also when fstab is wrong
Product: [Fedora] Fedora Reporter: Peter Trenholme <PTrenholme>
Component: grubbyAssignee: Peter Jones <pjones>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 16CC: a.sloman, bcl, collura, mads, pjones
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-02-13 21:11:41 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
grub.cfg file requested by Mads Kiilerich. none

Description Peter Trenholme 2011-11-07 21:23:47 UTC
Description of problem:
When a kernel is updated, grubby updates the grub,cfg file in /boot/grub2/ with a UUID that, as far as I can see, it has made up from hole cloth.

NOTE: The UUID that should be used is that of a software raid 1 (mirror,) but the UUID generated is neither that of the mirror nor of either of the devices making up the mirror.

Version-Release number of selected component (if applicable):
$ grubby --version
grubby version 8.3

How reproducible:
Every kernel update

Steps to Reproduce:
1.Update kernel
2.reboot
3.
  
Actual results:
dracut#

Expected results:
Normal boot

Additional info:
Replacing the invalid UUID value with the correct value (found in the search set root= line) permits a smooth system boot.

Here a little output:
$ su -
Password:
# grep -i uuid /boot/grub2/grub.cfg
        search --no-floppy --fs-uuid --set=root f52a4cdd-befe-4197-82dc-18a34eacf783
        linux   /boot/vmlinuz-3.1.0-7.fc16.x86_64 root=UUID=5dbb6342-a641-4a9c-921a-d392f8aab0de ro quiet SYSFONT=latarcyrheb-sun16 LANG=en_US.UTF-8 KEYTABLE=us
        search --no-floppy --fs-uuid --set=root f52a4cdd-befe-4197-82dc-18a34eacf783
        linux   /boot/vmlinuz-3.1.0-5.fc16.x86_64 root=UUID=f52a4cdd-befe-4197-82dc-18a34eacf783 ro quiet SYSFONT=latarcyrheb-sun16 LANG=en_US.UTF-8 KEYTABLE=us
        search --no-floppy --fs-uuid --set=root f52a4cdd-befe-4197-82dc-18a34eacf783
        linux   /boot/vmlinuz-3.1.0-1.fc16.x86_64 root=UUID=f52a4cdd-befe-4197-82dc-18a34eacf783 ro quiet
        search --no-floppy --fs-uuid --set=root f52a4cdd-befe-4197-82dc-18a34eacf783
        linux   /boot/vmlinuz-2.6.40.6-0.fc15.x86_64 root=UUID=f52a4cdd-befe-4197-82dc-18a34eacf783 ro quiet
        search --no-floppy --fs-uuid --set=root f52a4cdd-befe-4197-82dc-18a34eacf783
        linux   /boot/vmlinuz-2.6.38-8-generic root=UUID=f52a4cdd-befe-4197-82dc-18a34eacf783 ro quiet
        search --no-floppy --fs-uuid --set=root 98CAD0A9CAD08542
        search --no-floppy --fs-uuid --set=root D21A40A51A408885
        search --no-floppy --fs-uuid --set=root 98CAD0A9CAD08542
        search --no-floppy --fs-uuid --set=root 2C49A54513CE98D3
        search --no-floppy --fs-uuid --set=root f52a4cdd-befe-4197-82dc-18a34eacf783
        linux   /bootM/vmlinuz-2.6.40.6-0.fc15.x86_64 root=UUID=31ae217a-e756-4465-9e52-7f4040dcd2b8 ro quiet
        search --no-floppy --fs-uuid --set=root f52a4cdd-befe-4197-82dc-18a34eacf783
        linux   /bootM/boot2/vmlinuz-2.6.40.4-5.fc15.x86_64 root=UUID=31ae217a-e756-4465-9e52-7f4040dcd2b8 ro quiet
# findfs UUID=5dbb6342-a641-4a9c-921a-d392f8aab0de
findfs: unable to resolve 'UUID=5dbb6342-a641-4a9c-921a-d392f8aab0de'
# ls -l /dev/disk/by-uuid/
total 0
lrwxrwxrwx. 1 root root 10 Nov  7  2011 2C49A54513CE98D3 -> ../../sdb5
lrwxrwxrwx. 1 root root 11 Nov  7  2011 31ae217a-e756-4465-9e52-7f4040dcd2b8 -> ../../md127
lrwxrwxrwx. 1 root root 10 Nov  7  2011 65c92b08-9f35-4ba3-8c3e-e3c3f0efaf01 -> ../../sdc1
lrwxrwxrwx. 1 root root 10 Nov  7  2011 8ECEC31BCEC2FA8B -> ../../sdb2
lrwxrwxrwx. 1 root root 10 Nov  7  2011 98CAD0A9CAD08542 -> ../../sdb1
lrwxrwxrwx. 1 root root 10 Nov  7  2011 D21A40A51A408885 -> ../../sda3
lrwxrwxrwx. 1 root root 10 Nov  7  2011 f52a4cdd-befe-4197-82dc-18a34eacf783 -> ../../sdb6

Comment 1 Mads Kiilerich 2011-11-25 15:55:28 UTC
Please attach your grub.cfg

Comment 2 Peter Trenholme 2011-11-26 01:02:01 UTC
Created attachment 536511 [details]
grub.cfg file requested by Mads Kiilerich.

As requested, but pleas note that this grub.cfg file has been modified so it will boot correctly. Here's a diff from the grubby generated file:

# diff -it grub.cfg grub.cfg~
49c49
<         linux   /boot/vmlinuz-3.1.2-1.fc16.x86_64 root=UUID=f52a4cdd-befe-4197-82dc-18a34eacf783 ro quiet SYSFONT=latarcyrheb-sun16 LANG=en_US.UTF-8 KEYTABLE=us
---
>         linux   /boot/vmlinuz-3.1.2-1.fc16.x86_64 root=UUID=5dbb6342-a641-4a9c-921a-d392f8aab0de ro quiet SYSFONT=latarcyrheb-sun16 LANG=en_US.UTF-8 KEYTABLE=us

And here's my disk layout:

# ls -l /dev/disk/by-uuid/
total 0
lrwxrwxrwx. 1 root root 10 Nov 25 16:27 2C49A54513CE98D3 -> ../../sdb5
lrwxrwxrwx. 1 root root 11 Nov 25 16:27 31ae217a-e756-4465-9e52-7f4040dcd2b8 -> ../../md127
lrwxrwxrwx. 1 root root 10 Nov 25 16:27 65c92b08-9f35-4ba3-8c3e-e3c3f0efaf01 -> ../../sdc1
lrwxrwxrwx. 1 root root 10 Nov 25 16:27 8ECEC31BCEC2FA8B -> ../../sdb2
lrwxrwxrwx. 1 root root 10 Nov 25 16:27 98CAD0A9CAD08542 -> ../../sdb1
lrwxrwxrwx. 1 root root 10 Nov 25 16:27 D21A40A51A408885 -> ../../sda3
lrwxrwxrwx. 1 root root 10 Nov 25 16:27 f52a4cdd-befe-4197-82dc-18a34eacf783 -> ../../sdb6

# fdisk -l /dev/sd?

Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x945663c5

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *        2048      206847      102400    7  HPFS/NTFS/exFAT
/dev/sda2          206848   902435309   451114231    7  HPFS/NTFS/exFAT
/dev/sda3       902436864   929515519    13539328    7  HPFS/NTFS/exFAT
/dev/sda4       929515520  1953523711   512004096    5  Extended
/dev/sda5       929517568  1953523711   512003072   fd  Linux raid autodetect

Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x767f9c5a

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *        2048      206847      102400    7  HPFS/NTFS/exFAT
/dev/sdb2          206848   902435309   451114231    7  HPFS/NTFS/exFAT
/dev/sdb3       902436864  1926443007   512003072   fd  Linux raid autodetect
/dev/sdb4      1926443008  3907028991   990292992    5  Extended
/dev/sdb5      1926445056  1953067007    13310976    7  HPFS/NTFS/exFAT
/dev/sdb6      1953069056  3907028991   976979968   83  Linux

Disk /dev/sdc: 7969 MB, 7969177600 bytes
221 heads, 20 sectors/track, 3521 cylinders, total 15564800 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0005a701

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1            2048    15564799     7781376   83  Linux

Comment 3 Mads Kiilerich 2011-11-26 02:39:50 UTC
What is the output of 'blkid' (which is what grubby uses internally)?

Comment 4 Peter Trenholme 2011-11-26 17:01:49 UTC
$ blkid | sort -k 1
/dev/loop0: LABEL="STORM" TYPE="iso9660" 
/dev/loop10: TYPE="iso9660" 
/dev/loop1: LABEL="UNIVERSE_I" TYPE="iso9660" 
/dev/loop2: LABEL="CLAWS" TYPE="iso9660" 
/dev/loop3: TYPE="iso9660" LABEL="SPIDER" 
/dev/loop4: LABEL="ISLE" TYPE="iso9660" 
/dev/loop5: LABEL="UNTOTHEBREACH" TYPE="iso9660" 
/dev/loop6: LABEL="RCN_CD" TYPE="iso9660" 
/dev/loop7: LABEL="WINDRIDER" TYPE="iso9660" 
/dev/loop8: LABEL="EASTERN_FRONT" TYPE="iso9660" 
/dev/loop9: LABEL="Cryoburn CD" TYPE="udf" 
/dev/md127: UUID="31ae217a-e756-4465-9e52-7f4040dcd2b8" TYPE="ext4" 
/dev/sda1: LABEL="SYSTEM" UUID="98CAD0A9CAD08542" TYPE="ntfs" 
/dev/sda2: LABEL="OS" UUID="8ECEC31BCEC2FA8B" TYPE="ntfs" 
/dev/sda3: LABEL="HP_RECOVERY" UUID="D21A40A51A408885" TYPE="ntfs" 
/dev/sda5: UUID="31ae217a-e756-4465-9e52-7f4040dcd2b8" UUID_SUB="5effa823-ddb6-34ea-aaeb-4767769f0a19" LABEL="HP-p6710f:mirror" TYPE="linux_raid_member" 
/dev/sdb1: LABEL="SYSTEM" UUID="98CAD0A9CAD08542" TYPE="ntfs" 
/dev/sdb2: LABEL="OS" UUID="8ECEC31BCEC2FA8B" TYPE="ntfs" 
/dev/sdb3: UUID="31ae217a-e756-4465-9e52-7f4040dcd2b8" UUID_SUB="433e3e03-b289-60e1-eba8-b87b2c380277" LABEL="HP-p6710f:mirror" TYPE="linux_raid_member" 
/dev/sdb5: LABEL="RECOVERY" UUID="2C49A54513CE98D3" TYPE="ntfs" 
/dev/sdb6: UUID="f52a4cdd-befe-4197-82dc-18a34eacf783" UUID_SUB="e49c7a4e-4133-4784-98e3-78fb36079626" TYPE="btrfs" 
/dev/sdc1: UUID="65c92b08-9f35-4ba3-8c3e-e3c3f0efaf01" TYPE="ext4"

Comment 5 Mads Kiilerich 2011-11-27 16:21:17 UTC
That added at least one new piece of information: /boot is on btrfs. That is a new thing and not commonly used, so that might explain why there is an issue and nobody else have seen it.

grubby is perhaps confused by subvolumes (as grub2 in bug 751728). Can you verify that 'grub2-mkconfig' still uses the right UUIDs?

One way to debug this could be to add a 'set -x' to /sbin/new-kernel-pkg and install a new kernel package and see how it invokes grubby and try to run that manually with strace or gdb and see where and why it gets the new wrong UUID.

Comment 6 Peter Trenholme 2011-11-27 19:01:39 UTC
First, here's the grub2-mkconfig output (truncated after the second boot stanza) confirming that the program correctly identifies the UUID.

Second, it will take some time to follow your debug suggestion: I've just moved to new quarters and can't spend much time at my computer without my wife being even more annoyed then she now is in her current state of disgruntlement.

$ sudo grub2-mkconfig 2>/dev/null
[sudo] password for Peter: 
#
# DO NOT EDIT THIS FILE
#
# It is automatically generated by grub2-mkconfig using templates
# from /etc/grub.d and settings from /etc/default/grub
#

### BEGIN /etc/grub.d/00_header ###
if [ -s $prefix/grubenv ]; then
  load_env
fi
set default="${saved_entry}"
if [ "${prev_saved_entry}" ]; then
  set saved_entry="${prev_saved_entry}"
  save_env saved_entry
  set prev_saved_entry=
  save_env prev_saved_entry
  set boot_once=true
fi

function savedefault {
  if [ -z "${boot_once}" ]; then
    saved_entry="${chosen}"
    save_env saved_entry
  fi
}

function load_video {
  insmod vbe
  insmod vga
  insmod video_bochs
  insmod video_cirrus
}

set timeout=10
### END /etc/grub.d/00_header ###

### BEGIN /etc/grub.d/10_linux ###
menuentry 'Fedora Linux, with Linux 3.1.2-1.fc16.x86_64' --class fedora --class gnu-linux --class gnu --class os {
        savedefault
        load_video
        set gfxpayload=keep
        insmod gzio
        insmod part_msdos
        insmod btrfs
        set root='(hd1,msdos6)'
        search --no-floppy --fs-uuid --set=root f52a4cdd-befe-4197-82dc-18a34eacf783
        echo    'Loading Linux 3.1.2-1.fc16.x86_64 ...'
        linux   /boot/vmlinuz-3.1.2-1.fc16.x86_64 root=UUID=f52a4cdd-befe-4197-82dc-18a34eacf783 ro quiet 
        echo    'Loading initial ramdisk ...'
        initrd  /boot/initramfs-3.1.2-1.fc16.x86_64.img
}
menuentry 'Fedora Linux, with Linux 3.1.0-7.fc16.x86_64' --class fedora --class gnu-linux --class gnu --class os {
        savedefault
        load_video
        set gfxpayload=keep
        insmod gzio
        insmod part_msdos
        insmod btrfs
        set root='(hd1,msdos6)'
        search --no-floppy --fs-uuid --set=root f52a4cdd-befe-4197-82dc-18a34eacf783
        echo    'Loading Linux 3.1.0-7.fc16.x86_64 ...'
        linux   /boot/vmlinuz-3.1.0-7.fc16.x86_64 root=UUID=f52a4cdd-befe-4197-82dc-18a34eacf783 ro quiet 
        echo    'Loading initial ramdisk ...'
        initrd  /boot/initramfs-3.1.0-7.fc16.x86_64.img
}

Comment 7 aaronsloman 2012-03-23 00:09:22 UTC
This looks like the same problem as bug #756559 - grub2 not booting into new kernel after yum update (I've inserted a description there as comment 9, with a typo corrected in comment 10).

I've done some experiments and provided evidence that grub2 is seriously buggy and the way it is used by yum to insert a new kernel (install or update) is disastrous in the case where a different root partition was previously used. In my case I had fedora 15 in /dev/sda12 and then installed fedora 16 in /dev/sda13 

After that for EVERY invocation of yum install or yum update to get a new kernel while running F16 it puts the wrong partition UUID in /boot/grub2/grub.cfg

It uses the UUID previously used for F15. This then means that the new system is unbootable unless the grub.cfg file is first edited to specify the correct partition.

It looks as if grub2 is using some totally unreliable algorithm to find the root partition to associate with a new kernel. Surely it should default to the currently used root partition whenever yum update is invoked. Anyhow this accounts for the report at the top of this bug:

"When a kernel is updated, grubby updates the grub,cfg file in /boot/grub2/ with
a UUID that, as far as I can see, it has made up from hole cloth."

In my case it is made up from the root partition in which an earlier version of Fedora was installed.

Comment 8 Peter Trenholme 2012-04-03 21:25:53 UTC
Oops!

I finally put the "set -x" in the new-kernel-pkg script, and discovered that the "problem" was caused because the line describing the root file system in my /etc/fstab was using the incorrect UUID value. Since / is mounted before /etc/fstab/ is read (obviously), a scan of /etc/fstab for the value of the UUID (or label) to be used for the real boot is somewhat problematic.

Apparently the GRUB2 mkconfig program is somewhat more sophisticated . . .

Anyhow, probably not exactly a bug, just a assumption that's not always true.

Comment 9 aaronsloman 2012-04-03 23:26:56 UTC
(In reply to comment #8)

> I finally put the "set -x" in the new-kernel-pkg script, and discovered that
> the "problem" was caused because the line describing the root file system in my
> /etc/fstab was using the incorrect UUID value.

I did not think of looking there for the source of the incorrect UUID. Sure enough, it was also in my /etc/fstab.

I can't recall exactly what I did when upgrading from F15 to F16, but somehow that must have caused the old root UUID value to be copied into the new fstab file.

When grubby is used by 'yum update kernel' or similar commands, presumably it should make sure that the UUID it finds to go into grub.cfg for the root partition in grub.cfg is the *current* root partition.

Thanks. I'll add a comment in Bug #756559.

> Anyhow, probably not exactly a bug, just a assumption that's not always true.

It's arguable that a mechanism for installing a new kernel that allows a user error in /etc/fstab to produce such a disastrous consequence (booting impossible) is a bug and should be fixed.

Comment 10 Mads Kiilerich 2012-04-03 23:57:37 UTC
(In reply to comment #9)
> When grubby is used by 'yum update kernel' or similar commands, presumably it
> should make sure that the UUID it finds to go into grub.cfg for the root
> partition in grub.cfg is the *current* root partition.

Arguably grubby when modifying system configuration files should trust what the system configuration files says, rather than looking at what currently is running.

But if there is diverging opinions on what uuid to use then it could issue a big fat warning. (It is however not acceptable (or often not noticed) if rpm installation fails or asks, so it would have to choose one solution.) Not being able to find other boot loader configuration entries with the chosen uuid is an even stronger indication that something is seriously wrong. 

It also seems wrong if we end up with a situation where the fstab entry for / never is used for anything.

Comment 11 Peter Trenholme 2012-04-09 20:18:36 UTC
(In reply to comment #10)
> It also seems wrong if we end up with a situation where the fstab entry for /
> never is used for anything.

Since the root= parameter is specified on the kernel line, before the root file system is loaded, of what use is the specification of "/" in /etc/fstab? Clearly it is not used by the Linux kernel or, really, anything else, if I could run my Fedora system for months (from F15 through F17) without noticing that I'd neglected to updated the root uuid when I upgraded to a larger drive.

Note that GRUB2 designers did not use the fstab entry when build its cfg file. Perhaps they realized that the fstab entry for / is, in fact, never used for anything.

Hum, I wonder what would happen if I commented out that line?

Comment 12 aaronsloman 2012-04-10 01:59:38 UTC
(In reply to comment #11)
> > It also seems wrong if we end up with a situation where the fstab entry for /
> > never is used for anything.
> 
> Since the root= parameter is specified on the kernel line, before the root file
> system is loaded, of what use is the specification of "/" in /etc/fstab?
> Clearly it is not used by the Linux kernel or, really, anything else, if I
> could run my Fedora system for months (from F15 through F17) without noticing
> that I'd neglected to updated the root uuid when I upgraded to a larger drive.

Something similar happened to me when I upgraded from F15 to F16, using a new root partition. The fstab file created by the installer was very sparse, so I copied information about partitions, sizes and functions from the old fstab.

In the process I must have accidentally replaced the entry for "/" with the old one. I did not notice because that did not affect booting.

> Note that GRUB2 designers did not use the fstab entry when build its cfg file.
> Perhaps they realized that the fstab entry for / is, in fact, never used for
> anything.

Is it possible that the permissions are used? Or the field specifying frequency of backup?

> Hum, I wonder what would happen if I commented out that line?

I wonder too!

Comment 13 Fedora End Of Life 2013-01-16 16:57:56 UTC
This message is a reminder that Fedora 16 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 16. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '16'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 16's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 16 is end of life. If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora, you are encouraged to click on 
"Clone This Bug" and open it against that version of Fedora.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 14 Fedora End Of Life 2013-02-13 21:11:45 UTC
Fedora 16 changed to end-of-life (EOL) status on 2013-02-12. Fedora 16 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.