Bug 596829

Summary: grubby doesn't update grub.conf after F12 to F13 upgrade
Product: [Fedora] Fedora Reporter: Othman Madjoudj <athmanem>
Component: grubbyAssignee: Peter Jones <pjones>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: urgent Docs Contact:
Priority: low    
Version: 13CC: Colin.Simpson, jan.kratochvil, matthias_haase, nerijus, piergiorgio.sartor, pjones
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-08-25 00:13:07 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
strace output of: new-kernel-pkg -v --depmod --install 2.6.33.4-95.fc13.i686
none
grubbystrace.out
none
console=ttyS0 qemu-kvm messages on failed boot. none

Description Othman Madjoudj 2010-05-27 15:16:04 UTC
Description of problem:

I have updated my laptop from F12 to F13 using the DVD upgrade method,

after the installation my laptop can't boot because the fedora installer have not updated grub.conf so i rebooted to rescue mode and i have updated grub.conf manually, now the system boot with kernel version 2.6.33.3-85.fc13.i686 but again when i have updated the kernel using yum the package kernel-2.6.33.4-95.fc13 has been installed but grub.conf didn't get updated.

Version-Release number of selected component (if applicable):
kernel-2.6.33.3-85.fc13.i686
grubby-7.0.13-1.fc13.i686

How reproducible:
Every time i upgrade the kernel package.

Steps to Reproduce:
1. log as root (su -)
2. update my kernel (yum update kernel)
  
Actual results:
grub.conf can't be updated with the new kernel entry

Expected results:
grub.conf updated with the new kernel entry


Additional info:

# yum update kernel

[...]

 Installing     : kernel-2.6.33.4-95.fc13.i686                             1/3 
grubby recieved SIGSEGV!  Backtrace (8):
/sbin/grubby[0x804f898]
[0xcaf400]
/lib/libc.so.6[0x34bf88]
/sbin/grubby[0x804e1d4]
/sbin/grubby[0x804e32d]
/sbin/grubby[0x804f644]
/lib/libc.so.6(__libc_start_main+0xe6)[0x230cc6]
/sbin/grubby[0x80490a1]

[...]

Comment 1 Othman Madjoudj 2010-05-27 16:58:05 UTC
when i remove a kernel eg:

# yum remove kernel-2.6.33.4-95.fc13.i686


and remove grub.conf ie:

# mv /boot/grub/grub.conf /boot/grub/grub.conf.old


and re-install the kernel ie:

# yum install kernel


i don't get the previous SIGSEGV:

Running Transaction
  Installing     : kernel-2.6.33.4-95.fc13.i686                             1/3 
  Installing     : kmod-wl-2.6.33.4-95.fc13.i686-5.60.48.36-1.fc13.6.i686   2/3 
  Installing     : kmod-wl-5.60.48.36-1.fc13.6.i686                         3/3 

Installed:
  kernel.i686 0:2.6.33.4-95.fc13       kmod-wl.i686 0:5.60.48.36-1.fc13.6      

Dependency Installed:
  kmod-wl-2.6.33.4-95.fc13.i686.i686 0:5.60.48.36-1.fc13.6                      

Complete!

Comment 2 Colin.Simpson 2010-05-29 22:11:49 UTC
You're in a better place than me. I updated from F12 to F13 and I get your error

When I force on the latest kernel I get your error:

# rpm -Uvh --force kernel-2.6.33.4-95.fc13.x86_64.rpm 
Preparing...                ########################################### [100%]
   1:kernel                 ########################################### [100%]
grubby recieved SIGSEGV!  Backtrace (8):
/sbin/grubby[0x40805f]
/lib64/libc.so.6(+0x32a40)[0x7f2e96ec2a40]
/lib64/libc.so.6(+0x1279ba)[0x7f2e96fb79ba]
/sbin/grubby[0x40695e]
/sbin/grubby[0x406ae3]
/sbin/grubby[0x407e58]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x7f2e96eaec5d]
/sbin/grubby[0x401709]

when I move /etc/grub.conf aside it installs without error but doesn't restore the link from /etc/grub.conf to /boot/grub/grub.conf. If I manually restore the link and then retry it fails with the above again. So I wouldn't be so sure it's fixed for you!

I have tried loads of things, reinstalling the grub (grub-install) and manually setting up the boot sectors in grub. No difference.

I haven't a clue how to fix it yet. 

I have / (which /boot) is in on an md device if that makes a difference. I wonder if one of these makes a difference, as I'd have thought this would be a very active bug if it happened on all F12 to F13 upgrades.

Anyone got any ideas?

Comment 3 Othman Madjoudj 2010-05-29 22:30:34 UTC
Created attachment 417930 [details]
strace output of: new-kernel-pkg -v --depmod --install 2.6.33.4-95.fc13.i686

Comment 4 Othman Madjoudj 2010-05-29 22:33:00 UTC
my grub.conf :

================================================================================
# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE:  You have a /boot partition.  This means that
#          all kernel and initrd paths are relative to /boot/, eg.
#          root (hd0,0)
#          kernel /vmlinuz-version ro root=/dev/VolGroup00/LogVol00
#          initrd /initrd-version.img
#boot=/dev/sda
default=0
timeout=2
splashimage=(hd0,0)/grub/splash.xpm.gz
hiddenmenu
title Fedora (2.6.33.4-95.fc13.i686)
	root (hd0,0)
	kernel /vmlinuz-2.6.33.4-95.fc13.i686 ro root=/dev/VolGroup00/LogVol00 rhgb quiet SYSFONT=latarcyrheb-sun16 LANG=en_US.UTF-8 KEYTABLE=fr
	initrd /initramfs-2.6.33.4-95.fc13.i686.img

====================================[EOF]=======================================

i added the following lines to boot the new kernel:

title Fedora (2.6.33.3-85.fc13.i686)
	root (hd0,0)
	kernel /vmlinuz-2.6.33.3-85.fc13.i686 ro root=/dev/VolGroup00/LogVol00 rhgb quiet SYSFONT=latarcyrheb-sun16 LANG=en_US.UTF-8 KEYTABLE=fr
	initrd /initramfs-2.6.33.3-85.fc13.i686.img

Comment 5 Othman Madjoudj 2010-05-29 22:36:19 UTC
*sorry* i have made a mistake in the last comment, it should be:

================================================================================
# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE:  You have a /boot partition.  This means that
#          all kernel and initrd paths are relative to /boot/, eg.
#          root (hd0,0)
#          kernel /vmlinuz-version ro root=/dev/VolGroup00/LogVol00
#          initrd /initrd-version.img
#boot=/dev/sda
default=0
timeout=2
splashimage=(hd0,0)/grub/splash.xpm.gz
hiddenmenu
title Fedora (2.6.33.3-85.fc13.i686)
 root (hd0,0)
 kernel /vmlinuz-2.6.33.3-85.fc13.i686 ro root=/dev/VolGroup00/LogVol00 rhgb
quiet SYSFONT=latarcyrheb-sun16 LANG=en_US.UTF-8 KEYTABLE=fr
 initrd /initramfs-2.6.33.3-85.fc13.i686.img    

====================================[EOF]=======================================

i added the following lines to boot the new kernel:

title Fedora (2.6.33.4-95.fc13.i686)
 root (hd0,0)
 kernel /vmlinuz-2.6.33.4-95.fc13.i686 ro root=/dev/VolGroup00/LogVol00 rhgb
quiet SYSFONT=latarcyrheb-sun16 LANG=en_US.UTF-8 KEYTABLE=fr
 initrd /initramfs-2.6.33.4-95.fc13.i686.img

Comment 6 Colin.Simpson 2010-05-30 11:45:53 UTC
Yeah I had to hand hack the grub.conf, but I don't want to do that for every new kernel.

Comment 7 Piergiorgio Sartor 2010-06-15 21:31:38 UTC
I updated 3 (different) systems from F-12 to F-13 and one of these had the same problem. This is a x86_64, while the others are i686.

The grub.conf is very similar between those PCs, no "rhgb", no "quiet", no "hidemenu", no "splashimage".

Hope this helps,

bye,

pg

Comment 8 Colin.Simpson 2010-06-18 17:44:08 UTC
Created attachment 425201 [details]
grubbystrace.out

Comment 9 Colin.Simpson 2010-06-18 17:44:43 UTC
My grub.conf kernel line contains rhgb, quiet etc i.e

title Fedora (2.6.33.5-112.fc13)
	kernel /boot/vmlinuz-2.6.33.5-112.fc13.x86_64 root=/dev/md0 rhgb quiet S
YSFONT=latarcyrheb-sun16 LANG=en_US.UTF-8 KEYTABLE=uk nouveau.modeset=0
	initrd /boot/initramfs-2.6.33.5-112.fc13.x86_64.img

But to be honest I'm not sure it's related to the grub.conf contents at all. 
From the strace (attached, generated by the command below) it seems to have not explored very far into grub.conf but seems to fail out after exploring the disk (boot sector presumably). It's almost like older grub's have left something around that newer grubby doesn't like or expect (I'd guess all these problems are F12 to F13 upgrades, though my office machine went fine it's my home machine that's failing). I tried reinstalling grub and the boot sectors but it made no difference.

Any thoughts from someone who knows how we might debug this further, it's pretty annoying?


strace -f -o grubbystrace.out /sbin/grubby --add-kernel=/boot/vmlinuz-2.6.33.5-124.fc13.x86_64 --copy-default --make-default --title "Fedora (2.6.33.5-124.fc13.x86_64)" --args=root=/dev/md0 --remove-kernel "=TITLE=Fedora (2.6.33.5-124.fc13.x86_64)" 
grubby recieved SIGSEGV!  Backtrace (8):
/sbin/grubby[0x40805f]
/lib64/libc.so.6[0x3081432a20]
/lib64/libc.so.6[0x30815279fa]
/sbin/grubby[0x40695e]
/sbin/grubby[0x406ae3]
/sbin/grubby[0x407e58]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x308141ec5d]
/sbin/grubby[0x401709]

Comment 10 Jan Kratochvil 2010-07-09 08:53:49 UTC
/sbin/new-kernel-pkg --package kernel --install 2.6.33.6-147.fc13.x86_64

grubby recieved SIGSEGV!  Backtrace (8):
/sbin/grubby[0x40805f]
/lib64/libc.so.6[0x3decc32a20]
/lib64/libc.so.6[0x3decd20796]
/sbin/grubby[0x40695e]
/sbin/grubby[0x406ae3]
/sbin/grubby[0x407e58]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x3decc1ec5d]
/sbin/grubby[0x401709]

grubby-7.0.13-1.fc13.x86_64
glibc-2.12-2.x86_64
kernel-2.6.33.5-124.fc13.x86_64

(gdb) bt
#0  __strcmp_sse42 () at ../sysdeps/x86_64/multiarch/strcmp.S:129
#1  0x000000000040695e in suitableImage (entry=<value optimized out>, bootPrefix=<value optimized out>, skipRemoved=<value optimized out>, flags=<value optimized out>) at grubby.c:1297
#2  0x0000000000406ae3 in findTemplate (cfg=0x60c5a0, prefix=0x60c730 "/boot", indexPtr=<value optimized out>, skipRemoved=0, flags=0) at grubby.c:1447
#3  0x0000000000407e58 in main (argc=0, argv=0x7fffffffd030) at grubby.c:3182
(gdb) up
#1  0x000000000040695e in suitableImage (entry=<value optimized out>, bootPrefix=<value optimized out>, skipRemoved=<value optimized out>, flags=<value optimized out>) at grubby.c:1297
1297        if (strcmp(getuuidbydev(rootdev), getuuidbydev(dev))) {
(gdb) p rootdev
$1 = 0x60c820 "/dev/mapper/luks-97047e1e-cf34-4079-a464-69dbe6b5c6ab"
(gdb) p dev
$2 = <value optimized out>
(gdb) p getuuidbydev(rootdev)
$3 = 0x0
(gdb)

# ls -l /dev/mapper
total 0
crw-rw---- 1 root root  10, 62 Jul  1 13:10 control
lrwxrwxrwx 1 root root       7 Jul  1 13:10 luks-97047e1e-cf34-4079-a464-69dbe6b5c6ab -> ../dm-1
brw-rw---- 1 root disk 253,  0 Jul  1 11:10 luks-e6cc7ba8-7dd4-42a4-a5cf-921631b3a52d
# ls -l /dev/dm*
ls: cannot access /dev/dm*: No such file or directory
# cat /proc/partitions | grep dm
 253        1  963425468 dm-1
 253        0   12289084 dm-0

On a working similar system:

# ls -l /dev/mapper
total 0
crw-rw---- 1 root root  10, 62 Jun  1 01:58 control
lrwxrwxrwx 1 root root       7 Jun  1 03:58 luks-356494e6-636a-47b8-92eb-f875786074b2 -> ../dm-1
brw-rw---- 1 root disk 253,  0 Jun  1 02:37 luks-404dfadd-33aa-4047-ae17-403361b5bdd1
# ls -l /dev/dm*
brw-rw---- 1 root disk 253, 0 Jun  1 02:37 /dev/dm-0
brw-rw---- 1 root disk 253, 1 Jun  1 01:58 /dev/dm-1

After:

# mknod -m660 /dev/dm-0 b 253 0; chgrp disk /dev/dm-0
# mknod -m660 /dev/dm-1 b 253 1; chgrp disk /dev/dm-1

grubby works (but it still does not boot).

Occasionally I have seen some deleted /dev entries, such as /dev/kvm .
Currently I have deleted even /dev/net (for /dev/net/tun) again.

New kernel after boot writes "cannot open root device mapper/luks-9704...".
(Screen photo upon request.)
After reboot (into the older kernel) /dev/dm-{0,1} exist.

Attaching here a boot serial output from:
# qemu-kvm -snapshot -hda /dev/sda -hdb /dev/sdb -net none -serial stdio -m 1024
which is not exactly the same (different HDD controller) but it seems to me to
be the same problem.  The older kernel boots in qemu-kvm fine.  New kernel entry:
title Fedora (2.6.33.6-147.fc13.x86_64)
        root (hd0,0)
        kernel /vmlinuz-2.6.33.6-147.fc13.x86_64 ro root=/dev/mapper/luks-97047e1e-cf34-4079-a464-69dbe6b5c6ab selinux=0 audit=0 crashkernel=auto nomodeset panic=10 SYSFONT=latarcyrheb-sun16 LANG=en_US.UTF-8 KEYTABLE=us console=ttyS0

My system is "unusual" that it:
 * Does not use LVM (it uses mdadm raid1).
 * It uses LUKS incl. swap (the LUKS-swap bugs are probably irrelevant here).

Comment 11 Jan Kratochvil 2010-07-09 08:54:46 UTC
Created attachment 430572 [details]
console=ttyS0 qemu-kvm messages on failed boot.

Comment 12 Colin.Simpson 2010-08-18 18:39:40 UTC
Updating to grubby-7.0.16-1.fc13 seems to have fixed my one. Well it does when run manually with the test grubby command I used above. Cool.

I'll wait for a new kernel to test with an automatic kernel update/install.

Comment 13 Colin.Simpson 2010-08-24 23:47:45 UTC
Installing the new kernel 2.6.33.8-149 does update my grub.conf file correctly. 
So it's fixed for me.

Comment 14 Othman Madjoudj 2010-08-25 00:13:07 UTC
It fixed with the following versions:

grubby-7.0.16-1.fc13
kernel-2.6.33.8-149.fc13

I think it safe to close the bug.