This service will be undergoing maintenance at 00:00 UTC, 2016-08-01. It is expected to last about 1 hours

Bug 705572

Summary: EC2 missing /etc/sysconfig/kernel
Product: Red Hat Enterprise Linux 6 Reporter: Mike McGrath <mmcgrath>
Component: relengAssignee: Jay Greguske <jgregusk>
Status: CLOSED WONTFIX QA Contact: wes hayutin <whayutin>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.0CC: atodorov, cmorgan, dmach, drjones, jgregusk, pbonzini, rwilliam, sghosh, syeghiay
Target Milestone: rcKeywords: EC2
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-10-18 11:05:26 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Bug Depends On: 729586    
Bug Blocks:    

Description Mike McGrath 2011-05-17 16:45:58 EDT
As part of a normal install, anaconda creates /etc/sysconfig/kernel with the following information:

# UPDATEDEFAULT specifies if new-kernel-pkg should make
# new kernels the default
UPDATEDEFAULT=yes

# DEFAULTKERNEL specifies the default kernel package type
DEFAULTKERNEL=kernel


EOF

Since our ec2 images don't use anaconda, this file isn't getting created.  This means that when users run "yum update" the new kernel package gets pulled down and installed but /boot/grub/menu.lst is not getting updated so users continue to run on old versions of the kernel even when rebooting.
Comment 1 Chris Morgan 2011-05-17 16:59:07 EDT
Is this something that we need to add to the new images?  I'm trying to understand the impact of this issue.  It seems like this needs to be a KB entry and any new images account for this, but my concern is a customer running RHEL 5.5 and performing a yum update will not be using the new kernel.

Jay G -- what urgency do you suspect is needed here?  This seems like a rather big deal.  I thought we tested kernel updating as part of QE?
Comment 3 Mike McGrath 2011-05-17 17:05:10 EDT
One additional change needed /etc/grub.conf needs to be a symlink to /boot/grub/grub.conf
Comment 4 Chris Morgan 2011-05-17 17:10:59 EDT
I have instructed Jay G. to respin the images used for the public hourly on-demand offering and we will keep the existing Cloud Access images the same and provide a KB article for those customers.  I have also pinged AWS requested assistance -- they may have a way of injecting a couple of files into an existing AMI and keep the same ID, but I am not confident.

All future images should include this fix.  I would also recommend an automated test be added to check for this.
Comment 5 Chris Morgan 2011-05-17 17:40:10 EDT
Could we get these fixes in a patch that could be made available?  My concern
is now turning to Cloud Access customers which may already be running instances
of the images.
Comment 6 Chris Morgan 2011-05-17 17:55:29 EDT
Subhendu -- Is there a way to handle this as a kernel patch with a if (does /etc/sysconfig/kernel) DNE, then create?  Or some similar method?
Comment 7 Chris Morgan 2011-05-17 17:57:24 EDT
I assume this affect both RHEL 5 and RHEL 6.
Comment 8 wes hayutin 2011-05-18 09:58:40 EDT
automated tests have been created for this bug..

tests include...

1. test for /etc/sysconfig/kernel 
2. test that UPDATEDEFAULT=yes is in /etc/sysconfig/kernel
3. test that DEFAULTKERNEL=kernel is in /etc/sysconfig/kernel
4. update the kernel, reboot, verify latest installed kernel is running
Comment 9 wes hayutin 2011-05-20 17:50:47 EDT
This test fails


[root@ip-10-66-95-94 ~]# rpm -q kernel
kernel-2.6.32-71.29.1.el6.i686
kernel-2.6.32-131.0.15.el6.i686
[root@ip-10-66-95-94 ~]# cat /etc/sysconfig/kernel 
# UPDATEDEFAULT specifies if new-kernel-pkg should make
# new kernels the default
UPDATEDEFAULT=yes

# DEFAULTKERNEL specifies the default kernel package type
DEFAULTKERNEL=kernel
[root@ip-10-66-95-94 ~]# cat /boot/grub/grub.conf 
default=0
timeout=0
hiddenmenu
title RHEL-6.0-Starter-EBS-i386-11.raw (2.6.32-71.29.1.el6.i686)
        root (hd0)
        kernel /boot/vmlinuz-2.6.32-71.29.1.el6.i686 ro root=LABEL=_/
        initrd /boot/initramfs-2.6.32-71.29.1.el6.i686.img
[root@ip-10-66-95-94 ~]# ll /etc/grub.conf
lrwxrwxrwx. 1 root root 20 May 18 16:23 /etc/grub.conf -> /boot/grub/grub.conf
[root@ip-10-66-95-94 ~]# uname -a
Linux ip-10-66-95-94 2.6.32-71.29.1.el6.i686 #1 SMP Thu Apr 21 15:57:30 EDT 2011 i686 i686 i386 GNU/Linux
[root@ip-10-66-95-94 ~]# 
[root@ip-10-66-95-94 ~]# 
[root@ip-10-66-95-94 ~]# 
[root@ip-10-66-95-94 ~]# 
[root@ip-10-66-95-94 ~]# 
[root@ip-10-66-95-94 ~]# rpm -q kernel
kernel-2.6.32-71.29.1.el6.i686
kernel-2.6.32-131.0.15.el6.i686
[root@ip-10-66-95-94 ~]# rpm -e kernel-2.6.32-131.0.15.el6.i686
[root@ip-10-66-95-94 ~]# yum update kernel
Loaded plugins: amazon-id, security
Setting up Update Process
Resolving Dependencies
--> Running transaction check
---> Package kernel.i686 0:2.6.32-131.0.15.el6 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

==========================================================================================================================
 Package            Arch             Version                           Repository                                    Size
==========================================================================================================================
Installing:
 kernel             i686             2.6.32-131.0.15.el6               rhui-us-east-rhel-server-updates              21 M

Transaction Summary
==========================================================================================================================
Install       1 Package(s)

Total download size: 21 M
Installed size: 81 M
Is this ok [y/N]: y
Downloading Packages:
kernel-2.6.32-131.0.15.el6.i686.rpm                                                                |  21 MB     00:01     
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
Warning: RPMDB altered outside of yum.
  Installing : kernel-2.6.32-131.0.15.el6.i686                                                                        1/1 
grubby fatal error: unable to find a suitable template

Installed:
  kernel.i686 0:2.6.32-131.0.15.el6                                                                                       

Complete!
Comment 10 Jay Greguske 2011-05-24 17:25:16 EDT
There is a 3rd step: /etc/blkid/blkid.tab needs to exist and be complete. This can be done by running `blkid /dev/xvda` prior to the kernel update. Normally Anaconda does this indirectly during installation. Not sure why we're only seeing this for x86_64 yet.
Comment 11 wes hayutin 2011-06-21 14:01:17 EDT
[whayutin@localhost cloudEngine]$ ssh -i ~/cloude-key.pem root@ec2-50-19-20-60.compute-1.amazonaws.com
The authenticity of host 'ec2-50-19-20-60.compute-1.amazonaws.com (50.19.20.60)' can't be established.
RSA key fingerprint is 6f:ae:3b:7d:29:0d:3d:3c:be:48:d1:f4:c6:2f:3c:f9.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'ec2-50-19-20-60.compute-1.amazonaws.com,50.19.20.60' (RSA) to the list of known hosts.
[root@ip-10-245-41-76 ~]# cat /etc/sysconfig/kernel 
# UPDATEDEFAULT specifies if new-kernel-pkg should make
# new kernels the default
UPDATEDEFAULT=yes

# DEFAULTKERNEL specifies the default kernel package type
DEFAULTKERNEL=kernel

[root@ip-10-245-41-76 ~]# cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 6.1 (Santiago)
[root@ip-10-245-41-76 ~]# 

[whayutin@localhost cloudEngine]$ ssh -i ~/cloude-key.pem root@ec2-184-73-75-75.compute-1.amazonaws.com
The authenticity of host 'ec2-184-73-75-75.compute-1.amazonaws.com (184.73.75.75)' can't be established.
RSA key fingerprint is eb:45:5e:36:74:ae:11:eb:ac:4a:3b:6c:35:3f:66:d0.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'ec2-184-73-75-75.compute-1.amazonaws.com,184.73.75.75' (RSA) to the list of known hosts.
[root@domU-12-31-39-0F-21-C1 ~]# cat /etc/sysconfig/kernel 
# UPDATEDEFAULT specifies if new-kernel-pkg should make
# new kernels the default
UPDATEDEFAULT=yes

# DEFAULTKERNEL specifies the default kernel package type
DEFAULTKERNEL=kernel
[root@domU-12-31-39-0F-21-C1 ~]# uname -a
Linux domU-12-31-39-0F-21-C1 2.6.32-131.4.1.el6.x86_64 #1 SMP Fri Jun 10 10:54:26 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux
[root@domU-12-31-39-0F-21-C1 ~]# cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 6.1 (Santiago)
[root@domU-12-31-39-0F-21-C1 ~]# 
[root@domU-12-31-39-0F-21-C1 ~]#
Comment 12 wes hayutin 2011-06-21 14:03:22 EDT
[root@ip-10-245-41-76 blkid]# ls /etc/blkid/
[root@ip-10-245-41-76 blkid]# 

[root@ip-10-245-41-76 blkid]# ls /etc/blkid/
[root@ip-10-245-41-76 blkid]# 

waiting on jgreguske to comment
Comment 14 Alexander Todorov 2011-10-05 09:24:42 EDT
(In reply to comment #12)
> [root@ip-10-245-41-76 blkid]# ls /etc/blkid/
> [root@ip-10-245-41-76 blkid]# 
> 
> [root@ip-10-245-41-76 blkid]# ls /etc/blkid/
> [root@ip-10-245-41-76 blkid]# 
> 
> waiting on jgreguske to comment

Moving back to ASSIGNED. This is still empty with the latest Beta tree. /etc/sysconfig/kernel is present and configured as expected.
Comment 15 Jay Greguske 2011-10-05 10:26:18 EDT
In rc.local we have this:

if [ ! -f /etc/blkid/blkid.tab ] ; then
    blkid /dev/xvda &>/dev/null
fi

But I bet because of bug 729586, it doesn't work.
Comment 16 Andrew Jones 2011-10-05 10:56:11 EDT
(In reply to comment #15)
> In rc.local we have this:
> 
> if [ ! -f /etc/blkid/blkid.tab ] ; then
>     blkid /dev/xvda &>/dev/null
> fi
> 
> But I bet because of bug 729586, it doesn't work.

The jury is still out on how/if we change the behavior that bug 729586 describes. This rc.local script should be generated using a label or uuid with blkid, i.e.

blkid $(blkid -U some-UUID) &>/dev/null
Comment 17 Jay Greguske 2011-10-05 11:04:13 EDT
Block devices are generated on-demand in EC2. How would you know the UUID in advance?
Comment 18 Andrew Jones 2011-10-05 11:19:07 EDT
(In reply to comment #17)
> Block devices are generated on-demand in EC2. How would you know the UUID in
> advance?

Ok, so why is /dev/xvda there if it's supposed to be for any on-demand block device? When you dynamically attach a disk you can choose what the target name will be, and that name stays the way you choose it as long as it's xvd-something. So there shouldn't be a problem with on-demand disk naming. Maybe I'm missing something completely, and if I am, then please fill me in on how AMIs are using xvd[abcd] in their scripts.
Comment 19 Jay Greguske 2011-10-05 12:02:59 EDT
I don't have any science to say why using a disk label won't work. I just know we tried and failed the past. We'll investigate it.

But I was asking about UUIDs in my previous comment. If the (root) block device is being created for each instance we boot, the UUID would change for each instance would it not? I don't think it is possible to use that in fstab because we don't know what it will be.
Comment 20 Paolo Bonzini 2011-10-06 07:21:42 EDT
You are right that using the UUID is not possible.  However, we'd really like to understand why we are even using sda rather than xvda.
Comment 22 Jay Greguske 2011-10-11 17:25:54 EDT
Flipping back from assigned to share the pain with others.
Comment 24 RHEL Product and Program Management 2011-10-18 11:05:26 EDT
Development Management has reviewed and declined this request.  You may appeal
this decision by reopening this request.