Bug 705572
Summary: | EC2 missing /etc/sysconfig/kernel | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Mike McGrath <mmcgrath> |
Component: | releng | Assignee: | Jay Greguske <jgreguske> |
Status: | CLOSED WONTFIX | QA Contact: | wes hayutin <whayutin> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 6.0 | CC: | atodorov, cmorgan, dmach, drjones, jgreguske, pbonzini, rwilliam, sghosh, syeghiay |
Target Milestone: | rc | Keywords: | EC2 |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2011-10-18 15:05:26 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 729586 | ||
Bug Blocks: |
Description
Mike McGrath
2011-05-17 20:45:58 UTC
Is this something that we need to add to the new images? I'm trying to understand the impact of this issue. It seems like this needs to be a KB entry and any new images account for this, but my concern is a customer running RHEL 5.5 and performing a yum update will not be using the new kernel. Jay G -- what urgency do you suspect is needed here? This seems like a rather big deal. I thought we tested kernel updating as part of QE? One additional change needed /etc/grub.conf needs to be a symlink to /boot/grub/grub.conf I have instructed Jay G. to respin the images used for the public hourly on-demand offering and we will keep the existing Cloud Access images the same and provide a KB article for those customers. I have also pinged AWS requested assistance -- they may have a way of injecting a couple of files into an existing AMI and keep the same ID, but I am not confident. All future images should include this fix. I would also recommend an automated test be added to check for this. Could we get these fixes in a patch that could be made available? My concern is now turning to Cloud Access customers which may already be running instances of the images. Subhendu -- Is there a way to handle this as a kernel patch with a if (does /etc/sysconfig/kernel) DNE, then create? Or some similar method? I assume this affect both RHEL 5 and RHEL 6. automated tests have been created for this bug.. tests include... 1. test for /etc/sysconfig/kernel 2. test that UPDATEDEFAULT=yes is in /etc/sysconfig/kernel 3. test that DEFAULTKERNEL=kernel is in /etc/sysconfig/kernel 4. update the kernel, reboot, verify latest installed kernel is running This test fails [root@ip-10-66-95-94 ~]# rpm -q kernel kernel-2.6.32-71.29.1.el6.i686 kernel-2.6.32-131.0.15.el6.i686 [root@ip-10-66-95-94 ~]# cat /etc/sysconfig/kernel # UPDATEDEFAULT specifies if new-kernel-pkg should make # new kernels the default UPDATEDEFAULT=yes # DEFAULTKERNEL specifies the default kernel package type DEFAULTKERNEL=kernel [root@ip-10-66-95-94 ~]# cat /boot/grub/grub.conf default=0 timeout=0 hiddenmenu title RHEL-6.0-Starter-EBS-i386-11.raw (2.6.32-71.29.1.el6.i686) root (hd0) kernel /boot/vmlinuz-2.6.32-71.29.1.el6.i686 ro root=LABEL=_/ initrd /boot/initramfs-2.6.32-71.29.1.el6.i686.img [root@ip-10-66-95-94 ~]# ll /etc/grub.conf lrwxrwxrwx. 1 root root 20 May 18 16:23 /etc/grub.conf -> /boot/grub/grub.conf [root@ip-10-66-95-94 ~]# uname -a Linux ip-10-66-95-94 2.6.32-71.29.1.el6.i686 #1 SMP Thu Apr 21 15:57:30 EDT 2011 i686 i686 i386 GNU/Linux [root@ip-10-66-95-94 ~]# [root@ip-10-66-95-94 ~]# [root@ip-10-66-95-94 ~]# [root@ip-10-66-95-94 ~]# [root@ip-10-66-95-94 ~]# [root@ip-10-66-95-94 ~]# rpm -q kernel kernel-2.6.32-71.29.1.el6.i686 kernel-2.6.32-131.0.15.el6.i686 [root@ip-10-66-95-94 ~]# rpm -e kernel-2.6.32-131.0.15.el6.i686 [root@ip-10-66-95-94 ~]# yum update kernel Loaded plugins: amazon-id, security Setting up Update Process Resolving Dependencies --> Running transaction check ---> Package kernel.i686 0:2.6.32-131.0.15.el6 will be installed --> Finished Dependency Resolution Dependencies Resolved ========================================================================================================================== Package Arch Version Repository Size ========================================================================================================================== Installing: kernel i686 2.6.32-131.0.15.el6 rhui-us-east-rhel-server-updates 21 M Transaction Summary ========================================================================================================================== Install 1 Package(s) Total download size: 21 M Installed size: 81 M Is this ok [y/N]: y Downloading Packages: kernel-2.6.32-131.0.15.el6.i686.rpm | 21 MB 00:01 Running rpm_check_debug Running Transaction Test Transaction Test Succeeded Running Transaction Warning: RPMDB altered outside of yum. Installing : kernel-2.6.32-131.0.15.el6.i686 1/1 grubby fatal error: unable to find a suitable template Installed: kernel.i686 0:2.6.32-131.0.15.el6 Complete! There is a 3rd step: /etc/blkid/blkid.tab needs to exist and be complete. This can be done by running `blkid /dev/xvda` prior to the kernel update. Normally Anaconda does this indirectly during installation. Not sure why we're only seeing this for x86_64 yet. [whayutin@localhost cloudEngine]$ ssh -i ~/cloude-key.pem root.amazonaws.com The authenticity of host 'ec2-50-19-20-60.compute-1.amazonaws.com (50.19.20.60)' can't be established. RSA key fingerprint is 6f:ae:3b:7d:29:0d:3d:3c:be:48:d1:f4:c6:2f:3c:f9. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'ec2-50-19-20-60.compute-1.amazonaws.com,50.19.20.60' (RSA) to the list of known hosts. [root@ip-10-245-41-76 ~]# cat /etc/sysconfig/kernel # UPDATEDEFAULT specifies if new-kernel-pkg should make # new kernels the default UPDATEDEFAULT=yes # DEFAULTKERNEL specifies the default kernel package type DEFAULTKERNEL=kernel [root@ip-10-245-41-76 ~]# cat /etc/redhat-release Red Hat Enterprise Linux Server release 6.1 (Santiago) [root@ip-10-245-41-76 ~]# [whayutin@localhost cloudEngine]$ ssh -i ~/cloude-key.pem root.amazonaws.com The authenticity of host 'ec2-184-73-75-75.compute-1.amazonaws.com (184.73.75.75)' can't be established. RSA key fingerprint is eb:45:5e:36:74:ae:11:eb:ac:4a:3b:6c:35:3f:66:d0. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'ec2-184-73-75-75.compute-1.amazonaws.com,184.73.75.75' (RSA) to the list of known hosts. [root@domU-12-31-39-0F-21-C1 ~]# cat /etc/sysconfig/kernel # UPDATEDEFAULT specifies if new-kernel-pkg should make # new kernels the default UPDATEDEFAULT=yes # DEFAULTKERNEL specifies the default kernel package type DEFAULTKERNEL=kernel [root@domU-12-31-39-0F-21-C1 ~]# uname -a Linux domU-12-31-39-0F-21-C1 2.6.32-131.4.1.el6.x86_64 #1 SMP Fri Jun 10 10:54:26 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux [root@domU-12-31-39-0F-21-C1 ~]# cat /etc/redhat-release Red Hat Enterprise Linux Server release 6.1 (Santiago) [root@domU-12-31-39-0F-21-C1 ~]# [root@domU-12-31-39-0F-21-C1 ~]# [root@ip-10-245-41-76 blkid]# ls /etc/blkid/ [root@ip-10-245-41-76 blkid]# [root@ip-10-245-41-76 blkid]# ls /etc/blkid/ [root@ip-10-245-41-76 blkid]# waiting on jgreguske to comment (In reply to comment #12) > [root@ip-10-245-41-76 blkid]# ls /etc/blkid/ > [root@ip-10-245-41-76 blkid]# > > [root@ip-10-245-41-76 blkid]# ls /etc/blkid/ > [root@ip-10-245-41-76 blkid]# > > waiting on jgreguske to comment Moving back to ASSIGNED. This is still empty with the latest Beta tree. /etc/sysconfig/kernel is present and configured as expected. In rc.local we have this: if [ ! -f /etc/blkid/blkid.tab ] ; then blkid /dev/xvda &>/dev/null fi But I bet because of bug 729586, it doesn't work. (In reply to comment #15) > In rc.local we have this: > > if [ ! -f /etc/blkid/blkid.tab ] ; then > blkid /dev/xvda &>/dev/null > fi > > But I bet because of bug 729586, it doesn't work. The jury is still out on how/if we change the behavior that bug 729586 describes. This rc.local script should be generated using a label or uuid with blkid, i.e. blkid $(blkid -U some-UUID) &>/dev/null Block devices are generated on-demand in EC2. How would you know the UUID in advance? (In reply to comment #17) > Block devices are generated on-demand in EC2. How would you know the UUID in > advance? Ok, so why is /dev/xvda there if it's supposed to be for any on-demand block device? When you dynamically attach a disk you can choose what the target name will be, and that name stays the way you choose it as long as it's xvd-something. So there shouldn't be a problem with on-demand disk naming. Maybe I'm missing something completely, and if I am, then please fill me in on how AMIs are using xvd[abcd] in their scripts. I don't have any science to say why using a disk label won't work. I just know we tried and failed the past. We'll investigate it. But I was asking about UUIDs in my previous comment. If the (root) block device is being created for each instance we boot, the UUID would change for each instance would it not? I don't think it is possible to use that in fstab because we don't know what it will be. You are right that using the UUID is not possible. However, we'd really like to understand why we are even using sda rather than xvda. Flipping back from assigned to share the pain with others. Development Management has reviewed and declined this request. You may appeal this decision by reopening this request. |