Bug 1498169
| Summary: | grubby fatal error: unable to find a suitable template -- with a reproducer | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | R P Herrold <herrold> |
| Component: | grubby | Assignee: | Peter Jones <pjones> |
| Status: | CLOSED WONTFIX | QA Contact: | Release Test Team <release-test-team-automation> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 7.4 | CC: | adm.fkt.physik, chayang, herrold, juzhang, pjones, qzhang, release-test-team-automation, rharwood, sluo, xfu |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 1034591 | Environment: | |
| Last Closed: | 2021-01-15 07:43:03 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1034591 | ||
| Bug Blocks: | |||
| Attachments: | |||
|
Description
R P Herrold
2017-10-03 15:45:21 UTC
Created attachment 1333789 [details]
from the noise peoducing update
Created attachment 1333790 [details]
from the noise producing update
Created attachment 1333791 [details]
from the noise producing update
Created attachment 1333792 [details]
from the noise producing update pre update
the SOS report is rather fat and may contain credentials so I do not add it to a public bug oops did not attach the size indication [root@billing log]# ls -alh /var/tmp/sosreport-rherrold.new-bug-20171003114040.tar.xz -rw-------. 1 root root 11M Oct 3 11:42 /var/tmp/sosreport-rherrold.new-bug-20171003114040.tar.xz I have retained a local copy so it does not get aged away at: [herrold@centos-7 1498169]$ pwd /home/herrold/grubby/1498169 [herrold@centos-7 1498169]$ ls -al total 10532 drwxrwxr-x. 2 herrold herrold 4096 Oct 3 11:53 . drwxrwxr-x. 3 herrold herrold 4096 Oct 3 11:53 .. -rw-r--r--. 1 herrold herrold 32376 Oct 3 11:48 grubby -rw-r--r--. 1 herrold herrold 296 Oct 3 11:37 grubby_prune_debug -rw-r--r--. 1 herrold herrold 127502 Oct 3 11:44 messages -rw-------. 1 herrold herrold 10551828 Oct 3 11:53 sosreport-rherrold.new-bug-20171003114040.tar.xz -rw-r--r--. 1 herrold herrold 25947 Oct 3 11:38 yum.log -rw-r--r--. 1 herrold herrold 12843 Oct 3 11:37 yum.log-20151117 [herrold@centos-7 1498169]$ the underlying VM is: CentOS 7 x86_64 Deployed: 2015-11-16 768 MB RAM 12 GB HD 128 kbps BW internal identifying information VM Owner: herrold@col VM Name: vm_36023 Dom0: kvm-n026 VM Arch: x86_64 Virt. Type: kvm Monthly VM cost: 5200 Please ask if you need any further information, and I can provide it as well When the next kernel update comes out, I will 'watch' it as well and report new stderr matter as well snapping a post update backup Oct 3 12:07:09 secure pmmanLog[8896]: pmmanLog ( | _event_id: 10 | _owner_id: 14 | _vm_id: 772 | _message: [A] VM system quiescent backup (unnamed) has been ordered | _admin: -NULL- | _thread_id: the problem occurred on another unit, also without a separate /boot (by design in the VM model we are using) our identifier is: vm_19276 and I have saved the items in the README I will attach in a moment so it does not get aged away I updated grub*, and kernel-tools* before letting the kernel upgrade run this time so PRE and POST in the file snapshots from /var/log Created attachment 1350024 [details]
second instance README-20171109 manifest of saved matter for this report part
Created attachment 1350025 [details]
second instance yum.log-PRE
Created attachment 1350026 [details]
second instance yum.log-POST
Created attachment 1350028 [details]
second instance grubby-PRE
Created attachment 1350029 [details]
second instance grubby-POST
Created attachment 1350030 [details]
second instance grubby_prune_debug-PRE
Created attachment 1350031 [details]
second instance grubby_prune_debug-POST
if there is any way to 'dial up' logging, please let me know and I will use it. the problem is of course that the newly installed kernel is not being pointed to - -this after a SELinux relabel and reboot [root@nagios log]# rpm -qa kernel\* ; uname -a kernel-tools-libs-3.10.0-693.5.2.el7.x86_64 kernel-3.10.0-693.2.2.el7.x86_64 kernel-3.10.0-514.26.2.el7.x86_64 kernel-tools-3.10.0-693.5.2.el7.x86_64 kernel-3.10.0-693.5.2.el7.x86_64 kernel-3.10.0-123.el7.x86_64 kernel-3.10.0-514.10.2.el7.x86_64 Linux nagios.pmman.net 3.10.0-123.el7.x86_64 #1 SMP Mon Jun 30 12:09:22 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux [root@nagios log]# date Thu Nov 9 12:00:53 EST 2017 [root@nagios log]# w 12:01:37 up 57 min, 1 user, load average: 0.17, 0.07, 0.10 USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT root pts/0 router.owlriver. 11:31 1.00s 0.10s 0.02s w [root@nagios log]# two more 'aid to memory' links https://bugzilla.redhat.com/show_bug.cgi?id=864198 https://github.com/rhboot/grubby/issues/22 The "'there is just one partition' and so looking for a /boot mount" approach is recurring more in new approaches to partition layout. Is there anything I may do not help here? With the kernel side channel cache leakage exploits, there are now kernel packages in my 'to be installed' queue' Is there anything I can do to provide more information to help get this fixed, when running these changes? unlike bug #1177843 this '7 unit is already at the offered grubby-8.28-23 level [root@nagios ~]# rpm -q grubby grubby-8.28-23.el7.x86_64 so, proceeding through the tests at comment #27 in that prior bug a. - no kernel line MAYBE -- see the bottom of this update: [herrold@centos-7 sysconfig]$ grep kernel grub GRUB_CMDLINE_LINUX="vconsole.keymap=us crashkernel=auto vconsole.font=latarcyrheb-sun16 rhgb quiet rootflags=nouuid" [herrold@centos-7 sysconfig]$ cd ../../boot/grub2/ [herrold@centos-7 grub2]$ grep kernel * grub.cfg: linux16 /boot/vmlinuz-3.10.0-327.10.1.el7.x86_64 root=/dev/vda1 ro vconsole.keymap=us crashkernel=auto vconsole.font=latarcyrheb-sun16 rhgb quiet rootflags=nouuid grub.cfg: linux16 /boot/vmlinuz-3.10.0-123.13.2.el7.x86_64 root=/dev/vda1 ro vconsole.keymap=us crashkernel=auto vconsole.font=latarcyrheb-sun16 rhgb quiet rootflags=nouuid grub.cfg: linux16 /boot/vmlinuz-0-rescue-16290318d1874648e91e93d3be661d78 root=/dev/vda1 ro vconsole.keymap=us crashkernel=auto vconsole.font=latarcyrheb-sun16 rhgb quiet rootflags=nouuid [herrold@centos-7 grub2]$ b. - access(kernel_path, R_OK) fails no -- the permissions are 'stock' c. - no root= line and root= isn't on the kernel arguments line nope [herrold@centos-7 grub2]$ grep "root=" * grub.cfg: set root='hd0,msdos1' grub.cfg: linux16 /boot/vmlinuz-3.10.0-327.10.1.el7.x86_64 root=/dev/vda1 ro vconsole.keymap=us crashkernel=auto vconsole.font=latarcyrheb-sun16 rhgb quiet rootflags=nouuid d. - root= can't be resolved to a device no, as above perhaps as an aside, 'sosreport' did not save a copy of /etc/mtab I will file this separately also, when unpacking the tarball as a non-root user from sosreport, it failed on the mknode for ./dev.null ... out of scope here, however ------------- e. - we couldn't parse /etc/mtab I know there was a normally permissioned /etc/mtab there, but as noted, 'sosreport' did not save a copy, or the copy is later in the sosreport tarball, so it was not reached ... will test by unpacking as root, in a moment actually I see the mtab detail as to the 'root' device in another part of the sosreport sos_commands/filesys/df_-al Filesystem 1K-blocks Used Available Use% Mounted on rootfs - - - - / sysfs 0 0 0 - /sys proc 0 0 0 - /proc devtmpfs 369956 0 369956 0% /dev securityfs 0 0 0 - /sys/kernel/security tmpfs 379424 0 379424 0% /dev/shm devpts 0 0 0 - /dev/pts tmpfs 379424 38228 341196 11% /run tmpfs 379424 0 379424 0% /sys/fs/cgroup ... snip a bunch of cgroup stuff configfs 0 0 0 - /sys/kernel/config /dev/vda1 12571648 3076084 9495564 25% / selinuxfs 0 0 0 - /sys/fs/selinux f. -- UUID matters no UUID on this unit As always, if there is other information I can provide, please let me know I have amended my checklist to manually make copies, pre and post of: /etc/mtab /etc/sysconfig/grub* show the accessibility of the kernel and show the pre and post (here, post a reboot) of: grubby --info=` grubby --default-kernel ` which should show all this more concisely INTERESTINGLY, on the unit initially evidencing this, running that command, I get this: [root@nagios ~]# grubby --info=` grubby --default-kernel ` grubby: kernel not found [root@nagios ~]# grubby --default-kernel [root@nagios ~]# rpm -qa kernel kernel-3.10.0-693.2.2.el7.x86_64 kernel-3.10.0-514.26.2.el7.x86_64 kernel-3.10.0-693.5.2.el7.x86_64 kernel-3.10.0-123.el7.x86_64 kernel-3.10.0-514.10.2.el7.x86_64 [root@nagios ~]# [root@nagios ~]# rpm -V kernel .......T. /lib/modules/3.10.0-123.el7.x86_64/modules.devname .......T. /lib/modules/3.10.0-123.el7.x86_64/modules.softdep [root@nagios ~]# so somewhere along the way, your bullet item 1 seems to have become unresolveable to 'grubby', and the updates stopped being applied ... see that the running kernel is the oldest installed kernel [root@nagios proc]# cat cmdline BOOT_IMAGE=/boot/vmlinuz-3.10.0-123.el7.x86_64 root=/dev/vda1 ro vconsole.keymap=us crashkernel=auto vconsole.font=latarcyrheb-sun16 rhgb quiet rootflags=nouuid [root@nagios proc]# uname -a Linux nagios.pmman.net 3.10.0-123.el7.x86_64 #1 SMP Mon Jun 30 12:09:22 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux [root@nagios proc]# I pulled the 7.5 beta SRPMs -- do you wish me to build, and then update grubby, and possibly more, before re-testing? issue observed again today is this patch suitable for RHEL 7 space? https://src.fedoraproject.org/rpms/grub2/c/db7cf3a089075af0f4a8b955af508aea38 93465a (this from the mailing list thread: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/m essage/UGUSZIBOLV4XUZPKZ3ZTYZ2HJO36KPES/ After evaluating this issue, there are no plans to address it further or fix it in an upcoming release. Therefore, it is being closed. If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened. |