Bug 1411576
| Summary: | [LLNL 7.4 FEAT] Don't exit if /boot is not writable | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Ben Woodard <woodard> |
| Component: | kexec-tools | Assignee: | Bhupesh Sharma <bhsharma> |
| Status: | CLOSED ERRATA | QA Contact: | Qiao Zhao <qzhao> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 7.4 | CC: | cye, panand, piliu, ruyang, tgummels, xiawu |
| Target Milestone: | rc | Keywords: | FutureFeature |
| Target Release: | 7.4 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | kexec-tools-2.0.14-7.el7 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-08-01 09:31:21 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1384121, 1384257, 1446211 | ||
As a hack to work around this problem we are currently running with the following patch.
--- kexec-tools_a/kdumpctl 2016-11-23 10:30:14.196149000 -0800
+++ kexec-tools_b/kdumpctl 2016-11-23 11:06:24.771011000 -0800
@@ -590,13 +595,13 @@
fi
if [[ ! -w "$KDUMP_BOOTDIR" ]];then
- echo "$KDUMP_BOOTDIR does not have write permission. Can not rebuild $TARGET_INITRD"
- return 1
+ echo "$KDUMP_BOOTDIR does not have write permission. Will not rebuild $TARGET_INITRD"
+ return 0
+ else
+ echo "Rebuilding $TARGET_INITRD"
+ rebuild_initrd
+ return $?
fi
-
- echo "Rebuilding $TARGET_INITRD"
- rebuild_initrd
- return $?
}
(In reply to Ben Woodard from comment #0) > Description of problem: > The logic in systemd's kdumpctl file aborts and exits if /boot is not > writable. We believe that this is not necessarily the best approach. > > Not being able to rebuild the kdump kernel doesn't mean that the kdump > service cannot run, it simply means that whatever kdump kernel is there is > the one that must be used. > Following is the current implementation: -------------------------------------------------------------------------------- if [ $image_time -eq 0 ]; then echo -n "No kdump initial ramdisk found."; echo elif [ $DEFAULT_DUMP_MODE == "fadump" ] && [ "$initramfs_has_fadump" -eq "0" ]; then echo "$TARGET_INITRD has no fadump support" elif [ "$force_rebuild" != "0" ]; then echo -n "Force rebuild $TARGET_INITRD"; echo elif [ "$system_modified" != "0" ]; then : else return 0 fi if [[ ! -w "$KDUMP_BOOTDIR" ]];then echo "$KDUMP_BOOTDIR does not have write permission. Can not rebuild $TARGET_INITRD" return 1 fi -------------------------------------------------------------------------------- Code flow reaches to check for writeable /boot directory only when it seems that a rebuild is needed, ie system was modified. So, kdumpctrl will still be successful if there was no rebuild condition true. Not sure, if it will be a good idea to return success without attempting rebuild when there was a newer kernel available even for RO /boot. Last time [1] we had discussed to fail the service in such scenario. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1322621#c9 You are missing the point of what I was saying! You are deciding that a rebuild is necessary BEFORE looking if you can make a rebuild and store it. Look at it this way: The system administrator (i.e. God for that system) has decided that the kdump initrd that he has placed in the read-only /boot directory is one that will work for that system and is appropriate for the kernel. End of story. There is ABSOLUTELY NO REASON!!!! that some dumb kdump service script should EVER think that it knows better than the system administrator and through some primitive logic and come to believe a rebuild is appropriate. So first check that /boot is writable and if it is, THEN you can go through all of your logic to decide if you should rebuild. If /boot is not writable load the kdump initrd for the booted kernel. If it doesn't work, it is not your problem. It is the system administrator's problem. He put it there and made /boot read-only. (In reply to Ben Woodard from comment #4) > > Look at it this way: > The system administrator (i.e. God for that system) has decided that the > kdump initrd that he has placed in the read-only /boot directory is one that > will work for that system and is appropriate for the kernel. End of story. > So it would be fine to have a kdump.conf option like force_not_rebuild. When this option is enabled we load the already available kdump initrd. force_not_rebuild is one way to look at but that would add more fiddly options which adds complexity to the configuration, requires documentation...The approach that I've been advocating seems much more elegant. What is wrong with this logic?
if /boot is writable
if you need to rebuild
rebuild kdump initrd
else # you can't rebuild even if you wanted to so don't bother testing
if force_rebuild
exit with an error
# at this point we either have
# a) an initrd which didn't need to be rebuilt
# b) an initrd which was just rebuilt
# c) an initrd which was provided by the sysadmin in a RO /boot directory
load the kdump initrd
if load_failed
exit with an error
There could be risk to silently go ahead for a readonly mounted /boot, sometimes a corrupt disk will be mounted as ro automaticlly or for some other reason one do not intend to do this. For those cases failout should be better. Add a force_rebuild would benefit people who want a perticular initrd and no need any change. For example for me I do not need kernel modules and I always built in most of functionalities in a bzImage, so when I test a new kernel, rebuild kdump initrd is useless. So as per comment 7,it would be good to go with a kdump.conf option like force_not_rebuild. If there is no more conflict then, I will cook a patch for it. That should be OK. ...As long as you do the logic correct so that it doesn't exit if /boot is RO and force_not_rebuild is set. This unequivocally must be tested to work on a diskless machine with a read-only root and boot. That is the fundamental requirement. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2300 |
Description of problem: The logic in systemd's kdumpctl file aborts and exits if /boot is not writable. We believe that this is not necessarily the best approach. Not being able to rebuild the kdump kernel doesn't mean that the kdump service cannot run, it simply means that whatever kdump kernel is there is the one that must be used. In pseudocode: if /boot is writable if kdump.conf includes force_rebuild rebuild_kdump_initrd=1 else if kdump.conf is newer than kdump.initrd rebuild_kdump_initrd=1 else if <whatever other tests> rebuild_kdump_initrd=1 fi else # /boot is not writable if kdump.conf includes force_rebuild exit 1 #this is a legitimate error case rebuild_kdump_initrd=0 fi if rebuild_kdump_initrd == 1 rebuild kdump.initrd fi if kernel-version-kdump.initrd does not exist # something either went wrong building the kdump initrd # or /boot was not writable and a kdump initrd did not exist for this kernel version exit 1 fi load_kdump_kernel <...> The point is just skip all the logic testing if you need to rebuild the kdump initrd if /boot is not writable. This allows a system administrator to place a suitable kdump initrd in a boot directory and not have to worry about whatever logic you are using to decide if you need to rebuild the kdump initrd. It is there call, it is there choice for what they put there, it is their responsibility. If it doesn't work, it is their fault and they will fix it. You do not need to worry about that. As it is, they must go through and reverse engineer all the tests that you make and carefully make sure to disarm them so that the kdump logic ends up deciding that it doesn't need to make a new kdump initrd. You may ask, why /boot is not writable? These are diskless systems that netboot. This is a continuation of previous case. The changes in the logic that you made from 7.2 to 7.3 did not sufficiently address the issue.