Bug 981877
Summary: | RFE: Pressing C-A-D three times within 1s should dump systemd's state to console, pressing it 10 times within 1s should reboot the machine hard | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Adam Pribyl <covex> |
Component: | systemd | Assignee: | systemd-maint |
Status: | CLOSED NEXTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | rawhide | CC: | johannbg, lnykryn, mschmidt, msekleta, plautrba, rvokal, systemd-maint, vpavlin, zbyszek |
Target Milestone: | --- | Keywords: | Reopened |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | systemd-219-1.fc23 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2015-03-13 05:01:54 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Adam Pribyl
2013-07-06 15:19:53 UTC
The kernel already provides something like that: SysRq + s, u, b, or so: https://www.kernel.org/doc/Documentation/sysrq.txt Well we dont enable SysRq by default but yes that is one way to solve it. You can find how to enable it and other useful information about SysRq here https://fedoraproject.org/wiki/QA/Sysrq Gents, thanks for valueable advices, but this is too late to enable sysrq when system is already almost dead. As it is not enabled by default, it's out of question. I know about sysrq but my complain is about systemd and default reboot shortcut ctrl-alt-del, that is not working in such emergency case but was working for "init" age. Maybe then the systemd can call those kernel sysrq call on ctrl-alt-del if the file is not found or something? Systemd needs a lot of files from the rootfs to execute shutdown. The later step during shutdown is to replace/re-exec the running PID1 with another binary from the rootfs called "shutdown". There is really no way to do anything sensible from systemd without a working rootfs. Expecting all but the config files from the rootfs to work makes not much sense, it's just betting on luck. The kernel still can do things regardless of the broken rootfs, if you need sysreq to cover cases like that, please just enable it on the machine. After discussion on test list, I'd like to reopen this bug as I think the systemd reaction to ctrl-alt-del in case there is something wrong with rootfs is overall bad. The sysrq is not a solution as pointed out by initscripts maintainers. It's for debug, not for general usecase and under normal condition nobody expects fs corruption. I think we all agree that pressing the ctrl-alt-del is a clear attempt from user/admin to reboot the computer. It is intermediate step between software initiated reboot and power reset. If systemd is missing the .target file it completely ignores this attempt - it does not print any info it can not complete the requested action (only to log, which is useless), it does not attempt to do anything - leaving the system running. In my opinion, this event should have a backup action in case there is something wrong (like missing .target) - like init does - it sends TERM and then KILL to all processes in almost all cases on ctrl-alt-del. Whatever I did to init it at least prints something to tell me what is going on (e.g. on removal of shutdown binary) 'INIT: cannot execute "/sbin/shutdown"'. If systemd could do TERM->KILL and also mount read only all file systems and halt it would be superior. Very close story is when systemctl is corrupted - as systemd links all the reboot/halt/shutdown/poweroff files to systemctl, removing the systemctl and asking for reboot puts systemd in limbo. I would be glad if systemd could do better then init. That's all. (In reply to Adam Pribyl from comment #5) > I think we all agree that pressing the ctrl-alt-del is a clear attempt from > user/admin to reboot the computer. In most cases this is true, but it does not have to be and it is not documented as such. From "man systemd": SIGINT Upon receiving this signal the systemd system manager will start the ctrl-alt-del.target unit. This is mostly equivalent to systemctl start ctl-alt-del.target. (Notice that here it never mentions reboot.) From "man systemd.special": ctrl-alt-del.target systemd starts this target whenever Control+Alt+Del is pressed on the console. Usually this should be aliased (symlinked) to reboot.target. Notice "Usually". It can be aliased to another target, or it can be intentionally masked (linked to /dev/null) or deleted in order to disable CTRL+ALT+DEL. So I am not convinced we can say that a failure to load ctrl-alt-del.target implies systemd should go ahead and reboot. You are right, and I already thought about it - ctrl-alt-del has to be configurable. But the missing file is an invalid condition. So far I understood there is no valid state when ctrl-alt-del is not present or is it? This could trigger emergency action. (In reply to Adam Pribyl from comment #7) > You are right, and I already thought about it - ctrl-alt-del has to be > configurable. But the missing file is an invalid condition. So far I > understood there is no valid state when ctrl-alt-del is not present or is > it? This could trigger emergency action. I think we could treat missing or corrupted ctrl-alt-del.target as a special case. If you can come up with a *small* patch, then I'd consider it worth applying. This will give a bit of extra robustness to systemd, which is always good. I never thought of being able to submit a patch to anything like systemd where it hits the also the core. Here is what I came to: diff --git a/src/core/manager.c b/src/core/manager.c index 2e98181..3420dd3 100644 --- a/src/core/manager.c +++ b/src/core/manager.c @@ -1395,7 +1395,12 @@ static int manager_process_signal_fd(Manager *m) { case SIGINT: if (m->running_as == SYSTEMD_SYSTEM) { - manager_start_target(m, SPECIAL_CTRL_ALT_DEL_TARGET, JOB_REPLACE_IRREVERSIBLY); + if(manager_start_target(m, SPECIAL_CTRL_ALT_DEL_TARGET, JOB_REPLACE_IRREVERSIBLY) < 0) { + /* Emergency case - can not run SPECIAL_CTRL_ALT_DEL_TARGET, start reboot immediately */ + log_error("Failed to find ctrl-alt-del target, forcing reboot"); + m->exit_code = MANAGER_REBOOT; + } + break; } Untested ATM. But to be honest the amout of source code in systemd is huge. I've got lost in all the shutdown/halt/reboot possibilities there. I hope this one will not be blocked by any missing unit, just the missing shutdown binary will cause the system to freeze (and it does sync at least). Hmm, so there has been a feature request that if C-A-d is pressed three times within 1s (or something similar), then we'd dump systemd's state to the console, which could be useful to find a hang. Similar, we could say if you hit it 10 times within 3s or so we would instantly reboot. Of course, this would need to be something that can be disabled. Well, unicode is even available on the console just fine. It's 2013, after all! We do disable unicode output actually if you explicitly enable a non-UTF8 locale. However, if you have no locale set up, or have a UTF-8 locale then we will output things with UTF8. I understand a debug feature could be triggered by multiple C-A-d presses, I also understand that there may be a config option for any kind of emergency or debug dump after C-A-d press in systemd.conf, but to initiate a reboot, ever only one C-A-d press was needed. I may try to implement something like CtrlAltDelEmergency=Yes|No, everything else is beyond my abilities. I really do not see how to tell the user it needs 10 presses in 3s to trigger that. Or are we going to print such a message on every C-A-d press to console? This was only meant for a situation when something is terribly wrong (e.g. missing .target) and its better to reboot than sit there without any reaction. It is not any kind of "cheat". I am not sure how the comment about unicode is related to this bug. Was it really meant for this one? Pretty sure the unicode comment belongs to bug 971834. I tried to implement systemd config option for possibility to force the reboot in case the ctrl-alt-del.target does not exist, but it is somehow touching too many places in the code and I did not found a way to get the config value in the manager (I do not say it is not there but I am missing something here). If the option is going somewhere thru Dbus, it creates another weak point to activate what should be emergancy case solution. I am not sure, the troubles with config are worth it. In my opinion, the ctrl-alt.del.target should be there under normal circumstances, if it not there, we should trigger the most common action for it. (In reply to Adam Pribyl from comment #14) > I tried to implement systemd config option for possibility to force the > reboot in case the ctrl-alt-del.target does not exist, but it is somehow > touching too many places in the code and I did not found a way to get the > config value in the manager (I do not say it is not there but I am missing > something here). If the option is going somewhere thru Dbus, it creates > another weak point to activate what should be emergancy case solution. I am > not sure, the troubles with config are worth it. This doesn't sound useful. Normally configuration should be through ctrl-alt-del.target, and here we're only talking about the case where something bad happened to the root fs. > In my opinion, the ctrl-alt.del.target should be there under normal > circumstances, if it not there, we should trigger the most common action for > it. +1 This message is a reminder that Fedora 18 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 18. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '18'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 18's end of life. Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 18 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior to Fedora 18's end of life. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. So, we do ship ctrl-alt-del.target linked up in /usr/lib/systemd/system. /usr of course is package manager territory, not user territory, which means it's OK if things stop working if you remove stuff from there... I mean, that's a bit like deleting libc.so and expecting things to still work. Also, there's the valid admin choice of disabling c-a-d, and he can do that by masking the target or symlinking it to whatever target he likes doing whatever he likes. Just overriding the masking if something doesn't work sounds really wrong, as the admin might deliberately masked it. That said, I still think it's a good idea to provide an emergency reboot thing for cases where everything is fucked, as suggested in comment #10. People make mistakes, and there should be a way out, a second level of safety net (that however also needs to be possible to disable). Anyway, renaming this bug appropriately. Well pressing c-a-d 10 times within 1s will be pretty hard job... I am not sure athletics is what we need. Even 3 times within 1s will be a challenge. And are you sure, the operating system under heavy load will be able to process that? Once it is implemented, the exact number and timing requirements can be adjusted. Mostly implemented in http://cgit.freedesktop.org/systemd/systemd/commit/?id=2e5c94b9aa. Please open a new bug for further changes, this one is already long enough. |