Red Hat Bugzilla – Bug 1472062
apcupsd-initiated shutdown hangs on systemd stop job and does not kill UPS power
Last modified: 2018-05-19 00:20:47 EDT
Description of problem: When a UPS power failure occurs, apcupsd properly executes 'shutdown -h -H now' via /etc/apcupsd/apccontrol. However, the system hangs for 1 min 30 seconds with message "A stop job is running for APC UPS Power Control Daemon for Linux". After this 90 second hang, the shutdown process appears to resume and the machine is halted (without a poweroff), like it should be, but the UPS power is not killed, indicating that /etc/apcupsd/apccontrol was not properly called with the "killpower" argument. It looks like it is supposed to be called in this manner via /usr/lib/systemd/system-shutdown/apcupsd_shutdown. Since UPS power is not killed, if AC power is restored before the battery runs out, then the machine remains halted and does not power back on.
If I manually run:
* touch /etc/apcupsd/powerfail
* shutdown -h -H now
and wait, then the system halts and the UPS power is killed as it should be.
Version-Release number of selected component (if applicable):
# cat /etc/fedora-release
Fedora release 26 (Twenty Six)
# dnf info apcupsd
Last metadata expiration check: 0:29:36 ago on Mon 17 Jul 2017 09:40:57 PM EDT.
Name : apcupsd
Version : 3.14.14
Release : 5.fc26
Arch : x86_64
Size : 1.2 M
Source : apcupsd-3.14.14-5.fc26.src.rpm
Repo : @System
From repo : fedora
Summary : APC UPS Power Control Daemon
URL : http://www.apcupsd.com
License : GPLv2
Description : Apcupsd can be used for controlling most APC UPSes. During a
: power failure, apcupsd will inform the users about the power
: failure and that a shutdown may occur. If power is not restored,
: a system shutdown will follow when the battery is exausted, a
: timeout (seconds) expires, or the battery runtime expires based
: on internal APC calculations determined by power consumption
: rates. If the power is restored before one of the above shutdown
: conditions is met, apcupsd will inform users about this fact.
: Some features depend on what UPS model you have (simple or smart).
How reproducible: Always
Steps to Reproduce:
1. Unplug APC UPS.
2. Wait for apcupsd to initiate shutdown.
Actual results: 90-second hang during shutdown with console message "A stop job is running for APC UPS Power Control Daemon for Linux", and UPS power is not killed.
Expected results: No hang during shutdown, and UPS power to be killed during the shutdown process via /etc/apcupsd/apccontrol killpower.
I have a workaround:
While attempting to troubleshoot why this is not working, I inserted my own commands in the /etc/apcupsd/apccontrol script to write files and observed (on subsequent reboot with 'journalctl -b -1') that apccontrol did not have permission to write files. I disabled SELinux by setting SELINUX=disabled in /etc/selinux/config, and my main problem went away - UPS power is now killed on shutdown. Perhaps selinux was disallowing the /etc/apcupsd/powerfail file from being created by apcupsd.
The apcupsd service does still timeout on stop, but I have shortened the timeout to make this less annoying by adding 'TimeoutStopSec=10' to the [Service] section of /usr/lib/systemd/system/apcupsd.service.
If there are selinux denials, it would be good for you to include them. "ausearch -m avc" will show them all but it would probably be better to limit to those from the relevant period of time.
Sadly selinux problems have to be filed separately against the selinux policy, but I can do that once I have the relevant AVCs to inspect.
After a apcupsd-initiated shutdown with selinux enabled (again failing to cut power to the UPS), on a subsequent reboot "ausearch -m avc" just outputs "<no matches>".
Created attachment 1335814 [details]
SELinux Type Enforcement file for apsupsd
SElinux denied access (getattr) /lib/systemd/system-shutdown/apcupsd_shutdown to file /etc/apcupsd/powerfail
save attachment and run these commands:
checkmodule -m myapcupsd.te -o myapcupsd.mod
semodule_package -m myapcupsd.mod -o myapcupsd.pp
semodule -i myapcupsd.pp
semodule -r myapcupsd
I have similar issue that apcupsd sits for 1min 30secs timeout on shutdown for
"A stop job is running for APC UPS Power Control Daemon for Linux".
(SELinux is disabled for debugging)
When shutdown via UPS unplugging is invoked, the process list will be like:
4 0 1302 1 20 0 132984 1988 poll_s Ssl ? 0:00 /sbin/apcupsd -b -f /etc/apcupsd/apcupsd.conf
0 0 1754 1302 20 0 115252 1480 do_wai S ? 0:00 /usr/bin/sh /etc/apcupsd/apccontrol doshutdown APCUPS 1 1
0 0 1757 1754 20 0 115252 1428 do_wai S ? 0:00 /bin/sh /etc/apcupsd/doshutdown APCUPS 1 1
4 0 1762 1757 20 0 128820 1624 poll_s S ? 0:00 /sbin/shutdown -h -P now apcupsd UPS 1 initiated shutdown
That is, /sbin/shutdown is in same process group of apcupsd.
systemd tries to kill the whole process group since
systemd.kill(5) KillMode=control-group is the default, but
/sbin/shutdown seems to be immune from SIGTERM,
so shutdown process just hangs there and waits for 1min 30sec timeout.
Workaround: Adding custom setup in /etc/systemd/system/apcupsd.service.d/timeout.conf
to the unit description makes it shutdown smoothly.
I propose to add KillMode=process to the systemd unit file.
This message is a reminder that Fedora 26 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 26. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora 'version'
Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.
Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 26 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.
Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.
Hi, I've just bumped into this in Fedora 26 & Fedora 27 as well (same issue described by Josh in initial statement). I see there are 3 separate issues:
## Issue #1 (apcupsd shutdown problem via systemd): This is where you get "A stop job is running for APC UPS Power..." and you have to wait 1.5 minutes before systemd kills it.
Workaround: Create /etc/systemd/system/apcupsd.service.d/timeout.conf as suggested by Taketo (thanks!). This should be included in the apcupsd package as well (or fixed in the included systemd file)
## Issue #2 (SELinux prevents the creation of /etc/apcupsd/powerfail): This is a major one : "/etc/apcupsd/apccontrol killpower" never gets executed because powerfail is missing.
Workaround: The custom policy created by Paweł (thanks!) does the job and I can confirm powerfail gets created. Jason: Can we include this in the standard policy?
## Issue #3 (SELinux won't run /etc/apcupsd/apccontrol killpower): Ok, this is a new one here (I don't see anyone mentioning this particular issue). After solving both previous issues I still can't get my UPS to shutdown (apccontrol is never executed). If I disable SELinux it runs perfectly fine. And, as previously mentioned, "ausearch -m avc" shows nothign as well. I even disabled the "dontaudit" rules but still don't see anything. If someone knows how to make SELinux spit out more details on this I'll be happy to test it.
Workaround: set SELinux to permissive mode (which we don't want to do of course)
Jason: Can we get some traction here? apcupsd is really crippled out of the box (totally confirmed with F26 & F27).
(In reply to Jorge Fábregas from comment #7)
> ## Issue #3 (SELinux won't run /etc/apcupsd/apccontrol killpower)
> If someone knows how to make SELinux spit out more details on this I'll be
> happy to test it.
Currently I am on F28 and issue #3 for me does not exist. But if You want more details on SELinux try to remove from grub boot menu two word rhgb and quiet.
Note that the selinux policy is part of the central distro-wide selinux policy. Modifying it is not something I can do. Filing bugs is fine but my experience is that there isn't much chance of them being fixed within any reasonable time frame. Applying patches to the main policy in a package is a pretty bad idea, and actually overriding the main policy module is possible but a bit beyond my ability.
If someone wants to write a complete policy module for this package (not just a little patch as in comment 4 but the full policy) then I could incorporate that and ask that the policy be dropped from the central policy module. That's the direction selinux policy seems to be moving and I'm happy to work with that, but I don't have the level of understanding needed to actually come up with the policy. Given the policy I have no trouble packaging it properly.
So right now, for selinux issues I can only recommend that you file them against the central selinux policy until I'm able to create or obtain a standalone policy module. I do have another package with similar selinux issues and so I will eventually learn to write my own policy.
As for the unit file, that change is in rawhide and I will push updates to F28 and F27.
apcupsd-3.14.14-8.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2018-7d3056343f
apcupsd-3.14.14-11.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-da8bd6a8c2
(In reply to Jason Tibbitts from comment #9)
> Note that the selinux policy is part of the central distro-wide selinux
> policy. Modifying it is not something I can do. Filing bugs is fine but my
> experience is that there isn't much chance of them being fixed within any
> reasonable time frame.
Long time ago I report the bug https://bugzilla.redhat.com/show_bug.cgi?id=1505050 but as you notice "no one cares"
(In reply to Paweł from comment #8)
> Currently I am on F28 and issue #3 for me does not exist. But if You want
> more details on SELinux try to remove from grub boot menu two word rhgb and
Thanks Paweł for confirming it works in F28 and for the rhgb & quiet boot parameters! It turns out that it was a problem on my side indeed! While troubleshooting the other issues I redirected STDOUT & STDERR in the call to apccontrol to the root filesysetm:
mount -oremount,rw /
/etc/apcupsd/apccontrol killpower > /apccontrol.out 2>&1
mount -oremount,ro /
...and of course SELinux wouldn't allow to write in there.
I confirm that that after changing the KillMode & your SELinux changes apcupsd works like it should during a full power outtage. Thanks.
Jason: Got it. I see the only remaining issue is the SELinux stuff for which Paweł already filled a bug many months ago. I'm going to add my voice over there as well.
Thanks for the updated packages & for your work maintaining this package.
apcupsd-3.14.14-8.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-7d3056343f
apcupsd-3.14.14-11.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-da8bd6a8c2