Bug 1472062 - apcupsd-initiated shutdown hangs on systemd stop job and does not kill UPS power
apcupsd-initiated shutdown hangs on systemd stop job and does not kill UPS power
Status: ON_QA
Product: Fedora
Classification: Fedora
Component: apcupsd (Show other bugs)
26
x86_64 Linux
unspecified Severity unspecified
: ---
: ---
Assigned To: Jason Tibbitts
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-07-17 22:13 EDT by Josh Holtrop
Modified: 2018-05-19 00:20 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
SELinux Type Enforcement file for apsupsd (145 bytes, text/plain)
2017-10-08 04:08 EDT, Paweł
no flags Details

  None (edit)
Description Josh Holtrop 2017-07-17 22:13:58 EDT
Description of problem: When a UPS power failure occurs, apcupsd properly executes 'shutdown -h -H now' via /etc/apcupsd/apccontrol. However, the system hangs for 1 min 30 seconds with message "A stop job is running for APC UPS Power Control Daemon for Linux". After this 90 second hang, the shutdown process appears to resume and the machine is halted (without a poweroff), like it should be, but the UPS power is not killed, indicating that /etc/apcupsd/apccontrol was not properly called with the "killpower" argument. It looks like it is supposed to be called in this manner via /usr/lib/systemd/system-shutdown/apcupsd_shutdown. Since UPS power is not killed, if AC power is restored before the battery runs out, then the machine remains halted and does not power back on.

If I manually run:
  * touch /etc/apcupsd/powerfail
  * /usr/lib/systemd/system-shutdown/apcupsd_shutdown
  * shutdown -h -H now

and wait, then the system halts and the UPS power is killed as it should be.


Version-Release number of selected component (if applicable):
# cat /etc/fedora-release 
Fedora release 26 (Twenty Six)
# dnf info apcupsd
Last metadata expiration check: 0:29:36 ago on Mon 17 Jul 2017 09:40:57 PM EDT.
Installed Packages
Name         : apcupsd
Version      : 3.14.14
Release      : 5.fc26
Arch         : x86_64
Size         : 1.2 M
Source       : apcupsd-3.14.14-5.fc26.src.rpm
Repo         : @System
From repo    : fedora
Summary      : APC UPS Power Control Daemon
URL          : http://www.apcupsd.com
License      : GPLv2
Description  : Apcupsd can be used for controlling most APC UPSes. During a
             : power failure, apcupsd will inform the users about the power
             : failure and that a shutdown may occur.  If power is not restored,
             : a system shutdown will follow when the battery is exausted, a
             : timeout (seconds) expires, or the battery runtime expires based
             : on internal APC calculations determined by power consumption
             : rates.  If the power is restored before one of the above shutdown
             : conditions is met, apcupsd will inform users about this fact.
             : Some features depend on what UPS model you have (simple or smart).


How reproducible: Always


Steps to Reproduce:
1. Unplug APC UPS.
2. Wait for apcupsd to initiate shutdown.

Actual results: 90-second hang during shutdown with console message "A stop job is running for APC UPS Power Control Daemon for Linux", and UPS power is not killed.

Expected results: No hang during shutdown, and UPS power to be killed during the shutdown process via /etc/apcupsd/apccontrol killpower.
Comment 1 Josh Holtrop 2017-07-18 21:10:52 EDT
I have a workaround:

While attempting to troubleshoot why this is not working, I inserted my own commands in the /etc/apcupsd/apccontrol script to write files and observed (on subsequent reboot with 'journalctl -b -1') that apccontrol did not have permission to write files. I disabled SELinux by setting SELINUX=disabled in /etc/selinux/config, and my main problem went away - UPS power is now killed on shutdown. Perhaps selinux was disallowing the /etc/apcupsd/powerfail file from being created by apcupsd.

The apcupsd service does still timeout on stop, but I have shortened the timeout to make this less annoying by adding 'TimeoutStopSec=10' to the [Service] section of /usr/lib/systemd/system/apcupsd.service.
Comment 2 Jason Tibbitts 2017-07-21 17:20:47 EDT
If there are selinux denials, it would be good for you to include them.  "ausearch -m avc" will show them all but it would probably be better to limit to those from the relevant period of time.

Sadly selinux problems have to be filed separately against the selinux policy, but I can do that once I have the relevant AVCs to inspect.
Comment 3 Josh Holtrop 2017-07-22 08:07:17 EDT
After a apcupsd-initiated shutdown with selinux enabled (again failing to cut power to the UPS), on a subsequent reboot "ausearch -m avc" just outputs "<no matches>".
Comment 4 Paweł 2017-10-08 04:08 EDT
Created attachment 1335814 [details]
SELinux Type Enforcement file for apsupsd

SElinux denied access (getattr) /lib/systemd/system-shutdown/apcupsd_shutdown to file /etc/apcupsd/powerfail

Install:
save attachment and run these commands:

checkmodule -m myapcupsd.te -o myapcupsd.mod
semodule_package -m myapcupsd.mod -o myapcupsd.pp
semodule -i myapcupsd.pp

Remove:
semodule -r myapcupsd
Comment 5 Taketo Kabe 2017-11-13 20:52:01 EST
I have similar issue that apcupsd sits for 1min 30secs timeout on shutdown for
"A stop job is running for APC UPS Power Control Daemon for Linux".
(SELinux is disabled for debugging)

When shutdown via UPS unplugging is invoked, the process list will be like:
4     0  1302     1  20   0 132984  1988 poll_s Ssl  ?          0:00 /sbin/apcupsd -b -f /etc/apcupsd/apcupsd.conf
0     0  1754  1302  20   0 115252  1480 do_wai S    ?          0:00 /usr/bin/sh /etc/apcupsd/apccontrol doshutdown APCUPS 1 1
0     0  1757  1754  20   0 115252  1428 do_wai S    ?          0:00 /bin/sh /etc/apcupsd/doshutdown APCUPS 1 1
4     0  1762  1757  20   0 128820  1624 poll_s S    ?          0:00 /sbin/shutdown -h -P now apcupsd UPS 1 initiated shutdown

That is, /sbin/shutdown is in same process group of apcupsd.
systemd tries to kill the whole process group since
systemd.kill(5) KillMode=control-group is the default, but
/sbin/shutdown seems to be immune from SIGTERM,
so shutdown process just hangs there and waits for 1min 30sec timeout.

Workaround: Adding custom setup in /etc/systemd/system/apcupsd.service.d/timeout.conf

[Service]
#TimeoutStopSec=20
KillMode=process

to the unit description makes it shutdown smoothly.
I propose to add KillMode=process to the systemd unit file.
Comment 6 Fedora End Of Life 2018-05-03 04:45:05 EDT
This message is a reminder that Fedora 26 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 26. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '26'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 26 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.
Comment 7 Jorge Fábregas 2018-05-17 17:30:40 EDT
Hi, I've just bumped into this in Fedora 26 & Fedora 27 as well (same issue described by Josh in initial statement).  I see there are 3 separate issues:

## Issue #1 (apcupsd shutdown problem via systemd):  This is where you get "A stop job is running for APC UPS Power..." and you have to wait 1.5 minutes before systemd kills it.

Workaround: Create /etc/systemd/system/apcupsd.service.d/timeout.conf as suggested by Taketo (thanks!).   This should be included in the apcupsd package as well (or fixed in the included systemd file)


## Issue #2 (SELinux prevents the creation of /etc/apcupsd/powerfail):  This is a major one :  "/etc/apcupsd/apccontrol killpower" never gets executed because powerfail is missing.

Workaround: The custom policy created by Paweł (thanks!) does the job and I can confirm powerfail gets created.  Jason: Can we include this in the standard policy?


## Issue #3 (SELinux won't run /etc/apcupsd/apccontrol killpower): Ok, this is a new one here (I don't see anyone mentioning this particular issue).  After solving both previous issues I still can't get my UPS to shutdown (apccontrol is never executed).  If I disable SELinux it runs perfectly fine.  And, as previously mentioned, "ausearch -m avc" shows nothign as well.  I even disabled the "dontaudit" rules but still don't see anything.  If someone knows how to make SELinux spit out more details on this I'll be happy to test it.  

Workaround: set SELinux to permissive mode (which we don't want to do of course)



Jason: Can we get some traction here?  apcupsd is really crippled out of the box (totally confirmed with F26 & F27).
Comment 8 Paweł 2018-05-17 18:38:43 EDT
(In reply to Jorge Fábregas from comment #7)
> ## Issue #3 (SELinux won't run /etc/apcupsd/apccontrol killpower) 
> If someone knows how to make SELinux spit out more details on this I'll be
> happy to test it.  

Currently I am on F28 and issue #3 for me does not exist. But if You want more details on SELinux try to remove from grub boot menu two word rhgb and quiet.
Comment 9 Jason Tibbitts 2018-05-17 19:32:17 EDT
Note that the selinux policy is part of the central distro-wide selinux policy.  Modifying it is not something I can do.  Filing bugs is fine but my experience is that there isn't much chance of them being fixed within any reasonable time frame.  Applying patches to the main policy in a package is a pretty bad idea, and actually overriding the main policy module is possible but a bit beyond my ability.

If someone wants to write a complete policy module for this package (not just a little patch as in comment 4 but the full policy) then I could incorporate that and ask that the policy be dropped from the central policy module.  That's the direction selinux policy seems to be moving and I'm happy to work with that, but I don't have the level of understanding needed to actually come up with the policy.  Given the policy I have no trouble packaging it properly.

So right now, for selinux issues I can only recommend that you file them against the central selinux policy until I'm able to create or obtain a standalone policy module.  I do have another package with similar selinux issues and so I will eventually learn to write my own policy.

As for the unit file, that change is in rawhide and I will push updates to F28 and F27.
Comment 10 Fedora Update System 2018-05-17 19:43:47 EDT
apcupsd-3.14.14-8.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2018-7d3056343f
Comment 11 Fedora Update System 2018-05-17 19:43:55 EDT
apcupsd-3.14.14-11.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-da8bd6a8c2
Comment 12 Paweł 2018-05-18 01:47:10 EDT
(In reply to Jason Tibbitts from comment #9)
> Note that the selinux policy is part of the central distro-wide selinux
> policy.  Modifying it is not something I can do.  Filing bugs is fine but my
> experience is that there isn't much chance of them being fixed within any
> reasonable time frame. 

Long time ago I report the bug https://bugzilla.redhat.com/show_bug.cgi?id=1505050 but as you notice "no one cares"
Comment 13 Jorge Fábregas 2018-05-18 06:32:32 EDT
(In reply to Paweł from comment #8)
> Currently I am on F28 and issue #3 for me does not exist. But if You want
> more details on SELinux try to remove from grub boot menu two word rhgb and
> quiet.

Thanks Paweł for confirming it works in F28 and for the rhgb & quiet boot parameters!   It turns out that it was a problem on my side indeed!  While troubleshooting the other issues I redirected STDOUT & STDERR in the call to apccontrol to the root filesysetm:

mount -oremount,rw /
  /etc/apcupsd/apccontrol killpower > /apccontrol.out 2>&1
mount -oremount,ro /

...and of course SELinux wouldn't allow to write in there.  

I confirm that that after changing the KillMode & your SELinux changes apcupsd works like it should during a full power outtage.  Thanks.
Comment 14 Jorge Fábregas 2018-05-18 06:36:55 EDT
Jason:  Got it. I see the only remaining issue is the SELinux stuff for which Paweł already filled a bug many months ago.  I'm going to add my voice over there as well.  

Thanks for the updated packages & for your work maintaining this package.
Comment 15 Fedora Update System 2018-05-18 21:48:04 EDT
apcupsd-3.14.14-8.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-7d3056343f
Comment 16 Fedora Update System 2018-05-19 00:20:47 EDT
apcupsd-3.14.14-11.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-da8bd6a8c2

Note You need to log in before you can comment on or make changes to this bug.