Red Hat Bugzilla – Bug 1031158
swapoff during reboot/shutdown results in oom
Last modified: 2017-12-12 05:23:39 EST
Doing a swapoff at reboot time can be fatal.
If there's insufficient memory to page everything back in, the swapoff process can get killed, and then systemd craps itself and decides hanging the box is a better option than rebooting.
[61035.317344] Out of memory: Kill process 27030 (swapoff) score -1369716241 or sacrifice child
[61035.318256] Killed process 27030 (swapoff) total-vm:123380kB, anon-rss:164kB, file-rss:592kB
[61035.406573] systemd: Unit dev-sda2.swap entered failed state.
Why exactly are we doing a swapoff anyway? It's not like a filesystem where we have to maintain a coherent state across reboots.
There are cases where swapoff is necessary — when the backing device must be destroyed. For example, if the swap device is a LVM lv, and the LVM is ontop a raid array... I doubt that we can come up without logic to distinguish those special cases where that's not needed without hardcoding stuff.
Anyway, swaps are ordered before sysinit.target, and should be destroyed only after all services are gone. What is using the memory in your case?
CentOS 7 KVM
reboot process hangs when these processes are running:
It appears systemd is not shutting down all processes before the swap is destroyed.
on this reboot I manually stopped spamd last before shutdown -r was sent:
Apr 12 08:44:46 18-98-60-69 systemd: Stopped Spamassassin daemon.
** shutdown -r issued
Apr 12 08:45:15 18-98-60-69 systemd: Deactivating swap /dev/sdb1...
log shows that swap is the FIRST process systemd is deactivating
This reboot shows systemd not shutting down all processes:
The only process that is stopped is MariaDB, but after swap is destroyed
clamd, exim, spamd. Maybe it is stopping these when "Stopping Multi-User System", but again this is after swap is destroyed.
All of my CentOS 7 VPS's have low (512MB) physical memory, and 2-3GB swap. Everything runs fine with the virtual memory, but this reboot issue is affecting them.
Apr 12 07:58:03 MIA-VPS-VM01013 systemd: Stopping Session 16 of user root.
Apr 12 07:58:03 MIA-VPS-VM01013 systemd: Stopped Dump dmesg to /var/log/dmesg.
Apr 12 07:58:03 MIA-VPS-VM01013 systemd: Stopping Dump dmesg to /var/log/dmesg...
Apr 12 07:58:03 MIA-VPS-VM01013 systemd: Stopped target Timers.
Apr 12 07:58:03 MIA-VPS-VM01013 systemd: Stopping Timers.
Apr 12 07:58:03 MIA-VPS-VM01013 systemd: Deactivating swap /dev/sdb1...
Apr 12 07:58:03 MIA-VPS-VM01013 systemd: Stopped Daily Cleanup of Temporary Directories.
Apr 12 07:58:03 MIA-VPS-VM01013 systemd: Stopping Daily Cleanup of Temporary Directories.
Apr 12 07:58:03 MIA-VPS-VM01013 systemd: Stopping Authorization Manager...
Apr 12 07:58:03 MIA-VPS-VM01013 systemd: Stopped target Multi-User System.
Apr 12 07:58:03 MIA-VPS-VM01013 systemd: Stopping Multi-User System.
Apr 12 07:58:03 MIA-VPS-VM01013 systemd: Stopping MariaDB database server...
Apr 12 07:58:03 MIA-VPS-VM01013 systemd: Stopped target Login Prompts.
Apr 12 07:58:03 MIA-VPS-VM01013 systemd: Stopping Login Prompts.
Apr 12 07:58:03 MIA-VPS-VM01013 systemd: Stopping Command Scheduler...
Apr 12 07:58:03 MIA-VPS-VM01013 systemd: Stopping D-Bus System Message Bus...
Apr 12 07:58:03 MIA-VPS-VM01013 systemd: Stopping Avahi mDNS/DNS-SD Stack...
Apr 12 07:58:04 MIA-VPS-VM01013 rsyslogd: [origin software="rsyslogd" swVersion="7.4.7" x-pid="444" x-info="http://www.rsyslog.com"] exiting on signal 15.
Seems this might be a systemd WIDE issue
here is a Ubuntu 15.10 systemd VPS with the same problem. The last comment here in another CentOS user with same issue who links this bug to the post on Feb 21, 2016 after it was closed with "CLOSED INSUFFICIENT_DATA". His logs also show swap deactivating very early in the shutdown sequence.
*** Bug 1327177 has been marked as a duplicate of this bug. ***
> Description of problem:
> systemd disables swap before stopping all processes that are using swap
> memory during shutdown resulting in a 30 min suspended state after rsyslogd
> is stopped.
> Version-Release number of selected component (if applicable):
> CentOS 7.2
> How reproducible:
> CentOS 7.2 KVM VPS with 512MB memory and 3GB swap
> running mysql
> Steps to Reproduce:
> 1. reboot server
> Actual results:
> almost 30 after rsyslogd is stopped, the last process logged to message bus,
> almost exactly 30 elapses before system restarts
> Expected results:
> system should reboot/restart immediately after rsyslogd stops
Yeah, we should look into this. See also https://github.com/systemd/systemd/issues/2930.
systemd-219-19.el7.x86_64 on RHEL7.
In my case, Total RAM 7.6GB(fully used),and 2GB swap used.
OS shutdown while application running.
The systemd started to stop all application and services simultaneously,
and the systemd also "swapoff" at the same moment.
At the end result, it failed to shutdown the system.
My understanding is, swapoff should be happen _after_ all the
applications and OS related processes are terminated.
But according to the log, it happened just after the shutdown command
May 11 11:00:25 daemon.info: systemd: Started Delayed Shutdown Service.
May 11 11:00:25 daemon.info: systemd: Starting Delayed Shutdown Service...
May 11 11:00:25 daemon.info: systemd-shutdownd: Shutting down at Wed 2016-05-11 11:00:25 JST (poweroff)...
May 11 11:00:25 daemon.info: systemd-shutdownd: Creating /run/nologin, blocking further logins...
May 11 11:00:25 daemon.info: systemd: Stopping Session 1 of user osemerg2.
May 11 11:00:25 daemon.info: systemd: Stopping Authorization Manager...
May 11 11:00:25 daemon.info: systemd: Deactivating swap /dev/mapper/vg_system-lv_swap...
May 11 11:00:25 daemon.info: systemd: Stopped target Timers.
May 11 11:00:25 daemon.info: systemd: Stopped Authorization Manager.
May 11 11:00:25 daemon.notice: systemd: dev-mapper-vg_system\x2dlv_swap.swap swap process exited, code=exited status=255
May 11 11:00:25 daemon.info: systemd: Deactivated swap /dev/mapper/vg_system-lv_swap.
May 11 11:00:25 daemon.notice: systemd: Unit dev-mapper-vg_system\x2dlv_swap.swap entered failed state.
As you see the log, all events happend in the same time. (11:00:25)
I checked out systemd's upstream site and found this symptom has been
fixed by following cases.
make sure all swap units are ordered before the swap target
Swap units are deactivated too early
As long as we go with Fedora 24(which uses systemd-229-8),
I have confirmed the systemd-229.tar.gz in src.rpm include the fix.
I didn't check minimum required version to fix this symptom,
as I have a plan to move F24 any time soon.
FYI, the fix is not in systemd-219-19.el7_2.11, (as of RHEL7.2).
I hope RH will backport the patch, or re-base the systemd to latest.
This bug appears to have been reported against 'rawhide' during the Fedora 25 development cycle.
Changing version to '25'.
This bug is still present in the version of systemd deployed with CentOS 7.2.
Would it be possible to push the fix upstream?
Applied workaround would be something like this:
grep swap /var/log/dmesg |grep "dead -> active"
[ 1.995413] systemd: dev-dm\x2d1.swap changed dead -> active
[ 1.995495] systemd: dev-cl-swap.swap changed dead -> active
[ 1.995550] systemd: dev-disk-by\x2did-dm\x2dname\x2dcl\x2dswap.swap changed dead -> active
[ 1.995616] systemd: dev-disk-by\x2did-dm\x2duuid\x2dLVM\x2dXOAK7DHxMdmQCrNdwWE3Pt836Q9pHYSGyrO9ycCGeIYavzbamVWNKMaVUMLf1NWZ.swap changed dead -> active
[ 1.995678] systemd: dev-disk-by\x2duuid-6509e6e1\x2daf2d\x2d4d23\x2d9ebd\x2da9aa8801e658.swap changed dead -> active
For each problematic system one would need to create /etc/systemd/system/swap.target with the following content from precedent output:
After=dev-disk-by\x2duuid-6509e6e1\x2daf2d\x2d4d23\x2d9ebd\x2da9aa8801e658.swap dev-dm1.swap dev-disk-by\x2did-dm\x2duuid\x2dLVM\x2dXOAK7DHxMdmQCrNdwWE3Pt836Q9pHYSGyrO9ycCGeIYavzbamVWNKMaVUMLf1NWZ.swap dev-disk-by\x2did-dm\x2dname\x2dcl\x2dswap.swap dev-cl-swap.swap dev-dm\x2d1.swap
Else, the system attempts to swapoff each alias prior to stopping the rest of the system.
It seemed RH has been working on bz#1379268 and bz#1298355 to back port
upstream patch to fix the issue.
Bug 1379268 - backport upstream commit 681c8d8 to make sure all swap units are ordered before the swap target
Bug 1298355 - kickstart stuck at "Reached Target Shutdown" stage when removing media before shutdown completes
https://bugzilla.redhat.com/show_bug.cgi?id=1379268 is reported as being a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1298355, which it is not.
https://bugzilla.redhat.com/show_bug.cgi?id=1379268 should be reopened.
This message is a reminder that Fedora 25 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 25. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora 'version'
Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.
Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 25 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.
Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.
Fedora 25 changed to end-of-life (EOL) status on 2017-12-12. Fedora 25 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.
If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
Thank you for reporting this bug and we are sorry it could not be fixed.