Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1437114 - systemd stops servicing with "Freezing execution" message upon memory exhaustion
systemd stops servicing with "Freezing execution" message upon memory exhaustion
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: systemd (Show other bugs)
7.3
All Linux
urgent Severity urgent
: rc
: 7.5
Assigned To: Lukáš Nykrýn
Frantisek Sumsal
: ZStream
: 1496263 (view as bug list)
Depends On:
Blocks: 1399177 1420851 1465901 1466365 1476742 1496263 1546658 1624756
  Show dependency treegraph
 
Reported: 2017-03-29 10:11 EDT by Renaud Métrich
Modified: 2018-10-23 03:51 EDT (History)
15 users (show)

See Also:
Fixed In Version: systemd-219-44.el7
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1546658 1624756 (view as bug list)
Environment:
Last Closed: 2018-04-10 07:19:30 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
IBM Linux Technology Center 159344 None None None 2017-09-27 04:32 EDT
Red Hat Knowledge Base (Solution) 3096191 None None None 2017-06-28 14:10 EDT
Red Hat Product Errata RHBA-2018:0711 None None None 2018-04-10 07:20 EDT

  None (edit)
Description Renaud Métrich 2017-03-29 10:11:03 EDT
Description of problem:

When memory is exhausted on the system and systemd tries to fork to start or stop a service, systemd fails and enters "freeze" mode:

    systemd: Failed to fork: Cannot allocate memory
    systemd: Assertion 'pid >= 1' failed at src/core/unit.c:1997, function unit_watch_pid(). Aborting.
    systemd: Caught <ABRT>, cannot fork for core dump: Cannot allocate memory
    systemd: Freezing execution.


Once in "freeze" mode, systemd doesn't provide service anymore and system can only be rebooted using "systemctl reboot -ff":

    # reboot
    Error getting authority: Error initializing authority: Error calling StartServiceByName for org.freedesktop.PolicyKit1: Timeout was reached (g-io error-quark, 24)
    Could not watch jobs: Connection timed out
    Failed to open /dev/initctl: No such device or address
    Failed to talk to init daemon.

systemd doesn't listen on any socket anymore:

    # ls -l /proc/1/fd
    total 0
    lrwx------. 1 root root 64 Mar 29 16:00 0 -> /dev/null
    lrwx------. 1 root root 64 Mar 29 16:00 1 -> /dev/null
    lrwx------. 1 root root 64 Mar 29 16:00 2 -> /dev/null

And of courses services cannot be restarted.


Version-Release number of selected component (if applicable):

219-30.el7_3.7

How reproducible:

Always

Steps to Reproduce:
1. Start a VM with swap in it
2. Log onto the console
3. From a ssh terminal, create a tmpfs filesystem and a file in it to exhaust memory

    # mount -t tmpfs -o size=20G tmpfs /mnt
    # dd if=/dev/zero of=/mnt/file bs=1M
4. Wait for some seconds

If it doesn't reproduce immediately, restarting a service in loop may help:

    # while :; do systemctl restart iptables.service; sleep 5; done

Logging onto the console helps because it is likely that oom-killer will kill getty and systemd will try to restart it, causing systemd to enter freeze mode.
Comment 2 Jan Synacek 2017-04-03 04:41:41 EDT
I'm not sure how systemd should behave when the system is continuosly memory thrashed by a misbehaving process. If you run the "dd" command from the reproducer, it does exactly that and causes the kernel to kill everything, continuosly.

I suggest running the misbehaving process in its own service file which will make use of the MemoryLimit= directive. See systemd.resource-control(5) for more information.
Comment 3 Marko Myllynen 2017-04-04 10:23:57 EDT
(In reply to Jan Synacek from comment #2)
> I'm not sure how systemd should behave when the system is continuosly memory
> thrashed by a misbehaving process. If you run the "dd" command from the
> reproducer, it does exactly that and causes the kernel to kill everything,
> continuosly.
> 
> I suggest running the misbehaving process in its own service file which will
> make use of the MemoryLimit= directive. See systemd.resource-control(5) for
> more information.

This is fair approach wrt the reproducer or with a hopelessly out of control application.

However, bugs happen and in theory any component running on a system may cause OOM situation meaning that if, for whatever reason, the system is momentarily out of memory and someone/something happens to restart a service at that moment, then systemd is rendered unusable even if the offending process is terminated right after the fact. Requiring a reboot to recover from such a temporarily hickup would seem to be an unreasonable request for an enterprise operating system.

In fact, this could be even seen as a variant of DoS attack.

Thanks.
Comment 4 Marko Myllynen 2017-04-04 10:26:05 EDT
(In reply to Marko Myllynen from comment #3)
> 
> In fact, this could be even seen as a variant of DoS attack.

This of course iff this happens with regular user controlled user services.
Comment 11 Lukáš Nykrýn 2017-05-05 07:45:13 EDT
https://github.com/lnykryn/systemd-rhel/pull/119
Comment 16 Lukáš Nykrýn 2017-09-07 04:22:36 EDT
fix merged to upstream staging branch -> https://github.com/lnykryn/systemd-rhel/pull/119 -> post
Comment 20 Jan Synacek 2018-02-14 04:42:19 EST
*** Bug 1496263 has been marked as a duplicate of this bug. ***
Comment 21 IBM Bug Proxy 2018-02-14 04:45:39 EST
------- Comment From ruddk@us.ibm.com 2017-09-27 13:43 EDT-------
(In reply to comment #7)
> Well even if we fix that particular bug, I think this is a desing problem
> that basically can't be fixed. In the old world of sysvinit the service
> managers were generally state-less. So they were able to recover from these
> state because there was nothing that needed a recovery.
> But with state-full managers like systemd the situation is different. If a
> state of a service changes and the service manager can't allocate memory to
> record that change, then we can't guarantee that the system will work as
> expected even in the case that memory is freed again.

I think that Comment 3 in RH Bug 1437114 (and LTC Bug 159344 Comment 6) does a pretty good job of covering the main concern.

The goal of the fix is to make sure that the primary systemd process gets accurate information back in the event of a subsequent fork failing.

------- Comment From harihare@in.ibm.com 2018-02-14 03:40 EDT-------
Still the issue is reproduced on Pegas 1.1 Sanapshot 2 too.

[root@localhost ~]# dmesg
Segmentation fault
[root@localhost ~]# dmesg
-bash: fork: Cannot allocate memory
[root@localhost ~]# dmesg
-bash: fork: Cannot allocate memory
[root@localhost ~]# dmesg
-bash: fork: Cannot allocate memory

------- Comment From harihare@in.ibm.com 2018-02-14 03:40 EDT-------
Steps to Reproduce.

If it is not reproduce immediately, restarting a service in loop:
Comment 32 errata-xmlrpc 2018-04-10 07:19:30 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0711
Comment 35 IBM Bug Proxy 2018-10-22 17:40:33 EDT
------- Comment From chavez@us.ibm.com 2018-10-22 17:36 EDT-------
*** Bug 159283 has been marked as a duplicate of this bug. ***
Comment 36 Jan Synacek 2018-10-23 03:51:17 EDT
(In reply to IBM Bug Proxy from comment #35)
> ------- Comment From chavez@us.ibm.com 2018-10-22 17:36 EDT-------
> *** Bug 159283 has been marked as a duplicate of this bug. ***

I'm not sure where this came from, but that's certainly a mistake.

Note You need to log in before you can comment on or make changes to this bug.