740058 – systemd retains socket-activated service failure records forever leading to memory exhaustion

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 740058 - systemd retains socket-activated service failure records forever leading to memory exhaustion

Summary: systemd retains socket-activated service failure records forever leading to m...

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	systemd
Sub Component:
Version:	7.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	rc
Target Release:	---
Assignee:	systemd-maint
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:	739538
Blocks:	802465 816135 952670
TreeView+	depends on / blocked

Reported:	2011-09-20 19:32 UTC by Denise Dumas
Modified:	2013-05-06 18:17 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:	739538
Environment:
Last Closed:	2013-05-06 18:17:14 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Denise Dumas 2011-09-20 19:32:39 UTC

+++ This bug was initially created as a clone of Bug #739538 +++

New init `systemd' in Fedora distribution provides inetd-like functionality. I.e. systemd (PID=1) listens on a socket:

# systemctl enable cvs.socket
# systemctl --all --full |grep cvs
cvs             loaded inactive dead          CVS Server
cvs.socket                loaded active   listening     CVS Server Activation Socket

and when a client connects, it will spawn a network service (server) connected to the socket through standard input and output.

# systemctl --all --full |grep cvs
cvs@::1:2401-::1:60544.service loaded active   running       CVS Server
cvs.socket                loaded active   listening     CVS Server Activation Socket

If server process exits with non-zero code (e.g. client violated server protocol), systemd keeps details about this failure (available through `systemctl --all --full' command):

# systemctl --all --full |grep cvs
cvs@::1:2401-::1:60544.service loaded failed   failed        CVS Server
cvs.socket                loaded active   listening     CVS Server Activation Socket

The problem is the failure records are stored indefinitely:

# systemctl --all --full |grep cvs
cvs@::1:2401-::1:60543.service loaded failed   failed        CVS Server
cvs@::1:2401-::1:60544.service loaded failed   failed        CVS Server
cvs@::1:2401-::1:60545.service loaded failed   failed        CVS Server
cvs@::1:2401-::1:60546.service loaded failed   failed        CVS Server
cvs@::1:2401-::1:60547.service loaded failed   failed        CVS Server
cvs@::1:2401-::1:60548.service loaded failed   failed        CVS Server
cvs@::1:2401-::1:60549.service loaded failed   failed        CVS Server
cvs@::1:2401-::1:60550.service loaded failed   failed        CVS Server
cvs@::1:2401-::1:60551.service loaded failed   failed        CVS Server
cvs@::1:2401-::1:60552.service loaded failed   failed        CVS Server
cvs.socket                loaded active   listening     CVS Server Activation Socket

# pidof cvs; echo $?
1

and each record costs memory:

# ps u -p1
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.9  2.4  57408 24352 ?        Ss   14:26   0:08 /sbin/init

# for I in $(seq 1 $((2**10))); do echo "foo" >/dev/tcp/localhost/2401; done
# systemctl --all --full |grep -c cvs
1036
# ps u -p1
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  4.2  3.4  68000 34480 ?        Ss   14:26   0:41 /sbin/init

This increases memory usage about 10 KB/record.

Provided a network service is designed to be remotely accessible, this exhibits remote DOS vulnerability.

Tested on Fedora 17 with systemd-35-1.fc16.x86_64 and cvs-1.11.23-22.fc17.x86_64. As a lot of services are being migrated to systemd in Fedora 16 which stable release is close, I consider this issue becomes general available soon.

--- Additional comment from lpoetter on 2011-09-19 13:19:59 EDT ---

To make systemd forget about the failure state of a service, use ExecStart=-/foo/bar, i.e. add the "-" in there.

--- Additional comment from ppisar on 2011-09-20 03:52:50 EDT ---

This is work-around specific for the service configuration. However this is generic problem.

How can I can I forget job statuses for already exited serviced?

Howe can I limit size of job status log (systemctl --all), if I have socket services without the "-".

This is vulnerability in systemd as such. If you want close this bug, fix it on systemd level (e.g. by making "-" in socket-services implicit) before.

Comment 4 Lennart Poettering 2013-05-06 18:17:14 UTC

There isn't anything to fix here... People should use ExecStart=-/bin/false rather than ExecStart=/bin/false if they want the units to be cleaned up automatically. And if they don't want to have them cleaned automatically they should do so manually via "systemctl reset-failed".

This has been this way since about always, hence closing, since there's no bug to fix.

Note You need to log in before you can comment on or make changes to this bug.