Bug 739538

Summary: systemd retains socket-activated service failure records forever leading to memory exhaustion
Product: [Other] Security Response Reporter: Petr Pisar <ppisar>
Component: vulnerabilityAssignee: Nobody <nobody>
Status: ASSIGNED --- QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: azelinka, carnil, herrold, mschmidt, ovasik, shigorin
Target Milestone: ---Keywords: Reopened, Security
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 740058 (view as bug list) Environment:
Last Closed: 2011-09-19 17:21:09 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 740058    

Description Petr Pisar 2011-09-19 12:55:04 UTC
New init `systemd' in Fedora distribution provides inetd-like functionality. I.e. systemd (PID=1) listens on a socket:

# systemctl enable cvs.socket
# systemctl --all --full |grep cvs
cvs             loaded inactive dead          CVS Server
cvs.socket                loaded active   listening     CVS Server Activation Socket

and when a client connects, it will spawn a network service (server) connected to the socket through standard input and output.

# systemctl --all --full |grep cvs
cvs@::1:2401-::1:60544.service loaded active   running       CVS Server
cvs.socket                loaded active   listening     CVS Server Activation Socket

If server process exits with non-zero code (e.g. client violated server protocol), systemd keeps details about this failure (available through `systemctl --all --full' command):

# systemctl --all --full |grep cvs
cvs@::1:2401-::1:60544.service loaded failed   failed        CVS Server
cvs.socket                loaded active   listening     CVS Server Activation Socket

The problem is the failure records are stored indefinitely:

# systemctl --all --full |grep cvs
cvs@::1:2401-::1:60543.service loaded failed   failed        CVS Server
cvs@::1:2401-::1:60544.service loaded failed   failed        CVS Server
cvs@::1:2401-::1:60545.service loaded failed   failed        CVS Server
cvs@::1:2401-::1:60546.service loaded failed   failed        CVS Server
cvs@::1:2401-::1:60547.service loaded failed   failed        CVS Server
cvs@::1:2401-::1:60548.service loaded failed   failed        CVS Server
cvs@::1:2401-::1:60549.service loaded failed   failed        CVS Server
cvs@::1:2401-::1:60550.service loaded failed   failed        CVS Server
cvs@::1:2401-::1:60551.service loaded failed   failed        CVS Server
cvs@::1:2401-::1:60552.service loaded failed   failed        CVS Server
cvs.socket                loaded active   listening     CVS Server Activation Socket

# pidof cvs; echo $?
1

and each record costs memory:

# ps u -p1
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.9  2.4  57408 24352 ?        Ss   14:26   0:08 /sbin/init

# for I in $(seq 1 $((2**10))); do echo "foo" >/dev/tcp/localhost/2401; done
# systemctl --all --full |grep -c cvs
1036
# ps u -p1
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  4.2  3.4  68000 34480 ?        Ss   14:26   0:41 /sbin/init

This increases memory usage about 10 KB/record.

Provided a network service is designed to be remotely accessible, this exhibits remote DOS vulnerability.

Tested on Fedora 17 with systemd-35-1.fc16.x86_64 and cvs-1.11.23-22.fc17.x86_64. As a lot of services are being migrated to systemd in Fedora 16 which stable release is close, I consider this issue becomes general available soon.

Comment 1 Lennart Poettering 2011-09-19 17:19:59 UTC
To make systemd forget about the failure state of a service, use ExecStart=-/foo/bar, i.e. add the "-" in there.

Comment 2 Petr Pisar 2011-09-20 07:52:50 UTC
This is work-around specific for the service configuration. However this is generic problem.

How can I can I forget job statuses for already exited serviced?

Howe can I limit size of job status log (systemctl --all), if I have socket services without the "-".

This is vulnerability in systemd as such. If you want close this bug, fix it on systemd level (e.g. by making "-" in socket-services implicit) before.

Comment 3 Lennart Poettering 2011-09-21 21:05:35 UTC
(In reply to comment #2)
> This is work-around specific for the service configuration. However this is
> generic problem.

It's not a work-around. It's the suggested way to solve this.
> 
> How can I can I forget job statuses for already exited serviced?

systemctl reset-failed

> Howe can I limit size of job status log (systemctl --all), if I have socket
> services without the "-".

there's a global limit on loaded units. Not sure what I set it too. Something quite large. We do autopaging so a long output should be unproblematic, and for everything else there is head/tail.

> This is vulnerability in systemd as such. If you want close this bug, fix it on
> systemd level (e.g. by making "-" in socket-services implicit) before.

Nah. People might want this. This is a matter of documentation, not more. It is documented in the man pages, people just don't look there...

I will post a blog story about this very soon explaining how to convert inetd to systemd services. it will include a notice about this.

Comment 4 Petr Pisar 2011-09-27 08:57:59 UTC
(In reply to comment #3)
> (In reply to comment #2)
> > This is work-around specific for the service configuration. However this is
> > generic problem.
> 
> It's not a work-around. It's the suggested way to solve this.

What if I want to track failures but to be prevented from DOS?

> > 
> > How can I can I forget job statuses for already exited serviced?
> 
> systemctl reset-failed
> 
Great. That works.


> > Howe can I limit size of job status log (systemctl --all), if I have socket
> > services without the "-".
> 
> there's a global limit on loaded units. Not sure what I set it too. Something
> quite large. We do autopaging so a long output should be unproblematic, and
> for everything else there is head/tail.
> 
I don't talk about printed output. I talk about limit for internal systemd data structures. (Even getting list of systemctl --all with thousands of records takes a long time.)

In other words I'd like to see a limit rate on failed services. Something like a printk in Linux or syslog `last message repeated N times' or netfiler LOG --limit.

> > This is vulnerability in systemd as such. If you want close this bug,
> > fix it on systemd level (e.g. by making "-" in socket-services implicit)
> > before.
> 
> Nah. People might want this. This is a matter of documentation, not more.

Ok. I will ask FPC to encourage packagers to ignore socket-activated services failures otherwise their system would be become DOS-susceptible.

Comment 5 Petr Pisar 2011-09-27 09:15:48 UTC
(In reply to comment #4)
> Ok. I will ask FPC to encourage packagers to ignore socket-activated services
> failures otherwise their system would be become DOS-susceptible.
https://fedorahosted.org/fpc/ticket/103#comment:3

Comment 6 Jaromír Cápík 2012-02-02 14:32:39 UTC
> > This is vulnerability in systemd as such. If you want close this bug, fix it on
> > systemd level (e.g. by making "-" in socket-services implicit) before.
> 
> Nah. People might want this. This is a matter of documentation, not more. It is
> documented in the man pages, people just don't look there...
> 

Hello Lennart.

You're right. Some people might want that.
But I'm quite sure, that majority of people doesn't want that and there's quite good chance, that somebody could simply forget to disable the logging. The same could happen when people aren't well informed.

So, why don't You introduce something like ExecStart=+/foo/bar for enabling the logging for socket activated services? The logging should be disabled by default since it's safer. Btw. disabling all security holes by default is quite common policy in case of all vulnerable software.

Please, reconsider Your attitude.

Regards,
Jaromir.

Comment 7 Lennart Poettering 2013-05-06 18:14:46 UTC
(In reply to comment #6)
> > > This is vulnerability in systemd as such. If you want close this bug, fix it on
> > > systemd level (e.g. by making "-" in socket-services implicit) before.
> > 
> > Nah. People might want this. This is a matter of documentation, not more. It is
> > documented in the man pages, people just don't look there...
> > 
> 
> Hello Lennart.
> 
> You're right. Some people might want that.
> But I'm quite sure, that majority of people doesn't want that and there's
> quite good chance, that somebody could simply forget to disable the logging.
> The same could happen when people aren't well informed.
> 
> So, why don't You introduce something like ExecStart=+/foo/bar for enabling
> the logging for socket activated services? The logging should be disabled by
> default since it's safer. Btw. disabling all security holes by default is
> quite common policy in case of all vulnerable software.

We log about all units we start, actually. It's the right thing to do. There's nothing to change really.

Comment 8 Petr Pisar 2013-05-07 07:04:09 UTC
In good old days, a failure was sent to syslog, syslog append it to /var/log/message and logrotate trimmed the log and removed old log files to conserve disk space.

All that's still true however there is a new step where the failure is stored into systemd internal log and kept here forever (or until systemctl reset-failed is called).

With current implementation (ExecStart) you have everything or nothing. In the first case the system is vulnerable in the other case no diagnostics is possible.

Comment 9 Michael Shigorin 2014-03-13 15:38:56 UTC
New and shiny mindset: insecure by default, someone might need that.
Blog about that and move on, lusers don't read docs anyways!