Bug 1202979 - fail2ban with systemd backend may fill FS / hang system after reload
Summary: fail2ban with systemd backend may fill FS / hang system after reload
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: selinux-policy
Version: 25
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Lukas Vrabec
QA Contact: Fedora Extras Quality Assurance
URL: https://github.com/fail2ban/fail2ban/...
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-03-17 20:10 UTC by Rolf Fokkens
Modified: 2018-07-25 12:24 UTC (History)
22 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2017-12-12 10:10:10 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1513100 0 unspecified CLOSED SELInux prevents fail2ban from setting resource limits 2021-02-22 00:41:40 UTC

Internal Links: 1513100

Description Rolf Fokkens 2015-03-17 20:10:53 UTC
Description of problem:
After automatic "systemctl reload fail2ban" the fail2ban logfile sometines grows and fills the FS. The logfile only contains:

2015-03-16 23:01:25,159 fail2ban.server.asyncserver[879]: WARNING Socket error
2015-03-16 23:01:25,159 fail2ban.server.asyncserver[879]: WARNING Socket error
2015-03-16 23:01:25,159 fail2ban.server.asyncserver[879]: WARNING Socket error
2015-03-16 23:01:25,159 fail2ban.server.asyncserver[879]: WARNING Socket error
2015-03-16 23:01:25,159 fail2ban.server.asyncserver[879]: WARNING Socket error

etc. etc.

Version-Release number of selected component (if applicable):
fail2ban-0.9-2.fc20.noarch

How reproducible:
Seems to take time, or specific systems. The cause is clear though, see below.

Steps to Reproduce:
1. enable en start fail2ban
2. do a nightly (cron?) "systemctl reload fail2ban"
3. And wait...

Actual results:
FS full, system probably needs a reboot to make it responsive.

Expected results:
Happy systemn, no impact at all.

Additional info:
The probable cause is this:
- fail2ban opens all journal files in /var/log/journal for each jail.
- the number of log files in /var/log/journal grows over time (depending on settings)
- fail2ban has a (soft) ulimit of 1024 open files
- at a certain point (during reload?) fail2ban hits the ulimit
- and starts flooding its log file

Comment 1 Orion Poplawski 2015-03-18 04:27:36 UTC
Filed upstream.

Comment 2 Orion Poplawski 2015-04-07 22:24:05 UTC
Let's see what the systemd folks think.

Comment 3 Zbigniew Jędrzejewski-Szmek 2015-04-10 13:40:28 UTC
I think it would be best if fail2ban was not creating it's own log file but was using the journal instead — at least this failure would be handled more nicely (the disk wouldn't be filled, and possibly journal rate limiting would reduce the number of messages too).

fail2ban has to handle failure better. It is possible that the number of journal files grows to exhaust the number of available file descriptors, especially when watching multiple jails. fail2ban has to prepared to fail gracefully in that case.

Comment 4 Zbigniew Jędrzejewski-Szmek 2015-04-10 13:51:48 UTC
I looked at the upstream bug report. If you think that python-systemd leaks fd's, I'll write some test code to see if that is true.

Comment 5 Zbigniew Jędrzejewski-Szmek 2015-04-12 21:01:07 UTC
Sorry for the back-and-forth. I made some tests and thought that the problem is indeed with python-systemd, but it was a false positive. It seems that python-systemd is working properly. It seems that the problem is with fail2ban.

Comment 6 Orion Poplawski 2015-04-13 18:13:06 UTC
Zbigniew - From your exploration do you have any sense of where the issue may lie?  Thanks for looking into it.

Comment 7 Zbigniew Jędrzejewski-Szmek 2015-04-13 18:16:52 UTC
No, sorry.

Comment 8 Fedora End Of Life 2015-05-29 13:45:09 UTC
This message is a reminder that Fedora 20 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 20. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '20'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 20 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 9 Jasper Siero 2015-06-11 11:56:56 UTC
Problem still exist in Fedora 22:
-bash-4.3# systemctl restart fail2ban
-bash-4.3# uname -a
Linux f22builder.priv.tgho.nl 3.19.5-100.fc20.x86_64 #1 SMP Mon Apr 20 19:51:16 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
-bash-4.3# ps auwx | grep fail2ban
root       875  1.6  3.3 1287160 34736 ?       Sl   13:51   0:00 /usr/bin/python -Es /usr/bin/fail2ban-server -s /var/run/fail2ban/fail2ban.sock -p /var/run/fail2ban/fail2ban.pid -x -b
-bash-4.3# cd /proc/875/fd
-bash-4.3# ls | wc
     42      42     116
-bash-4.3# systemctl reload fail2ban
-bash-4.3# ls | wc
     78      78     224
-bash-4.3# systemctl reload fail2ban
-bash-4.3# ls | wc
     86      86     248
-bash-4.3# systemctl reload fail2ban
-bash-4.3# ls | wc
    108     108     334
-bash-4.3# systemctl reload fail2ban
-bash-4.3# ls | wc
    144     144     468

Please change the version to Fedora 22

Comment 10 Göran Uddeborg 2015-12-29 10:28:53 UTC
As those of you who follow the upstreams bug may have noticed, I've been trying to investigate this a bit.  I tried to increase the number of file descriptors by adding LimitNOFILE=8192 to the service, but found that it only affected the hard limit.  The soft limit remained on 1024.

My current understanding is that this is blocked by SELinux.  If I change the label on fail2ban-client to bin_t, the the fail2ban-server process will wind up running as unconfined_service_t, and it will have all the file descriptors declared in the .service file.  But if fail2ban-client has it's normal fail2ban_client_exec_t type, the server runs as fail2ban_t.  And the soft file limit is lowered to 1024.

By default, I do not see any AVC:s.  But if I disable dontaudit rules, there is an interesting AVC:

type=AVC msg=audit(1451382614.620:28974): avc:  denied  { rlimitinh } for  pid=8792 comm="fail2ban-server" scontext=system_u:system_r:fail2ban_client_t:s0 tcontext=system_u:system_r:fail2ban_t:s0 tclass=process permissive=0

So I made myself a local module with the single rule

allow fail2ban_client_t fail2ban_t:process rlimitinh;

And indeed, now fail2ban-server runs with 8192 file descriptors as both the soft and hard limit! :-)

Maybe this should become an bug, or change request, on the SELinux policy rather than fail2ban?

Comment 11 Göran Uddeborg 2015-12-29 10:51:47 UTC
By the way, when rlimitinh is denied, the resource limits are "reset" it says in the documentation.  Reset to what?  Are there some global default values?  I thought everyone always inherited these and they were only reset with setrlimit(). (The initial process 1 of course gets some kind of default from the kernel, but that I thought were maximums on everything.)

Where do the defaults come from?  Anyone knows?

Comment 12 Alick Zhao 2016-01-01 18:39:04 UTC
Let's get some input from selinux folks.

Comment 13 Daniel Walsh 2016-01-02 12:45:52 UTC
Yes we should just allow it for this domain.

Comment 14 Edgar Hoch 2016-05-31 06:05:23 UTC
This error occurs in Fedora 23 too.

Comment 15 Edgar Hoch 2016-07-10 21:10:11 UTC
Some news?

I have created a file 
# cat /etc/systemd/system/fail2ban.service.d/limit.conf
[Service]
LimitNOFILE=16384

but the fail2ban still has a limit of 1024 open files:
# cat /proc/$(cat /var/run/fail2ban/fail2ban.pid)/limits | grep "Max open files"
Max open files            1024                 16384                files     

That is, I cannot change the limit.

We have several hosts a week where fail2ban used 100% cpu and fills partition /var by log file /var/log/fail2ban.log growing over any limit until the file system is full. This makes the system partly unusable, because some services, login and desktop processes, etc. require to write some data to files in /var.

It would be great if this can be fixed soon.
Thanks in advance!

Comment 16 Fedora End Of Life 2016-07-19 19:05:58 UTC
Fedora 22 changed to end-of-life (EOL) status on 2016-07-19. Fedora 22 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 17 Edgar Hoch 2016-07-19 21:06:09 UTC
I think this bug should be reopend, because at least it exists at Fedora 23 (see comment #15).

I cannot say if it exists on Fedora 24, because we have only a few machines with Fedora 24 at the moment, because of other problems of Fedora 24 (e.g. boot problems, microcode bug).

Comment 18 Fedora Admin XMLRPC Client 2016-09-27 14:59:52 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 19 Fedora End Of Life 2016-11-24 11:34:23 UTC
This message is a reminder that Fedora 23 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 23. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '23'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 23 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 20 Fedora End Of Life 2016-12-20 13:22:09 UTC
Fedora 23 changed to end-of-life (EOL) status on 2016-12-20. Fedora 23 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 21 Edgar Hoch 2016-12-20 14:08:07 UTC
This bug still exists on Fedora 25. It should be reopend.

The test case of comment #15 shows the same result in Fedora 25.

Comment 22 Fedora End Of Life 2017-11-16 19:51:48 UTC
This message is a reminder that Fedora 25 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 25. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '25'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 25 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 23 Fedora End Of Life 2017-12-12 10:10:10 UTC
Fedora 25 changed to end-of-life (EOL) status on 2017-12-12. Fedora 25 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 24 Orion Poplawski 2017-12-12 16:50:14 UTC
SELinux may also be preventing the number of file descriptors from being increased, see 1513100


Note You need to log in before you can comment on or make changes to this bug.