Description of problem: After automatic "systemctl reload fail2ban" the fail2ban logfile sometines grows and fills the FS. The logfile only contains: 2015-03-16 23:01:25,159 fail2ban.server.asyncserver[879]: WARNING Socket error 2015-03-16 23:01:25,159 fail2ban.server.asyncserver[879]: WARNING Socket error 2015-03-16 23:01:25,159 fail2ban.server.asyncserver[879]: WARNING Socket error 2015-03-16 23:01:25,159 fail2ban.server.asyncserver[879]: WARNING Socket error 2015-03-16 23:01:25,159 fail2ban.server.asyncserver[879]: WARNING Socket error etc. etc. Version-Release number of selected component (if applicable): fail2ban-0.9-2.fc20.noarch How reproducible: Seems to take time, or specific systems. The cause is clear though, see below. Steps to Reproduce: 1. enable en start fail2ban 2. do a nightly (cron?) "systemctl reload fail2ban" 3. And wait... Actual results: FS full, system probably needs a reboot to make it responsive. Expected results: Happy systemn, no impact at all. Additional info: The probable cause is this: - fail2ban opens all journal files in /var/log/journal for each jail. - the number of log files in /var/log/journal grows over time (depending on settings) - fail2ban has a (soft) ulimit of 1024 open files - at a certain point (during reload?) fail2ban hits the ulimit - and starts flooding its log file
Filed upstream.
Let's see what the systemd folks think.
I think it would be best if fail2ban was not creating it's own log file but was using the journal instead — at least this failure would be handled more nicely (the disk wouldn't be filled, and possibly journal rate limiting would reduce the number of messages too). fail2ban has to handle failure better. It is possible that the number of journal files grows to exhaust the number of available file descriptors, especially when watching multiple jails. fail2ban has to prepared to fail gracefully in that case.
I looked at the upstream bug report. If you think that python-systemd leaks fd's, I'll write some test code to see if that is true.
Sorry for the back-and-forth. I made some tests and thought that the problem is indeed with python-systemd, but it was a false positive. It seems that python-systemd is working properly. It seems that the problem is with fail2ban.
Zbigniew - From your exploration do you have any sense of where the issue may lie? Thanks for looking into it.
No, sorry.
This message is a reminder that Fedora 20 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 20. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '20'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 20 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Problem still exist in Fedora 22: -bash-4.3# systemctl restart fail2ban -bash-4.3# uname -a Linux f22builder.priv.tgho.nl 3.19.5-100.fc20.x86_64 #1 SMP Mon Apr 20 19:51:16 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux -bash-4.3# ps auwx | grep fail2ban root 875 1.6 3.3 1287160 34736 ? Sl 13:51 0:00 /usr/bin/python -Es /usr/bin/fail2ban-server -s /var/run/fail2ban/fail2ban.sock -p /var/run/fail2ban/fail2ban.pid -x -b -bash-4.3# cd /proc/875/fd -bash-4.3# ls | wc 42 42 116 -bash-4.3# systemctl reload fail2ban -bash-4.3# ls | wc 78 78 224 -bash-4.3# systemctl reload fail2ban -bash-4.3# ls | wc 86 86 248 -bash-4.3# systemctl reload fail2ban -bash-4.3# ls | wc 108 108 334 -bash-4.3# systemctl reload fail2ban -bash-4.3# ls | wc 144 144 468 Please change the version to Fedora 22
As those of you who follow the upstreams bug may have noticed, I've been trying to investigate this a bit. I tried to increase the number of file descriptors by adding LimitNOFILE=8192 to the service, but found that it only affected the hard limit. The soft limit remained on 1024. My current understanding is that this is blocked by SELinux. If I change the label on fail2ban-client to bin_t, the the fail2ban-server process will wind up running as unconfined_service_t, and it will have all the file descriptors declared in the .service file. But if fail2ban-client has it's normal fail2ban_client_exec_t type, the server runs as fail2ban_t. And the soft file limit is lowered to 1024. By default, I do not see any AVC:s. But if I disable dontaudit rules, there is an interesting AVC: type=AVC msg=audit(1451382614.620:28974): avc: denied { rlimitinh } for pid=8792 comm="fail2ban-server" scontext=system_u:system_r:fail2ban_client_t:s0 tcontext=system_u:system_r:fail2ban_t:s0 tclass=process permissive=0 So I made myself a local module with the single rule allow fail2ban_client_t fail2ban_t:process rlimitinh; And indeed, now fail2ban-server runs with 8192 file descriptors as both the soft and hard limit! :-) Maybe this should become an bug, or change request, on the SELinux policy rather than fail2ban?
By the way, when rlimitinh is denied, the resource limits are "reset" it says in the documentation. Reset to what? Are there some global default values? I thought everyone always inherited these and they were only reset with setrlimit(). (The initial process 1 of course gets some kind of default from the kernel, but that I thought were maximums on everything.) Where do the defaults come from? Anyone knows?
Let's get some input from selinux folks.
Yes we should just allow it for this domain.
This error occurs in Fedora 23 too.
Some news? I have created a file # cat /etc/systemd/system/fail2ban.service.d/limit.conf [Service] LimitNOFILE=16384 but the fail2ban still has a limit of 1024 open files: # cat /proc/$(cat /var/run/fail2ban/fail2ban.pid)/limits | grep "Max open files" Max open files 1024 16384 files That is, I cannot change the limit. We have several hosts a week where fail2ban used 100% cpu and fills partition /var by log file /var/log/fail2ban.log growing over any limit until the file system is full. This makes the system partly unusable, because some services, login and desktop processes, etc. require to write some data to files in /var. It would be great if this can be fixed soon. Thanks in advance!
Fedora 22 changed to end-of-life (EOL) status on 2016-07-19. Fedora 22 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.
I think this bug should be reopend, because at least it exists at Fedora 23 (see comment #15). I cannot say if it exists on Fedora 24, because we have only a few machines with Fedora 24 at the moment, because of other problems of Fedora 24 (e.g. boot problems, microcode bug).
This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component.
This message is a reminder that Fedora 23 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 23. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '23'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 23 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora 23 changed to end-of-life (EOL) status on 2016-12-20. Fedora 23 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.
This bug still exists on Fedora 25. It should be reopend. The test case of comment #15 shows the same result in Fedora 25.
This message is a reminder that Fedora 25 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 25. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '25'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 25 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora 25 changed to end-of-life (EOL) status on 2017-12-12. Fedora 25 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.
SELinux may also be preventing the number of file descriptors from being increased, see 1513100