Bug 848852

Summary: rsyslogd-2163 - epoll_ctl failed
Product: Red Hat Enterprise Linux 6 Reporter: seth vidal <svidal>
Component: rsyslogAssignee: Tomas Heinrich <theinric>
Status: CLOSED ERRATA QA Contact: BaseOS QE Security Team <qe-baseos-security>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.3CC: App_KBS_Servers, blakec, brian.macleod, dapospis, garrett.demarco, jcmcken, kevin, ksrot, pvrabec, ssahani, tcallawa, theinric, tlavigne, vgaikwad
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-03-17 14:59:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1056252    

Description seth vidal 2012-08-16 15:18:43 UTC
Description of problem:
our central loghost is routinely dumping out
Aug 15 09:45:11 log02 rsyslogd-2163: last message repeated 1662 times
Aug 15 09:45:11 log02 rsyslogd-2163:epoll_ctl failed on fd 136, id 0/0x7ff1d401fc10, op 1 with File exists
: File exists [try http://www.rsyslog.com/e/2163 ]



Version-Release number of selected component (if applicable):
rsyslog-5.8.10-2.el6.x86_64

How reproducible:
that's a good question - we don't quite know how to reproduce it
It just happens pretty often recently. It only started following the rhel 6.3 upgrade.


 
Actual results:
it emits this message - specifically the last line which is not in syslog format and causes our log parser/analyzer to complain

Expected results:
it should not emit the message or rather should not have the problem causing the msg.

Comment 2 RHEL Program Management 2012-12-14 08:02:39 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 5 Dalibor Pospíšil 2013-04-28 10:45:13 UTC
Hi Seth, have you got some new information about this issue? Do we have something which could lead us to reproducer?

Comment 7 seth vidal 2013-06-19 17:31:39 UTC
Sorry for the lag in reply - I must have missed the needinfo setting back in april.

I don't have any additional info. It continues to happen to us.

Comment 8 Tomas Heinrich 2013-06-20 08:58:26 UTC
If it's still a problem, I'll see what I can do, but no promises. Would you be able to test a potential fix?

Comment 9 seth vidal 2013-06-20 15:20:25 UTC
Yes - we can test out fixes. It sometimes takes a few days for the error to crop back up - but we can test out fixes.

Thank you

Comment 11 Blake Caldwell 2013-08-23 17:55:14 UTC
We are seeing similar messages appear when there is an unusually high load on a rsyslog server that aggregates from several nodes.  Its a RHEL6.4 system with kernel version is 2.6.32-279.19.1.el6.x86_64 and rsyslog-5.8.10-6.el6.x86_64. 

rsyslogd-2163: epoll_ctl failed on fd 984, id 0/0x7f26e8377c40, op 1 with File exists: File exists [try http://www.rsyslog.com/e/2163 ]

Have any fixes for this bug proved successful?

Comment 12 RHEL Program Management 2013-10-14 04:50:53 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 18 Garrett DeMarco 2013-10-23 12:20:57 UTC
I've also experienced this same issue with ryslog-5.8.10-7.el6_4.x86_64.

Comment 19 Susant Sahani 2013-10-24 03:43:04 UTC
This is fixed in the version rsyslog-5.8.10-8.el6_4.x86_64. Please upgrade to it or onwards.

Also increase per user max open fd limit which is by default set to 1024 in RHEL.
/etc/security/limits.conf
~~~
root soft nofile 4096
root hard nofile 10240
~~~

Comment 21 Tomas Heinrich 2013-10-24 09:33:39 UTC
As Susant has written, increasing the maximum number of file descriptors can be a viable workaround.
An updated package should be available in el6.5.

The upstream patch that should fix the cause for the "epoll_ctl" message is this:
http://git.adiscon.com/?p=rsyslog.git;a=commitdiff;h=bea499dcb2747d1f5b42eae4978cfe86a37dc957

But there's some other problem:
The side-effect seems to be that the daemon takes CPU utilization to 100% until a file descriptor frees up.
I need to investigate this further.

Comment 22 Brian MacLeod 2014-01-16 21:51:50 UTC
I will confirm your findings Tomas.  I have this same core issue (epoll_ctl), and updating to rsyslog-5.8.10-8.el6_4.x86_64 does seem to "fix" the original problem, but it does cause the CPU usage to spike.

Because our instance is an HPC environment, we already had max open set higher than default in /etc/security/limits.conf

---
*                soft    nofile     32000
*                hard    nofile     100000
---

Comment 25 Tomas Heinrich 2014-03-17 14:59:31 UTC
The original issue, as reported, has been already fixed in rsyslog-5.8.10-8 as noted in an earlier comment. This is the relevant patch:
rsyslog-5.8.10-bz862517.patch

Thus, I'm closing this bz.

There is a related issue, uncovered by the abovementioned patch, with spikes of CPU usage. I've filed it separately as bug #1077238.

Comment 26 Jon McKenzie 2015-08-31 19:04:09 UTC
Are there plans to incorporate this patch into the rsyslog5 package included with EL5 (currently at 5.8.12)?