Bug 182559 - auditd takes all available memory within 20 minutes of boot
Summary: auditd takes all available memory within 20 minutes of boot
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Fedora
Classification: Fedora
Component: audit
Version: 5
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: ---
Assignee: Steve Grubb
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-02-23 10:53 UTC by Bevis King
Modified: 2007-11-30 22:11 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-05-02 12:46:55 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Bevis King 2006-02-23 10:53:10 UTC
Description of problem:
Shortly after booting a Sun w2100z dual processor opteron workstation with
Fedora Core 5 test 3, the system became unusable due to system load.  Looking
with top this was attributable to auditd having reached 18.9G of virtual address
space, 1.8G resident set size and a shared of 438M.

There were a large number of yp_docall: clnt_call: RPC timed out errors being
reported on the system despite all usual YP tools working normally: ypwhich,
ypcat, etc.  I'm still investigating that.  Even so, auditd shouldn't have
exhausted all system resources.

Version-Release number of selected component (if applicable):
audit-1.1.4-5.1

How reproducible:
Unknown.

Steps to Reproduce:
1.  Boot system, log in, start working
2.  System load becomes high, auditd seems to be culprit
3.
  
Actual results:
System becomes very slow (load average > 20) until auditd process is killed.

Expected results:
System running normally.

Additional info:
Killing auditd seemed to resolve the problem and the machine returned to normal
operation.

Comment 1 Steve Grubb 2006-02-23 11:27:13 UTC
Do you have any changes in /etc/auditd.conf from the default config? I just
checked it with valgrind on an x86_64 machine and I don't see a memory leak.

Comment 2 Bevis King 2006-02-23 13:46:53 UTC
Nope, absolutely default config.  Machine was installed yesterday evening from
the standard CD-ROM distribution.  There is definitely something amiss with YP
on the box though - dbus-daemon is sitting permanently at >80% CPU usage now
auditd is gone.  Still investigating that....

Comment 3 Bevis King 2006-02-23 14:21:02 UTC
The problem with YP appears to have been related to SELinux denying socket
connects to it.  Running with SELinux disabled seems much smoother.

Comment 4 Steve Grubb 2006-03-17 13:43:30 UTC
Does this bug still exist?

Comment 5 Bevis King 2006-03-17 14:05:00 UTC
I think it probably does - the do_ypcall issue turned out to be that NIS host
lookups hang and if you have host resolution set in nsswitch.conf to "files nis
dns", things go badly wrong.  This has been reported seperately.


Comment 6 Bevis King 2006-03-17 14:08:04 UTC
The NIS bug is : 183188

Comment 7 Steve Grubb 2006-03-17 15:19:39 UTC
So, if we exclude comments regarding yp system, does the audit daemon still
consume all memory? Current release is 1.1.5. I've spent time with valgrind and
cannot find any leaks. I do the audit daemon development work on an x86_64
machine and test it all the time and never see anything consuming all memory.
I'm at a loss as to what it could be.

Comment 8 Bevis King 2006-03-17 16:00:58 UTC
OK, there are two scenarios I can think of - one is that selinux barring access
to it's port was causing all requests to block and stay sitting on the queue, or
it was forking over the name resolution issues and again the queue grows and
thus exhausting memory.  I'm not sure that there is a flaw per se except that it
doesn't detect that an upstream blockage is causing it's queue to build
continuously.  I think I may have seen an SElinux update go through that fixed
problems dbus was having with communication to auditd.  Did that address this
issue too?

Regards, Bevis.

Comment 9 Steve Grubb 2006-04-26 13:44:05 UTC
I have used valgrind on auditd and have found no leaks. The standard SE Linux
policy does not block access from the audit daemon to the kernel. The audit
daemon uses select or poll to detect data being available. It re-uses the same
buffer for each event to avoid memory allocation problems.

If this problem exists, I need some data showing a memory leak. I'd like to fix
it if its there. Otherwise, I would like to close this bug since all my testing
shows no leak.

Thanks...

Comment 10 Steve Grubb 2006-05-02 12:46:55 UTC
Closing this bug. If this bug recurs, please attach strace, mtrace, or valgrind
data showing a leak. Thanks.


Note You need to log in before you can comment on or make changes to this bug.