Bug 193643 - Audit system blocks, preventing associated services to work
Summary: Audit system blocks, preventing associated services to work
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: laus
Version: 3.0
Hardware: i386
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Jason Vas Dias
QA Contact: Jay Turner
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-05-31 11:13 UTC by Frode Nordahl
Modified: 2015-01-08 00:12 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-05-31 15:23:07 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Frode Nordahl 2006-05-31 11:13:36 UTC
Description of problem:
Recently one of our servers has started to show erratic behaviour. After 24-48 hrs services connected 
to the audit subsystem stops working. We cannot log in via ssh, crond stops performing its tasks etc.

I was lucky to have a allready logged in root shell on console and checked what sshd and crond was up 
to, and they hang waiting for I/O to /dev/audit.

If I stop auditd everything starts working again.

I have started auditd with strace, and it just hangs forever on read from /dev/audit.

I have turned on audit debugging (dev.audit.debug=1) and it says the following when I try to log in via 
SSH (there is probably some other processes involved in this output):
Audit daemon registered (process 18620)
auditf_ioctl: done, result=0
auditf_read: called.
auditf_open: opened by pid 18627
auditf_ioctl: ctx=e2298bc0, cmd=0x801c406f
auditf_ioctl: ctx=c4808480, cmd=0x4065
auditf_ioctl: ctx=c4808480, cmd=0x801c406f
auditf_release: called.
auditf_release: Audit daemon closed audit file; auditing disabled
audit_resume: process 18620 resumes auditing
auditf_ioctl: done, result=-19
auditf_release: called.
auditf_ioctl: done, result=-19
auditf_ioctl: done, result=-19
auditf_ioctl: ctx=c4808480, cmd=0x801c406f
auditf_ioctl: done, result=-19
auditf_ioctl: ctx=c4808480, cmd=0x801c406f
auditf_ioctl: done, result=-19
auditf_ioctl: ctx=c4808480, cmd=0x4066
audit_detach: detaching process 18719
auditf_ioctl: done, result=-49
auditf_ioctl: ctx=c4808480, cmd=0x4065
auditf_ioctl: done, result=-19



It seems to me that stopping auditd also stops the audit system in the kernel, so I think the bug is in 
the kernel part of the audit system.

Version-Release number of selected component (if applicable):
kernel-2.4.21-40.EL
laus-0.1-70RHEL3

How reproducible:
Unknown

Comment 1 Jason Vas Dias 2006-05-31 15:23:07 UTC
The problem could be occurring because auditd is finding that the amount of
free space on the filesystem containing /var/log/audit.d/ is falling below
the threshold specified in /etc/audit/audit.conf:
   notify          = "/usr/sbin/audbin -S /var/log/audit.d/save.%u -C -T 20%";
and it is hence unable to rotate the /var/log/audit.d/bin* audit log files.
When audit finds that free space falls below the -T threshold, it put the 
system into 'suspend mode' until the free space is equal to or greater than
the threshold. Entering suspend mode is the default action to take when there
is insufficient free disk space, as configured by the -T threshold, as 
configured in /etc/audit.conf:
         error {
                action {
                        type = suspend;
                };
See the man-pages for auditd(8), audbin(1), and audit.conf(5).

Do you see messages in /var/log/messages saying audit is entering suspend mode?:
# egrep 'audbin|suspend' /var/log/messages
If so, then the /var/log/audit.d/ disk space threshold being exceeded is the 
problem.

Unless you require auditing, then turn it off - 
# chkconfig --level=123456 audit off ; reboot
nothing else depends on audit being enabled, and this is the default for 
a clean RHEL-3 install post-U5.

If you want to retain auditing, then you need to set up a mechanism to purge
old rotated log files - see the '-T' and '-N' options in man audbin(1) - 
eg. to remove the oldest log files, set this in /etc/audit.conf:
   notify = "/usr/sbin/audbin -S /var/log/audit.d/save.%u -C -T 20% -N 'rm -f %f'";
or to move them to a different partition:
   notify = "/usr/sbin/audbin -S /var/log/audit.d/save.%u -C -T 20% -N 'mv -f %f
 /another_partition/'";
or to process them with a script that then removes them:
   notify = "/usr/sbin/audbin -S /var/log/audit.d/save.%u -C -T 20% -N
'/bin/my_audit_log_rotation_script %f'";
 
If you do not see any 'audbin|suspend' messages in /var/log/messages, and
the machine is still suspending, or if putting a log rotation mechanism in
place does not fix the problem, then please re-open this bug and I'll 
investigate further - thanks.

Comment 2 Frode Nordahl 2006-06-02 09:20:27 UTC
Thank you for your thorough response!

I am a bit surprised though that the default configuration of RedHat Linux is to make sure the Operator 
cannot operate the system as soon as it needs Operator attention.


Note You need to log in before you can comment on or make changes to this bug.