Bug 1296189

Summary: RFE: audit default backlog_wait_time setting
Product: [Fedora] Fedora Reporter: Steve Grubb <sgrubb>
Component: kernelAssignee: Paul Moore <pmoore>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: rawhideCC: gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, mchehab, rbriggs
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-12-02 19:26:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Steve Grubb 2016-01-06 14:47:16 UTC
Description of problem:
The default value for backlog_wait_time makes the system unusable.

Version-Release number of selected component (if applicable):
4.2.6

How reproducible:
always

Steps to Reproduce:
Run the following as a script:
auditctl -D
auditctl -b 16440
auditctl -f 0
auditctl --backlog_wait_time 0
auditctl -a always,exit -F arch=x86_64 -S all
sleep 3
service auditd stop

Actual results:
System is unusable for a long time after auditd halts.

Comment 1 Richard Guy Briggs 2016-01-18 17:43:36 UTC
I would be tempted to close this as a duplicate of bz1129013.  If it is stuck, there is a reason why, and shortening the timeout won't fix that.

Comment 2 Paul Moore 2016-01-18 18:27:06 UTC
Okay, I'll buy that reasoning, but since BZ #1129013 is opened for RHEL6, and currently unresolved, let's leave this BZ open.  Once we have resolved BZ #1129013 and merged the fix upstream we can mark this as UPSTREAM.

Comment 3 Steve Grubb 2016-01-18 18:47:20 UTC
One question about bz1129013, none of the loops use audit_pid to see if its even valid to wait on auditd. Shouldn't they only wait on auditd if there is an auditd? IOW

-        while (audit_backlog_limit
+        while (audit_backlog_limit && audit_pid
               && skb_queue_len(&audit_skb_queue) > audit_backlog_limit + reserve) {

Comment 4 Richard Guy Briggs 2016-01-21 16:16:21 UTC
(In reply to Steve Grubb from comment #3)
> One question about bz1129013, none of the loops use audit_pid to see if its
> even valid to wait on auditd. Shouldn't they only wait on auditd if there is
> an auditd? IOW
> 
> -        while (audit_backlog_limit
> +        while (audit_backlog_limit && audit_pid
>                && skb_queue_len(&audit_skb_queue) > audit_backlog_limit + reserve) {

I attempted to address some of this in the case where it goes away while waiting with:
    [RFC PATCH 6/7] audit: wake up audit_backlog_wait queue when auditd goes away.
    https://www.redhat.com/archives/linux-audit/2015-October/msg00074.html
and
    [RFC PATCH 7/7] audit: wake up kauditd_thread after auditd registers
    https://www.redhat.com/archives/linux-audit/2015-October/msg00074.html
pmoore's reply:
    https://www.redhat.com/archives/linux-audit/2015-November/msg00028.html

I'll think about this one, but that might address another case I hadn't considered...

Comment 5 Richard Guy Briggs 2016-01-21 16:17:42 UTC
(In reply to Richard Guy Briggs from comment #4)
> (In reply to Steve Grubb from comment #3)
> > One question about bz1129013, none of the loops use audit_pid to see if its
> > even valid to wait on auditd. Shouldn't they only wait on auditd if there is
> > an auditd? IOW
> > 
> > -        while (audit_backlog_limit
> > +        while (audit_backlog_limit && audit_pid
> >                && skb_queue_len(&audit_skb_queue) > audit_backlog_limit + reserve) {
> 
> I attempted to address some of this in the case where it goes away while
> waiting with:
>     [RFC PATCH 6/7] audit: wake up audit_backlog_wait queue when auditd goes
> away.
>     https://www.redhat.com/archives/linux-audit/2015-October/msg00074.html
> and
>     [RFC PATCH 7/7] audit: wake up kauditd_thread after auditd registers
>     https://www.redhat.com/archives/linux-audit/2015-October/msg00074.html

oops, wrong link, here's the right one:
    https://www.redhat.com/archives/linux-audit/2015-October/msg00075.html

> pmoore's reply:
>     https://www.redhat.com/archives/linux-audit/2015-November/msg00028.html
> 
> I'll think about this one, but that might address another case I hadn't
> considered...

Comment 6 Paul Moore 2016-04-06 23:42:58 UTC
So where do we stand on this at present?

Comment 7 Richard Guy Briggs 2016-06-13 18:12:28 UTC
Still waiting in upstream resolution to bz 1129013 .

Comment 8 Paul Moore 2016-12-02 19:26:05 UTC
Closing this out as we have made some substantial changes upstream for Linux v4.10 which should improve the behavior of the backlog under load.

* https://github.com/linux-audit/audit-kernel/issues/23