Bug 189626

Summary: LTC23352-Audit netlink deadlock
Product: [Fedora] Fedora Reporter: IBM Bug Proxy <bugproxy>
Component: kernelAssignee: Steve Grubb <sgrubb>
Status: CLOSED NEXTRELEASE QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 5CC: sgrubb, wtogami
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: powerpc   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-05-02 13:01:27 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Test driver shell script
none
awk script to grab syscalls from ppc_table.h and insert audit rules
none
Straces of auditctl -l and auditctl -D
none
Task list from Ctrl-O t none

Description IBM Bug Proxy 2006-04-21 18:45:15 UTC
LTC Owner is: bugrobot.com
LTC Originator is: gcwilson.com


Problem description:

Auditctl deadlocks on large buffers.  It appears that when the netlink buffer
fills, it is holding a mutex, it expects the receiver to receive, auditctl
receive needs to be audited but can't because the mutex is held.  Hence it
deadlocks.

If this is a customer issue, please indicate the impact to the customer:


If this is not an installation problem,
       Describe any custom patches installed.

       Using FC5 + latest updates + 2.6.17-rc1-mm3 + audit 1.2.1

       Provide output from "uname -a", if possible:

       Linux jawbreaker.ltc.austin.ibm.com 2.6.17-rc1-mm3 #1 SMP Tue Apr 18
13:23:40 CDT 2006 ppc64 ppc64 ppc64 GNU/Linux

Hardware Environment
    Machine type (p650, x235, SF2, etc.): HV4 LPAR
    Cpu type (Power4, Power5, IA-64, etc.): Power5
    Describe any special hardware you think might be relevant to this problem:


Is this reproducible?
    If so, how long does it (did it) take to reproduce it?

    1 min.

    Describe the steps:

    Please see attached scripts.  Basically, add a entry and exit never rules
for all syscalls, auditctl -l, auditctl -D.  Now you're deadlocked.

    If not, describe how the bug was encountered:


Is the system (not just the application) hung?

    Yes, for some value of hung.  Kernel is still alive.  However, trusted
programs, such as login, are hung.

    If so, describe how you determined this:

    Can still Ctrl-o (Alt-SysRq) t, u, and b on console.  Existing open ssh
sessions still allow some commands to be executed.  Auditctl -l hangs.


Did the system produce an OOPS message on the console?

    No.  However, please see attached Ctrl-o t output.


Is the system sitting in a debugger right now?

    Can be at any time.

Comment 1 IBM Bug Proxy 2006-04-21 18:47:42 UTC
Created attachment 128097 [details]
Test driver shell script

Comment 2 IBM Bug Proxy 2006-04-21 18:49:03 UTC
Created attachment 128098 [details]
awk script to grab syscalls from ppc_table.h and insert audit rules

Sorry this is a single line ugly thing.  It's just a saved shell command line.

Comment 3 IBM Bug Proxy 2006-04-21 18:50:41 UTC
Created attachment 128099 [details]
Straces of auditctl -l and auditctl -D

Comment 4 IBM Bug Proxy 2006-04-21 18:51:51 UTC
Created attachment 128100 [details]
Task list from Ctrl-O t

Comment 5 Dave Jones 2006-04-25 03:38:34 UTC
Why are you filing -mm bugs here ? They belong in http://bugme.osdl.org/


Comment 6 IBM Bug Proxy 2006-04-25 18:59:18 UTC
----- Additional Comments From gcwilson.com  2006-04-25 15:01 EDT -------
Please do not redisposition this bug.  Steve Grubb explicitly requested that it
be opened against Fedora Development.  I asked him beforehand.  I also cc'd him
on it and have in our project status.  Irina Boverman is also aware of it. 
Contact Steve or Irina if you have questions.

Quoting from the redhat-lspp list:

> On Friday 21 April 2006 10:35, George Wilson wrote:
> > Should it just be against rawhide?
>
> Yes. Fedora Core/devel. Let me know the bugzilla number.
>
> -Steve 

Comment 7 IBM Bug Proxy 2006-04-25 19:13:24 UTC
----- Additional Comments From gcwilson.com  2006-04-25 15:16 EDT -------
Steve Grubb is reopening this on the Red Hat side.  Thanks, Steve. 

Comment 8 Steve Grubb 2006-05-02 13:01:27 UTC
The kernel piece has been released in lspp.20 kernel. Closing this bug.

Comment 9 IBM Bug Proxy 2006-05-03 22:23:52 UTC
changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|FIXEDAWAITINGTEST           |TESTED




------- Additional Comments From gcwilson.com  2006-05-03 18:26 EDT -------
Deadlock appears to be fixed in the lspp.21 kernel.