Bug 189626 - LTC23352-Audit netlink deadlock
Summary: LTC23352-Audit netlink deadlock
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 5
Hardware: powerpc
OS: Linux
medium
high
Target Milestone: ---
Assignee: Steve Grubb
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-04-21 18:45 UTC by IBM Bug Proxy
Modified: 2007-11-30 22:11 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-05-02 13:01:27 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Test driver shell script (95 bytes, text/plain)
2006-04-21 18:47 UTC, IBM Bug Proxy
no flags Details
awk script to grab syscalls from ppc_table.h and insert audit rules (283 bytes, application/octet-stream)
2006-04-21 18:49 UTC, IBM Bug Proxy
no flags Details
Straces of auditctl -l and auditctl -D (165.45 KB, text/plain)
2006-04-21 18:50 UTC, IBM Bug Proxy
no flags Details
Task list from Ctrl-O t (62.87 KB, text/plain)
2006-04-21 18:51 UTC, IBM Bug Proxy
no flags Details

Description IBM Bug Proxy 2006-04-21 18:45:15 UTC
LTC Owner is: bugrobot.com
LTC Originator is: gcwilson.com


Problem description:

Auditctl deadlocks on large buffers.  It appears that when the netlink buffer
fills, it is holding a mutex, it expects the receiver to receive, auditctl
receive needs to be audited but can't because the mutex is held.  Hence it
deadlocks.

If this is a customer issue, please indicate the impact to the customer:


If this is not an installation problem,
       Describe any custom patches installed.

       Using FC5 + latest updates + 2.6.17-rc1-mm3 + audit 1.2.1

       Provide output from "uname -a", if possible:

       Linux jawbreaker.ltc.austin.ibm.com 2.6.17-rc1-mm3 #1 SMP Tue Apr 18
13:23:40 CDT 2006 ppc64 ppc64 ppc64 GNU/Linux

Hardware Environment
    Machine type (p650, x235, SF2, etc.): HV4 LPAR
    Cpu type (Power4, Power5, IA-64, etc.): Power5
    Describe any special hardware you think might be relevant to this problem:


Is this reproducible?
    If so, how long does it (did it) take to reproduce it?

    1 min.

    Describe the steps:

    Please see attached scripts.  Basically, add a entry and exit never rules
for all syscalls, auditctl -l, auditctl -D.  Now you're deadlocked.

    If not, describe how the bug was encountered:


Is the system (not just the application) hung?

    Yes, for some value of hung.  Kernel is still alive.  However, trusted
programs, such as login, are hung.

    If so, describe how you determined this:

    Can still Ctrl-o (Alt-SysRq) t, u, and b on console.  Existing open ssh
sessions still allow some commands to be executed.  Auditctl -l hangs.


Did the system produce an OOPS message on the console?

    No.  However, please see attached Ctrl-o t output.


Is the system sitting in a debugger right now?

    Can be at any time.

Comment 1 IBM Bug Proxy 2006-04-21 18:47:42 UTC
Created attachment 128097 [details]
Test driver shell script

Comment 2 IBM Bug Proxy 2006-04-21 18:49:03 UTC
Created attachment 128098 [details]
awk script to grab syscalls from ppc_table.h and insert audit rules

Sorry this is a single line ugly thing.  It's just a saved shell command line.

Comment 3 IBM Bug Proxy 2006-04-21 18:50:41 UTC
Created attachment 128099 [details]
Straces of auditctl -l and auditctl -D

Comment 4 IBM Bug Proxy 2006-04-21 18:51:51 UTC
Created attachment 128100 [details]
Task list from Ctrl-O t

Comment 5 Dave Jones 2006-04-25 03:38:34 UTC
Why are you filing -mm bugs here ? They belong in http://bugme.osdl.org/


Comment 6 IBM Bug Proxy 2006-04-25 18:59:18 UTC
----- Additional Comments From gcwilson.com  2006-04-25 15:01 EDT -------
Please do not redisposition this bug.  Steve Grubb explicitly requested that it
be opened against Fedora Development.  I asked him beforehand.  I also cc'd him
on it and have in our project status.  Irina Boverman is also aware of it. 
Contact Steve or Irina if you have questions.

Quoting from the redhat-lspp list:

> On Friday 21 April 2006 10:35, George Wilson wrote:
> > Should it just be against rawhide?
>
> Yes. Fedora Core/devel. Let me know the bugzilla number.
>
> -Steve 

Comment 7 IBM Bug Proxy 2006-04-25 19:13:24 UTC
----- Additional Comments From gcwilson.com  2006-04-25 15:16 EDT -------
Steve Grubb is reopening this on the Red Hat side.  Thanks, Steve. 

Comment 8 Steve Grubb 2006-05-02 13:01:27 UTC
The kernel piece has been released in lspp.20 kernel. Closing this bug.

Comment 9 IBM Bug Proxy 2006-05-03 22:23:52 UTC
changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|FIXEDAWAITINGTEST           |TESTED




------- Additional Comments From gcwilson.com  2006-05-03 18:26 EDT -------
Deadlock appears to be fixed in the lspp.21 kernel. 


Note You need to log in before you can comment on or make changes to this bug.