121731 – (SELINUX)oops in selinux_socket_sock

Bug 121731 - (SELINUX)oops in selinux_socket_sock

Summary: (SELINUX)oops in selinux_socket_sock

Keywords:
Status:	CLOSED WORKSFORME
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	2
Hardware:	i686
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Assignee:	Dave Jones
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2004-04-26 20:46 UTC by Dan Christian
Modified:	2015-01-04 22:05 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2004-12-07 06:24:30 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
text of stack trace (from serial console) (2.79 KB, text/plain) 2004-04-26 20:49 UTC, Dan Christian	no flags	Details
Stack trace from oops in 2.6.3 (2.83 KB, text/plain) 2004-04-26 21:15 UTC, Dan Christian	no flags	Details
Another oops (same top level, but different call trace) (2.94 KB, text/plain) 2004-04-26 22:23 UTC, Dan Christian	no flags	Details
Another stack trace (2.87 KB, text/plain) 2004-04-27 16:50 UTC, Dan Christian	no flags	Details
yet another stack trace (2.79 KB, text/plain) 2004-04-27 16:52 UTC, Dan Christian	no flags	Details
Call trace from 3.6.5-1.327smp (3.05 KB, text/plain) 2004-04-28 23:13 UTC, Dan Christian	no flags	Details
View All

Description Dan Christian 2004-04-26 20:46:44 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.2)
Gecko/20040301

Description of problem:
The kernel will oops in selinux_socket_sock after a few hours of load.

I'll attach a stack trace in a minue.

I've also reproced this under 2.6.3-1.110smp (took over 24hrs).

The hardware is a dual Xeon.  Multiple (identical) boxes act the same way.


Version-Release number of selected component (if applicable):
2.6.5-1.315smp

How reproducible:
Always

Steps to Reproduce:
1.  Boot system
2.  Run lots of mail through sendmail (with a milter filter)
3.  Wait for the oops (2-30hours)
    

Additional info:

[support@proofpoint support]$ /sbin/lsmod
Module                  Size  Used by
ipv6                  224128  14
tg3                    71044  0
ipt_REJECT              8704  1
ipt_state               5376  1
ip_conntrack           29612  1 ipt_state
iptable_filter          6016  1
ip_tables              18176  3 ipt_REJECT,ipt_state,iptable_filter
p4_clockmod             8068  0
microcode              10528  0
button                  8472  0
battery                10892  0
asus_acpi              12440  0
ac                      7436  0
ext3                   95912  3
jbd                    58008  1 ext3
megaraid               37576  4
sd_mod                 20224  5
scsi_mod               99912  2 megaraid,sd_mod

Comment 1 Dan Christian 2004-04-26 20:49:01 UTC

Created attachment 99697 [details]
text of stack trace (from serial console)

I've reproduced this multiple times under 2.6.3.  The stack traces are very
similar.

Comment 2 Dan Christian 2004-04-26 21:15:12 UTC

Created attachment 99702 [details]
Stack trace from oops in 2.6.3

Comment 3 Dan Christian 2004-04-26 22:23:27 UTC

Created attachment 99707 [details]
Another oops (same top level, but different call trace)

Comment 4 Dan Christian 2004-04-27 16:50:24 UTC

Created attachment 99717 [details]
Another stack trace

Comment 5 Dan Christian 2004-04-27 16:52:53 UTC

Created attachment 99718 [details]
yet another stack trace

I'll stop posting stack traces unless someone asks for one.
It certainly seems to be repeatable (given a few hours).

Comment 6 James Morris 2004-04-28 13:50:44 UTC

Could you please post a trace using kernel-smp-2.6.5-1.327.i686.rpm ?

Comment 7 Stephen Smalley 2004-04-28 13:55:23 UTC

Looks similar to the problem reported earlier by akpm.
Unless I misread the code, it looks like isec is null at the point of
the dereferencing of isec->sclass after the self_netif_lookup, which
implies that the socket inode was freed in the midst of sock_rcv_skb.
Suggestions:
- Add an explicit null test for isec and return with a warning.
- Take the sk_callback_lock around the use of the socket inode?
- Eliminate the need for accessing the socket inode by applying the
patch I sent a while back to allow use of sk security field for INET
sockets.

Comment 8 Dan Christian 2004-04-28 16:00:54 UTC

I have yet to see an oops on 326 or 327.  All of them have the vdso=0
boot argument.

I have 4 machines running them now.  I can take days for this to show,
I'll post an oops when I get one.

Comment 9 Dan Christian 2004-04-28 23:13:50 UTC

Created attachment 99760 [details]
Call trace from 3.6.5-1.327smp

Comment 10 James Morris 2004-04-29 01:14:34 UTC

Thanks for that, it confirms that the oops is happening at the first
dereference of the inode security field.

        isec = inode->i_security;  <--- here

        switch (isec->sclass) {
        case SECCLASS_UDP_SOCKET:
                netif_perm = NETIF__UDP_RECV;

Comment 11 James Morris 2004-05-20 15:56:50 UTC

A fix has been included in the latest Fedora kernel.  Please let us
know if this works.

Comment 13 Dave Jones 2004-12-07 06:24:30 UTC

6 months with no comment - closing.

Note You need to log in before you can comment on or make changes to this bug.