465876 – NMI Watchdog detected LOCKUP in :sctp:sctp_copy_local_addr_list

Bug 465876 - NMI Watchdog detected LOCKUP in :sctp:sctp_copy_local_addr_list

Summary: NMI Watchdog detected LOCKUP in :sctp:sctp_copy_local_addr_list

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	5.3
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Neil Horman
QA Contact:	Dayong Tian
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2008-10-06 21:49 UTC by Nate Straz
Modified:	2011-07-21 09:44 UTC (History)
CC List:	12 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2011-07-21 09:44:47 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
patch to balance locking (707 bytes, patch) 2008-10-07 14:32 UTC, Neil Horman	no flags	Details \| Diff
patch to balance locking (751 bytes, patch) 2010-07-08 19:39 UTC, Neil Horman	no flags	Details \| Diff
new patch to balance locking and make sure sctp_local_addr_list is initalized (1.69 KB, patch) 2010-07-15 18:09 UTC, Neil Horman	no flags	Details \| Diff
Show Obsolete (1) View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2011:1065	0	normal	SHIPPED_LIVE	Important: Red Hat Enterprise Linux 5.7 kernel security and bug fix update	2011-07-21 09:21:37 UTC

Description Nate Straz 2008-10-06 21:49:12 UTC

Description of problem:

While running a test which uses SCTP for communications I randomly hit this panic.

NMI Watchdog detected LOCKUP on CPU 0
CPU 0 
Modules linked in: sctp ipt_MASQUERADE iptable_nat ip_nat xt_state ip_conntrack 
nfnetlink ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge autofs4 
hidp rfcomm l2cap bluetooth lock_dlm gfs2 dlm configfs sunrpc ipv6 xfrm_nalgo cr
ypto_api dm_multipath scsi_dh video backlight sbs i2c_ec button battery asus_acp
i acpi_memhotplug ac parport_pc lp parport sg i2c_i801 i3000_edac edac_mc i2c_co
re ide_cd pcspkr shpchp qla2xxx cdrom scsi_transport_fc tg3 libphy serio_raw dm_
snapshot dm_zero dm_mirror dm_log dm_mod ata_piix libata sd_mod scsi_mod ext3 jb
d uhci_hcd ohci_hcd ehci_hcd
Pid: 9241, comm: d_doio Not tainted 2.6.18-116.gfs2abhi.001 #1
RIP: 0010:[<ffffffff80064cc3>]  [<ffffffff80064cc3>] .text.lock.spinlock+0x11/0x
30
RSP: 0018:ffff81002b657d70  EFLAGS: 00000086
RAX: 0000000000000296 RBX: ffff81002b3ea880 RCX: 0000000000000006
RDX: 00000000000000d0 RSI: 0000000000000001 RDI: ffffffff886ccba8
RBP: ffff81002db984a8 R08: 0000000000000006 R09: ffffffff886b3720
R10: ffff81002b657ce8 R11: ffffffff886942b9 R12: 0000000000000000
R13: 0000000000000006 R14: ffff81002db984b0 R15: 0000000000000001
FS:  00002aff64992020(0000) GS:ffffffff803b8000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002aff64782000 CR3: 000000002ac64000 CR4: 00000000000006e0
Process d_doio (pid: 9241, threadinfo ffff81002b656000, task ffff81002b247100)
Stack:  ffffffff88694804 ffff81002af820a8 ffff81002b3ea880 ffff81002db984a8
 ffff81002af820a8 0000000000000000 ffff81002db984b0 0000000000000001
 ffffffff8869ccde 000000d000000006 ffff81002b657ed8 ffff81002af81b80
Call Trace:
 [<ffffffff88694804>] :sctp:sctp_copy_local_addr_list+0x27/0xb6
 [<ffffffff8869ccde>] :sctp:sctp_bind_addr_copy+0x44/0xbe
 [<ffffffff8869d5e0>] :sctp:__sctp_connect+0x1af/0x3b1
 [<ffffffff886a0938>] :sctp:sctp_get_port+0x35/0x47
 [<ffffffff8869d817>] :sctp:sctp_connect+0x35/0x4d
 [<ffffffff8021a674>] sys_connect+0x7e/0xae
 [<ffffffff800b54ca>] audit_syscall_entry+0x16e/0x1a1
 [<ffffffff8005d28d>] tracesys+0xd5/0xe0


Code: 7e f9 e9 f9 fe ff ff f3 90 83 3f 00 7e f9 e9 f8 fe ff ff f3 
Kernel panic - not syncing: nmi watchdog

Version-Release number of selected component (if applicable):
kernel-2.6.18-116.gfs2abhi.001 (-116 w/ a few GFS2 related patches)

How reproducible:
Not easily.  Once a week or so.
  
Actual results:
See panic above

Expected results:
The CPU shouldn't lock up.

Additional info:

Comment 1 Steve Whitehouse 2008-10-07 09:59:12 UTC

Interesting.

Looks like a lot has happened since RHEL forked from upstream. There seems to be two patches relating to this area of code which might be important....

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=29c7cf96186ac14ce7380633f690fc39732ff03a#patch3
(which, btw looks as if it introduces a new bug, and)
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=293035479942400a7fe8e4f72465d4e4e466b91a
which seems to be the fix for the initial patch.

The code looks rather crufty to me, so what the exact cause of the crash is I'm not sure, but certainly the use of irqsave locks around that bit if code is wrong. Also I notice that the GFP parameter of sctp_copy_local_addr_list is unused.

The fact that sctp_add_bind_addr() is called with GFP_ATOMIC means that its quite likely to fail. That seems odd since its being called from process context, but is no doubt due to the erroneous irqsave lock.

In short the whole thing is a mess.

Comment 2 Steve Whitehouse 2008-10-07 10:06:41 UTC

It gets worse...... (from socket.c)

        sctp_spin_lock_irqsave(&sctp_local_addr_lock, flags);
        list_for_each(pos, &sctp_local_addr_list) {
                addr = list_entry(pos, struct sctp_sockaddr_entry, list);
                if ((PF_INET == sk->sk_family) && 
                    (AF_INET6 == addr->a.sa.sa_family))
                        continue;
                memcpy(&temp, &addr->a, sizeof(temp));
                sctp_get_pf_specific(sk->sk_family)->addr_v4map(sctp_sk(sk),
                                                                &temp);
                addrlen = sctp_get_af_specific(temp.sa.sa_family)->sockaddr_len;
                if(space_left<addrlen)
                        return -ENOMEM;

The above seems to return still holding the lock.

Comment 3 Neil Horman 2008-10-07 14:31:07 UTC

yeah, its this patch:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=293035479942400a7fe8e4f72465d4e4e466b91a

Like you've noted, theres been a good deal of divergence between upstream and RHEL.  As it stands we actually have part of this patch in place, but not all of it (since we don't currently receive netdev notifications in RHEL5 for sctp at the moment).

Easiest solution is to apply the portion of the patch we need to avoid  throwing the lock out of balance.  I'll attach a patch in a sec.

Comment 4 Neil Horman 2008-10-07 14:32:45 UTC

Created attachment 319644 [details]
patch to balance locking

here you go.  Please test this out and confirm that it solves the problem if you would.  I'll post it as soon as I get confirmation

Comment 5 RHEL Program Management 2008-10-07 14:59:38 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 6 Nate Straz 2008-11-02 13:07:02 UTC

I started hitting this again so I had Abhi build a kernel with the patch from comment #4 in it.  I was able to hit the lockup again with that kernel.

morph-04 login: BUG: soft lockup - CPU#0 stuck for 10s! [d_doio:3561]                                                                                           Pid: 3561, comm:               d_doio
EIP: 0060:[<c0613c26>] CPU: 0
EIP is at _spin_lock_irqsave+0x16/0x27
 EFLAGS: 00000282    Not tainted  (2.6.18-120.gfs2abhi.003 #1)
EAX: f8ee99e8 EBX: f55a6880 ECX: 000000d0 EDX: 00000296
ESI: f42cb5dc EDI: 00000000 EBP: 00000000 DS: 007b ES: 007b
CR0: 8005003b CR2: b7f2a000 CR3: 33ee8000 CR4: 000006d0
 [<f8ecace4>] sctp_copy_local_addr_list+0x1a/0x9a [sctp]
 [<f8ed236a>] sctp_bind_addr_copy+0x30/0x9f [sctp]
 [<f8ecb5f7>] sctp_assoc_set_bind_addr_from_ep+0x53/0x58 [sctp]
 [<f8ed2b96>] __sctp_connect+0x1ad/0x39d [sctp]
 [<c0613c4f>] _spin_lock_bh+0x8/0x18
 [<c05b19ce>] lock_sock+0x8e/0x96
 [<f8ed5cd7>] sctp_get_port+0x2c/0x3b [sctp]
 [<f8ed2db0>] sctp_connect+0x2a/0x40 [sctp]
 [<c05f3bd1>] inet_dgram_connect+0x45/0x4e
 [<c05b04e8>] sys_connect+0x7d/0xa9
 [<c0495751>] inotify_d_instantiate+0x3c/0x5f
 [<c0613c4f>] _spin_lock_bh+0x8/0x18
 [<c05b19ce>] lock_sock+0x8e/0x96
 [<c05b19ce>] lock_sock+0x8e/0x96
 [<c0613c4f>] _spin_lock_bh+0x8/0x18
 [<c04880f3>] destroy_inode+0x36/0x45
 [<c05b0cca>] sys_socketcall+0x8c/0x19e
 [<c0407f46>] do_syscall_trace+0xab/0xb1
 [<c0404f17>] syscall_call+0x7/0xb
 =======================

It does take quite a while to reproduce this, but it is possible.

Comment 7 Neil Horman 2008-11-06 01:43:20 UTC

Can you provide me with a link to the kernel binary that you built that produced this hang?  A visual inspection of the call sites on this lock (there are only about 15 of them) are all balanced, so I'm concerned that perhaps the patch inadvertently didn't get built in to the kernel in question.  Also, if this happens again, if you could forcibly crash the system and collect a vmcore, that would be helpful.  Thanks!

Comment 8 Neil Horman 2008-11-06 01:49:46 UTC

Nate, this is marked as a 5.3 exception, so if I can get my hands on that kernel asap, that would be great.  Please get me a pointer to the kernel source and/or binary, I'll prioritize this and get it straightened out.  Thanks!

Comment 10 Nate Straz 2008-11-06 13:55:02 UTC

I've only been able to hit this once since I started testing this kernel and that was 6 days ago.  I confirmed in the console logs that it was this kernel which was running before the soft lockup hit.

Comment 12 Neil Horman 2008-11-06 15:24:51 UTC

Ok, so I've been looking over this all morning, and I can't find for the life
of me where theres an imbalance in this locking (and I've verified that the
above patch is in place in your test kernel).   I'm starting to wonder if we're
not scribbling over this spinlock during one of the many memcopies in this code
path.  Is it possible for you to reproduce this and capture a vmcore?

Comment 15 RHEL Program Management 2009-02-16 15:40:38 UTC

Updating PM score.

Comment 20 Neil Horman 2010-07-08 18:41:33 UTC

Note to self: using hp-xw9300-03.lab.bos.redhat.com too look at these cores

Comment 21 Neil Horman 2010-07-08 19:39:43 UTC

Created attachment 430451 [details]
patch to balance locking

Hey, this is a slight modification on the patch that I did for this bug before.  I've looked at it again, and it defiantely fixes a locking imbalance.  The previous test results from the last person to hit this bug were inconclusive in my opinion.  Would you please try this patch and let me know the result.  Thanks!

Comment 22 Jaroslav Kortus 2010-07-13 13:18:27 UTC

The patch still does not fully solve the problem. I was able to reproduce the crash again with the patch included.

Comment 24 Neil Horman 2010-07-13 15:18:33 UTC

Please remember to clear the needinfo field when you update a bz with relevant data

Comment 25 Neil Horman 2010-07-13 19:09:26 UTC

so, this latest core doesn't appear to be the result of a deadlock, its an actual crash.  But thats just based on the txt output you have.  Do you have nn ia64 system somewhere where I can take a closer look at this core?

Comment 26 Neil Horman 2010-07-15 18:09:09 UTC

So, I'm looking at the core, and the local_addr_list is all 0's, but nothing else in the sctp_globals struct is.  that suggests to me that someone has corrupted an entry in the list outside of our lock protection.  It also suggests that perhaps this problem might be occuring before that part of the globals struct is initalized.  The log in the crash seems to suggest the same.  It appears this crash happens during or near boot, and the local_addr_list is one of the last things to get initalized in sctp_init.  Its possible we start doing stuff in user space before teh sctp module is fully initalized and we trip over an uniitalized list.  I've got another patch for you to try.  This one moves the list init very early in sctp_init, prior to making the protocol available to user space.  Please let me know if this fixes the issue.

Comment 27 Neil Horman 2010-07-15 18:09:44 UTC

Created attachment 432179 [details]
new patch to balance locking and make sure sctp_local_addr_list is initalized

Comment 28 Neil Horman 2010-10-25 10:57:37 UTC

ping any results here?

Comment 29 Jaroslav Kortus 2010-11-08 23:43:18 UTC

I was not able to reproduce it with this patch so far so it seems to work.

Comment 30 Neil Horman 2010-11-29 12:00:42 UTC

Note to self, this is a backport of upstream commits:
270637abff0cdf848b910b9f96ad342e1da61c66
the unlock code in copy_laddrs was done upstream by some rcu conversion and isn't really suitable for backport

Comment 31 Neil Horman 2010-11-29 21:34:31 UTC

http://brewweb.devel.redhat.com/brew/taskinfo?taskID=2921310

Test build with full backport

Comment 32 Neil Horman 2010-11-30 17:07:47 UTC

http://brewweb.devel.redhat.com/brew/taskinfo?taskID=2923578

Fixed test build

Comment 33 RHEL Program Management 2011-02-01 16:54:22 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 38 Jarod Wilson 2011-03-10 16:41:42 UTC

in kernel-2.6.18-247.el5
You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5

Detailed testing feedback is always welcomed.

Comment 42 Chao Ye 2011-06-16 05:55:41 UTC

Confirm patch in kernel git tree.

Comment 46 errata-xmlrpc 2011-07-21 09:44:47 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-1065.html

Note You need to log in before you can comment on or make changes to this bug.