Bug 438629

Summary: multiple concurrent brctl addif cause kernel panic
Product: Red Hat Enterprise Linux 5 Reporter: Dan Kenigsberg <danken>
Component: kernelAssignee: Neil Horman <nhorman>
Status: CLOSED DUPLICATE QA Contact: Martin Jenner <mjenner>
Severity: high Docs Contact:
Priority: low    
Version: 5.1CC: davem, herbert.xu, tgraf
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-05-14 18:14:30 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
reproduce kernel panic with multiple concurrent brctl/tunctl
none
tunctl source code
none
nasty script makes network unworkable for an hour none

Description Dan Kenigsberg 2008-03-23 15:37:39 UTC
Description of problem: multiple parallel tunctl/brctl cause kernel panic


Version-Release number of selected component (if applicable):
kernel-2.6.18-53.1.14

How reproducible:
always

Steps to Reproduce:
1. compile tunctl (originally from user-mode-linux utils, source attached)
2. create a bridge called sw0
3. run the attached if-create-add-del script (as root)
  
Actual results:
after few seconds of running, kernel panic (not the same one on each attempt)

Expected results:
the script should finish, all the interfaces it generated should be down.

Additional info:
modern Fedora kernel (kernel-2.6.24.3-12) does not have this.

Comment 1 Dan Kenigsberg 2008-03-23 15:37:39 UTC
Created attachment 298862 [details]
reproduce kernel panic with multiple concurrent brctl/tunctl

Comment 2 Dan Kenigsberg 2008-03-23 15:39:15 UTC
Created attachment 298863 [details]
tunctl source code

Comment 3 Prarit Bhargava 2008-05-07 11:32:37 UTC
Dan, can you cut-and-paste the output of the panic here?

Thanks,

P.

Comment 4 Dan Kenigsberg 2008-05-07 12:18:20 UTC
Sure. Note that this specific run is on an unpatched 2.6.18-8 kernel, but same
thing happens with 5.1 kernel.

device 0x0 entered promiscuous mode
device 0x0 left promiscuous mode
sw0: port 2(0x0) entering disabled state
device 1x0 entered promiscuous mode
device 16x0 entered promiscuous mode
device 16x0 left promiscuous mode
sw0: port 3(16x0) entering disabled state
device 9x0 entered promiscuous mode
device 9x0 left promiscuous mode
sw0: port 3(9x0) entering disabled state
device 6x0 entered promiscuous mode
device 6x0 left promiscuous mode
sw0: port 3(6x0) entering disabled state
device 11x0 entered promiscuous mode
device 2x0 entered promiscuous mode
device 3x0 entered promiscuous mode
device 4x0 entered promiscuous mode
sw0: port 2(1x0) entering learning state
device 5x0 entered promiscuous mode
device 5x0 left promiscuous mode
sw0: port 7(5x0) entering disabled state
device 1x0 left promiscuous mode
sw0: port 2(1x0) entering disabled state
device 18x0 entered promiscuous mode
device 18x0 left promiscuous mode
sw0: port 2(18x0) entering disabled state
device 25x0 entered promiscuous mode
device 2x0 left promiscuous mode
sw0: port 4(2x0) entering disabled state
device 2x1 entered promiscuous mode
device 4x0 left promiscuous mode
sw0: port 6(4x0) entering disabled state
device 4x1 entered promiscuous mode
device 28x0 entered promiscuous mode
device 4x1 left promiscuous mode
sw0: port 6(4x1) entering disabled state
sw0: port 3(11x0) entering learning state
device 7x0 entered promiscuous mode
device 26x0 entered promiscuous mode
device 27x0 entered promiscuous mode
device 10x0 entered promiscuous mode
sw0: port 7(28x0) entering learning state
device 0x1 entered promiscuous mode
device 17x0 entered promiscuous mode
Unable to handle kernel NULL pointer dereference at 0000000000000009 RIP: 
 [<ffffffff8004b284>] run_workqueue+0x64/0xe5
PGD 0 
Oops: 0002 [1] SMP 
last sysfs file: /class/net/lo/type
CPU 0 
Modules linked in: netconsole tun ksm_mem(U) kvm_intel(U) kvm(U) autofs4 hidp
nfs lockd fscache nfs_acl rfcomm l2cap bluetooth sunrpc bridge ipv6
cpufreq_ondemand video sbs i2c_ec button battery asus_acpi acpi_memhotplug ac
parport_pc lp parport sg i2c_i801 ide_cd i2c_core cdrom serio_raw shpchp e1000
pcspkr dm_snapshot dm_zero dm_mirror dm_mod ata_piix libata sd_mod scsi_mod ext3
jbd ehci_hcd ohci_hcd uhci_hcd
Pid: 8, comm: events/0 Tainted: GF     2.6.18-8.el5 #1
RIP: 0010:[<ffffffff8004b284>]  [<ffffffff8004b284>] run_workqueue+0x64/0xe5
RSP: 0018:ffff810037dabe40  EFLAGS: 00010006
RAX: 000000046474e550 RBX: ffff810071071740 RCX: 0000000000000000
RDX: 0000000000000001 RSI: 0000000000000296 RDI: ffff810037d09740
RBP: ffff810071071748 R08: ffff810037d09788 R09: ffffffff800617b6
R10: ffff81005784f588 R11: ffff810058780e48 R12: ffff810037d09740
device 17x0 left promiscuous mode
sw0: port 12(17x0) entering disabled state
R13: 0000000000000296 R14: 00000000004056f0 R15: 00000000000056f0
FS:  0000000000000000(0000) GS:ffffffff8038a000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000009 CR3: 0000000059463000 CR4: 00000000000026e0
Process events/0 (pid: 8, threadinfo ffff810037daa000, task ffff810037fef7a0)
Stack:  ffff810037dabe80 ffff810037d09740 ffffffff80047c13 ffff81007fe31d10
 ffff81007fee9720 ffffffff80280001 0000000000000000 ffffffff80047d03
 0000000000000000 ffff810037fef7a0 ffffffff80086c5f 0000000000100100
Call Trace:
 [<ffffffff80047c13>] worker_thread+0x0/0x122
 [<ffffffff80047d03>] worker_thread+0xf0/0x122
 [<ffffffff80086c5f>] default_wake_function+0x0/0xe
 [<ffffffff8003216e>] kthread+0xfe/0x132
 [<ffffffff8005bfe5>] child_rip+0xa/0x11
 [<ffffffff80032070>] kthread+0x0/0x132
 [<ffffffff8005bfdb>] child_rip+0x0/0x11


Code: 48 89 42 08 48 89 10 48 89 6d 08 48 89 6d 00 e8 c0 73 01 00 
RIP  [<ffffffff8004b284>] run_workqueue+0x64/0xe5
 RSP <ffff810037dabe40>
CR2: 0000000000000009
 <0>Kernel panic - not syncing: Fatal exception
 



Comment 5 Neil Horman 2008-05-12 20:21:23 UTC
could you please test with the latest RHEL 5.2 kernel (-92.el5 I think is the
latest).  This looks like a duplication of bz 408791.  Thanks!

Comment 6 Dan Kenigsberg 2008-05-12 21:22:10 UTC
pardon my ignorance, but where can I get one of those latest RHEL 5.2 kernels?

Please note that bug 408791 is not viewable by me, so I cannot judge if it's the
same. (if it, maybe this bug, too, has to be embatgoed)

Comment 7 Neil Horman 2008-05-13 11:06:37 UTC
you can get them  from the RHN beta channel for RHEL5 server.  Its release
84.el5, rather than 92, but it should still have the fix in place.  I've cc'd
you on the other bug.   ITs not sensitive, just in-accessible with the group set
that you're in.  You should be able to see it now.

Comment 8 Dan Kenigsberg 2008-05-14 15:36:44 UTC
Created attachment 305372 [details]
nasty script makes network unworkable for an hour

with -92.el5 the panic is gone. however, my nasty script makes the server
unresponsive for at least an hour (/var/log/messages attached). This does not
happen on my 2.6.24.3-12.fc8 Fedora.

Comment 9 Dan Kenigsberg 2008-05-14 15:38:02 UTC
toggle needinfo

Comment 10 Neil Horman 2008-05-14 18:14:30 UTC
yeah, fork bombs do that too ;)

Your script effectively creates 1000 processes all trying to manipulate some of
the same data structures.

The patch for the panic you reported works by serializing the removal of your
tun/tap interfaces behind the completion of the port_carrier_check work that the
bridge has to do on the tun/tap interface after its added to the bridge.  The
result is that every one of those 1000 process has to wait in line for the
bridge interface to process its corresponding carrier check work.   

As for F-8 not having this problem, it looks like the between the time this
patch went up stream and 2.6.24 released, there were significant changes made to
the bridge code, which among other things  took this carrier check operation out
of the blocking path for the operations you are trying to preform.  We could
look into moving that code back to RHEL5 if you like, but I suspect it will be
an ABI breaker if we do, and given that the script you provide here is more of
an academic exercize more than a practical function, I'd say its probably best
to just leave this fixed as it is.


*** This bug has been marked as a duplicate of 408791 ***

Comment 11 Dan Kenigsberg 2008-05-14 20:53:43 UTC
Thanks for your help.

FYI, I wrote that script in order to reproduce a real-world error (brctl
Abort'ing without core dump, while other tunctl/brctl were running). Instead, I
produced this kernel panic.

This was enough to prove that we cannot trust the bridge code, and must
protected ourselves with a userspace semaphore.

Comment 12 Neil Horman 2008-05-15 00:43:16 UTC
no worries.  

I figured that you were trying to reproduce a real world error, I just didn't
figure that the real world problem involved actually trying to create and delete
1000 tap interfaces on a bridge at once.

As you can see from the fix though, that serialization happens in the kernel
now, you shouldn't need any additional syncronization in userspace.

Regards
Neil