This service will be undergoing maintenance at 00:00 UTC, 2016-08-01. It is expected to last about 1 hours
Bug 476897 - kernel panics when attempting to rmmod the bnx2 module while it is in use.
kernel panics when attempting to rmmod the bnx2 module while it is in use.
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.3
All Linux
high Severity high
: rc
: 5.4
Assigned To: Andy Gospodarek
Red Hat Kernel QE team
:
Depends On: 475567
Blocks: RHEL5u3_relnotes 458757 483701 483784 485920 502021
  Show dependency treegraph
 
Reported: 2008-12-17 15:17 EST by Mike Gahagan
Modified: 2014-06-29 19:00 EDT (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
If jumbo frames are enabled on your system, a kernel panic will occur if you attempt to unload the bnx2 module.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-09-02 04:14:22 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
bnx2 patch (598 bytes, patch)
2008-12-17 21:17 EST, Michael Chan
no flags Details | Diff

  None (edit)
Description Mike Gahagan 2008-12-17 15:17:37 EST
Description of problem:
From problem reported during verification of bz 470625
https://bugzilla.redhat.com/show_bug.cgi?id=470625#c18


We found an issue while under Xen. A kernel panic will occur if the following
conditions are met:

1) Jumbo frames are enabled on both ports of a 5709C
2) Both interfaces are brought up.
3) Driver is unloaded from memory (rmmod)
   Kernel panic will occur

Version-Release number of selected component (if applicable):
RHEL 5.3 snapshot 6 (likely others as well)

How reproducible:
not certain, likely to be pretty reproduceable.

Steps to Reproduce:
1.see description
2.
3.
  
Actual results:
attempts to unload modules in use should fail rather than panic'ing the system

Expected results:
attempt to unload module in use should fail, system remains up and running.

Additional info:
very likely to be a module refcounting bug in this driver according to Neil.
Comment 1 Neil Horman 2008-12-17 15:23:25 EST
Joe, as per your last comment in bz 470625, do you have a backtrace of the panic that you saw there?
Comment 2 Andy Gospodarek 2008-12-17 15:45:48 EST
I'd like to see that backtrace, too.  I was thinking this was related to the freeing of dummy_netdevs, but don't see any immediate problems.
Comment 3 Joe T 2008-12-17 17:34:05 EST
We didn't have the trace captured, so it had to be reproduced:

Red Hat Enterprise Linux Server release 5.3 Beta (Tikanga)
Kernel 2.6.18-126.el5xen on an x86_64


login: root
Password: 
Last login: Wed Dec 17 21:29:49 on tty1
[root@RHEL53b ~]# ethtool -i eth4
driver: bnx2
version: 1.7.9-1
firmware-version: 1.9.6
bus-info: 0000:04:00.0
[root@RHEL53b ~]# ifconfig eth4 mtu 9000 up
[root@RHEL53b ~]# dhclient eth4
Internet Systems Consortium DHCP Client V3.0.5-RedHat
Copyright 2004-2006 Internet Systems Consortium.
All rights reserved.
For info, please visit http://www.isc.org/sw/dhcp/

Listening on LPF/eth4/00:1f:29:e6:d8:56
Sending on   LPF/eth4/00:1f:29:e6:d8:56
Sending on   Socket/fallback
DHCPREQUEST on eth4 to 255.255.255.255 port 67
DHCPREQUEST on eth4 to 255.255.255.255 port 67
DHCPACK from 172.16.10.100
bound to 172.16.99.158 -- renewal in 40276 seconds.
[root@RHEL53b ~]# lspci | grep 04:00
04:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet (rev 12)
[root@RHEL53b ~]# rmmod bnx2
Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP: 
 [<ffffffff8027cc53>] xen_destroy_contiguous_region+0x83/0x3d6
PGD 557e4067 PUD 56859067 PMD 0 
Oops: 0002 [1] SMP 
last sysfs file: /devices/pci0000:00/0000:00:03.0/class
CPU 0 
Modules linked in: ipv6 xfrm_nalgo crypto_api autofs4 hidp rfcomm l2cap bluetooth sunrpc cpufreq_ondemand powernow_k8 freq_table dm_mirror dm_log dm_multipath scsi_dh dm_mod video hwmon backlight sbs i2c_ec button battery asus_acpi ac parport_pc lp parport pcspkr serio_raw hpilo i2c_piix4 serial_core i2c_core bnx2 ide_cd cdrom shpchp cciss sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 3414, comm: rmmod Not tainted 2.6.18-126.el5xen #1
RIP: e030:[<ffffffff8027cc53>]  [<ffffffff8027cc53>] xen_destroy_contiguous_region+0x83/0x3d6
RSP: e02b:ffff880057849cf8  EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000001000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: ffffffff8068fa40 R09: 0000000000000000
R10: ffff880057849cf8 R11: 0000000000000048 R12: 0000000000000001
R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
FS:  00002af1a13aa6e0(0000) GS:ffffffff805ba000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000
Process rmmod (pid: 3414, threadinfo ffff880057848000, task ffff880069b54100)
Stack:  ffff880057849d40  0000000000000001  0000000000000000  0000000000007ff0 
 ffffffff8068ea40  0000000000000001  0000000000000000  0000000000007ff0 
 0000000000000000  ffffffff804eac00 
Call Trace:
 [<ffffffff80271292>] dma_free_coherent+0x69/0x77
 [<ffffffff8810b054>] :bnx2:bnx2_free_mem+0x12d/0x228
 [<ffffffff8810cc70>] :bnx2:bnx2_close+0x59/0x7a
 [<ffffffff80410d11>] dev_close+0x53/0x72
 [<ffffffff80410db9>] unregister_netdevice+0x89/0x21b
 [<ffffffff80410f5c>] unregister_netdev+0x11/0x17
 [<ffffffff8810e072>] :bnx2:bnx2_remove_one+0x30/0x8e
 [<ffffffff80346fc8>] pci_device_remove+0x24/0x3a
 [<ffffffff803a08a4>] __device_release_driver+0x9f/0xc3
 [<ffffffff803a0c44>] driver_detach+0xad/0x101
 [<ffffffff8039fe62>] bus_remove_driver+0x6d/0x90
 [<ffffffff803a0ccb>] driver_unregister+0xd/0x16
 [<ffffffff80347157>] pci_unregister_driver+0x10/0x5f
 [<ffffffff8029f9c2>] sys_delete_module+0x196/0x1c5
 [<ffffffff8025f2f9>] tracesys+0xab/0xb6


Code: f3 aa 48 c7 c7 00 31 53 80 e8 8f 6d fe ff 49 89 c3 48 b8 ff 
RIP  [<ffffffff8027cc53>] xen_destroy_contiguous_region+0x83/0x3d6
 RSP <ffff880057849cf8>
CR2: 0000000000000000
 <0>Kernel panic - not syncing: Fatal exception
 (XEN) Domain 0 crashed: rebooting machine in 5 seconds.
Comment 4 Michael Chan 2008-12-17 21:17:49 EST
Created attachment 327295 [details]
bnx2 patch

This problem is likely caused by a bug in bnx2's bnx2_free_rx_mem() and the attached patch should fix it.  I'll do more testing and will post the patch upstream.  Thanks.
Comment 6 Linda Wang 2008-12-17 22:57:34 EST
Release note added. If any revisions are required, please set the 
"requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

New Contents:
On RHEL5.3, when removing the bnx2 driver module,a kernel panic will occur.
Comment 7 Andy Gospodarek 2008-12-18 10:38:06 EST
That release note language seems a bit scary.  Can we narrow this down to something that only happens with jumbo frames?
Comment 8 Michael Chan 2008-12-18 13:08:05 EST
Agreed with Andy.  The patch has been verified to fix the issue and upstream patch has also been accepted.  Thanks.
Comment 9 Andy Gospodarek 2008-12-18 13:51:23 EST
Release note updated. If any revisions are required, please set the 
"requires_release_notes"  flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

Diffed Contents:
@@ -1 +1 @@
-On RHEL5.3, when removing the bnx2 driver module,a kernel panic will occur.+On RHEL5.3, when removing the bnx2 driver module while using jumbo frames, a kernel panic will occur.
Comment 10 Linda Wang 2008-12-18 16:24:56 EST
thanks Andy, Mike. I was just trying to get the release notes going.
FWIW, since comment#4 hasn't been posted for review and integrated into
the kernel, we can't really move this bug to VERIFIED. 
So move this bug back to ASSI for 5.4 processing. 

Thanks again.
Comment 11 Don Domingo 2009-01-13 22:18:33 EST
Release note updated. If any revisions are required, please set the 
"requires_release_notes"  flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

Diffed Contents:
@@ -1 +1 @@
-On RHEL5.3, when removing the bnx2 driver module while using jumbo frames, a kernel panic will occur.+If jumbo frames are enabled on your system, a kernel panic will occur if you attempt to unload the bnx2 module.
Comment 13 RHEL Product and Program Management 2009-02-16 10:19:06 EST
Updating PM score.
Comment 15 Don Zickus 2009-04-27 11:58:45 EDT
in kernel-2.6.18-141.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Please do NOT transition this bugzilla state to VERIFIED until our QE team
has sent specific instructions indicating when to do so.  However feel free
to provide a comment indicating that this fix has been verified.
Comment 17 Joe T 2009-04-29 15:42:52 EDT
Issue no longer seen in kernel-xen-2.6.18-141.el5.x86_64.rpm (bnx2 v1.9.3)
Comment 20 errata-xmlrpc 2009-09-02 04:14:22 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1243.html

Note You need to log in before you can comment on or make changes to this bug.