Bugzilla will be upgraded to version 5.0 on a still to be determined date in the near future. The original upgrade date has been delayed.
Bug 599476 - bnx2x driver dumps logs. Network unusable. [NEEDINFO]
bnx2x driver dumps logs. Network unusable.
Status: CLOSED INSUFFICIENT_DATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.4
x86_64 Linux
low Severity medium
: rc
: ---
Assigned To: Michal Schmidt
Red Hat Kernel QE team
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-06-03 06:20 EDT by Linux engineering teams - Veritas
Modified: 2013-12-11 08:55 EST (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-12-11 08:55:04 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
sgruszka: needinfo? (udas)


Attachments (Terms of Use)
Console messages from th ebnx2x driver (412.89 KB, text/plain)
2010-06-03 06:20 EDT, Linux engineering teams - Veritas
no flags Details
debug tool (4.69 MB, application/x-gzip)
2010-06-30 09:07 EDT, Dmitry Kravkov
no flags Details
example of running the debug tool (4.82 KB, text/plain)
2010-06-30 09:10 EDT, Dmitry Kravkov
no flags Details

  None (edit)
Description Linux engineering teams - Veritas 2010-06-03 06:20:14 EDT
Created attachment 419311 [details]
Console messages from th ebnx2x driver

Description of problem:
The bnx2x driver dumps verbose messages to the console prefixed with "bnx2x_panic_dump". After that network becomes unusable. This causes the Symantec clusterware to eventually panic nodes.

Version-Release number of selected component (if applicable):
(Linux)(c1062-hpblade1) ~{1} uname -a
Linux c1062-hpblade1.engba.symantec.com 2.6.18-164.el5 #1 SMP Tue Aug 18 15:51:48 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
(Linux)(c1062-hpblade1) ~{2} modinfo bnx2x
filename:       /lib/modules/2.6.18-164.el5/kernel/drivers/net/bnx2x.ko
version:        1.48.105
license:        GPL
description:    Broadcom NetXtreme II BCM57710/57711/57711E Driver
author:         Eliezer Tamir
srcversion:     6D030D52DFD981356EEC2BE
alias:          pci:v000014E4d00001650sv*sd*bc*sc*i*
alias:          pci:v000014E4d0000164Fsv*sd*bc*sc*i*
alias:          pci:v000014E4d0000164Esv*sd*bc*sc*i*
depends:        
vermagic:       2.6.18-164.el5 SMP mod_unload gcc-4.1
parm:           multi_mode: Use per-CPU queues (int)
parm:           disable_tpa: Disable the TPA (LRO) feature (int)
parm:           int_mode: Force interrupt mode (1 INT#x; 2 MSI) (int)
parm:           poll: Use polling (for debug) (int)
parm:           mrrs: Force Max Read Req Size (0..3) (for debug) (int)
parm:           debug: Default debug msglevel (int)
module_sig:     883f3504a8b7cb4bd273d74512bb112f59309f6c2877f55cb62445a8c8af7cdaf1e559983bdb09f5b1b3fa9aca58a8f88267866b046e346ebe3bbaa
(Linux)(c1062-hpblade1) ~{3} 


How reproducible:
Occurs in relatively heavy network load. Easily reproducible with Symantec clusterware with large configurations.
  
Actual results:
Machine shows bnx2x messages on console. Network goes unusable.

Additional info:
Attaching the messages seen.

Symantec contact: udas@veritas.com
Comment 1 Stanislaw Gruszka 2010-06-07 08:24:37 EDT
This bug is most likely the same issue as we have in bug 516090. It is already fixed by driver update. Try up-to-date kernel like 2.6.18-194.3.1.el5, or kernels from http://people.redhat.com/jwilson/el5/
Comment 2 Stanislaw Gruszka 2010-06-09 08:57:17 EDT
Any comments on above?
Comment 3 Linux engineering teams - Veritas 2010-06-24 06:29:37 EDT
The latest kernel from the above link did not work with us. Similar log entries were observed.

Symantec contact: udas@veritas.com
Comment 4 Stanislaw Gruszka 2010-06-24 07:27:31 EDT
We have another bnx2x panic, happens on RHEL5.4 and on up-to-date RHEL5 kernels.
Comment 5 Stanislaw Gruszka 2010-06-24 07:28:23 EDT
@Veritas, did blacklisting cnic and bnx2i modules help?
Comment 6 Stanislaw Gruszka 2010-06-30 08:13:15 EDT
Please try kernel 204.el5, it include cnic fix for bug that can cause bnx2x panic.

If it does not help we will probably need more info. I'm not sure if this bnx2x crash dump contains all information to allow Broadcom to fix the issue. 

@Broadcom, do you want any more info to track down this bnx2x panic?
Comment 7 Dmitry Kravkov 2010-06-30 09:07:37 EDT
Created attachment 427988 [details]
debug tool

debugging tool
Comment 8 Dmitry Kravkov 2010-06-30 09:09:15 EDT
Please provide us result of "grcDump" command using attached debug tool
Building it:
> tar xf edebug_linux_ver_0.1.4.tar.gz
> cd  edebug_0.1.4/
> make
>./load.sh

Using it
The tool will show list of BCM5771x device on the system
select one caused the crash by "device X" (you can recognize it by pci bus address or MAC address). Then apply command "grcDump regs.dump". Exit application by "exit" command. Upload generated regs.dump file.

Thanks
Comment 9 Dmitry Kravkov 2010-06-30 09:10:43 EDT
Created attachment 427990 [details]
example of running the debug tool
Comment 10 Linux engineering teams - Veritas 2010-07-07 15:45:50 EDT
Our repro setup is no longer available for this. We will get back to you whenever they become available again. However, the issue is easily reproduce on HP blades.
Comment 11 Michal Schmidt 2013-12-11 08:55:04 EST
This BZ has had the needinfo? flag set for more than 3 years. Closing.

Note You need to log in before you can comment on or make changes to this bug.