Bug 595862

Summary: [Broadcom 5.6 bug] cnic: Panic in cnic_iscsi_nl_msg_recv()
Product: Red Hat Enterprise Linux 5 Reporter: Michael Chan <mchan>
Component: kernelAssignee: Stanislaw Gruszka <sgruszka>
Status: CLOSED ERRATA QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 5.6CC: aaswath, agospoda, andriusb, apevec, benlu, coughlan, cward, dhoward, dyasny, edwardn, enarvaez, gideonn, jbroman, jpirko, mchristi, niran, vbian
Target Milestone: rcKeywords: OtherQA, ZStream
Target Release: 5.6   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 578005 Environment:
Last Closed: 2011-01-13 21:34:01 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 568606, 578005    
Bug Blocks: 615260    

Description Michael Chan 2010-05-25 19:29:33 UTC
This happens occasionally during heavy iscsi login/logout test.

Sometimes we receive a netlink message for a device that has been closed.  This upstream patch fixes the problem:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d02a5e6c2fba8b114c44cf05085fca07180f37f1

Requesting 5.5.z inclusion.

Comment 1 Stanislaw Gruszka 2010-06-02 14:39:56 UTC
Brew build:
https://brewweb.devel.redhat.com/taskinfo?taskID=2484169
Publicly available packages:
http://people.redhat.com/sgruszka/rhel5/bz596862/

Comment 2 RHEL Program Management 2010-06-02 14:52:00 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 3 Jan Tluka 2010-06-08 13:07:56 UTC
Could you provide a panic message you are hitting?

Comment 4 Michael Chan 2010-06-08 16:15:54 UTC
crash> bt
PID: 5571 TASK: ffff8101af9f2100 CPU: 2 COMMAND: "brcm_iscsiuio"
#0 [ffff8101af22d7f0] crash_kexec at ffffffff800aeb6b
#1 [ffff8101af22d8b0] __die at ffffffff80066157
#2 [ffff8101af22d8f0] do_page_fault at ffffffff80067dd7
#3 [ffff8101af22d9e0] error_exit at ffffffff8005ede9
[exception RIP: cnic_iscsi_nl_msg_recv+65]
RIP: ffffffff8849914c RSP: ffff8101af22da98 RFLAGS: 00010206
RAX: 000000000000003f RBX: 0000000000003918 RCX: 0000000000000040
RDX: ffff81032fe7c8a8 RSI: 000000000000001e RDI: ffff81032fe7c800
RBP: ffff81032e196048 R8: ffff81032a006000 R9: 0000000000000088
R10: ffff81032ccc0c00 R11: 0000000000000000 R12: ffff81032e196010
R13: ffff8101af22ddd8 R14: ffffffff884b0e00 R15: 0000000000000000
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#4 [ffff8101af22dad0] bnx2i_nl_set_path at ffffffff884a978f
#5 [ffff8101af22dae0] iscsi_if_rx at ffffffff883f6093
#6 [ffff8101af22db60] netlink_data_ready at ffffffff80245de1
#7 [ffff8101af22db70] netlink_sendskb at ffffffff80244f98
#8 [ffff8101af22db90] netlink_sendmsg at ffffffff80245dbc
#9 [ffff8101af22dc20] sock_sendmsg at ffffffff8005567d
#10 [ffff8101af22ddc0] sys_sendmsg at ffffffff8022704e
#11 [ffff8101af22df80] tracesys at ffffffff8005e28d (via system_call)
RIP: 00000038dcc0df2b RSP: 00007fff09f985b0 RFLAGS: 00000202
RAX: ffffffffffffffda RBX: ffffffff8005e28d RCX: ffffffffffffffff
RDX: 0000000000000000 RSI: 00007fff09f985e0 RDI: 0000000000000009
RBP: 000000000000001e R8: 0000000005dc0000 R9: 000000005e4110ac
R10: 0000000000000000 R11: 0000000000000202 R12: 00000000194bc2a0
R13: 0000000000000078 R14: 0000000000000088 R15: 0000000000000009
ORIG_RAX: 000000000000002e CS: 0033 SS: 002b

Comment 6 Jarod Wilson 2010-06-14 18:23:44 UTC
in kernel-2.6.18-203.el5
You can download this test kernel from http://people.redhat.com/jwilson/el5

Detailed testing feedback is always welcomed.

Comment 7 Andrius Benokraitis 2010-06-14 18:29:28 UTC
Requesting/proposing for RHEL 5.5.z based on severity of the defect, patch size, and user impact.

Comment 12 edwardn 2010-11-23 00:05:48 UTC
Update for RH5.6 inbox:

With kernel-2.6.18-232.el5 and iscsi-initiator-utils-6.2.0.872-6 (uIP
0.6.2.2.1), this issue is no longer seen.  It was also verified previously with kernel-2.6.18-203.el5 back in June 2000.

Comment 13 Chris Ward 2010-11-23 09:52:05 UTC
Thanks Broadcom. In the future, it would help me out if when informing us of successful test verification, you'd also add 'Broadcom' to the Verified field above. Thanks!

Comment 15 errata-xmlrpc 2011-01-13 21:34:01 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0017.html