Bug 517378 - [Broadcom 5.5 FEAT] Update bnx2i and cnic drivers
Summary: [Broadcom 5.5 FEAT] Update bnx2i and cnic drivers
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.5
Hardware: All
OS: Linux
high
high
Target Milestone: alpha
: 5.5
Assignee: Mike Christie
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
: 512193 (view as bug list)
Depends On: 515716
Blocks: 533192 557291 481160 517380 525215 533941
TreeView+ depends on / blocked
 
Reported: 2009-08-13 17:00 UTC by Michael Chan
Modified: 2011-12-06 15:42 UTC (History)
24 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-03-30 07:21:28 UTC


Attachments (Terms of Use)
0001-update-cnic-driver.patch (138.12 KB, patch)
2009-11-19 16:01 UTC, Stanislaw Gruszka
no flags Details | Diff
0006-update-cnic-driver__v2.patch (137.77 KB, patch)
2009-11-23 09:09 UTC, Stanislaw Gruszka
no flags Details | Diff
0001-cnic-fixes-for-RHEL5.5.patch (11.35 KB, patch)
2009-12-15 15:29 UTC, Stanislaw Gruszka
no flags Details | Diff
0002-bnx2i-fixes-for-RHEL5.5.patch (7.80 KB, patch)
2009-12-15 15:30 UTC, Stanislaw Gruszka
no flags Details | Diff
bnx2x panic dump from comment 78 (253.48 KB, text/plain)
2010-03-03 02:44 UTC, Thomas Chenault
no flags Details
bnx2x panic dump with kernel -190.el5 (1.36 MB, text/plain)
2010-03-03 21:10 UTC, Thomas Chenault
no flags Details
GRC dump from 57710 (4.72 MB, text/plain)
2010-03-04 23:23 UTC, Thomas Chenault
no flags Details
log from brcm_iscsiuio -f -d 100 (30.05 KB, text/plain)
2010-03-11 20:09 UTC, IBM Bug Proxy
no flags Details
log from iscsid -f -d 100 (30.21 KB, application/octet-stream)
2010-03-11 20:09 UTC, IBM Bug Proxy
no flags Details
output from /var/log/messages (1.02 KB, text/plain)
2010-03-11 20:09 UTC, IBM Bug Proxy
no flags Details
dmesg log with debugging information (46.97 KB, text/plain)
2010-03-13 00:24 UTC, IBM Bug Proxy
no flags Details
shell session (3.34 KB, text/plain)
2010-03-15 23:25 UTC, IBM Bug Proxy
no flags Details
/var/log/messages output (24.11 KB, text/plain)
2010-03-15 23:36 UTC, IBM Bug Proxy
no flags Details
debugging output from brcm_iscsiuio (469.55 KB, text/plain)
2010-03-15 23:36 UTC, IBM Bug Proxy
no flags Details
debugging output from iscsid (180.35 KB, text/plain)
2010-03-15 23:36 UTC, IBM Bug Proxy
no flags Details
output from /var/log/messages (15.87 KB, text/plain)
2010-03-16 23:46 UTC, IBM Bug Proxy
no flags Details
output from brcm_iscsiuio (510.27 KB, text/plain)
2010-03-16 23:46 UTC, IBM Bug Proxy
no flags Details
output from iscsid (55.62 KB, text/plain)
2010-03-16 23:46 UTC, IBM Bug Proxy
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2010:0178 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 5.5 kernel security and bug fix update 2010-03-29 12:18:21 UTC

Description Michael Chan 2009-08-13 17:00:36 UTC
1.  Feature Overview:
     a. Update bnx2i and cnic drivers

     b. Feature Description
	General bug fixes and to add support for 10G devices.  Bug fixes for ia64 and ppc.

2.  Feature Details:
     a. Architectures:
         32-bit x86
         64-bit Intel EM64T/AMD64
         64-bit Itanium2
         ppc64

     b. Upstream acceptance information:
        Initial versions are upstream.  Bug fixes and enhancement patch submission will be ongoing.

3. Business Justification:
     a. Why is this feature needed?
	Support 10G devices and bug fixes.

4. Primary contact at Broadcom, email, phone
    mchan@broadcom.com
    (949)926-6170

Comment 1 Andrius Benokraitis 2009-08-24 19:20:30 UTC
*** Bug 512193 has been marked as a duplicate of this bug. ***

Comment 2 Michael Chan 2009-10-12 21:24:51 UTC
bnx2x and cnic patches for 10G iSCSI support have been merged into net-next-2.6.  Thanks.

Comment 3 Mike Christie 2009-10-12 22:05:37 UTC
Are any brcm/uip changes?

No bnx2i changes right?

Do you have a bugzilla to update bnx2x for general networking issues? Maybe we could just piggy back the 10 gig iscsi bnx2x changes with it, so it will be easier to coordinate that.

Comment 4 Michael Chan 2009-10-12 22:14:50 UTC
(In reply to comment #3)
> Are any brcm/uip changes?

Yes, we need to add the uio userspace driver for 10G.  Still working on that.

> No bnx2i changes right?

Right.  May be some bug fixes during testing.

> Do you have a bugzilla to update bnx2x for general networking issues? Maybe we
> could just piggy back the 10 gig iscsi bnx2x changes with it, so it will be
> easier to coordinate that.  

Bug 515716

Comment 5 Chris Ward 2009-10-13 15:49:04 UTC
@Broadcom,

We need to confirm that there is commitment to test 
for the resolution of this request during the RHEL 5.5 test
phase, if it is accepted into the release. 

Please post a confirmation before Oct 16th, 2009, 
including the contact information for testing engineers.

Comment 6 Michael Chan 2009-10-13 16:38:44 UTC
Yes, adding Nasser to CC to assign test engineers.

Comment 7 IBM Bug Proxy 2009-10-13 18:51:45 UTC
------- Comment From lcm@us.ibm.com 2009-10-13 14:42 EDT-------
IBM will also provide test feedback. Please coordinate through Peter Bogdanovic, pbogdano@redhat.com.

Comment 8 Ed Narvaez 2009-10-13 22:02:30 UTC
PQA Test Engineers are as follows:

bnx2x - Tung Nguyen (tungn@broadcom.com)
bnx2 - Joe Torricelli (jtorrice@broadcom.com)
tg3 - Jeff Leu (jleu@broadcom.com)
bnx2i - Emory Bestenlehner (emoryb@broadcom.com)

BRCM will also provide periodic test results.

Any questions/comments, please let me know.  Thanks

Ed Narvaez, enarvaez@broadcom.com, 949-926-6456

Comment 9 Mike Christie 2009-10-13 22:16:55 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > Are any brcm/uip changes?
> 
> Yes, we need to add the uio userspace driver for 10G.  Still working on that.
>

Ok. Let me use this bugzilla for the userspace changes needed then.
 
> > No bnx2i changes right?
> 
> Right.  May be some bug fixes during testing.


Ok.

> 
> > Do you have a bugzilla to update bnx2x for general networking issues? Maybe we
> > could just piggy back the 10 gig iscsi bnx2x changes with it, so it will be
> > easier to coordinate that.  
> 
> Bug 515716  


And then just send the bnx2x iscsi stuff with your normal update, so that way we do not have any dependency issues with the kernel stuff.

Comment 10 John Jarvis 2009-10-27 14:25:44 UTC
This enhancement request was evaluated by the full Red Hat Enterprise Linux 
team for inclusion in a Red Hat Enterprise Linux minor release.   As a 
result of this evaluation, Red Hat has tentatively approved inclusion of 
this feature in the next Red Hat Enterprise Linux Update minor release.   
While it is a goal to include this enhancement in the next minor release 
of Red Hat Enterprise Linux, the enhancement is not yet committed for 
inclusion in the next minor release pending the next phase of actual 
code integration and successful Red Hat and partner testing.

Comment 11 Stanislaw Gruszka 2009-11-19 16:01:30 UTC
Created attachment 370624 [details]
0001-update-cnic-driver.patch

Comment 14 Stanislaw Gruszka 2009-11-23 09:09:23 UTC
Created attachment 373042 [details]
0006-update-cnic-driver__v2.patch

This patch is the same as 0001-update-cnic-driver.patch,   but it's rebased
 atop Mike Christie patch for bug 516233, which also modifies cnic code

Comment 15 Stanislaw Gruszka 2009-11-23 12:02:08 UTC
Brew build:
https://brewweb.devel.redhat.com/taskinfo?taskID=2099220

It contains patches from following bugzillas:
 Bug 516233 -  Panic on boot when loading iscsid with broadcom NIC
 Bug 515716 -  [Broadcom 5.5 FEAT] Update bnx2x to 1.52.1-5
 Bug 517378 -  [Broadcom 5.5 FEAT] Update bnx2i and cnic drivers

Public download (x86_64,i686,src):
http://people.redhat.com/sgruszka/rhel5.5-broadcom/

Please test.

Comment 16 Stanislaw Gruszka 2009-11-26 08:00:05 UTC
Does packages works? I would like to have this confirmation before posting patch to RKML. Thanks.

Comment 17 Stanislaw Gruszka 2009-11-30 06:38:05 UTC
(In reply to comment #15)
> Brew build:
> https://brewweb.devel.redhat.com/taskinfo?taskID=2099220
> 
> It contains patches from following bugzillas:
>  Bug 516233 -  Panic on boot when loading iscsid with broadcom NIC
>  Bug 515716 -  [Broadcom 5.5 FEAT] Update bnx2x to 1.52.1-5
>  Bug 517378 -  [Broadcom 5.5 FEAT] Update bnx2i and cnic drivers
> 
> Public download (x86_64,i686,src):
> http://people.redhat.com/sgruszka/rhel5.5-broadcom/
> 
> Please test.  

(In reply to comment #16)
> Does packages works? I would like to have this confirmation before posting
> patch to RKML. Thanks.  

Hi Michael

Any news?

Comment 18 Michael Chan 2009-11-30 08:56:14 UTC
Hi Stanislaw,  I just got back from vacation.  I'll need to check with our QA tomorrow to see if they've done any testing.  Thanks.

Comment 19 Michael Chan 2009-12-01 18:12:25 UTC
Emory will be providing test results soon.

Comment 20 Emory Bestenlehner 2009-12-02 03:38:04 UTC
I was able to run to some basic target compatibility and read/write tests. More to follow.

Thanks

Comment 21 Emory Bestenlehner 2009-12-04 02:35:11 UTC
We are unable to make any connections with bnx2i after initially getting it to work. I had restarted the iSCSI service while running disk I/O, and since then am only able to connect via L2. I am reloading 5.4 to check for procedural differences in connection establishment, etc.

Comment 22 Stanislaw Gruszka 2009-12-04 08:34:35 UTC
We have ongoing update of bnx2 (bug 517377), perhaps it should be included as well in build for testing. We also do not apply this commit:

commit d0549382da9997834ce65e489d9dbdc4b4693a2b
Author: Michael Chan <mchan@broadcom.com>
Date:   Wed Oct 28 03:41:59 2009 -0700

    cnic: Fix L2CTX_STATUSB_NUM offset in context memory.

Michael, any thoughts ?

Comment 23 Michael Chan 2009-12-04 20:25:26 UTC
Yes, this patch is needed if bnx2 is using 5.0.0.j3 firmware.  Without it, bnx2 will crash when iSCSI is started.  Not sure if that's what Emory was seeing.  Thanks.

Comment 24 Emory Bestenlehner 2009-12-05 02:50:08 UTC
Seems there was some sort of corruption that occurred while testing prior. I removed the bnx2i and cnic modules and had to manually delete the iSCSI iface files previously created (was unable to via iscsiadm). Modprobed bnx2i and re-created ifaces and nodes via a separate interface and it worked. Will resume moderate testing.

Comment 25 Stanislaw Gruszka 2009-12-08 07:02:00 UTC
Emory, 

Could you please provide some more info how you are you enter the issue (hardware, steps to reproduce, are you using cnic with bnx2 or bnx2x?), perhaps we will be able to reproduce problem in RH.

Comment 26 Stanislaw Gruszka 2009-12-08 07:04:52 UTC
(In reply to comment #23)
> Yes, this patch is needed if bnx2 is using 5.0.0.j3 firmware.  Without it, bnx2
> will crash when iSCSI is started.  Not sure if that's what Emory was seeing. 

In RHEL5 we use older bnx2 firmware 4.6.16  and 4.6.15, so this must be different problem.

Comment 27 Michael Chan 2009-12-08 07:50:44 UTC
I'll have a software engineer look into this tomorrow.  We are testing 1G iSCSI using bnx2 because the final version of the userspace UIO driver for 10G bnx2x has not been sent to Mike Christie yet.

Comment 28 Emory Bestenlehner 2009-12-08 17:54:43 UTC
  1. Load RedHat5.4, upgrade to 5.5 kernel (kernel-2.6.18-174.el5.bz515716.x86_64.rpm)
  2. Create iSCSI interface and bind it to eth2, bnx2i and ip address
  3. Connect to some targets and run disktest
  4. Restart iSCSI service

I'm using the 5709x (4 port), but only only 1 port is active (ifdown on other 3). 5.5 inbox bnx2i version 2.0.1e

Comment 29 Michael Chan 2009-12-08 22:11:08 UTC
Eddie Wai is now debugging the problem with Emory.

Comment 30 Stanislaw Gruszka 2009-12-09 10:02:13 UTC
Great thanks. Please remember kernel-2.6.18-174.el5.broadcom_test build do not include requested bnx2 fixes from bug 517377. If you want new rebuild with bnx2 2.0.2 update please let me know.

Comment 31 Eddie Wai 2009-12-09 18:45:17 UTC
The problem is related to the iSCSI service being scripted to restart while running disktest on the iSCSI disks.  The iSCSI sessions should get logged out upon the service going down and get re-established upon the service going back up.  Apparently, some connections didn't get cleaned up correctly.  This is still under investigation.  

Barring this cycling of the iSCSI service test, the code appears to run okay so far.  Emory is in the process of executing more of our normal test plan to see if other failures shall occur.

Comment 32 Stanislaw Gruszka 2009-12-09 19:44:40 UTC
(In reply to comment #28)
>   1. Load RedHat5.4, upgrade to 5.5 kernel
> (kernel-2.6.18-174.el5.bz515716.x86_64.rpm)

Oh dear. I  took a look at this again. This kernel not contains cnic fixes. It only contains bnx2x fixes from bug 515716 . Please try one from:
http://people.redhat.com/sgruszka/rhel5.5-broadcom/
Package should be named kernel-2.6.18-174.el5.broadcom_test.x86_64.rpm

Comment 33 Michael Chan 2009-12-09 20:00:43 UTC
Oops, good catch.  Emory, please use test kernels specified in this BZ and check the versions of the cnic and bnx2i drivers.  The cnic driver version should be 2.1.0.  Thanks.

Comment 34 Michael Chan 2009-12-12 07:08:19 UTC
There are some small bug fix patches for cnic and bnx2i to support the bnx2x 10G devices.  They are in the net-2.6 and scsi-misc-2.6 trees.  Can those be added here or should we file a new BZ?  Thanks.

Comment 35 Stanislaw Gruszka 2009-12-14 07:58:30 UTC
This one is ok.

Comment 36 Michael Chan 2009-12-14 21:07:51 UTC
Please include these cnic patches:

commit 4e9c4fd3e7e022c7a5b8bb7cd06bf914b202cfea
    cnic: Zero out status block and Event Queue indices.

commit 1bcdc32cf4d94442eba79599ce8438ea0b8f78b5
    cnic: Send delete command when shutting down iSCSI ring.

commit 3248e1682035eef6774c280cd7be19984feb78bb
    cnic: Use dma_alloc_coherent().

commit 15971c3ce3caf9a92b603a61b07e0be8c9b9d276
    cnic: Fix rq_page_table DMA address.

commit dd2e4dbce32a2802088f6d0132046afec9bfb2ad
    cnic: Fix bogus iSCSI MAC address

commit 8b065b671d3096bfe0dbc9a833cb592f84642436
    cnic: Fix bnx2x ring shutdown.

commit c7596b79feb3d15bea64007254f77233bda811f4
    cnic: Fix ring I/O address for bnx2x devices.

commit 164165dad7e607ec359e64b6fae72abbf3640ea6
    drivers/net: tasklet_init - Remove unnecessary leading & from second arg

commit 0d37f36ff9bc41067c71635d14b6a5834853a779
    cnic: ensure ulp_type is not negative

commit d0549382da9997834ce65e489d9dbdc4b4693a2b
    cnic: Fix L2CTX_STATUSB_NUM offset in context memory.
    (This one requires bnx2 update to use newer firmware)

Comment 37 Michael Chan 2009-12-14 21:22:43 UTC
Please also include these bnx2i patches (in scsi-misc-2.6):

commit 45ca38e753016432a266a18679268a4c4674fb52
    [SCSI] bnx2i: minor code cleanup and update driver version
   
commit 85fef20222bda1ee41f97ff94a927180ef0b97e6
    [SCSI] bnx2i: Task management ABORT TASK fixes
   
commit 8776193bc308553ac0011b3bb2dd1837e0c6ab28
    [SCSI] bnx2i: update CQ arming algorith for 5771x chipsets
   
commit f8c9abe797c54e798b4025b54d71e5d2054c929a
    [SCSI] bnx2i: Adjust sq_size module parametr to power of 2 only if a non-zero value is specified
   
commit 5d9e1fa99c2a9a5977f5757f4e0fd02697c995c2
    [SCSI] bnx2i: Add 5771E device support to bnx2i driver

Comment 38 Mike Christie 2009-12-15 09:24:25 UTC
For the bnx2i patches I think we need to make another bugzilla. I will do this in the morning if I cannot find one.

Stanislaw, I can send the patches in comment #37 since they are all scsi related.

Comment 39 Stanislaw Gruszka 2009-12-15 09:59:36 UTC
(In reply to comment #38)
> Stanislaw, I can send the patches in comment #37 since they are all scsi
> related.  

I'll do this, bnx2i patches may have dependency with cnic ones. Thanks.

Comment 40 Anil Veerabhadrappa 2009-12-15 14:50:06 UTC
bnx2i patches in #37 are independent patches and not linked to any cnic driver changes.

Comment 41 Stanislaw Gruszka 2009-12-15 15:28:14 UTC
Right, but I have already done :)

Comment 42 Stanislaw Gruszka 2009-12-15 15:29:14 UTC
Created attachment 378535 [details]
0001-cnic-fixes-for-RHEL5.5.patch

Comment 43 Stanislaw Gruszka 2009-12-15 15:30:29 UTC
Created attachment 378536 [details]
0002-bnx2i-fixes-for-RHEL5.5.patch

Comment 44 Don Zickus 2009-12-15 20:18:51 UTC
in kernel-2.6.18-181.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Please update the appropriate value in the Verified field
(cf_verified) to indicate this fix has been successfully
verified. Include a comment with verification details.

Comment 46 Stanislaw Gruszka 2009-12-15 20:56:15 UTC
(In reply to comment #44)
> in kernel-2.6.18-181.el5

Ahh, this bug needs additional fixes. Moving back to ASSIGNED.

Comment 48 Stanislaw Gruszka 2009-12-15 21:10:43 UTC
Hello Broadcom

I updated kernel-2.6.18-180.el5.broadcom_test at
http://people.redhat.com/sgruszka/rhel5.5-broadcom/

It includes all cnic, bnx2i and bnx2 fixes You requested. Please test and
report back ASAP. Note there is know issue with bnx2 with MTU=9000 (see bug
517377).

Comment 49 Michael Chan 2009-12-15 21:21:17 UTC
Thanks.  We can continue 1G bnx2 iSCSI testing using the new kernel and this initiator:

http://people.redhat.com/mchristi/iscsi/rhel5.5/iscsi-initiator-utils/

For 10G iSCSI testing on bnx2x devices, we need to wait for an updated initiator (Please see bug 517380).

Comment 53 Stanislaw Gruszka 2009-12-18 21:20:14 UTC
(In reply to comment #49)
> Thanks.  We can continue 1G bnx2 iSCSI testing using the new kernel and this
> initiator:

Any news, how does testing goes ?

Comment 54 Ed Narvaez 2009-12-19 18:26:45 UTC
Here's what I show in our testdbase re: progress.  I will ask Emory to further update.

Software/Driver	Driver: iSCSI HBA - Linux
iSCSI HBA - Linux 2.01e
Linux x64	

Passed	Failed	N/A	Blocking   Done
7 (14%)	0 (0%)	0 (0%)	0 (0%)	   7 of 47 (14%)

Joe verified MTU fix no longer occurs (517377)

Comment 55 Michael Chan 2009-12-20 21:45:41 UTC
With the latest kernel, the bnx2i driver should be 2.1.0

Comment 56 Stanislaw Gruszka 2009-12-21 10:35:47 UTC
kernel-2.6.18-180.el5.broadcom_test contains bnx2i version 2.1.0. Please test that kernel. Don Zickus 2.6.18-181 kernel is latter but it not contains new Broadcom drivers patches. Thank you.

Comment 57 Emory Bestenlehner 2009-12-21 18:55:22 UTC
We're not getting iSCSI init interrupts. Currently being investigated.
/var/log/messages:
Dec 21 10:35:16 localhost kernel: iscsi: registered transport (bnx2i)
Dec 21 10:35:16 localhost kernel: scsi21 : Broadcom Offload iSCSI Initiator
Dec 21 10:35:16 localhost kernel: bnx2i: send ISCSI_INIT KWQE
Dec 21 10:35:18 localhost kernel: scsi22 : Broadcom Offload iSCSI Initiator
Dec 21 10:35:18 localhost kernel: scsi23 : Broadcom Offload iSCSI Initiator
Dec 21 10:35:18 localhost kernel: bnx2i: send ISCSI_INIT KWQE
Dec 21 10:35:20 localhost kernel: scsi24 : Broadcom Offload iSCSI Initiator
Dec 21 10:35:20 localhost kernel: bnx2i: send ISCSI_INIT KWQE
Dec 21 10:35:22 localhost kernel: scsi25 : Broadcom Offload iSCSI Initiator
Dec 21 10:35:22 localhost kernel: scsi26 : Broadcom Offload iSCSI Initiator
Dec 21 10:35:22 localhost kernel: bnx2i: ep connect - start...
Dec 21 10:35:22 localhost kernel: bnx2i: ep connect shost...
Dec 21 10:35:22 localhost kernel: bnx2i: ep connect - hba not ready ...
Dec 21 10:35:23 localhost iscsid: Received iferror -1

Thanks,

Emory

Comment 58 Mike Christie 2010-01-04 21:35:31 UTC
Hi Emory,

Were you using bnx2i with bnx2x or bnx2?

And if you are using bnx2x are you using the latest iscsi initiator utils? It is here http://people.redhat.com/mchristi/iscsi/rhel5.5/iscsi-initiator-utils.

Comment 59 Emory Bestenlehner 2010-01-05 05:13:28 UTC
Hello Mike,

I'm using the latest initiator with bnx2i.

Thanks,
Emory

Comment 60 Mike Christie 2010-01-06 22:03:04 UTC
Adding Michael Chan from Broadcom.

Michael see comment #57. It looks like bnx2i_adapter_ready is failing. I cannot remember the common case for this. Was it if the ethX device was not also set up or something like that.

Comment 61 Michael Chan 2010-01-06 22:22:56 UTC
It could be the link was down or we did not get ISCSI_KCQE_OPCODE_INIT completion from the firmware.

We'll retest with the -180 kernel from sgruszka and debug it if it doesn't work.

Comment 62 Gideon Naim 2010-01-08 18:18:50 UTC

Update - we have an internal automated test setup for iSCSI offload. For the first time we’ve managed to pass all the iSCSI offload IPv4 tests on both bnx2 and bnx2x with Kernel: 2.6.18-180.el5.broadcom_test and Initiator: 6.2.0.871-0.14.el5.

We are still investigating the open issues, but significant progress has been achieved.

Thanks,
Gidi

Comment 67 Gideon Naim 2010-02-12 21:05:16 UTC
I will update tonight with the latest test progress.

It will help the PQA efforts to have bnx2i 2.1.0 as part of the Beta CD.

Will bnx2i 2.1.0 be part of the next snapshot CD?

Thanks,
Gidi

Comment 68 Andrius Benokraitis 2010-02-12 21:16:42 UTC
Correct - Snapshot 1 is what you'll need once released.

Comment 70 Andrius Benokraitis 2010-02-15 19:01:28 UTC
Broadcom - looks like only a partial set of the patches made Snapshot 1 (bnx2x only). The cnic bits will land in Snapshot 2.

Comment 72 Gideon Naim 2010-02-16 05:11:21 UTC
Testing update regarding BRCM PQA additional iSCSI offload testing (doesn't include the automated protocol testing this is fully passing):

10G 1st pass:	
iSCSI offload testing	42% 

1G 1st pass:
iSCSI offload testing	25%

This testing is done using test Kernels and not snapshot CDs since bnx2i is not part of the CDs yet.

Thanks,
Gidi

Comment 78 Thomas Chenault 2010-03-03 02:31:56 UTC
I am experiencing a failure while testing bnx2i over bnx2x from 2.6.18-189.el5 x86_64. The failure results in a bnx2x panic dump. The steps to reproduce are roughly:

1. Configure network interface on Broadcom 57711 device to use DHCP assigned address.
2. Configure iSCSI iface on same Broadcom 57711 device to use DHCP assigned address.
3. Discover Equallogic iSCSI target and connect to same.
4. Allow system to sit idle for several hours.

Which, if any, of the preceding details are important is currently unknown.

I will attach the text of the dump.

Comment 79 Thomas Chenault 2010-03-03 02:44:23 UTC
Created attachment 397464 [details]
bnx2x panic dump from comment 78

Comment 80 Anil Veerabhadrappa 2010-03-03 04:57:10 UTC
Please post complete driver logs leading into bnx2x driver assert.

Comment 81 Andrius Benokraitis 2010-03-03 05:05:31 UTC
NOTE: there are still two fixes slated to be included in Snapshot 3 in regard to bnx2x driver, and bnx2x firmware.

Comment 82 Michael Chan 2010-03-03 05:36:09 UTC
That's right.  Hopefully the new firmware fixes this issue.  I'm adding Eilon to CC so he can have the Israel team look at the MC assert message.

Comment 83 Andrius Benokraitis 2010-03-03 06:13:30 UTC
Still pending inclusion in Snapshot 3:

https://bugzilla.redhat.com/show_bug.cgi?id=561578

https://bugzilla.redhat.com/show_bug.cgi?id=567979

Comment 84 Michael Chan 2010-03-03 06:51:07 UTC
The -190.el5 kernel should have the updated bnx2x firmware (see Bug 560556).  I think the newer bnx2x firmware may have a fix for this issue.  Thomas, can you run the same test using this kernel?

Comment 85 Thomas Chenault 2010-03-03 16:33:40 UTC
I cannot access Bug 560556. I need a more direct link to the -190.el5 kernel.

Comment 86 Andrius Benokraitis 2010-03-03 16:45:07 UTC
Bug permissions fixed.

You can find all test kernels at:
http://people.redhat.com/jwilson/el5/

Comment 87 Thomas Chenault 2010-03-03 21:10:32 UTC
Created attachment 397661 [details]
bnx2x panic dump with kernel -190.el5

The bnx2x panic dump is still occurring with the -190.el5 kernel. See attachment. I will attempt the test on different hardware.

Comment 88 Andrius Benokraitis 2010-03-03 21:38:22 UTC
Right, please try -191 since that's where the fixes landed.

Comment 89 IBM Bug Proxy 2010-03-04 01:31:53 UTC
------- Comment From linuxram@us.ibm.com 2010-03-03 20:24 EDT-------
-191 does not work.  Fails to login to the target using bnx2i target.

However login to the target using tcp transport works.

BTW, the NIC is 5709 with iscsi key enabled.

Comment 90 Thomas Chenault 2010-03-04 02:36:04 UTC
(In reply to comment #88)
> Right, please try -191 since that's where the fixes landed.    

The failure persisted with the -191.el5 kernel. I will run the test on a 57710 and a different 57711 overnight.

Comment 91 Andrius Benokraitis 2010-03-04 15:54:07 UTC
Michael Chan, Mike Christie - see Comment 89 and Comment 90. Any ideas? Have these been tested on your side?

Comment 92 Thomas Chenault 2010-03-04 16:08:28 UTC
The bnx2x panic dump failure occurred on both the 57710 and 57711 NIC overnight.

Comment 93 Michael Chan 2010-03-04 16:10:55 UTC
Our Israel team is looking into the 10G firmware panic reported by Thomas.  The problem reported by IBM may be a configuration issue, but we need more information on their setup.

Comment 94 Michael Chan 2010-03-04 16:16:56 UTC
Thomas, the Israel firmware team is asking for a GRC dump.  Can you provide that for us?

Comment 95 Thomas Chenault 2010-03-04 18:29:47 UTC
(In reply to comment #94)
> Thomas, the Israel firmware team is asking for a GRC dump.  Can you provide
> that for us?    

Yes, I think that I can get the dump. Feel free to contact me off-list if there are any special instructions I need to follow.

Comment 96 Michael Chan 2010-03-04 22:31:56 UTC
I think no special instructions.  Just use lediag to get the dump.  Thanks.

Comment 97 Thomas Chenault 2010-03-04 23:23:35 UTC
Created attachment 397951 [details]
GRC dump from 57710

Michael, I have attached the GRC dump.

Comment 98 Michael Chan 2010-03-05 17:41:20 UTC
Thanks Thomas.  we've looked at the GRC dump and it is very likely a uIP userspace driver issue.  We should have a fix later today.  Do we need a new BZ for that?  Can we still fix it for RHEL5.5?

Comment 99 Andrius Benokraitis 2010-03-05 18:44:19 UTC
Michael, there's no "uIP userspace driver" in RHEL 5.5 currently, correct? I thought this was a 5.6 item?

Comment 100 Michael Chan 2010-03-05 18:56:59 UTC
Yes Andrius, it is the brcm_iscsiuio package that we updated in Bug 517380.  The uIP driver is part of that package.  We need to fix one line in that userspace driver to fix the issue Thomas reported.

Comment 101 Andrius Benokraitis 2010-03-05 19:29:09 UTC
Go ahead and file a new bugzilla ASAP if you have the requisite business justification and I'll send it up ASAP.

Comment 102 IBM Bug Proxy 2010-03-11 20:09:06 UTC
------- Comment From coschult@us.ibm.com 2010-03-11 14:52 EDT-------
I have not been able to log into the target while using bnx2i transport. If I used the default (no iface argument to iscsiadm) I did successfully log in.  Attached are my most recent logs, using kernel 2.6.18-191.el5.

Comment 103 IBM Bug Proxy 2010-03-11 20:09:21 UTC
Created attachment 399437 [details]
log from brcm_iscsiuio -f -d 100


------- Comment (attachment only) From coschult@us.ibm.com 2010-03-11 14:53 EDT-------

Comment 104 IBM Bug Proxy 2010-03-11 20:09:35 UTC
Created attachment 399439 [details]
log from iscsid -f -d 100


------- Comment on attachment From coschult@us.ibm.com 2010-03-11 14:54 EDT-------


One error message caught my eye, but I don't know if it's significant:

iscsid: Recieved iferror -38: Unknown error 18446744073709551578

Comment 105 IBM Bug Proxy 2010-03-11 20:09:49 UTC
Created attachment 399440 [details]
output from /var/log/messages


------- Comment on attachment From coschult@us.ibm.com 2010-03-11 14:55 EDT-------


Nothing interesting in here, but I'm attaching it for completeness.

Comment 106 Mike Christie 2010-03-12 21:23:26 UTC
(In reply to comment #104)
> Created an attachment (id=399439) [details]
> log from iscsid -f -d 100
> 
> 
> ------- Comment on attachment From coschult@us.ibm.com 2010-03-11 14:54
> EDT-------
> 
> 
> One error message caught my eye, but I don't know if it's significant:
> 
> iscsid: Recieved iferror -38: Unknown error 18446744073709551578    

-38 is ENOSYS which just means userspace tried to set some feature the kernel did not support. The 18446744073709551578 is a bug due to me using strerror and passing it the negative value, so instead of a nice string you get junk.

What is interesting is

iscsid: Received iferror -1
iscsid: cannot make a connection to 9.47.81.22:3260 (-1,11)


The bnx2i driver is returning -EPERM. Benjamnin@broadcom, what was the reason for this again?

Comment 107 Benjamin Li 2010-03-12 21:47:09 UTC
Hi Mike,

One place during ep_connect() where -EPERM will be returned is if the state of the offloaded adapter is not ready then -EPERM will be returned.  bnx2i will catch the netevents of when the adapter comes up/down and will cache the state.

Corinna did you get a chance to run the debug driver set that Michael sent on Thu, 11 Mar 2010 16:13:50 -0800.  That driver set has a some additional printk's to help us understand the flow of the code.

Thanks again.

-Ben

Comment 108 Michael Chan 2010-03-12 22:01:59 UTC
Yes, I just got debug output from Corinna.  It failed at exactly the place where we check for bnx2i_adapter_ready().  I'm sending her another patch to print the hba->adapter_state.

Comment 109 IBM Bug Proxy 2010-03-12 22:45:17 UTC
------- Comment From coschult@us.ibm.com 2010-03-12 17:39 EDT-------
adapter_state is 0.  Here's the kernel log output when I restart iscsi and it tried to log in to the target:

Mar 12 14:26:20 elm3b102 iscsid: iSCSI logger with pid=7864 started!
Mar 12 14:26:20 elm3b102 kernel: bnx2i_ep_connect, shost ffff81107b549000, hba ffff81107b5495a0
Mar 12 14:26:20 elm3b102 kernel: bnx2i_ep_connect, alloc_ep succeeded
Mar 12 14:26:20 elm3b102 kernel: bnx2i_ep_connect, hba adapter_state 0
Mar 12 14:26:21 elm3b102 iscsid: transport class version 2.0-871. iscsid version 2.0-871
Mar 12 14:26:21 elm3b102 iscsid: iSCSI daemon with pid=7865 started!
Mar 12 14:26:21 elm3b102 iscsid: Received iferror -1
Mar 12 14:26:21 elm3b102 iscsid: cannot make a connection to 9.47.69.22:3260 (-1,11)
Mar 12 14:26:22 elm3b102 kernel: bnx2i_ep_connect, shost ffff81107b549000, hba ffff81107b5495a0
Mar 12 14:26:22 elm3b102 kernel: bnx2i_ep_connect, alloc_ep succeeded
Mar 12 14:26:22 elm3b102 kernel: bnx2i_ep_connect, hba adapter_state 0
Mar 12 14:26:23 elm3b102 iscsid: Received iferror -1
Mar 12 14:26:23 elm3b102 iscsid: cannot make a connection to 9.47.81.22:3260 (-1,11)

Comment 110 Michael Chan 2010-03-12 23:00:55 UTC
Thanks Corinna.  Can you do iscsiadm login one more time immediately after the above failure?

Comment 111 IBM Bug Proxy 2010-03-12 23:34:45 UTC
------- Comment From coschult@us.ibm.com 2010-03-12 18:21 EDT-------
Same result (with tail -f /var/log/messages & running in the same shell):

[root@elm3b102 ~]# iscsiadm -m discovery -t sendtargets -p 9.47.69.22 -I bnx2i.eth3
9.47.69.22:3260,1000 iqn.1992-08.com.netapp:sn.84183797
9.47.81.22:3260,1001 iqn.1992-08.com.netapp:sn.84183797

[root@elm3b102 ~]# iscsiadm -m node -l -I bnx2i.eth3
Logging in to [iface: bnx2i.eth3, target: iqn.1992-08.com.netapp:sn.84183797, portal: 9.47.69.22,3260]
Logging in to [iface: bnx2i.eth3, target: iqn.1992-08.com.netapp:sn.84183797, portal: 9.47.81.22,3260]
iscsiadm: Could not login to [iface: bnx2i.eth3, target: iqn.1992-08.com.netapp:sn.84183797, portal: 9.47.69.22,3260]:
iscsiadm: initiator reported error (4 - encountered connection failure)
Mar 12 15:03:39 elm3b102 kernel: bnx2i_ep_connect, shost ffff81107b549000, hba ffff81107b5495a0
Mar 12 15:03:39 elm3b102 kernel: bnx2i_ep_connect, alloc_ep succeeded
Mar 12 15:03:39 elm3b102 kernel: bnx2i_ep_connect, hba adapter_state 0
Mar 12 15:03:39 elm3b102 iscsid: Received iferror -1
Mar 12 15:03:39 elm3b102 iscsid: cannot make a connection to 9.47.69.22:3260 (-1,11)
iscsiadm: Could not login to [iface: bnx2i.eth3, target: iqn.1992-08.com.netapp:sn.84183797, portal: 9.47.81.22,3260]:
iscsiadm: initiator reported error (4 - encountered connection failure)
[root@elm3b102 ~]# Mar 12 15:03:41 elm3b102 kernel: bnx2i_ep_connect, shost ffff81107b549000, hba ffff81107b5495a0
Mar 12 15:03:41 elm3b102 kernel: bnx2i_ep_connect, alloc_ep succeeded
Mar 12 15:03:41 elm3b102 kernel: bnx2i_ep_connect, hba adapter_state 0
Mar 12 15:03:41 elm3b102 iscsid: Received iferror -1
Mar 12 15:03:41 elm3b102 iscsid: cannot make a connection to 9.47.81.22:3260 (-1,11)

[root@elm3b102 ~]# iscsiadm -m node -l -I bnx2i.eth3
Logging in to [iface: bnx2i.eth3, target: iqn.1992-08.com.netapp:sn.84183797, portal: 9.47.69.22,3260]
Logging in to [iface: bnx2i.eth3, target: iqn.1992-08.com.netapp:sn.84183797, portal: 9.47.81.22,3260]
iscsiadm: Could not login to [iface: bnx2i.eth3, target: iqn.1992-08.com.netapp:sn.84183797, portal: 9.47.69.22,3260]:
iscsiadm: initiator reported error (4 - encountered connection failure)
Mar 12 15:03:55 elm3b102 kernel: bnx2i_ep_connect, shost ffff81107b549000, hba ffff81107b5495a0
Mar 12 15:03:55 elm3b102 kernel: bnx2i_ep_connect, alloc_ep succeeded
Mar 12 15:03:55 elm3b102 kernel: bnx2i_ep_connect, hba adapter_state 0
Mar 12 15:03:55 elm3b102 iscsid: Received iferror -1
Mar 12 15:03:55 elm3b102 iscsid: cannot make a connection to 9.47.69.22:3260 (-1,11)
iscsiadm: Could not login to [iface: bnx2i.eth3, target: iqn.1992-08.com.netapp:sn.84183797, portal: 9.47.81.22,3260]:
iscsiadm: initiator reported error (4 - encountered connection failure)
[root@elm3b102 ~]# Mar 12 15:03:57 elm3b102 kernel: bnx2i_ep_connect, shost ffff81107b549000, hba ffff81107b5495a0
Mar 12 15:03:57 elm3b102 kernel: bnx2i_ep_connect, alloc_ep succeeded
Mar 12 15:03:57 elm3b102 kernel: bnx2i_ep_connect, hba adapter_state 0
Mar 12 15:03:57 elm3b102 iscsid: Received iferror -1
Mar 12 15:03:57 elm3b102 iscsid: cannot make a connection to 9.47.81.22:3260 (-1,11)

Comment 112 Anil Veerabhadrappa 2010-03-12 23:56:52 UTC
Is this device licensed for iSCSI offload? Is it possible to upload the complete logs which covers messages from driver load as well?

Comment 113 IBM Bug Proxy 2010-03-13 00:24:43 UTC
------- Comment From coschult@us.ibm.com 2010-03-12 19:19 EDT-------
Yes, there are messages in the kernel log indicating that offload is successful:

Broadcom NetXtreme II iSCSI Driver bnx2i v2.1.0 (Dec 06, 2009)
iscsi: registered transport (bnx2i)
scsi6 : Broadcom Offload iSCSI Initiator
scsi7 : Broadcom Offload iSCSI Initiator
scsi8 : Broadcom Offload iSCSI Initiator
scsi9 : Broadcom Offload iSCSI Initiator

I will attach the latest dmesg log. It shows several attempts to log into the target.

Comment 114 IBM Bug Proxy 2010-03-13 00:24:51 UTC
Created attachment 399791 [details]
dmesg log with debugging information


------- Comment (attachment only) From coschult@us.ibm.com 2010-03-12 19:21 EDT-------

Comment 115 Michael Chan 2010-03-13 22:23:10 UTC
I think I know what's happening.  bnx2i_init_one() calls cnic->register_device() and bnx2i_start() will be called immediately.  bnx2i_start() will try to send the iscsi_init message to firmware but hba->cnic has not been setup yet so it fails and adapter_state will stay at 0.

I think this can be easily fixed, but for now we can work around this by loading the bnx2i driver first before bringing up the eth? devices.

IBM, please try this:

modprobe bnx2
modprobe cnic
modprobe bnx2i
ifup eth0

Or:

modprobe cnic
modprobe bnx2i
modprobe bnx2
ifup eth0

Comment 116 IBM Bug Proxy 2010-03-15 23:25:07 UTC
------- Comment From coschult@us.ibm.com 2010-03-15 19:18 EDT-------
I rmmod'd the drivers and modprobe'd them in both sequences, and got a different error. The first time, it was a connection timeout, and the second time it immediately returned with a connection failure. I retried the first sequence, and it didn't timeout but gave me a connection failure. Looking at the log, adapter_state is 1, so that much succeeded, at least.

I'm attaching the various logs.

Comment 117 IBM Bug Proxy 2010-03-15 23:25:20 UTC
Created attachment 400330 [details]
shell session


------- Comment on attachment From coschult@us.ibm.com 2010-03-15 19:20 EDT-------


My retry of the first sequence of module loading is not shown here.

Comment 118 IBM Bug Proxy 2010-03-15 23:36:06 UTC
Created attachment 400332 [details]
/var/log/messages output


------- Comment (attachment only) From coschult@us.ibm.com 2010-03-15 19:23 EDT-------

Comment 119 IBM Bug Proxy 2010-03-15 23:36:17 UTC
Created attachment 400333 [details]
debugging output from brcm_iscsiuio


------- Comment (attachment only) From coschult@us.ibm.com 2010-03-15 19:24 EDT-------

Comment 120 IBM Bug Proxy 2010-03-15 23:36:25 UTC
Created attachment 400334 [details]
debugging output from iscsid


------- Comment (attachment only) From coschult@us.ibm.com 2010-03-15 19:24 EDT-------

Comment 121 Michael Chan 2010-03-15 23:46:11 UTC
The kernel cannot find the route to the iSCSI target through eth3.  Do you have eth0 and eth3 configured for the same subnet?  If the shortest route to the target is through eth0 and not eth3, it will not connect.  Please configure eth3 only with the subnet that can reach the iSCSI target.

Comment 122 IBM Bug Proxy 2010-03-16 22:35:51 UTC
------- Comment From coschult@us.ibm.com 2010-03-16 18:21 EDT-------
It turns out that I did have eth0 configured (with the same address as eth3!). I did a ip address flush dev eth0 on it, and reloaded the modules, and still got the same result: get_route returns 0. I flushed usb0, which had an IPV6 address assigned to it, and eth4, which had been configured with the address 10.0.0.102.

I ran tcpdump while trying to connect and saw no network traffic.

bHere's the output of ifconfig -a:

eth0      Link encap:Ethernet  HWaddr 00:21:5E:09:60:40
BROADCAST MULTICAST  MTU:1500  Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
Interrupt:169 Memory:96000000-96012800

eth1      Link encap:Ethernet  HWaddr 00:21:5E:09:60:42
BROADCAST MULTICAST  MTU:1500  Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
Interrupt:138 Memory:98000000-98012800

eth2      Link encap:Ethernet  HWaddr 00:10:18:57:0D:BC
BROADCAST MULTICAST  MTU:1500  Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
Interrupt:209 Memory:92000000-92012800

eth3      Link encap:Ethernet  HWaddr 00:10:18:57:0D:BE
inet addr:9.47.67.102  Bcast:9.47.67.255  Mask:255.255.254.0
inet6 addr: fe80::210:18ff:fe57:dbe/64 Scope:Link
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
RX packets:7479 errors:0 dropped:0 overruns:0 frame:0
TX packets:439 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:757738 (739.9 KiB)  TX bytes:43750 (42.7 KiB)
Interrupt:146 Memory:94000000-94012800

eth4      Link encap:Ethernet  HWaddr 00:14:5E:99:03:F4
BROADCAST MULTICAST  MTU:1500  Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:45 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 b)  TX bytes:6820 (6.6 KiB)
Interrupt:185 Memory:9c800000-9c800fff

lo        Link encap:Local Loopback
inet addr:127.0.0.1  Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING  MTU:16436  Metric:1
RX packets:6363 errors:0 dropped:0 overruns:0 frame:0
TX packets:6363 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:9063624 (8.6 MiB)  TX bytes:9063624 (8.6 MiB)

sit0      Link encap:IPv6-in-IPv4
NOARP  MTU:1480  Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

usb0      Link encap:Ethernet  HWaddr 02:21:5E:0A:60:43
BROADCAST MULTICAST  MTU:1500  Metric:1
RX packets:468 errors:0 dropped:0 overruns:0 frame:0
TX packets:29 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:30420 (29.7 KiB)  TX bytes:5831 (5.6 KiB)

Comment 123 Michael Chan 2010-03-16 22:44:29 UTC
(In reply to comment #122)
> ------- Comment From coschult@us.ibm.com 2010-03-16 18:21 EDT-------
> It turns out that I did have eth0 configured (with the same address as eth3!).
> I did a ip address flush dev eth0 on it, and reloaded the modules, and still
> got the same result: get_route returns 0.

Yesterday, cnic_get_route() was returning -101 which was -ENETUNREACH.  Returning 0 means it is successful.  So we're making progress!  Can you post the logs or send them to me privately?

Comment 124 IBM Bug Proxy 2010-03-16 23:46:04 UTC
------- Comment From coschult@us.ibm.com 2010-03-16 19:40 EDT-------
By the way, the error message I got back from iscsiadm was a connection time out.

Comment 125 IBM Bug Proxy 2010-03-16 23:46:12 UTC
Created attachment 400598 [details]
output from /var/log/messages


------- Comment (attachment only) From coschult@us.ibm.com 2010-03-16 19:36 EDT-------

Comment 126 IBM Bug Proxy 2010-03-16 23:46:21 UTC
Created attachment 400599 [details]
output from brcm_iscsiuio


------- Comment (attachment only) From coschult@us.ibm.com 2010-03-16 19:38 EDT-------

Comment 127 IBM Bug Proxy 2010-03-16 23:46:29 UTC
Created attachment 400600 [details]
output from iscsid


------- Comment (attachment only) From coschult@us.ibm.com 2010-03-16 19:38 EDT-------

Comment 128 Benjamin Li 2010-03-17 00:27:42 UTC
Hi Corinna,

We are definitely getting further.  Now from the uIP daemon's point of view, we can see the ARP request come from CNIC.  It looks like the ARP request packet was placed on the wire and uIP sees packets come in from the wire but they were not ARP packets because it didn't fill the uIP ARP packets.  I was wondering if you are able to provide a packet trace of when you try an iSCSI login.

Note:  Running wireshark on the L2 interface will only provide a partial view of the contents on the wire because all uni-cast traffic to the iSCSI offload interface is not sent to the L2 interface.  If you could do a network capture on the iSCSI target or in the middle of the connection that would be the best.

Thanks again.

-Ben

Comment 129 IBM Bug Proxy 2010-03-18 20:19:56 UTC
------- Comment From coschult@us.ibm.com 2010-03-18 16:01 EDT-------
I tried connecting to a target on a different machine, which had an ip address restriction on the target, restricted to my ethernet ip address. I was able to discover the target, but when I tried to log in, I received the error "non-retryable iSCSI login failure". When I had our admin add the iscsi ip address as well, then I was able to log in successfully.

But I was only able to successfully login when I loaded the modules in the order suggested by Michael. When I used the init.d script to start the service, I got "encountered connection failure".

I am now looking into why I am unable to log into the first machine.

Comment 130 Benjamin Li 2010-03-18 22:30:31 UTC
Hi Corinna,

For the problem where you are not able to login if you use the init.d scripts to start the service, the iscsid script will load the cnic, and bnx2i drivers only.  It will not load the bnx2 driver or bring that interface up.  

Does your test system automatically run the iscsid init script when booted?  If so could you list the directory contents of the SysV init scripts of the runlevel you are having trouble with?  (ie list the contents '/etc/rc.d/rc<run level>.d>')

Also were you able to provide a wiretrace of the problem describe in comment 124?

Thanks again.

-Ben

Comment 131 IBM Bug Proxy 2010-03-22 23:35:56 UTC
------- Comment From coschult@us.ibm.com 2010-03-22 19:22 EDT-------
Since I was able to log into a different target, on a machine with a different configuration, I'm going to not worry about my failure to log into the first machine. Likely it is a configuration problem, that we lack sufficient experience to diagnose. In any case, that machine is a Netapp, and I can't easily tap into its network traffic.

Comment 133 errata-xmlrpc 2010-03-30 07:21:28 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0178.html

Comment 134 IBM Bug Proxy 2010-05-21 21:36:50 UTC
------- Comment From coschult@us.ibm.com 2010-05-21 17:23 EDT-------
Verified on rhel5.5 rc2 (the release referred to by http://rhn.redhat.com/errata/RHSA-2010-0178.html )


Note You need to log in before you can comment on or make changes to this bug.