Bug 469774

Summary: RHEL53 Beta1: network installation through cxgb3 interface failed if the adapter firmware doesn't match the cxgb3 device driver requst firmware level in rhel53.
Product: Red Hat Enterprise Linux 5 Reporter: IBM Bug Proxy <bugproxy>
Component: kernelAssignee: Andy Gospodarek <agospoda>
Status: CLOSED ERRATA QA Contact: Martin Jenner <mjenner>
Severity: high Docs Contact:
Priority: medium    
Version: 5.3CC: agospoda, ahecox, cward, ddumas, dzickus, jkachuck, lwang, mgahagan, peterm, syeghiay, tao
Target Milestone: rc   
Target Release: ---   
Hardware: other   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-01-20 20:06:40 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description IBM Bug Proxy 2008-11-03 23:30:31 UTC
=Comment: #0=================================================
 
---Problem Description---
RHEL53 Beta1: network installation through cxgb3 interface failed if the adapter firmware doesn't
match the cxgb3 device driver requst firmware level in rhel53.
 
Contact Information = wenxiong.com 
 
---uname output---
2.6.18-120.el5
 
Machine Type = eClipz L4 
 
---Debugger---
A debugger is not configured
 
---Steps to Reproduce---
 none
 
---Installation and Packaging Component Data--- 
Install disk info: SAS
 
Install method: NIM
 
Install ISO Information: rhel53 beta1
=================================================================================

This bug is found when we verified our feature 201290. LTC bugzilla #43344.

[Bug 43344] LTC:5.3:201290:Enable the Chelsio 10Gb adapter - driver cxgb3 - as a
bootable/installable device


The cxgb3 device driver binds with certain level adapter firmware. 

For example, in rhel52, cxgb3 driver binds with 5.0.0 level adapter firmware. In rhel53, cxgb3
driver binds with 6.0.0 level adapter firmware.
When make cxgb3 interface up, cxgb3 device driver will check the level of
adapter firmware. If the level doesn't match between the leval that cxgb3
driver request and the level on the adapter, cxgb3 interface up failed.

When I verified this install/boot feature, if the adapter firmware is 6.0.0
now,  I can do network installation through cxgb3 interface, no problem.

The issue is: If the adapter firmware is 5.0.0, the cxgb3 driver modules during network installation
requested 6.0.0 level firmware, the cxgb3 interface failed at ifconfig up. So network installation
failed through cxgb3 interface.

Can you add /lib/firmware/t3fw-6.0.0.bin file duing the network installation? So if the adapter
firmware doesn't match with cxgb3 driver requirment during cxgb3 network installation, cxgb3
driver/modules can flash to the firmware to /lib/firmware/t3fw-6.0.0.bin -correct level in rhel53.

If you have any questions, please let me know.

Thanks for your help

Comment 1 Andy Gospodarek 2008-11-04 18:56:16 UTC
If we include the cxgb3-firmware rpm as part of the initrd this will be solved.  That firmware is not part of the supplemental disc, so there should be no problem adding this.

Comment 2 IBM Bug Proxy 2008-11-05 02:50:25 UTC
Andy, Thank you very much! Can we expect the fix in snapshot2?

Comment 4 Issue Tracker 2008-11-06 22:10:12 UTC
------- Comment From daisyc.com 2008-11-06 16:36 EDT-------
Red Hat,

Just to clarify the regression situation.

It is true that the cxgb3-firmware-level package is a new package for 5.3.
It was not there and was not installed for 5.2 before. This is because the
requirement of this package is a 5.3 feature - RH BZ:366861 - Ship the
adapter
firmware image with driver cxgb3.

On RHEL5.2, the cvgb3 driver used to be able to come up and running
without any problem. But with this new requirement, testers found that the
cxgb3 driver cannot be configured on 5.3 Betas. This is a regression from
5.2 from a testing's perspective, which we are afraid will also be the
case from customer's perspective if it is not fixed in 5.3.

Thanks for linking this bug to your BZ 469774. I just raised this bug to
"ship issue" just to make sure that the proposed change will be picked up
in the next snapshot.

Please let us know if you have any other concerns about this. Thanks.


This event sent from IssueTracker by jkachuck 
 issue 234962

Comment 5 Andy Gospodarek 2008-11-06 22:24:00 UTC
Thanks for the feedback.  We definitely recognize this as an issue that we need to fix and I think the proposed change will help keep the kernel and firmware in sync between releases.  New systems that come from the manufacturer might have the new firmware and would not see this issue, but systems that are older and being installed with 5.3 cleanly or ones that are being updated from 5.2 really need to make sure they are getting the right firmware.  I think our change will make this all work correctly.

Comment 6 IBM Bug Proxy 2008-11-06 23:10:45 UTC
*** Bug 43344 has been marked as a duplicate of this bug. ***

Comment 7 Andy Gospodarek 2008-11-07 02:06:47 UTC
My test kernels have been updated to include a patch for this bugzilla.

http://people.redhat.com/agospoda/#rhel5

Please test them and report back your results.

Comment 8 IBM Bug Proxy 2008-11-07 18:20:27 UTC
Andy, Wendy has tested the test kernel and she put her response in RIT234692 (IBM BZ 49507)

Hi Andy,

I installed 2.6.18-122.el5.gtest.60.ppc64.rpm and reboot with -122.el5.gtest.60
kernel.  After reboot, I didn't see /lib/firmware/t3fw-6.0.0.bin file.

Which package you include /lib/firmware/t3fw-6.0.0.bin in this
-122.el5.gtest.60 rpm?

#rpm -ql 2.6.18-122.el5.gtest.60
/boot/System.map-2.6.18-122.el5.gtest.60
/boot/config-2.6.18-122.el5.gtest.60
/boot/initrd-2.6.18-122.el5.gtest.60.img
/boot/symvers-2.6.18-122.el5.gtest.60.gz
/boot/vmlinuz-2.6.18-122.el5.gtest.60
/etc/modprobe.d/blacklist-firewire
/lib/modules/2.6.18-122.el5.gtest.60
/lib/modules/2.6.18-122.el5.gtest.60/build
/lib/modules/2.6.18-122.el5.gtest.60/extra
/lib/modules/2.6.18-122.el5.gtest.60/kernel
/lib/modules/2.6.18-122.el5.gtest.60/kernel/arch
/lib/modules/2.6.18-122.el5.gtest.60/kernel/arch/powerpc
/lib/modules/2.6.18-122.el5.gtest.60/kernel/arch/powerpc/oprofile

Will attach initrd outout.

Is something wrong with my test?

Thanks,
Wendy


Andy, Please also see the initrd output that Wendy attached under RIT234692. It should contain more information for you to diagnose.

Thanks for your help.

Comment 9 Andy Gospodarek 2008-11-07 18:36:54 UTC
Wendy, how did you install the kernel rpm?  Did you use --force or --nodeps to install the kernel?  It should not have installed unless you have the cxgb3-firmware rpm installed on your system.  The point of this is that:

1.  When upgrading yum will note that it is needed when updating and make sure to download that package too.

2.  When composing an initrd the cxgb3-firmware rpm will get added to the initrd to satisfy the kernels requirement.

Comment 10 Andy Gospodarek 2008-11-07 19:45:13 UTC
I just confirmed that my test kernels cannot be installed from the command line (rpm -i) without the proper version of cxgb3-firmware rpm installed.  The conflicts line added to the rpm spec-file are what makes this work properly.  If you installed with --force and/or --nodeps this check would have been skipped and I suspect your cxgb3 card would still not work.

Comment 12 IBM Bug Proxy 2008-11-07 20:00:38 UTC
>>Wendy, how did you install the kernel rpm?  Did you use --force or -->>nodeps to
>>install the kernel?  It should not have installed unless you have the
>>cxgb3-firmware rpm installed on your system.  The point of this is that:

Hi Andy,

#rpm -ivh kernel-2.6.18-122.el5.gtest.60.ppc64.rpm

I didn't use --force or --nodeps. Also I didn't have cxgb3-firmware rpm
installed on my system.

Thanks,
Wendy

Comment 13 Andy Gospodarek 2008-11-07 20:20:46 UTC
That's interesting, because this is what I get:

# rpm -ivh kernel-2.6.18-122.el5.gtest.60.x86_64.rpm
error: Failed dependencies:
        cxgb3-firmware < 6.0.0-6 conflicts with kernel-2.6.18-122.el5.gtest.60.x86_64

Is this some ppc64 oddity?  (That doesn't seem logical to me.)

Comment 14 IBM Bug Proxy 2008-11-07 20:20:53 UTC
>>I just confirmed that my test kernels cannot be installed from the >>command line
>>(rpm -i) without the proper version of cxgb3-firmware rpm installed.  >>The
>>conflicts line added to the rpm spec-file are what makes this work >>properly.
>>If you installed with --force and/or --nodeps this check would have >>been
>>skipped and I suspect your cxgb3 card would still not work.

I tried again, here is the detail commands I just used.

Is your test kernel is kernel-2.6.18-122.el5.gtest.60.rpm?

First, check if cxgb3-firmware and kernel-2.6.18-122.el5.gtest.60.rpm  in the system.

[root@IO79 ~]# uname -r
2.6.18-120.el5

[root@IO79 firmware]# rpm -qa|grep cxgb3
libcxgb3-1.2.2-1.el5
libcxgb3-1.2.2-1.el5
[root@IO79 firmware]# rpm -qa|grep 122
kernel-2.6.18-122.el5

Second,

[root@IO79 ~]# rpm -ivh kernel-2.6.18-122.el5.gtest.60.ppc64.rpm
Preparing...                ########################################### [100%]
1:kernel                 ########################################### [100%]

Third, check again,
[root@IO79 firmware]# rpm -qa|grep cxgb3
libcxgb3-1.2.2-1.el5
libcxgb3-1.2.2-1.el5
[root@IO79 firmware]# rpm -qa|grep 122
kernel-2.6.18-122.el5
kernel-2.6.18-122.el5.gtest.60

I didn't see cxgb3-firmware installed.

Am I missing something?

Comment 15 Andy Gospodarek 2008-11-07 20:36:45 UTC
OK, someone just re-explained the difference between requires and conflicts in rpm spec-files, so I'll have to re-do this.  This should hopefully still make snap3.

Comment 16 IBM Bug Proxy 2008-11-07 22:50:44 UTC
Thanks! Let me know when you have your kernel ready for me to test. Just confirm with you:  The fixes  you make I expect have :

(1)cxgb3-firmware is automatically installed after install rhel53.

(2)we can do network installation through cxgb3 interface(if the firmware on adapter doesn't match with the cxgb3 device requested firmware).

Thanks for your hard work! We want to make the fixes into Snapshot3. We understood that it is hard for us to make the fixes in after SnapShot3.

Wendy

Comment 18 IBM Bug Proxy 2008-11-15 03:40:32 UTC
Andy, Do you have test kernel around somewhere for this bug? Thanks!

Comment 20 IBM Bug Proxy 2008-11-21 04:40:57 UTC
Due to this bug happened during network installation, Can you add t3fw-6.0.0.bin on the install initrd or in ramdisk.image?

If you can build it into next snapshot, we can give it try.

Thanks,
Wendy

Comment 21 Andy Gospodarek 2008-11-21 05:23:11 UTC
Unfortunately it's not that easy.  The installer doesn't have a way to respond appropriately to the request_firmware action, so we can't just put the files in /lib/firmware.

We are working up some ways to make this happen and hope to have something soon.  We will let you know.

Comment 23 Don Zickus 2008-12-02 22:19:34 UTC
in kernel-2.6.18-125.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Comment 25 Andy Gospodarek 2008-12-03 16:04:57 UTC
*Sigh* this patch seems to be causing problems, but I know what's wrong. :-/

Comment 26 Andy Gospodarek 2008-12-04 12:54:25 UTC
My test kernels have been updated to include a patch for this bugzilla.

http://people.redhat.com/agospoda/#rhel5

Please test them and report back your results.

Comment 27 IBM Bug Proxy 2008-12-04 15:41:14 UTC
Andy, I tested this new kernel with ifconfig cxgb3 interface without /lib/firmware/t3fw-6.0.0.bin, the adapter flashed to embedded firmware.
Does this mean it will work in/during network installation?

Comment 28 Andy Gospodarek 2008-12-04 20:24:57 UTC
Thanks for testing!  I'm glad to hear that you get the same results that I get when I've been trying it.  

The cxgb3-based devices should now work just fine during network installation of 5.3.  We will need to you to test it for sure (to be sure the short delay to load the firmware isn't a problem for anaconda), but I don't imagine it will be any problem.

Comment 31 Don Zickus 2008-12-09 21:04:40 UTC
in kernel-2.6.18-126.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Comment 32 Don Zickus 2008-12-09 21:54:18 UTC
Andy has posted another cleanup patch. :-)  Reverting back to POST to pick it up.

Comment 34 Don Zickus 2008-12-16 19:15:19 UTC
in kernel-2.6.18-127.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Comment 36 IBM Bug Proxy 2008-12-17 03:20:41 UTC
I have tried this in snap6. The adapter has 7.0.0 level firmware, when I picked cxgb3 interface as interface during network installation, the cxgb3 driver is loaded and adapter firmware flashed to 6.0.0 level. Looks the fixes have been in snap6.

Why posted -127 kernel again? Let us know if you make new changes in -127 for cxgb3 driver.

Thanks,
Wendy

Comment 37 Andy Gospodarek 2008-12-17 05:19:02 UTC
Just a small patch that has a 2-line bug-fix in an error path.

Nothing that should effect you if you system is working now, but something that's good to have.

Comment 38 IBM Bug Proxy 2008-12-17 16:21:47 UTC
*** Bug 50647 has been marked as a duplicate of this bug. ***

Comment 39 IBM Bug Proxy 2008-12-17 20:02:22 UTC
Andy, Thanks for letting us know!

Comment 41 errata-xmlrpc 2009-01-20 20:06:40 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-0225.html