Created attachment 517444 [details] src rpm for the HP provided Qlogic Network Card Driver/Module Description of problem: The netxen_nic driver does not seem to work correctly. When a network cable is unplugged from a quad port nic and then plugged back in the link lights do not come back on and the eth port stays down Firmware was updated off of HP's support driver site to: firmware-version: 4.0.556 Version-Release number of selected component (if applicable): HP DL380 G6 Server This problem is seen on either of the two RHEL5 kernels Linux ustchscaeflx09 2.6.18-274.el5 #1 SMP Fri Jul 8 17:36:59 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux Linux ustchscbeflx09 2.6.18-238.12.1.el5 #1 SMP Sat May 7 20:18:50 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux firmware-version: 4.0.556 0a:00.0 Ethernet controller: NetXen Incorporated NX3031 Multifunction 1/10-Gigabit Server Adapter (rev 42) Subsystem: Hewlett-Packard Company NC375T PCI Express Quad Port Gigabit Server Adapter How reproducible: Steps to Reproduce: 1. Put quad card in place 2. Cable up 3. Reboot 4. Pull a network cable from the quad core nic and plug it back in Actual results: The link light does not come back on and the eth port stays down. Expected results: the link light should come back on Additional info: I pulled down the driver/module for this network card from HP Support Driver site. And built two rpms, loaded the rpms, which was required to flash the firmware up to 4.0.556. Then I removed the rpms to try and use the default netxen_nic driver with the updated firmware. Same issue. I resolved the issue by reinstalling the two rpms provided by HP which blacklist the netxen_nic netxen_xport modules and utilize the nx_nic module. Once I did this when I pull a cable on this card and then plug back in the link light comes back on as expected. We have been trying to only utilize drivers provided by the Native Red Hat kernel but this issue will force us to use one provided by HP. #these are the ones that were built from src and that were installed to make things work, provided from HP's support driver site rpm -ivh hp-nx_nic-tools-4.0.556-2.x86_64.rpm kmod-hp-nx_nic-4.0.556-2.x86_64.rpm
ACK Have notified the HP networking group.
Created attachment 520304 [details] Phantom core to take fw dump Hi Dave, I tried the test on "Hewlett-Packard Company NC375T PCI Express Quad Port Gigabit Server Adapter" with 4.0.556 firmware and rhel5.7 inbox netxen_nic driver (2.6.18-274.el5). For me link was coming up properly after the test(unplug/plug the cable test). Before running the test load the module with auto_fw_reset disabled. modprobe netxen_nic auto_fw_reset=0 I am attaching a phantomcore_p3 binary to take a firmware dump. After you hit the issue run ./phantomcore_p3 -i <interface>. Also send the output of following commands before and after the test. a. ethtool -i <interface> b ethtool <interface>
Hi Dave, Can you send the dmesg output of failure and success case ? One after loading netxen_nic and one after loading nx_nic. Is the interface connected to a switch ? If yes then can you send the details of switch, why type of switch and its name etc. Also send ethtool -i output in both cases.
Hi all, I have/had similar problems: Card Information: Ethernet controller: NetXen Incorporated NX3031 Multifunction 1/10-Gigabit Server Adapter (rev 42) Subsystem: Hewlett-Packard Company NC375i Integrated Quad Port Multifunction Gigabit Server Adapter # ethtool -i eth0 driver: netxen_nic version: 4.0.74 firmware-version: 4.0.544 bus-info: 0000:04:00.0 Driver netxen_nic with Version 4.0.74 wasn't working with firmware 4.0.544 - couldn't set speed, duplex, autoneg through ethtool - I also wasn't able to bring up the interface I then upgraded to RHEL 5.7 where we have netxen_nic driver version 4.0.75 available - was able to set speed, duplex, autoneg through ethtool - link is detected and interface is coming up - setting the interface to autoneg isn't working! I need to set ETHTOOL_OPTS in ifcfg-eth0 - network performance is poor while comparing with the situation we had in the past running driver version 4.0.74 and firmware version 4.0.534! It's probably worth to mention that our setup is a bit special, since we are connected to a 100MB/s Switch and not to 1000MB/s - but I don't see why this should cause such massive problems
(In reply to comment #3) > Hi Dave, > > Can you send the dmesg output of failure and success case ? > One after loading netxen_nic and one after loading nx_nic. > > Is the interface connected to a switch ? > If yes then can you send the details of switch, why type of switch > and its name etc. > > Also send ethtool -i output in both cases. I think it's kind of strange that we would ship a driver that requires this option to function correctly: modprobe netxen_nic auto_fw_reset=0 I added options netxen_nic auto_fw_reset=0 to modprobe.conf and rebuilt the initrd for good measure. Rebooted, and now when I pull a cable and plug back in the link light stays on as expected. However, when I ran the ./phantomcore_p3 -i eth0 to take a firrware dump it hung up the system. I logged in to the remote console and restarted the network, service network restart and saw some nasty errors. I rebooted the system thinking maybe the firmware dump had something to do with this and it looks like when I restart the network (service network restart) all is good. So I think we are good to go. fyi, I'm not too worried about these messages, since it was caused by the firmware dump Sep 1 13:30:54 ustchscaeflx09 kernel: eth6: firmware hang detected Sep 1 13:41:07 ustchscaeflx09 kernel: netxen_nic: card response timeout here's the ethtool request root@ustchscaeflx09 ~]# !984 for i in 0 1 2 3 4 5 6 7; do echo "eth$i"; ethtool -i eth$i; done eth0 driver: netxen_nic version: 4.0.75 firmware-version: 4.0.556 bus-info: 0000:0a:00.0 eth1 driver: netxen_nic version: 4.0.75 firmware-version: 4.0.556 bus-info: 0000:0a:00.1 eth2 driver: bnx2 version: 2.0.21 firmware-version: bc 4.6.4 NCSI 1.0.3 bus-info: 0000:02:00.0 eth3 driver: bnx2 version: 2.0.21 firmware-version: bc 4.6.4 NCSI 1.0.3 bus-info: 0000:02:00.1 eth4 driver: netxen_nic version: 4.0.75 firmware-version: 4.0.556 bus-info: 0000:0a:00.2 eth5 driver: bnx2 version: 2.0.21 firmware-version: bc 4.6.4 bus-info: 0000:03:00.1 eth6 driver: netxen_nic version: 4.0.75 firmware-version: 4.0.556 bus-info: 0000:0a:00.3 eth7 driver: bnx2 version: 2.0.21 firmware-version: bc 4.6.4 bus-info: 0000:03:00.0 One issue that I see is that in order for me to load the updated firmware-version from HP, it required me to have nx_nic module loaded. So in order for me to do this. I installed the the following rpms hp-nx_nic-tools-4.0.556-2.x86_64.rpm kmod-hp-nx_nic-4.0.556-2.x86_64.rpm compiling them from src (pulled from HP support site) I then got the required module (nx_xport.ko) loaded to run the firmware update. ./CP015529.scexe -s Then I yum removed the two rpms above. This seems somewhat painful of a process. If both Red Hat and HP's driver is the same qlogic driver upstream why do we have different module names, would be nice for the two to get in sync. Simon we do 100Mb/s so you shouldn't have a problem assuming you update the firmware to 4.0.556. [root@ustchscaeflx09 nicswap]# cat /etc/sysconfig/network-scripts/ifcfg-eth0 DEVICE=eth0 ETHTOOL_OPTS="autoneg off speed 100 duplex full" ONBOOT=yes MASTER=bond1 SLAVE=yes USERCTL=no BOOTPROTO=none
Created attachment 521095 [details] Firmware Dump eth0 netxen_nic
Perhaps this HP CA is related? http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?lang=en&cc=us&task Id=110&prodSeriesId=3913537&prodTypeId=329290&objectID=c02964542 Chad, do you know what changes would have been made to hp's nx_nic driver for the updated firmware the CA references and if the same changes are in our netxen_nic driver?
(In reply to comment #5) > (In reply to comment #3) > > Hi Dave, > > > > Can you send the dmesg output of failure and success case ? > > One after loading netxen_nic and one after loading nx_nic. > > > > Is the interface connected to a switch ? > > If yes then can you send the details of switch, why type of switch > > and its name etc. > > > > Also send ethtool -i output in both cases. > > I think it's kind of strange that we would ship a driver that requires this > option to function correctly: > > modprobe netxen_nic auto_fw_reset=0 > > I added options netxen_nic auto_fw_reset=0 to modprobe.conf and rebuilt the > initrd for good measure. Rebooted, and now when I pull a cable and plug back > in the link light stays on as expected. > > However, when I ran the ./phantomcore_p3 -i eth0 to take a firrware dump it > hung up the system. I logged in to the remote console and restarted the > network, service network restart and saw some nasty errors. > > I rebooted the system thinking maybe the firmware dump had something to do with > this and it looks like when I restart the network (service network restart) all > is good. So I think we are good to go. > > fyi, I'm not too worried about these messages, since it was caused by the > firmware dump > > Sep 1 13:30:54 ustchscaeflx09 kernel: eth6: firmware hang detected > Sep 1 13:41:07 ustchscaeflx09 kernel: netxen_nic: card response timeout > > here's the ethtool request > > root@ustchscaeflx09 ~]# !984 > for i in 0 1 2 3 4 5 6 7; do echo "eth$i"; ethtool -i eth$i; done > eth0 > driver: netxen_nic > version: 4.0.75 > firmware-version: 4.0.556 > bus-info: 0000:0a:00.0 > eth1 > driver: netxen_nic > version: 4.0.75 > firmware-version: 4.0.556 > bus-info: 0000:0a:00.1 > eth2 > driver: bnx2 > version: 2.0.21 > firmware-version: bc 4.6.4 NCSI 1.0.3 > bus-info: 0000:02:00.0 > eth3 > driver: bnx2 > version: 2.0.21 > firmware-version: bc 4.6.4 NCSI 1.0.3 > bus-info: 0000:02:00.1 > eth4 > driver: netxen_nic > version: 4.0.75 > firmware-version: 4.0.556 > bus-info: 0000:0a:00.2 > eth5 > driver: bnx2 > version: 2.0.21 > firmware-version: bc 4.6.4 > bus-info: 0000:03:00.1 > eth6 > driver: netxen_nic > version: 4.0.75 > firmware-version: 4.0.556 > bus-info: 0000:0a:00.3 > eth7 > driver: bnx2 > version: 2.0.21 > firmware-version: bc 4.6.4 > bus-info: 0000:03:00.0 > > One issue that I see is that in order for me to load the updated > firmware-version from HP, it required me to have nx_nic module loaded. So in > order for me to do this. I installed the the following rpms > > hp-nx_nic-tools-4.0.556-2.x86_64.rpm > kmod-hp-nx_nic-4.0.556-2.x86_64.rpm > > compiling them from src (pulled from HP support site) > > I then got the required module (nx_xport.ko) loaded to run the firmware update. > > ./CP015529.scexe -s > > Then I yum removed the two rpms above. This seems somewhat painful of a > process. If both Red Hat and HP's driver is the same qlogic driver upstream > why do we have different module names, would be nice for the two to get in > sync. > > Simon we do 100Mb/s so you shouldn't have a problem assuming you update the > firmware to 4.0.556. > [root@ustchscaeflx09 nicswap]# cat /etc/sysconfig/network-scripts/ifcfg-eth0 > DEVICE=eth0 > ETHTOOL_OPTS="autoneg off speed 100 duplex full" > ONBOOT=yes > MASTER=bond1 > SLAVE=yes > USERCTL=no > BOOTPROTO=none The option auto_fw_reset=0 only disables firmware recovery in case of firmware hang. So do not set this option for normal operations. phantomcore_p3 takes a dump of firmware so it shuts down the firmware therefore in this case we have to disable auto recovery. Therefore you were having nasty messages in dmesg. If you are getting linkup now then you should get it even after not setting auto_fw_reset=0 option. Please try now without setting this option, if you still see the problem send me the dmesg, /var/log/messages/, ethtool <interface>, ethtool -i <interface> output.
(In reply to comment #5) > Simon we do 100Mb/s so you shouldn't have a problem assuming you update the > firmware to 4.0.556. > [root@ustchscaeflx09 nicswap]# cat /etc/sysconfig/network-scripts/ifcfg-eth0 > DEVICE=eth0 > ETHTOOL_OPTS="autoneg off speed 100 duplex full" > ONBOOT=yes > MASTER=bond1 > SLAVE=yes > USERCTL=no > BOOTPROTO=none Dave, The problem is not, that it does not work. The problem is that it doesn't perform as expected, respectively like it did before the firmware and OS upgrade (note, the OS upgrade was necessary, since the driver version 4.0.74 was not able to work with firmware version 4.0.544) Just to give you an idea - we have a second server, that runs the same hardware, but with the old firmware/drivers. Both are attached to the same switch and are running 100MB/s full duplex Legend: external Server: server1 server running RHEL 5.6 firmware version 4.0.534: serverA server running RHEL 5.7 firmware version 4.0.544: serverX For testing purpose, I've created the following file and copied it over the network to an external server scp: dd if=/dev/zero of=/home/~/test.file2 bs=1024 count=100000 ---------- [ Server running RHEL 5.7 and is having performance issues ] server1:~ $ time scp serverX:~/test.file . test.file 100% 98MB 4.9MB/s 00:20 real 0m20.821s user 0m3.278s sys 0m0.854s server1:~ $ server1:~ $ tracepath serverX 1: server1 (XX.XX.XX.218) 0.072ms pmtu 1500 1: SECRET (XX.XX.XX.254) 0.823ms 2: SECRET (XX.XX.XX.102) 1.049ms 3: SECRET (XX.XX.XX.5) 1.230ms 4: SECRET (XX.XX.XX.14) 1.669ms 5: serverX (XX.XX.XX.53) 2.389ms reached Resume: pmtu 1500 hops 5 back 5 --------- [ Server running RHEL 5.6 and does not show any network issue ] server1:~ $ time scp serverA:~/test.file2 . test.file2 100% 98MB 12.2MB/s 00:08 real 0m9.482s user 0m3.417s sys 0m0.864s server1:~ $ server1:~ $ tracepath serverA 1: server1 (XX.XX.XX.218) 0.072ms pmtu 1500 1: SECRET (XX.XX.XX.254) 0.745ms 2: SECRET (XX.XX.XX.102) 1.214ms 3: SECRET (XX.XX.XX.5) 1.032ms 4: SECRET (XX.XX.XX.14) 1.039ms 5: serverA (XX.XX.XX.49) 0.600ms reached Resume: pmtu 1500 hops 5 back 5
(In reply to comment #8) > (In reply to comment #5) > > (In reply to comment #3) > > > Hi Dave, > > > > > > Can you send the dmesg output of failure and success case ? > > > One after loading netxen_nic and one after loading nx_nic. > > > > > > Is the interface connected to a switch ? > > > If yes then can you send the details of switch, why type of switch > > > and its name etc. > > > > > > Also send ethtool -i output in both cases. > > > > I think it's kind of strange that we would ship a driver that requires this > > option to function correctly: > > > > modprobe netxen_nic auto_fw_reset=0 > > > > I added options netxen_nic auto_fw_reset=0 to modprobe.conf and rebuilt the > > initrd for good measure. Rebooted, and now when I pull a cable and plug back > > in the link light stays on as expected. > > > > However, when I ran the ./phantomcore_p3 -i eth0 to take a firrware dump it > > hung up the system. I logged in to the remote console and restarted the > > network, service network restart and saw some nasty errors. > > > > I rebooted the system thinking maybe the firmware dump had something to do with > > this and it looks like when I restart the network (service network restart) all > > is good. So I think we are good to go. > > > > fyi, I'm not too worried about these messages, since it was caused by the > > firmware dump > > > > Sep 1 13:30:54 ustchscaeflx09 kernel: eth6: firmware hang detected > > Sep 1 13:41:07 ustchscaeflx09 kernel: netxen_nic: card response timeout > > > > here's the ethtool request > > > > root@ustchscaeflx09 ~]# !984 > > for i in 0 1 2 3 4 5 6 7; do echo "eth$i"; ethtool -i eth$i; done > > eth0 > > driver: netxen_nic > > version: 4.0.75 > > firmware-version: 4.0.556 > > bus-info: 0000:0a:00.0 > > eth1 > > driver: netxen_nic > > version: 4.0.75 > > firmware-version: 4.0.556 > > bus-info: 0000:0a:00.1 > > eth2 > > driver: bnx2 > > version: 2.0.21 > > firmware-version: bc 4.6.4 NCSI 1.0.3 > > bus-info: 0000:02:00.0 > > eth3 > > driver: bnx2 > > version: 2.0.21 > > firmware-version: bc 4.6.4 NCSI 1.0.3 > > bus-info: 0000:02:00.1 > > eth4 > > driver: netxen_nic > > version: 4.0.75 > > firmware-version: 4.0.556 > > bus-info: 0000:0a:00.2 > > eth5 > > driver: bnx2 > > version: 2.0.21 > > firmware-version: bc 4.6.4 > > bus-info: 0000:03:00.1 > > eth6 > > driver: netxen_nic > > version: 4.0.75 > > firmware-version: 4.0.556 > > bus-info: 0000:0a:00.3 > > eth7 > > driver: bnx2 > > version: 2.0.21 > > firmware-version: bc 4.6.4 > > bus-info: 0000:03:00.0 > > > > One issue that I see is that in order for me to load the updated > > firmware-version from HP, it required me to have nx_nic module loaded. So in > > order for me to do this. I installed the the following rpms > > > > hp-nx_nic-tools-4.0.556-2.x86_64.rpm > > kmod-hp-nx_nic-4.0.556-2.x86_64.rpm > > > > compiling them from src (pulled from HP support site) > > > > I then got the required module (nx_xport.ko) loaded to run the firmware update. > > > > ./CP015529.scexe -s > > > > Then I yum removed the two rpms above. This seems somewhat painful of a > > process. If both Red Hat and HP's driver is the same qlogic driver upstream > > why do we have different module names, would be nice for the two to get in > > sync. > > > > Simon we do 100Mb/s so you shouldn't have a problem assuming you update the > > firmware to 4.0.556. > > [root@ustchscaeflx09 nicswap]# cat /etc/sysconfig/network-scripts/ifcfg-eth0 > > DEVICE=eth0 > > ETHTOOL_OPTS="autoneg off speed 100 duplex full" > > ONBOOT=yes > > MASTER=bond1 > > SLAVE=yes > > USERCTL=no > > BOOTPROTO=none > > The option auto_fw_reset=0 only disables firmware recovery in case of firmware > hang. So do not set this option for normal operations. phantomcore_p3 takes a > dump of firmware so it shuts down the firmware therefore in this case we have > to disable auto recovery. Therefore you were having nasty messages in dmesg. > > If you are getting linkup now then you should get it even after not setting > auto_fw_reset=0 option. Please try now without setting this option, if you > still see the problem send me the dmesg, /var/log/messages/, ethtool > <interface>, ethtool -i <interface> output. Rajesh, I already ran that test without this option and that's when I noticed unplugging the cable and plugging it back in didn't bring the link light back on. I can try to re-run the test again today to make sure. I'm working with application folks who are using this machine so I have to get permission to run the test. Simon this is why I was saying that it didn't work. What I notice with the netxen_nic is that when I unplug and replug the link light doesn't come back right away. After some time it does though. I haven't had the opportunity to performance test this nic/driver set. Apparently the option auto_fw_reset=0 option corrects the link light issue. But according to Rajesh we shouldn't run in this mode. There's obviously has to be some differences between the code sets for the HP nx_nic and the RH netxen_nic, not sure why since it's the same upstream qlogic driver. But I assume this happens from time to time.
I already attached the hp nx_nic src rpm, should be fairly trivial to compare that to what's in the 2.6.18-274 kernel src rpm.
(In reply to comment #10) > (In reply to comment #8) > > (In reply to comment #5) > > > (In reply to comment #3) > > > > Hi Dave, > > > > > > > > Can you send the dmesg output of failure and success case ? > > > > One after loading netxen_nic and one after loading nx_nic. > > > > > > > > Is the interface connected to a switch ? > > > > If yes then can you send the details of switch, why type of switch > > > > and its name etc. > > > > > > > > Also send ethtool -i output in both cases. > > > > > > I think it's kind of strange that we would ship a driver that requires this > > > option to function correctly: > > > > > > modprobe netxen_nic auto_fw_reset=0 > > > > > > I added options netxen_nic auto_fw_reset=0 to modprobe.conf and rebuilt the > > > initrd for good measure. Rebooted, and now when I pull a cable and plug back > > > in the link light stays on as expected. > > > > > > However, when I ran the ./phantomcore_p3 -i eth0 to take a firrware dump it > > > hung up the system. I logged in to the remote console and restarted the > > > network, service network restart and saw some nasty errors. > > > > > > I rebooted the system thinking maybe the firmware dump had something to do with > > > this and it looks like when I restart the network (service network restart) all > > > is good. So I think we are good to go. > > > > > > fyi, I'm not too worried about these messages, since it was caused by the > > > firmware dump > > > > > > Sep 1 13:30:54 ustchscaeflx09 kernel: eth6: firmware hang detected > > > Sep 1 13:41:07 ustchscaeflx09 kernel: netxen_nic: card response timeout > > > > > > here's the ethtool request > > > > > > root@ustchscaeflx09 ~]# !984 > > > for i in 0 1 2 3 4 5 6 7; do echo "eth$i"; ethtool -i eth$i; done > > > eth0 > > > driver: netxen_nic > > > version: 4.0.75 > > > firmware-version: 4.0.556 > > > bus-info: 0000:0a:00.0 > > > eth1 > > > driver: netxen_nic > > > version: 4.0.75 > > > firmware-version: 4.0.556 > > > bus-info: 0000:0a:00.1 > > > eth2 > > > driver: bnx2 > > > version: 2.0.21 > > > firmware-version: bc 4.6.4 NCSI 1.0.3 > > > bus-info: 0000:02:00.0 > > > eth3 > > > driver: bnx2 > > > version: 2.0.21 > > > firmware-version: bc 4.6.4 NCSI 1.0.3 > > > bus-info: 0000:02:00.1 > > > eth4 > > > driver: netxen_nic > > > version: 4.0.75 > > > firmware-version: 4.0.556 > > > bus-info: 0000:0a:00.2 > > > eth5 > > > driver: bnx2 > > > version: 2.0.21 > > > firmware-version: bc 4.6.4 > > > bus-info: 0000:03:00.1 > > > eth6 > > > driver: netxen_nic > > > version: 4.0.75 > > > firmware-version: 4.0.556 > > > bus-info: 0000:0a:00.3 > > > eth7 > > > driver: bnx2 > > > version: 2.0.21 > > > firmware-version: bc 4.6.4 > > > bus-info: 0000:03:00.0 > > > > > > One issue that I see is that in order for me to load the updated > > > firmware-version from HP, it required me to have nx_nic module loaded. So in > > > order for me to do this. I installed the the following rpms > > > > > > hp-nx_nic-tools-4.0.556-2.x86_64.rpm > > > kmod-hp-nx_nic-4.0.556-2.x86_64.rpm > > > > > > compiling them from src (pulled from HP support site) > > > > > > I then got the required module (nx_xport.ko) loaded to run the firmware update. > > > > > > ./CP015529.scexe -s > > > > > > Then I yum removed the two rpms above. This seems somewhat painful of a > > > process. If both Red Hat and HP's driver is the same qlogic driver upstream > > > why do we have different module names, would be nice for the two to get in > > > sync. > > > > > > Simon we do 100Mb/s so you shouldn't have a problem assuming you update the > > > firmware to 4.0.556. > > > [root@ustchscaeflx09 nicswap]# cat /etc/sysconfig/network-scripts/ifcfg-eth0 > > > DEVICE=eth0 > > > ETHTOOL_OPTS="autoneg off speed 100 duplex full" > > > ONBOOT=yes > > > MASTER=bond1 > > > SLAVE=yes > > > USERCTL=no > > > BOOTPROTO=none > > > > The option auto_fw_reset=0 only disables firmware recovery in case of firmware > > hang. So do not set this option for normal operations. phantomcore_p3 takes a > > dump of firmware so it shuts down the firmware therefore in this case we have > > to disable auto recovery. Therefore you were having nasty messages in dmesg. > > > > If you are getting linkup now then you should get it even after not setting > > auto_fw_reset=0 option. Please try now without setting this option, if you > > still see the problem send me the dmesg, /var/log/messages/, ethtool > > <interface>, ethtool -i <interface> output. > > Rajesh, I already ran that test without this option and that's when I noticed > unplugging the cable and plugging it back in didn't bring the link light back > on. I can try to re-run the test again today to make sure. I'm working with > application folks who are using this machine so I have to get permission to run > the test. > > Simon this is why I was saying that it didn't work. What I notice with the > netxen_nic is that when I unplug and replug the link light doesn't come back > right away. After some time it does though. I haven't had the opportunity to > performance test this nic/driver set. Apparently the option auto_fw_reset=0 > option corrects the link light issue. But according to Rajesh we shouldn't run > in this mode. There's obviously has to be some differences between the code > sets for the HP nx_nic and the RH netxen_nic, not sure why since it's the same > upstream qlogic driver. But I assume this happens from time to time. Hmm, ok, we just ran the test (pulling cable/replugging) again and it seems to work just fine. So now I am a bit confused. This test was without the auto_fw_reset=0 setting in modprobe.conf. Maybe somehow I got the the kernel versions and firmware versions mixed up. So let me run some netperf tests on this nic. [root@ustchscaeflx09 nicswap]# for i in 0 1 2 3 4 5 6 7; do echo "eth$i"; ethtool -i eth$i; done eth0 driver: netxen_nic version: 4.0.75 firmware-version: 4.0.556 bus-info: 0000:0a:00.0 eth1 driver: netxen_nic version: 4.0.75 firmware-version: 4.0.556 bus-info: 0000:0a:00.1 eth2 driver: bnx2 version: 2.0.21 firmware-version: bc 4.6.4 NCSI 1.0.3 bus-info: 0000:02:00.0 eth3 driver: bnx2 version: 2.0.21 firmware-version: bc 4.6.4 NCSI 1.0.3 bus-info: 0000:02:00.1 eth4 driver: netxen_nic version: 4.0.75 firmware-version: 4.0.556 bus-info: 0000:0a:00.2 eth5 driver: bnx2 version: 2.0.21 firmware-version: bc 4.6.4 bus-info: 0000:03:00.1 eth6 driver: netxen_nic version: 4.0.75 firmware-version: 4.0.556 bus-info: 0000:0a:00.3 eth7 driver: bnx2 version: 2.0.21 firmware-version: bc 4.6.4 bus-info: 0000:03:00.0 [root@ustchscaeflx09 nicswap]# cat /etc/modprobe.conf alias scsi_hostadapter cciss alias scsi_hostadapter1 ata_piix alias scsi_hostadapter2 lpfc alias scsi_hostadapter3 usb-storage options lpfc lpfc_lun_queue_depth=16 lpfc_devloss_tmo=10 lpfc_discovery_threads=32 #options netxen_nic auto_fw_reset=0 alias eth0 netxen_nic alias eth1 netxen_nic alias eth2 bnx2 alias eth3 bnx2 alias eth4 netxen_nic alias eth5 bnx2 alias eth6 netxen_nic alias eth7 bnx2 alias bond0 bonding mode=1 miimon=100 alias bond1 bonding mode=1 miimon=100 #new initrd -rw------- 1 root root 3768901 Sep 2 11:10 initrd-2.6.18-274.el5.img
Is any of the failing configurations using half duplex settings? Since the hardware does not support half duplex, there is a workaround in the 4.0.556 firmware to declare link down to the host, when it detects half duplex.
(In reply to comment #13) > Is any of the failing configurations using half duplex settings? Since the > hardware does not support half duplex, there is a workaround in the 4.0.556 > firmware to declare link down to the host, when it detects half duplex. From my perspective all is good with the 4.0.556 firmware update. I transferred a 1GB file one using bnx2 and one using netxen_nic and I'm seeing about the same performance. I also ran netperf tests over the weekend across the netxen_nic/driver and all seems ok. [root@deruescaeflx05 tmp]# #bnx2 nic [root@deruescaeflx05 tmp]# time scp ./1gb.dat xxx.xx.xx.xx:/tmp/ Warning: Permanently added '121.74.251.7' (RSA) to the list of known hosts. Subject to applicable law, anyone using the Network expressly consents to: 1) having his/her network activity monitored and recorded; and, 2) using the Network only in accordance with the terms of the applicable Acceptable Use Practices (www.NetworkAUP.com). Your work product created, transmitted or stored on GM networks or systems, including your name or other personally identifiable information, may be shared with other GM entities, suppliers and third parties around the globe when required for business or legal purposes. BE ADVISED, that improper usage of the network and/or computing systems and equipment may result in disciplinary action, up to and including termination of employment. If possible criminal activity is detected, system records may be provided to law enforcement officials. 1gb.dat 100% 1024MB 51.2MB/s 00:20 real 0m20.606s user 0m19.453s sys 0m1.592s [root@deruescaeflx05 tmp]# #netxen_nic [root@deruescaeflx05 tmp]# time scp ./1gb.dat xx.xx.xx.xx:/tmp/ Warning: Permanently added '10.22.5.2' (RSA) to the list of known hosts. Subject to applicable law, anyone using the Network expressly consents to: 1) having his/her network activity monitored and recorded; and, 2) using the Network only in accordance with the terms of the applicable Acceptable Use Practices (www.NetworkAUP.com). Your work product created, transmitted or stored on GM networks or systems, including your name or other personally identifiable information, may be shared with other GM entities, suppliers and third parties around the globe when required for business or legal purposes. BE ADVISED, that improper usage of the network and/or computing systems and equipment may result in disciplinary action, up to and including termination of employment. If possible criminal activity is detected, system records may be provided to law enforcement officials. 1gb.dat 100% 1024MB 48.8MB/s 00:21 real 0m21.078s user 0m19.353s sys 0m1.422s
My conclusion: If you are deploying on 5.7 (2.6.18-274 kernel) then go with netxen_nic provided by Red Hat. However if you have to deploy on 5.6 (2.6.18-238.XX) kernel then use the nx_nic kernel. We built the updated driver rpms for nx_nic for that specific kernel. The netxen_nic with the 2.6.18-238 kernel has negotiation issues where as the nx_nic does not. nx_nic testing 2.6.18-274 ..................................................................................................................................... [root@ustchscaeflx09 nicswap]# time scp /tmp/1gb.dat 192.168.109.2:/tmp/ Subject to applicable law, anyone using the Network expressly consents to: 1) having his/her network activity monitored and recorded; and, 2) using the Network only in accordance with the terms of the applicable Acceptable Use Practices (www.NetworkAUP.com). Your work product created, transmitted or stored on GM networks or systems, including your name or other personally identifiable information, may be shared with other GM entities, suppliers and third parties around the globe when required for business or legal purposes. BE ADVISED, that improper usage of the network and/or computing systems and equipment may result in disciplinary action, up to and including termination of employment. If possible criminal activity is detected, system records may be provided to law enforcement officials. 1gb.dat 100% 1024MB 11.1MB/s 01:32 real 1m31.707s user 0m19.856s sys 0m2.036s [root@ustchscaeflx09 nicswap]# ethtool -i eth6 driver: nx_nic version: 4.0.556 firmware-version: 4.0.556 bus-info: 0000:0a:00.3 [root@ustchscaeflx09 nicswap]# ifdown eth6 [root@ustchscaeflx09 nicswap]# ifup eth6 [root@ustchscaeflx09 nicswap]# uname -a Linux ustchscaeflx09 2.6.18-274.el5 #1 SMP Fri Jul 8 17:36:59 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux You have new mail in /var/spool/mail/root netxen_nic testing .................................................................................................................... [root@ustchscaeflx09 nicswap]# time scp /tmp/1gb.dat 192.168.109.2:/tmp/ Subject to applicable law, anyone using the Network expressly consents to: 1) having his/her network activity monitored and recorded; and, 2) using the Network only in accordance with the terms of the applicable Acceptable Use Practices (www.NetworkAUP.com). Your work product created, transmitted or stored on GM networks or systems, including your name or other personally identifiable information, may be shared with other GM entities, suppliers and third parties around the globe when required for business or legal purposes. BE ADVISED, that improper usage of the network and/or computing systems and equipment may result in disciplinary action, up to and including termination of employment. If possible criminal activity is detected, system records may be provided to law enforcement officials. 1gb.dat 100% 1024MB 11.1MB/s 01:32 real 1m31.843s user 0m20.007s sys 0m1.966s You have new mail in /var/spool/mail/root [root@ustchscaeflx09 nicswap]# ethtool -i eth6 driver: netxen_nic version: 4.0.75 firmware-version: 4.0.556 bus-info: 0000:0a:00.3 [root@ustchscaeflx09 nicswap]# uname -a Linux ustchscaeflx09 2.6.18-274.el5 #1 SMP Fri Jul 8 17:36:59 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux nx_nic 2.6.18-238.xx ........................................................................................................................... [root@ustchscbeflx09 nicswap]# ethtool -i eth6 driver: nx_nic version: 4.0.550 firmware-version: 4.0.556 bus-info: 0000:0a:00.3 [root@ustchscbeflx09 nicswap]# ifdown eth6 [root@ustchscbeflx09 nicswap]# ifup eth6 [root@ustchscbeflx09 nicswap]# uname -a Linux ustchscbeflx09 2.6.18-238.12.1.el5 #1 SMP Sat May 7 20:18:50 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux [root@ustchscbeflx09 nicswap]# history | grep time 694 time cp zsa_tables01.dbf /var/crash 696 time scp zsa_tables01.dbf nodea:/var/crash 1055 time scp /tmp/1gb.dat 192.168.109.1:/tmp/ 1068 history | grep time 1069 time scp /tmp/1gb.dat 192.168.109.1:/tmp/ 1189 history | grep time [root@ustchscbeflx09 nicswap]# time scp /tmp/1gb.dat 192.168.109.1:/tmp/ Subject to applicable law, anyone using the Network expressly consents to: 1) having his/her network activity monitored and recorded; and, 2) using the Network only in accordance with the terms of the applicable Acceptable Use Practices (www.NetworkAUP.com). Your work product created, transmitted or stored on GM networks or systems, including your name or other personally identifiable information, may be shared with other GM entities, suppliers and third parties around the globe when required for business or legal purposes. BE ADVISED, that improper usage of the network and/or computing systems and equipment may result in disciplinary action, up to and including termination of employment. If possible criminal activity is detected, system records may be provided to law enforcement officials. 1gb.dat 100% 1024MB 11.3MB/s 01:31 real 1m31.914s user 0m19.518s sys 0m2.141s netxen_nic 2.6.18-238.xx ....................................................................................................... [root@ustchscbeflx09 nicswap]# ethtool -i eth6 driver: netxen_nic version: 4.0.74 firmware-version: 4.0.556 bus-info: 0000:0a:00.3 [root@ustchscbeflx09 nicswap]# ifdown eth6 [root@ustchscbeflx09 nicswap]# ifup eth6 Cannot set new settings: Input/output error not setting speed not setting duplex not setting autoneg [root@ustchscbeflx09 nicswap]# uname -a Linux ustchscbeflx09 2.6.18-238.12.1.el5 #1 SMP Sat May 7 20:18:50 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux [root@ustchscbeflx09 nicswap]# time scp /tmp/1gb.dat 192.168.109.1:/tmp/ Subject to applicable law, anyone using the Network expressly consents to: 1) having his/her network activity monitored and recorded; and, 2) using the Network only in accordance with the terms of the applicable Acceptable Use Practices (www.NetworkAUP.com). Your work product created, transmitted or stored on GM networks or systems, including your name or other personally identifiable information, may be shared with other GM entities, suppliers and third parties around the globe when required for business or legal purposes. BE ADVISED, that improper usage of the network and/or computing systems and equipment may result in disciplinary action, up to and including termination of employment. If possible criminal activity is detected, system records may be provided to law enforcement officials. 1gb.dat 100% 1024MB 11.3MB/s 01:31 real 1m31.695s user 0m19.778s sys 0m2.235s nx_nic 2.6.18.238.xx ........................................................................................................................................... [root@ustchscbeflx09 nicswap]# time scp /tmp/1gb.dat 192.168.109.1:/tmp/ Subject to applicable law, anyone using the Network expressly consents to: 1) having his/her network activity monitored and recorded; and, 2) using the Network only in accordance with the terms of the applicable Acceptable Use Practices (www.NetworkAUP.com). Your work product created, transmitted or stored on GM networks or systems, including your name or other personally identifiable information, may be shared with other GM entities, suppliers and third parties around the globe when required for business or legal purposes. BE ADVISED, that improper usage of the network and/or computing systems and equipment may result in disciplinary action, up to and including termination of employment. If possible criminal activity is detected, system records may be provided to law enforcement officials. 1gb.dat 100% 1024MB 11.1MB/s 01:32 real 1m32.078s user 0m19.623s sys 0m1.815s You have new mail in /var/spool/mail/root [root@ustchscbeflx09 nicswap]# ethtool -i eth6 driver: nx_nic version: 4.0.556 firmware-version: 4.0.530 bus-info: 0000:0a:00.3 [root@ustchscbeflx09 nicswap]# ifdown eth6 [root@ustchscbeflx09 nicswap]# ifup eth6 [root@ustchscbeflx09 nicswap]# uname -a Linux ustchscbeflx09 2.6.18-238.12.1.el5 #1 SMP Sat May 7 20:18:50 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux netxen_nic 2.6.18.238.xx ........................................................................................................................................... [root@ustchscbeflx09 nicswap]# time scp /tmp/1gb.dat 192.168.109.1:/tmp/ Subject to applicable law, anyone using the Network expressly consents to: 1) having his/her network activity monitored and recorded; and, 2) using the Network only in accordance with the terms of the applicable Acceptable Use Practices (www.NetworkAUP.com). Your work product created, transmitted or stored on GM networks or systems, including your name or other personally identifiable information, may be shared with other GM entities, suppliers and third parties around the globe when required for business or legal purposes. BE ADVISED, that improper usage of the network and/or computing systems and equipment may result in disciplinary action, up to and including termination of employment. If possible criminal activity is detected, system records may be provided to law enforcement officials. 1gb.dat 100% 1024MB 11.1MB/s 01:32 real 1m31.828s user 0m19.584s sys 0m1.923s You have new mail in /var/spool/mail/root [root@ustchscbeflx09 nicswap]# ifdown eth6 [root@ustchscbeflx09 nicswap]# ifup eth6 Cannot set new settings: Input/output error not setting speed not setting duplex not setting autoneg [root@ustchscbeflx09 nicswap]# ethtool -i eth6 driver: netxen_nic version: 4.0.74 firmware-version: 4.0.530 bus-info: 0000:0a:00.3
If I understood you correctly, would you suggest to update the NIC firmware to version 4.0.556 but keep using netxen_nic module? If so, can someone please outline the changes that were made from firmware version 4.0.544 to 4.0.556 (can find any helpful information on www.hp.com)
(In reply to comment #16) > If I understood you correctly, would you suggest to update the NIC firmware to > version 4.0.556 but keep using netxen_nic module? > > If so, can someone please outline the changes that were made from firmware > version 4.0.544 to 4.0.556 (can find any helpful information on www.hp.com) I'm saying upgrade the firmware to 4.0.556 and use the netxen_nic in 5.7 (2.6.18-274) I still an issue with the 4.0556 and netxen_nic in 5.6 (2.6.18-238.xx) with it negotiating network settings, although I get the same results performance wise. If you have to remain on 5.6 (2.6.18-238.XX) I would still recommend upgrading the firmware to 4.0.556, but you will have to pull the hp nx_nic driver and compile that for your kernel version as I noted from first post. This is ultimately the same driver upstream (provided by qlogic), not sure why hp/rh don't use the same name.
(In reply to comment #7) > Perhaps this HP CA is related? > > http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?lang=en&cc=us&task > Id=110&prodSeriesId=3913537&prodTypeId=329290&objectID=c02964542 > > Chad, do you know what changes would have been made to hp's nx_nic driver for > the updated firmware the CA references and if the same changes are in our > netxen_nic driver? The change in the drivers is simply to check for the minimum firmware revision in the HP advisory.
The changes/Fixes for duplex settings went to upstream after rhel5.6 submission. That is the reason it works with rhel5.7 inbox driver and not with rhel5.6. Can we close this bug now.
Rajesh, from my perspective I would say yes. But Simon mentioned some performance problems. However I didn't notice any performance issues with my testing on 5.7 and the updated firmware. -Dave
(In reply to comment #22) > Rajesh, from my perspective I would say yes. > > But Simon mentioned some performance problems. However I didn't notice any > performance issues with my testing on 5.7 and the updated firmware. > > -Dave Indeed, I do have some performance issues, which were not resolved by upgrading to firmware version 4.0.556. I've now changed the cable and also the switch port and still, the performance is very bad. Red Hat Professional support does suggest to update the kernel to version 2.6.18-274.3.1 since they have put driver updates into this version. Let's see if this helps The strange thing is, that I see lots of retransmitted packages as well as lost segments, which usually indicates cable, mtu, physical issues - but as mentioned , I did change everything, and still the issue isn't resolved! Anyway, I think the main issue (for which the case was created) has been resolved and I think it's OK if we close the case (since I also have an open case with Red Hat professional support). Simon
Closing. Resolved with RHEL 5.7 inbox driver and netxen firmware version 4.0.556.