Bug 677579

Summary: NetXen NX3031 Multifunction 1-Gigabit Adapter doesn't work in rhel 6.1
Product: Red Hat Enterprise Linux 6 Reporter: Adam Okuliar <aokuliar>
Component: kernelAssignee: bob picco <bpicco>
Status: CLOSED DUPLICATE QA Contact: Adam Okuliar <aokuliar>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 6.1CC: amit.salecha, arozansk, borgan, bpicco, cdupuis, GR-Linux-NIC-Dev, he.wu, jiayin.shao, prarit, sucheta.chakraborty
Target Milestone: rcFlags: GR-Linux-NIC-Dev: needinfo+
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-03-23 12:49:30 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
compressed content of /var/log directory none

Description Adam Okuliar 2011-02-15 09:49:08 UTC
Description of problem:
NetXen NX3031 Multifunction 1-Gigabit  Adapter doesn't work in rhel 6.1. 

Version-Release number of selected component (if applicable):
RHEL6.1-20110211.n.0 
Linux hp-dl380g7-01.lab.eng.brq.redhat.com 2.6.32-114.0.1.el6.x86_64 #1 SMP Thu Feb 10 16:04:24 EST 2011 x86_64 x86_64 x86_64 GNU/Linux


How reproducible:
100%


Steps to Reproduce:
1. Grab machine with NetXen NX3031
2. Provision it with RHEL6.1-20110211.n.0 
3. Assign ip address to NetXen interface
4  Try to ping any other host via NetXen interface
  
Actual results:
Ping is failing, no data flows trough the interface

Expected results:
Correct functionality of interface

Additional info:
In serial console appears following info:

netxen_nic: card response timeout.
netxen_nic: Failed to destroy rx ctx in firmware
netxen_nic: failed card response code:0xc
netxen_nic: Failed to destroy tx ctx in firmware
netxen1: Error in setting hw resources
net netxen3: firmware hang detected
net netxen2: firmware hang detected
net netxen0: firmware hang detected
net netxen1: firmware hang detected
netxen_nic 0000:15:00.1: firmware: requesting phanfw.bin
netxen_nic 0000:15:00.1: loading firmware from phanfw.bin
netxen_nic 0000:15:00.1: using 64-bit dma mask
netxen_nic 0000:15:00.1: firmware v4.0.534 [legacy]
netxen_nic 0000:15:00.3: using 64-bit dma mask
netxen_nic 0000:15:00.3: firmware v4.0.534 [legacy]
netxen_nic 0000:15:00.2: using 64-bit dma mask
netxen_nic 0000:15:00.2: firmware v4.0.534 [legacy]
netxen_nic 0000:15:00.0: using 64-bit dma mask
netxen_nic: Quad Gig LP Board S/N QG96BK0056  Chip rev 0x42
netxen_nic 0000:15:00.0: firmware v4.0.534 [legacy]
pcieport 0000:00:05.0: AER: Corrected error received: id=0028
pcieport 0000:00:05.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=0028(Transmitter ID)
pcieport 0000:00:05.0:   device [8086:340c] error status/mask=00001000/00000000
pcieport 0000:00:05.0:    [12] Replay Timer Timeout  
pcieport 0000:00:05.0: AER: Corrected error received: id=0028
pcieport 0000:00:05.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=0028(Transmitter ID)
pcieport 0000:00:05.0:   device [8086:340c] error status/mask=00001000/00000000
pcieport 0000:00:05.0:    [12] Replay Timer Timeout  
pcieport 0000:00:05.0: AER: Corrected error received: id=0028
pcieport 0000:00:05.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=0028(Transmitter ID)
pcieport 0000:00:05.0:   device [8086:340c] error status/mask=00001000/00000000
pcieport 0000:00:05.0:    [12] Replay Timer Timeout  
netxen_nic: card response timeout.
Failed to create rx ctx in firmware17

Comment 1 Adam Okuliar 2011-02-15 09:55:12 UTC
Created attachment 478829 [details]
compressed content of /var/log directory

Comment 3 Chad Dupuis (Cavium) 2011-02-21 21:30:50 UTC
This is rather strange.  Looks like the driver is having trouble communicating with the hardware.  Is this a stand-up card or is this Lan on Motherboard (LOM)?  

Would it be possible to maybe seat the card in another slot in the server?

Comment 4 Adam Okuliar 2011-02-22 09:42:31 UTC
It is a standalone QLE3044-RJ-CK card in PCI-E slot. I'll try to put it into another slot and let you know result. But please note, that in rhel6.0 works this card fine.

Comment 5 sucheta.chakraborty 2011-02-22 20:24:07 UTC
What is flashed firmware version on card? Please paste o/p of "ethtool -i <eth>".
Also, can I get o/p of "lspci -v".

Thanks,
Sucheta.

Comment 6 Adam Okuliar 2011-02-23 09:27:13 UTC
Hi,

we had some problems with this adapter in past. Please check:

https://bugzilla.redhat.com/show_bug.cgi?id=640228

These are the same cards in same machines. You can also find a lspci -vv and firmware revision there.

Thanks, 
Adam

Comment 7 sucheta.chakraborty 2011-02-25 01:28:45 UTC
Thanks Adam for the info.

I tried same driver (4.0.75) and firmware (4.0.534) version on a dl380g7 m/c. I tried with following two kernels: -

2.6.32-81.el6.bz562940.x86_64 - older RHEL6.1 kernel. I don't see any problem with this kernel.

2.6.32-118.el6.x86_64 - current RHEL6.1 kernel. Here I see the problem described in the bug report.

So, can you tell me whether AER related code path has changed b/w these two kernels? If yes, what are the changes?

Comment 8 Adam Okuliar 2011-02-28 10:12:54 UTC
I have done some investigation and it looks like it happens only on systems with Xeon CPU. AMD systems are unaffected. So probably this issue is related to bios or chipset somehow. I can give you access to affected machine, so you will be able to investigate this.

Thanks,
Adam.

Comment 9 sucheta.chakraborty 2011-03-01 18:08:57 UTC
Thanks Adam for additional info.
However I can reproduce the bug here.
But if I go back to previous kernel version on the same machine - 2.6.32-81 - I don't see any problem.

So, it has to do something with the newer kernel.
Can you tell me what are the differences between these two kernels?

Comment 10 Adam Okuliar 2011-03-02 09:39:39 UTC
Hi. 

Complete kernel change-log is available in brew underneath rpm links. I'm not familiar with AER-code, but guessing that our problem might be here:

[pci] Fix KABI breakage (Prarit Bhargava) [661301] - 
[pci] PCIe/AER: Disable native AER service if BIOS has precedence (Prarit Bhargava) [661301] - 
[pci] aerdrv: fix uninitialized variable warning (Prarit Bhargava) [661301] - 
[pci] hotplug: Fix build with CONFIG_ACPI unset (Prarit Bhargava) [661301] - 
[pci] PCIe: Ask BIOS for control of all native services at once (Prarit Bhargava) [661301] - 
[pci] PCIe: Introduce commad line switch for disabling port services (Prarit Bhargava) [661301] - 
[pci] ACPI/PCI: Negotiate _OSC control bits before requesting them (Prarit Bhargava) [661301] - [pci] ACPI/PCI: Make acpi_pci_query_osc() return control bits (Prarit Bhargava) [661301] 

or here:

[pci] PCIe AER: use pci_is_pcie() (Prarit Bhargava) [661301] - 
[pci] introduce pci_is_pcie() (Prarit Bhargava) [661301] - 
[pci] PCIe AER: use pci_pcie_cap() (Prarit Bhargava) [661301] - 
[pci] fix memory leak in aer_inject (Prarit Bhargava) [661301] - 
[pci] use better error return values in aer_inject (Prarit Bhargava) [661301] - 
[pci] add support for PCI domains to aer_inject (Prarit Bhargava) [661301] 

please see full change-log at:
https://brewweb.devel.redhat.com/buildinfo?buildID=157683

I'm cc-ing Prarit Bhargava maybe he will provide further detail. 

Thanks,
Adam

Comment 11 Marvell Linux NIC Driver 2011-03-04 07:56:58 UTC
Thanks Adam.

But I need to understand exactly what these changes are and why they were made - in order to root cause this issue.

Prarit please respond.

Thanks,
Sucheta.

Comment 12 Prarit Bhargava 2011-03-09 14:55:02 UTC
(In reply to comment #11)
> Thanks Adam.
> 
> But I need to understand exactly what these changes are and why they were made
> - in order to root cause this issue.
> 
> Prarit please respond.
> 
> Thanks,
> Sucheta.

These are updates to the existing AER code in RHEL6.  Can you boot with pci=noaer to see if you still have a problem?

P.

Comment 13 Adam Okuliar 2011-03-09 18:58:09 UTC
pci=noaer did not help. I can give you access to this system. Can you please investigate it?

Adam

Comment 14 Prarit Bhargava 2011-03-09 19:05:57 UTC
(In reply to comment #13)
> pci=noaer did not help. I can give you access to this system. Can you please
> investigate it?
> 

Sure ... but if pci=noaer is a boot option then AER is completely disabled -- which means the problem is likely not in the AER code.

But I can help to bisect.  Can you please loan the system to me?  

Thanks,

P.

Comment 15 Adam Okuliar 2011-03-09 19:23:10 UTC
...loaned.

To test functionality please assign ip addresses to cards with macs

00:0e:1e:02:05:a6
00:0e:1e:02:05:82

and test connectivity. Sometimes small ping passes without problems, but larger data transfer causes problem. You can use netcat or netperf benchmark:

https://brewweb.devel.redhat.com/buildinfo?buildID=135954
netperf -L local.ip -H remote.ip

Affected is only hp-dl380g7. On 385 same card works fine.

If any problems occurs please let me know.

Big thanks.
Adam

Comment 16 Prarit Bhargava 2011-03-10 13:44:24 UTC
There seem to be other problems with the netxen driver on this system.  If I turn off PCIE AER (pci=noaer) and boot, an idle system eventually starts streaming:

NMI: IOCK error (debug interrupt?)
CPU 0 
Modules linked in: nfs lockd fscache nfs_acl auth_rpcgss autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table ipv6 dm_mirror dm_region_hash dm_log power_meter microcode serio_raw iTCO_wdt iTCO_vendor_support hpilo bnx2 i7core_edac edac_core ixgbe mdio igb dca sg netxen_nic ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif ata_generic pata_acpi ata_piix hpsa radeon ttm drm_kms_helper drm hwmon i2c_algo_bit i2c_core dm_mod [last unloaded: scsi_wait_scan]

<snip -- sorry couldn't capture the whole thing>

DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process events/0 (pid: 75, threadinfo ffff8800bbe66000, task ffff8800bbe65500)
Stack:
 0000000000000046 00000b65778379d7 00000b657787f040 00000b65778379d7
<0> 00000b657787f040 0000000000000080 00000b65778379d7 ffff880028203f58
<0> ffff880028203f78 ffff88002820db40 0000000000000000 ffffc900068b2140
Call Trace:
 <IRQ> 
 [<ffffffff814e117b>] smp_apic_timer_interrupt+0x6b/0x9b
 [<ffffffff8100bc93>] apic_timer_interrupt+0x13/0x20
 <EOI> 
 [<ffffffffa025ef2e>] ? netxen_nic_hw_read_wx_2M+0x8e/0x150 [netxen_nic]
 [<ffffffffa0262330>] ? netxen_fw_poll_work+0x0/0x2e0 [netxen_nic]
 [<ffffffffa02623e5>] netxen_fw_poll_work+0xb5/0x2e0 [netxen_nic]
 [<ffffffff8108dfce>] ? prepare_to_wait+0x4e/0x80
 [<ffffffffa0262330>] ? netxen_fw_poll_work+0x0/0x2e0 [netxen_nic]
 [<ffffffff810883b0>] worker_thread+0x170/0x2a0
 [<ffffffff8108dce0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff81088240>] ? worker_thread+0x0/0x2a0
 [<ffffffff8108d976>] kthread+0x96/0xa0
 [<ffffffff8100c1ca>] child_rip+0xa/0x20
 [<ffffffff8108d8e0>] ? kthread+0x0/0xa0
 [<ffffffff8100c1c0>] ? child_rip+0x0/0x20
Code: 45 b0 0f 8d 15 ff ff ff 48 89 45 b0 48 89 45 a0 e9 08 ff ff ff 49 89 9d b0 00 00 00 eb 8d 48 8b 45 b0 48 89 45 a0 e9 f2 fe ff ff <41> c7 85 94 00 00 00 00 00 00 00 48 83 c4 48 5b 41 5c 41 5d 41 

ie) The netxen driver is broken.

Comment 17 Prarit Bhargava 2011-03-10 14:06:21 UTC
Adam -- I cannot find 00:0e:1e:02:05:82 on the system ... are you sure that's correct?

P.

Comment 18 Adam Okuliar 2011-03-10 14:44:56 UTC
00:0e:1e:02:05:82 is in hp-dl385g7-01.lab.eng.brq.redhat.com.

[root@hp-dl385g7-01 ~]# ip l | grep 00:0e:1e:02:05:82
    link/ether 00:0e:1e:02:05:82 brd ff:ff:ff:ff:ff:ff

Do you have access to this system?
Adam

Comment 19 Adam Okuliar 2011-03-14 10:07:48 UTC
Ameen do you have any other idea? 
I would be glad to solve this before 6.1GA.

Thanks,
Adam

Comment 20 Marvell Linux NIC Driver 2011-03-15 17:44:33 UTC
Adam 

We are looking into PCI traces collected for this issue. We are seeing some weird things going on the PCI bus. I have asked the ASIC team to take further look at the traces and will provide an update once I hear back from them.

As we have indicated in  Comment #7, when we tested with 2.6.32-81.el6.bz562940.x86_64 kernel everything was fine. 2.6.32-118.el6.x86_64 kernel seems to bing out these issues. Something must have changed (need not be in the AER code) which is triggering these issues. Do you have a team in RedHat that can help us to understand the change/patch in the kernel which is triggering this issues?

Thanks,
-Ameen

Comment 21 Prarit Bhargava 2011-03-15 17:59:35 UTC
Adding Bob.

P.

Comment 22 Marvell Linux NIC Driver 2011-03-15 18:08:32 UTC
(In reply to comment #16)
> There seem to be other problems with the netxen driver on this system.  If I
> turn off PCIE AER (pci=noaer) and boot, an idle system eventually starts
> streaming:
> NMI: IOCK error (debug interrupt?)
> CPU 0 
> Modules linked in: nfs lockd fscache nfs_acl auth_rpcgss autofs4 sunrpc
> cpufreq_ondemand acpi_cpufreq freq_table ipv6 dm_mirror dm_region_hash dm_log
> power_meter microcode serio_raw iTCO_wdt iTCO_vendor_support hpilo bnx2
> i7core_edac edac_core ixgbe mdio igb dca sg netxen_nic ext4 mbcache jbd2 sr_mod
> cdrom sd_mod crc_t10dif ata_generic pata_acpi ata_piix hpsa radeon ttm
> drm_kms_helper drm hwmon i2c_algo_bit i2c_core dm_mod [last unloaded:
> scsi_wait_scan]
> <snip -- sorry couldn't capture the whole thing>
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process events/0 (pid: 75, threadinfo ffff8800bbe66000, task ffff8800bbe65500)
> Stack:
>  0000000000000046 00000b65778379d7 00000b657787f040 00000b65778379d7
> <0> 00000b657787f040 0000000000000080 00000b65778379d7 ffff880028203f58
> <0> ffff880028203f78 ffff88002820db40 0000000000000000 ffffc900068b2140
> Call Trace:
>  <IRQ> 
>  [<ffffffff814e117b>] smp_apic_timer_interrupt+0x6b/0x9b
>  [<ffffffff8100bc93>] apic_timer_interrupt+0x13/0x20
>  <EOI> 
>  [<ffffffffa025ef2e>] ? netxen_nic_hw_read_wx_2M+0x8e/0x150 [netxen_nic]
>  [<ffffffffa0262330>] ? netxen_fw_poll_work+0x0/0x2e0 [netxen_nic]
>  [<ffffffffa02623e5>] netxen_fw_poll_work+0xb5/0x2e0 [netxen_nic]
>  [<ffffffff8108dfce>] ? prepare_to_wait+0x4e/0x80
>  [<ffffffffa0262330>] ? netxen_fw_poll_work+0x0/0x2e0 [netxen_nic]
>  [<ffffffff810883b0>] worker_thread+0x170/0x2a0
>  [<ffffffff8108dce0>] ? autoremove_wake_function+0x0/0x40
>  [<ffffffff81088240>] ? worker_thread+0x0/0x2a0
>  [<ffffffff8108d976>] kthread+0x96/0xa0
>  [<ffffffff8100c1ca>] child_rip+0xa/0x20
>  [<ffffffff8108d8e0>] ? kthread+0x0/0xa0
>  [<ffffffff8100c1c0>] ? child_rip+0x0/0x20
> Code: 45 b0 0f 8d 15 ff ff ff 48 89 45 b0 48 89 45 a0 e9 08 ff ff ff 49 89 9d
> b0 00 00 00 eb 8d 48 8b 45 b0 48 89 45 a0 e9 f2 fe ff ff <41> c7 85 94 00 00 00
> 00 00 00 00 48 83 c4 48 5b 41 5c 41 5d 41 
> ie) The netxen driver is broken.

Prarit,

We are seeing weird things going on in the PCI bus with 2.6.32-118.el6.x86_64 kernel. When we tested with 2.6.32-81.el6.bz562940.x86_64 kernel, everything was working fine. What kernel version did you use in this experiment?. All we have to do to re-produe this issue is to boot the machine with pci=noaer and leave the system idle right? Please let us know if there were any other variables involved (which we are not aware of).

We will try to re-produce this failure and collect a PCI trace. This way we can verify if the foot prints here are same as the other issues or something different.

Comment 23 bob picco 2011-03-16 13:52:18 UTC
Adam,

Can I have this machine for a day or possibly two?

The netxen driver changed very little since 2.6.32-81.el6.bz562940.x86_64.
2.6.32-81.el6.bz562940.x86_64 is a brew/cvs build of bz562940. The patches
entered RHEL6.1 at kernel-2.6.32-100.el6. Chad's bz667194 changes arrived
at kernel-2.6.32-101.

I have kernel-2.6.32-122.el6 on hp-nehalem-02 without any issues.

So I'd like to try kernel-2.6.32-100.el6 and kernel-2.6.32-101.

From bz640228 it also appears this netxen card has a bad history?

thanx,

bob

Comment 24 Marvell Linux NIC Driver 2011-03-17 06:01:02 UTC
From PCI traces we don't see any downstream TLPs from the host bridge when the issue occurs. It would be good to understand what kernel version onwards we see this issue and what changes went into that kernel version

Comment 25 Marvell Linux NIC Driver 2011-03-17 06:02:34 UTC
(In reply to comment #22)
> (In reply to comment #16)
> > There seem to be other problems with the netxen driver on this system.  If I
> > turn off PCIE AER (pci=noaer) and boot, an idle system eventually starts
> > streaming:
> > NMI: IOCK error (debug interrupt?)
> > CPU 0 
> > Modules linked in: nfs lockd fscache nfs_acl auth_rpcgss autofs4 sunrpc
> > cpufreq_ondemand acpi_cpufreq freq_table ipv6 dm_mirror dm_region_hash dm_log
> > power_meter microcode serio_raw iTCO_wdt iTCO_vendor_support hpilo bnx2
> > i7core_edac edac_core ixgbe mdio igb dca sg netxen_nic ext4 mbcache jbd2 sr_mod
> > cdrom sd_mod crc_t10dif ata_generic pata_acpi ata_piix hpsa radeon ttm
> > drm_kms_helper drm hwmon i2c_algo_bit i2c_core dm_mod [last unloaded:
> > scsi_wait_scan]
> > <snip -- sorry couldn't capture the whole thing>
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > Process events/0 (pid: 75, threadinfo ffff8800bbe66000, task ffff8800bbe65500)
> > Stack:
> >  0000000000000046 00000b65778379d7 00000b657787f040 00000b65778379d7
> > <0> 00000b657787f040 0000000000000080 00000b65778379d7 ffff880028203f58
> > <0> ffff880028203f78 ffff88002820db40 0000000000000000 ffffc900068b2140
> > Call Trace:
> >  <IRQ> 
> >  [<ffffffff814e117b>] smp_apic_timer_interrupt+0x6b/0x9b
> >  [<ffffffff8100bc93>] apic_timer_interrupt+0x13/0x20
> >  <EOI> 
> >  [<ffffffffa025ef2e>] ? netxen_nic_hw_read_wx_2M+0x8e/0x150 [netxen_nic]
> >  [<ffffffffa0262330>] ? netxen_fw_poll_work+0x0/0x2e0 [netxen_nic]
> >  [<ffffffffa02623e5>] netxen_fw_poll_work+0xb5/0x2e0 [netxen_nic]
> >  [<ffffffff8108dfce>] ? prepare_to_wait+0x4e/0x80
> >  [<ffffffffa0262330>] ? netxen_fw_poll_work+0x0/0x2e0 [netxen_nic]
> >  [<ffffffff810883b0>] worker_thread+0x170/0x2a0
> >  [<ffffffff8108dce0>] ? autoremove_wake_function+0x0/0x40
> >  [<ffffffff81088240>] ? worker_thread+0x0/0x2a0
> >  [<ffffffff8108d976>] kthread+0x96/0xa0
> >  [<ffffffff8100c1ca>] child_rip+0xa/0x20
> >  [<ffffffff8108d8e0>] ? kthread+0x0/0xa0
> >  [<ffffffff8100c1c0>] ? child_rip+0x0/0x20
> > Code: 45 b0 0f 8d 15 ff ff ff 48 89 45 b0 48 89 45 a0 e9 08 ff ff ff 49 89 9d
> > b0 00 00 00 eb 8d 48 8b 45 b0 48 89 45 a0 e9 f2 fe ff ff <41> c7 85 94 00 00 00
> > 00 00 00 00 48 83 c4 48 5b 41 5c 41 5d 41 
> > ie) The netxen driver is broken.
> Prarit,
> We are seeing weird things going on in the PCI bus with 2.6.32-118.el6.x86_64
> kernel. When we tested with 2.6.32-81.el6.bz562940.x86_64 kernel, everything
> was working fine. What kernel version did you use in this experiment?. All we
> have to do to re-produe this issue is to boot the machine with pci=noaer and
> leave the system idle right? Please let us know if there were any other
> variables involved (which we are not aware of).
> We will try to re-produce this failure and collect a PCI trace. This way we can
> verify if the foot prints here are same as the other issues or something
> different

Foot prints in the PCI trace for this issue is the same as what we have seen for other issues.

Comment 26 Adam Okuliar 2011-03-17 16:15:34 UTC
(In reply to comment #23)

Hi Bob,

I can give you access to those machines. Can you please give me some estimation about when you will have time to investigate this? I need it for scheduling my work.

Big Thanks
Adam

Comment 27 bob picco 2011-03-17 16:47:36 UTC
(In reply to comment #26)
> (In reply to comment #23)
> 
> Hi Bob,
> 
> I can give you access to those machines. Can you please give me some estimation
> about when you will have time to investigate this? I need it for scheduling my
> work.
> 
> Big Thanks
> Adam

Hi Adam,

I assume you mean hp-dl380g7-01.lab.eng.brq.redhat.com?  How about next week one
or two days?

your welcome,

bob

Comment 28 Adam Okuliar 2011-03-17 17:05:52 UTC
OK, so what about Monday and Tuesday? Are you OK with it?
Adam

Comment 29 Marvell Linux NIC Driver 2011-03-17 17:20:49 UTC
HP has reported the same issue on
https://bugzilla.redhat.com/show_bug.cgi?id=688489

Comment 30 bob picco 2011-03-17 17:38:43 UTC
(In reply to comment #28)
> OK, so what about Monday and Tuesday? Are you OK with it?
> Adam

Monday and Tuesday will work. 

I saw the HP issue you mentioned in comment 29.

thanx,

bob

Comment 31 Adam Okuliar 2011-03-21 12:06:14 UTC
Hi Bob,

I loaned two machines to you in beaker. Their hostnames are:

hp-dl380g7-01.lab.eng.brq.redhat.com
hp-dl385g7-01.lab.eng.brq.redhat.com

both of them have qlogic card inside.Interfaces ara marked as netxen[0-4]. This configuration disappear after reboot. Please run /root/prepare_sys.py to restore configuration after reboot. On 380 there are problems with communication with card, on 385 car is working fine. You can test throughput between 380 and 385 via q-logic interfaces by netperf: netperf -H 172.16.25.20. Please let me know if you will need any assistance.

Thanks,
Adam

Comment 32 bob picco 2011-03-23 12:49:30 UTC
Hi Adam,

Thanks for your machines. I've returned them. I'm fairly confident because of
the success of git tag 101 kernel that this bug is a duplicate of
bug 681870. I can't confirm this without access to our git tree. Also I don't
want to tie your machines up longer.

Adam, there is a patch attached to bug 681870. Prarit would like you to test
it and report back to him, please.

Oh, the git tagged 101 kernel is the last netxen commit thus far in r6.1.

Ameen,

Thanks for your help too.

bob

*** This bug has been marked as a duplicate of bug 681870 ***

Comment 33 Marvell Linux NIC Driver 2011-03-23 15:19:58 UTC
Bob, Thanks for the update. I don't have permission to view bug 681870. Can you please add me to this bug?

Comment 34 bob picco 2011-03-23 15:55:17 UTC
(In reply to comment #33)
> Bob, Thanks for the update. I don't have permission to view bug 681870. Can you
> please add me to this bug?

Done.