Bug 468922

Summary:

bnx2x + 57711 MCA on BL870c

Product:

Red Hat Enterprise Linux 5

Reporter:

Alex Chiang <achiang>

Component:

kernel

Assignee:

Andy Gospodarek <agospoda>

Status:

CLOSED ERRATA

QA Contact:

Martin Jenner <mjenner>

Severity:

high

Docs Contact:

Priority:

medium

Version:

5.3

CC:

adaora.onyia, agospoda, bill.hayes, dchapman, eilong, luyu, mgahagan, peterm, rick.hester, rpacheco, syeghiay, tao, tim.moore

Target Milestone:

Target Release:

---

Hardware:

ia64

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2009-01-20 20:06:31 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
script to reproduce MCA	none
bnx2x debug patch	none
reproducer script, v2	none
console log with debug patch and reproducer v2	none
debug patch, v2	none
console log with debug patch v2	none
console log with debug patch, v3	none
bnx2x-4-fixes.patch	none

Description Alex Chiang 2008-10-28 20:51:39 UTC

Created attachment 321725 [details]
script to reproduce MCA

Description of problem:
An MCA will be encountered with 57711 and various ia64 systems. Typically, the MCA occurs when you start pushing traffic over the card, but we have seen it occur on boot as well.

Version-Release number of selected component (if applicable):
From the RHEL5.3a1 sources:

drivers/net/bnx2x.c:
#define DRV_MODULE_VERSION      "1.40.22"
#define DRV_MODULE_RELDATE      "2007/11/27"
#define BNX2X_BC_VER            0x040200

How reproducible:
100%

Steps to Reproduce:
1. Install RHEL5.3 + bcm57711
2. Boot
3. Start pushing network traffic over card
4. MCA
  
Actual results:
MCA within 10 minutes.

Expected results:
No MCA.

Additional info:
The bnx2x driver consistently MCAs on a BL870c while under load, running
RHEL5.3 alpha 1.

MCAs have been seen on multiple network interfaces (typically eth0 and eth2).

Running the MCA errdump from EFI through the analyzer, we see this pattern
repeatedly:

MOD_TARGET_IDENTIFIER             0x00000000c680a518
  (rope 4 LMMIO)

	or

MOD_TARGET_IDENTIFIER             0x00000000c780a510
  (rope 4 LMMIO)

We also consistently see these IIP and XIP values:
+IIP                 0xa0000001004109f0 
+XIP                 0xa0000001002ddcc0 

I rebuilt the 5.3a1 kernel using the Red Hat default .config for ia64. I did
make one modification, which was to build the bnx2x driver as a kernel built-in
to help make debugging slightly easier.

Examining the IIP and XIP addresses, we see that they map to:

[root@localhost linux-2.6.18.ia64]# addr2line -e vmlinux a0000001004109f0
/usr/src/redhat/BUILD/kernel-2.6.18/linux-2.6.18.ia64/drivers/net/bnx2x_main.c:1717

 1693 static int bnx2x_acquire_hw_lock(struct bnx2x *bp, u32 resource)
 ....
 1709         if (func <= 5) {
 1710                 hw_lock_control_reg = (MISC_REG_DRIVER_CONTROL_1 + func*8);
 1711         } else {
 1712                 hw_lock_control_reg =
 1713                                 (MISC_REG_DRIVER_CONTROL_7 + (func - 6)*8);
 1714         }
 1715 
 1716         /* Validating that the resource is not already taken */
 1717         lock_status = REG_RD(bp, hw_lock_control_reg);
 1718         if (lock_status & resource_bit) {
 1719                 DP(NETIF_MSG_HW, "lock_status 0x%x  resource_bit 0x%x\n",
 1720                    lock_status, resource_bit);
 1721                 return -EEXIST;
 1722         }

[root@localhost linux-2.6.18.ia64]# addr2line -e vmlinux a0000001002ddcc0
/usr/src/redhat/BUILD/kernel-2.6.18/linux-2.6.18.ia64/include/asm/io.h:367

364 static inline unsigned int
365 ___ia64_readl (const volatile void __iomem *addr)
366 {
367         return *(volatile unsigned int __force *) addr;
368 }

Examining drivers/net/bnx2x_reg.h, we see the following value for
MISC_REG_DRIVER_CONTROL_1:

#define MISC_REG_DRIVER_CONTROL_1                                0xa510

The next thing I looked at is the EFI memmap for this address range:
MemMapIO   0000000080000000-00000000FDFFFFFF  000000000007E000 0000000000000003

The "3" in the last column means that SFW has declared this range as UC|WC.

Here are the contents of /proc/iomem:
c0000000-dfffffff : PCI Bus 0000:05
  c0000000-dfffffff : PCI Bus #06
    ...
    c6800000-c6ffffff : 0000:06:00.1
      c6800000-c6ffffff : bnx2x
    c7800000-c7ffffff : 0000:06:00.0
      c7800000-c7ffffff : bnx2x

So far, all the data seem to be at least internally consistent. Recall that
the various MCAing addresses were: 

	0x00000000c680a518
	0x00000000c780a510

And the value of hw_lock_control_reg is given by:
	hw_lock_control_reg = (MISC_REG_DRIVER_CONTROL_1 + func*8)

[root@localhost ~]# lspci -vt
-+-[0000:07]---00.0-[0000:08]--
 +-[0000:05]---00.0-[0000:06]--+-00.0  Broadcom Corporation NetXtreme II BCM57711E 10Gigabit PCIe
 |                             +-00.1  Broadcom Corporation NetXtreme II BCM57711E 10Gigabit PCIe
 |                             +-00.2  Broadcom Corporation NetXtreme II BCM57711E 10Gigabit PCIe
 |                             +-00.3  Broadcom Corporation NetXtreme II BCM57711E 10Gigabit PCIe
 |                             +-00.4  Broadcom Corporation NetXtreme II BCM57711E 10Gigabit PCIe
 |                             +-00.5  Broadcom Corporation NetXtreme II BCM57711E 10Gigabit PCIe
 |                             +-00.6  Broadcom Corporation NetXtreme II BCM57711E 10Gigabit PCIe
 |                             \-00.7  Broadcom Corporation NetXtreme II BCM57711E 10Gigabit PCIe
 +-[0000:03]---00.0-[0000:04]--+-00.0  Intel Corporation 82571EB Dual Port Gigabit Mezzanine Adapter
 |                             \-00.1  Intel Corporation 82571EB Dual Port Gigabit Mezzanine Adapter

Getting some help from a firmware guy, he says:
From Rope 4 RPIO data:
.... HP_RPIO_DATA ....
+VALIDATION_BITS                         0x0000000000000001
+CELL_NUMBER                             0x0000000000000000
+SBA_NUMBER                              0x0000000000000000
+ROPE_NUMBER                             0x0000000000000004 # Rope 4
+RP_BUS_NUMBER                           0x0000000000060605
+RP_ERROR_CONFIG_ID                      0x0000000002810040
+RP_ERROR_MODE_AND_STATUS                0x0000000000000000
+RP_PIO_HARDFAIL_ENABLE                  0x0000000000070404
+RP_PIO_ERROR_STATUS                     0x0000000000010000 # Memory UR
+RP_PIO_ERROR_MASKS                      0x0000000000000303 # not masked
+RP_PIO_ERROR_SEVERITY                   0x0000000000030000 # configured as fatal
+RP_PIO_FIRST_ERROR_POINTER              0x0000000000000000
+RP_PIO_ADDRESS_LOG1                     0x00000000c780a510
+RP_PIO_ADDRESS_LOG2                     0x0000000000000000

So for some reason the adaptor is return an unsupported request for that memory
address?

Appendix
--------
Random bits of data, not sure if/how they fit into the big picture just yet.

Shell> ioconfig
Fast initialization: Enabled
MPS optimization:    Disabled
System Wake-On-LAN:  Enabled

I've also seen some MCAs with this signature:

IIP                 0xa000000100431020
XIP                 0xa000000100430fc0
MOD_TARGET_IDENTIFIER             0x00000000c58afff0
  (rope 4 LMMIO)

	or

MOD_TARGET_IDENTIFIER             0x00000000c78aff98
  (rope 4 LMMIO)


Continuing with our addr2line trick, we see:

[root@localhost linux-2.6.18.ia64]# addr2line -e vmlinux a000000100431020
/usr/src/redhat/BUILD/kernel-2.6.18/linux-2.6.18.ia64/drivers/net/bnx2x_main.c:3966

 3937 static void bnx2x_timer(unsigned long data)
 ....
 3963                 drv_pulse = bp->fw_drv_pulse_wr_seq;
 3964                 SHMEM_WR(bp, func_mb[func].drv_pulse_mb, drv_pulse);
 3965 
 3966                 mcp_pulse = (SHMEM_RD(bp, func_mb[func].mcp_pulse_mb) &
 3967                              MCP_PULSE_SEQ_MASK);

	and

[root@localhost linux-2.6.18.ia64]# addr2line -e vmlinux a000000100430fc0
/usr/src/redhat/BUILD/kernel-2.6.18/linux-2.6.18.ia64/include/asm/io.h:391

388 static inline void
389 __writel (unsigned int val, volatile void __iomem *addr)
390 {
391         *(volatile unsigned int __force *) addr = val;
392 }

Note here that the XIP and the IIP are mismatched. The IIP indicates we
should be MCAing due to the read, but the XIP says we were doing a writel.

I'm not entirely sure how this fits into the MCA we are seeing, but this
signature appears less often, so maybe it doesn't mean anything.

Comment 1 Alex Chiang 2008-10-29 05:50:42 UTC

Created attachment 321751 [details]
bnx2x debug patch

This patch applies against RHEL5.3b1 bnx2x driver, and produces the debug console output that I plan on posting in a later comment.

Comment 2 Alex Chiang 2008-10-29 05:52:39 UTC

Created attachment 321752 [details]
reproducer script, v2

It turns out you don't even really need to push much traffic over the card to get the machine to MCA. Simply taking ports up and down seems to do it.

This version of the reproducer comments out all the wget statements and simply does a bunch of ifup and ifdowns.

Comment 3 Alex Chiang 2008-10-29 06:03:58 UTC

Created attachment 321754 [details]
console log with debug patch and reproducer v2

This is the console output with the debug patch applied and the shorter version of the reproducer.

The one thing that's confusing me a bit is, we're in the middle of doing an ifup on eth4 when we MCA, but we see eth0 (function 0) attempting to acquire a lock as well. Why is it doing that? And are there any known problems in this area?

As far as I can tell from the log, it sure looks like the hw lock acquires/releases are balanced, so again, if something's broke, it's not obvious to me.

I've been staring at this for too long and am going to knock off for some sleep. It would be great to hear some input from Eilon on this.

Thanks.

Comment 4 Alex Chiang 2008-10-29 06:05:57 UTC

Oops, just realized that the original description is wrong wrt driver revision. That is the version from the wrong file.

It should read:

drivers/net/bnx2x_main.c:
#define DRV_MODULE_VERSION      "1.45.21"
#define DRV_MODULE_RELDATE      "2008/09/03"
#define BNX2X_BC_VER            0x040200

Thanks.

Comment 5 Eilon Greenstein 2008-10-29 15:30:03 UTC

Hi Alex,

Thank you so much for the detailed research. I’m trying to get an IA64 system since I don’t have anything to add on your thorough analysis. From the amount of effort you put into it, I can tell that this is a critical issue. Can you please let me know just how critical is it? Who is asking for it?

I hope to get a system soon but I’m not sure if I can add something today.

Thanks,
Eilon

Comment 6 Alex Chiang 2008-10-30 00:09:02 UTC

Created attachment 321884 [details]
debug patch, v2

I changed up the debug patch to make the output slightly easier to read. This will become apparent in later attachments.

Comment 7 Alex Chiang 2008-10-30 00:44:09 UTC

Created attachment 321887 [details]
console log with debug patch v2

This console log is pretty huge, so apologies in advance. :(

It was created by applying my debug patch v2, and then booting with bnx2x.debug=0x2007. For those playing at home, it turns on the following debug messages:

        NETIF_MSG_DRV           = 0x0001,
        NETIF_MSG_PROBE         = 0x0002,
        NETIF_MSG_LINK          = 0x0004,
        ...
        NETIF_MSG_HW            = 0x2000,

For what it's worth, I did _not_ turn on:

        NETIF_MSG_INTR          = 0x0200,

Because it was too noisy. Oh well.

It may take a little while to understand what's going on in the log, but essentially, the outline is this:

1) testport does an ifdown on all ethX interfaces
2) we then start the following loop, per interface:
2a) ifup interface
2b) ifconfig interface
2c) ping a net target 100x
2d) ifconfig interface
2e) ifdown interface

So if you search in the file for the first instance of 'ifup', that will be the first iteration of step 2a, which is ifup eth0.

[I did look at this log file quite a bit today, and actually found it a bit easier to follow if you split the output from 2a--2e into 6 separate files, one for each interface. Then, things like doing gvimdiff eth0.txt eth6.txt become possible. I didn't upload those files here because they're easy to recreate and didn't want to spam this bz anymore than I already have.]

If you do end up comparing the loop for the various interfaces, you'll that only the odd numbered interfaces (eth1, eth3, and eth5) were ifup'ed successfully. I don't know why that would be. Maybe Bill H. can chime in here if he has a clue...

The other thing you'll see is that as eth6 is in the loop, it seems to be chugging along quite merrily, and then all of a sudden, after a final bnx2x_get_settings(), we see that eth0 is trying to acquire the hardware lock (0xc780a510).

Recall from the first MCA analysis:

+RP_PIO_ERROR_STATUS                     0x0000000000010000 # Memory UR
+RP_PIO_ERROR_MASKS                      0x0000000000000303 # not masked
+RP_PIO_ERROR_SEVERITY                   0x0000000000030000 # cfg'd as fatal

The RP_PIO_ERROR_STATUS is the card itself telling the rest of the system I/O fabric that access to 0xc780a510 is an unsupported request.

The RP_PIO_ERROR_SEVERITY indicates that the fabric went fatal, and that's why we saw the MCA.

I am seeing this pattern 100% of the time now, where some interface will be in the middle of that loop, then eth0 attempts to acquire the hw lock, and we die.

At this point, my two questions are:

1) Why is eth0 trying to acquire this lock? It seems to be coming out of nowhere. I would have expected to see bnx2x_attn_int_asserted() by eth0 appear in the trace, but it does not. Maybe I didn't have the correct level of debugging turned on to see what async event is causing eth0 to try and acquire the lock, so maybe if Eilon has a suggestion here, I can try that.

2) Even if eth0 has some mysterious event that's waking it up and causing it to try and grab the lock, why is that making the card go fatal? That doesn't make sense from a kernel / driver point of view -- an interface should be able to contend for the lock without causing the card to die.

I'm really hoping Eilon gets access to an ia64 system and can reproduce this in his lab, maybe with some hardware analyzer hooked up, because from a software point of view, I'm just not understanding it.

As for me, my next step is to try my debug patch on an x86 system with the same card and try to compare the debug traces. It may take me a day or two to get a test bed setup.

ps, in the debug trace, if you see 0xc in the high nibble of the address, that is simply the ia64 convention for an uncacheable memory address (aka I/O).

Comment 8 Alex Chiang 2008-10-30 22:49:01 UTC

Created attachment 321992 [details]
console log with debug patch, v3

As per Eilon's (offline) request, here is a console log with bnx2x.debug=0xf70f7.

This time, it's a full bootlog that captures the card init as well.

It's also pretty huge, so beware. :-/

Comment 9 Alex Chiang 2008-10-31 18:27:13 UTC

Well now, this is very interesting. I added a call to dump_stack() in the bnx2x_acquire_hw_lock() code, and received this trace.

[bnx2x_set_storm_rx_mode:4590(eth6)]rx mode 1  mask 0x1000
[bnx2x_attn_int:2832(eth6)]attn_bits 0  attn_ack 100  asserted 0  deasserted 100
[bnx2x_attn_int_deasserted:2759(eth6)]attn: 00000000 00000000 021a0000 00105400
[bnx2x_attn_int_deasserted:2794(eth6)]about to mask 0xfffffeff at HC addr 0x1081
[bnx2x_get_settings:7645(eth6)]ethtool_cmd: cmd 1
  supported 0xf460  advertising 0x9460  speed 900
  duplex 1  port 3  phy_address 1  transceiver 0
  autoneg 1  maxtxpkt 0  maxrxpkt 0
[bnx2x_acquire_hw_lock:1723(eth6)]REG_RD(0xc0000000c180a3c8) lck 0xa3c8 fn 6 rb
[bnx2x_acquire_hw_lock:1727(eth6)]+ read success
[bnx2x_acquire_hw_lock:1740(eth6)]+ write success
[bnx2x_attn_int_deasserted:2807(eth6)]aeu_mask f7  newly deasserted 100
[bnx2x_attn_int_deasserted:2809(eth6)]new mask f7
[bnx2x_release_hw_lock:1774(eth6)]REG_RD(0xc0000000c180a3c8) lck 0xa3c8 fn 6 rb
[bnx2x_release_hw_lock:1778(eth6)]- read success
[bnx2x_release_hw_lock:1786(eth6)]- write success
[bnx2x_attn_int_deasserted:2814(eth6)]attn_state 100
[bnx2x_attn_int_deasserted:2816(eth6)]new state 0
[bnx2x_set_rx_mode:9793(eth6)]dev->flags = 1003
[bnx2x_set_rx_mode:9885(eth6)]Adding mcast MAC: 33:33:ff:77:c4:48
[bnx2x_set_rx_mode:9885(eth6)]Adding mcast MAC: 33:33:00:00:00:01
[bnx2x_set_rx_mode:9885(eth6)]Adding mcast MAC: 01:00:5e:00:00:01
[bnx2x_set_storm_rx_mode:4590(eth6)]rx mode 1  mask 0x1000
[bnx2x_set_rx_mode:9793(eth6)]dev->flags = 1003
[bnx2x_set_rx_mode:9885(eth6)]Adding mcast MAC: 33:33:ff:77:c4:48
[bnx2x_set_rx_mode:9885(eth6)]Adding mcast MAC: 33:33:00:00:00:01
[bnx2x_set_rx_mode:9885(eth6)]Adding mcast MAC: 01:00:5e:00:00:01
[bnx2x_set_storm_rx_mode:4590(eth6)]rx mode 1  mask 0x1000
[bnx2x_set_rx_mode:9793(eth6)]dev->flags = 1003
[bnx2x_set_rx_mode:9885(eth6)]Adding mcast MAC: 33:33:00:00:00:fb
[bnx2x_set_rx_mode:9885(eth6)]Adding mcast MAC: 33:33:ff:77:c4:48
[bnx2x_set_rx_mode:9885(eth6)]Adding mcast MAC: 33:33:00:00:00:01
[bnx2x_set_rx_mode:9885(eth6)]Adding mcast MAC: 01:00:5e:00:00:01
[bnx2x_set_storm_rx_mode:4590(eth6)]rx mode 1  mask 0x1000

Call Trace:
 [<a000000100013ba0>] show_stack+0x40/0xa0
                                sp=e0000100e6b4fb00 bsp=e0000100e6b493c8
 [<a000000100013c30>] dump_stack+0x30/0x60
                                sp=e0000100e6b4fcd0 bsp=e0000100e6b493b0
 [<a000000100414d10>] bnx2x_acquire_hw_lock+0x130/0x580
                                sp=e0000100e6b4fcd0 bsp=e0000100e6b49368
 [<a000000100415660>] bnx2x_acquire_phy_lock+0x60/0x80
                                sp=e0000100e6b4fcd0 bsp=e0000100e6b49340
 [<a00000010042f150>] bnx2x_get_drvinfo+0xd0/0x240
                                sp=e0000100e6b4fcd0 bsp=e0000100e6b49300
 [<a00000010058b780>] dev_ethtool+0x480/0x2320
                                sp=e0000100e6b4fce0 bsp=e0000100e6b492b0
 [<a000000100587860>] dev_ioctl+0x920/0xd60
                                sp=e0000100e6b4fdb0 bsp=e0000100e6b49260
 [<a000000100568f30>] sock_ioctl+0x5d0/0x620
                                sp=e0000100e6b4fe10 bsp=e0000100e6b49230
 [<a0000001001a0110>] do_ioctl+0x90/0x180
                                sp=e0000100e6b4fe10 bsp=e0000100e6b491e8
 [<a0000001001a0a80>] vfs_ioctl+0x880/0x8e0
                                sp=e0000100e6b4fe10 bsp=e0000100e6b491a0
 [<a0000001001a0bb0>] sys_ioctl+0xd0/0x140
                                sp=e0000100e6b4fe20 bsp=e0000100e6b49120
 [<a00000010000bdd0>] __ia64_trace_syscall+0xd0/0x110
                                sp=e0000100e6b4fe30 bsp=e0000100e6b49120
 [<a000000000010620>] __start_ivt_text+0xffffffff00010620/0x400
                                sp=e0000100e6b50000 bsp=e0000100e6b49120
[bnx2x_acquire_hw_lock:1723(eth0)]REG_RD(0xc0000000c780a510) lck 0xa510 fn 0 rb
Entered OS MCA handler. PSP=20010000fff21120 cpu=0 monarch=1
All OS MCA slaves have reached rendezvous
mlogbuf_finish: printing switched to urgent mode, MCA/INIT might be dodgy or fai
Delaying for 5 seconds...


So now we know that the stray eth0 hw lock acquire is coming from ethtool.

Is there some sort of daemon that runs ethtool periodically?

It remains unknown as to why calling ethtool on eth0 would cause the card to go fatal.

Comment 10 Alex Chiang 2008-10-31 21:09:59 UTC

I reconfigured my system to single function mode. I cannot reproduce the problem in that mode, but the card freaked out. Claims it can't get link


Fri Oct 31 21:54:57 MDT 2008
ifup eth0

Determining IP information for eth0...[bnx2x_fw_command:5722(eth0)]FW failed to respond!
bnx2x: begin fw dump (mark 0x800f7d0)
<3>>0x100
evnt[0] 0x100->0x0
evnt[0<3>] 0x0->0x100
evnt[0] 0x100->0x0
evnt[0] 0x0->0x100
evnt[0] 0x100<3>->0x0
evnt[0] 0x0->0x100
evnt[0]<3> 0x100->0x0
evnt[0] 0x0->0x100
e<3>vnt[0] 0x100->0x0
evnt[0] 0x0->0<3>x100
evnt[0] 0x100->0x0
evnt[0] <3>0x0->0x100
evnt[0] 0x100->0x0
ev<3>nt[0] 0x0->0x100
evnt[0] 0x100-><3>0x0
evnt[0] 0x0-Ap�p�
>0x100
evnt[0] 0x100->0x0
evnt[0<3>] 0x0->0x100
evnt[0] 0x100->0x0
evnt[0] 0x0->0x100
evnt[0] 0x100<3>->0x0
evnt[0] 0x0->0x100
evnt[0]<3> 0x100->0x0
evnt[0] 0x0->0x100
e<3>vnt[0] 0x100->0x0
evnt[0] 0x0->0<3>x100
evnt[0] 0x100->0x0
evnt[0] <3>0x0->0x100
evnt[0] 0x100->0x0
ev<3>nt[0] 0x0->0x100
evnt[0] 0x100-><3>0x0
evnt[0] 0x0->0x100
evnt[0] 0<3>x100->0x0
evnt[0] 0x0->0x100
evn<3>t[0] 0x100->0x0
evnt[0] 0x0->0x1<3>00
evnt[0] 0x100->0x0
evnt[0] 0x<3>0->0x100
evnt[0] 0x100->0x0
evnt<3>[0] 0x0->0x100
evnt[0] 0x100->0x<3>0
evnt[0] 0x0->0x100
evnt[0] 0x1<3>00->0x0
evnt[0] 0x0->0x100
evnt[<3>0] 0x100->0x0
evnt[0] 0x0->0x100<3>
evnt[0] 0x100->0x0
evnt[0] 0x0-<3>>0x100
evnt[0] 0x100->0x0
evnt[0<3>] 0x0->0x100
evnt[0] 0x100->0x0
evnt[0] 0x0->0x100
evnt[0] 0x100<3>->0x0
evnt[0] 0x0->0x100
evnt[0]<3> 0x100->0x0
f0: BIOS VIRT_MAC_PR<3>IM
link_init[0]: settings 0x4
f0<3>: BIOS VIRT_MAC_PRIM
link_init[0<3>]: settings 0x4
p0: link_init fa<3>il (rc 20006)
init_phy[0]: done
f0: LOAD_REQ
f0: LOAD_L2B_PRAM
i<3>mage 0x70000000 loaded  size 0x2<3>1a4  addr 0x601c0000
f0: LOAD_L2<3>B_PRAM
image 0x80000000 loaded  <3>size 0x8c4  addr 0x60240000
f0: <3>LOAD_L2B_PRAM
image 0x90000000 l<3>oaded  size 0x23b4  addr 0x602c0<3>000
f0: LOAD_L2B_PRAM
image 0xa0<3>000000 loaded  size 0x18c4  addr<3> 0x60340000
evnt[0] 0x0->0x1000
f0: LOAD_DONE
p0: PMF ->f0
init_<3>phy[0]: done
ASSERT! drv_hsi.c #<3>0x341
mcp intr[p0]: 0x1:GRC TIME<3>OUT => 0x115c00b PC 0x8014ce4
mc<3>p intr[p0]: 0x1:GRC TIMEOUT => 0<3>x115c00b PC 0x8014ce4
mcp intr[p<3>0]: 0x1:GRC TIMEOUT => 0x135c00b<3> PC 0x8014cc8
mcp intr[p0]: 0x1:<3>GRC TIMEOUT => 0x135c00b PC 0x80<3>14cc8
mcp intr[p0]: 0x1:GRC TIME<3>OUT => 0x155c00b PC 0x8014c30
mc<3>p intr[p0]: 0x1:GRC TIMEOUT => 0<3>x155c00b PC 0x8014c30
mcp intr[p<3>0]: 0x1:GRC TIMEOUT => 0x175c00b<3> PC 0x8014c48
mcp intr[p0]: 0x1:<3>GRC TIMEOUT => 0x175c00b PC 0x80<3>14c48

bnx2x: end of fw dump
[bnx2x_nic_load:6293(eth0)]MCP response failure, aborting
 failed; no link present.  Check cable?
Fri Oct 31 21:55:05 MDT 2008
ifconfig eth0
eth0      Link encap:Ethernet  HWaddr 00:17:A4:77:C4:28  
          BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
          Interrupt:55 Memory:f7800000-f7ffffff 

Fri Oct 31 21:55:05 MDT 2008
ping -c 100 -i 0.1 10.0.0.1
connect: Network is unreachable
Fri Oct 31 21:55:05 MDT 2008
ifconfig eth0 - note the traffic counts before taking down port
eth0      Link encap:Ethernet  HWaddr 00:17:A4:77:C4:28  
          BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
          Interrupt:55 Memory:f7800000-f7ffffff 

Fri Oct 31 21:55:05 MDT 2008
ifdown eth0
Fri Oct 31 21:55:05 MDT 2008
Fri Oct 31 21:55:05 MDT 2008
ifup eth1

Determining IP information for eth1...[bnx2x_fw_command:5722(eth1)]FW failed to 
respond!
bnx2x: begin fw dump (mark 0x800f7d0)
<3>>0x100
evnt[0] 0x100->0x0
evnt[0<3>] 0x0->0x100
evnt[0] 0x100->0x0
evnt[0] 0x0->0x100
evnt[0] 0x100<3>->0x0
evnt[0] 0x0->0x100
evnt[0]<3> 0x100->0x0
evnt[0] 0x0->0x100
e<3>vnt[0] 0x100->0x0
evnt[0] 0x0->0<3>x100
evnt[0] 0x100->0x0
evnt[0] <3>0x0->0x100
evnt[0] 0x100->0x0
ev<3>nt[0] 0x0->0x100
evnt[0] 0x100-><3>0x0
evnt[0] 0x0-Ap�p�
>0x100
evnt[0] 0x100->0x0
evnt[0<3>] 0x0->0x100
evnt[0] 0x100->0x0
evnt[0] 0x0->0x100
evnt[0] 0x100<3>->0x0
evnt[0] 0x0->0x100
evnt[0]<3> 0x100->0x0
evnt[0] 0x0->0x100
e<3>vnt[0] 0x100->0x0
evnt[0] 0x0->0<3>x100
evnt[0] 0x100->0x0
evnt[0] <3>0x0->0x100
evnt[0] 0x100->0x0
ev<3>nt[0] 0x0->0x100
evnt[0] 0x100-><3>0x0
evnt[0] 0x0->0x100
evnt[0] 0<3>x100->0x0
evnt[0] 0x0->0x100
evn<3>t[0] 0x100->0x0
evnt[0] 0x0->0x1<3>00
evnt[0] 0x100->0x0
evnt[0] 0x<3>0->0x100
evnt[0] 0x100->0x0
evnt<3>[0] 0x0->0x100
evnt[0] 0x100->0x<3>0
evnt[0] 0x0->0x100
evnt[0] 0x1<3>00->0x0
evnt[0] 0x0->0x100
evnt[<3>0] 0x100->0x0
evnt[0] 0x0->0x100<3>
evnt[0] 0x100->0x0
evnt[0] 0x0-<3>>0x100
evnt[0] 0x100->0x0
evnt[0<3>] 0x0->0x100
evnt[0] 0x100->0x0
evnt[0] 0x0->0x100
evnt[0] 0x100<3>->0x0
evnt[0] 0x0->0x100
evnt[0]<3> 0x100->0x0
f0: BIOS VIRT_MAC_PR<3>IM
link_init[0]: settings 0x4
f0<3>: BIOS VIRT_MAC_PRIM
link_init[0<3>]: settings 0x4
p0: link_init fa<3>il (rc 20006)
init_phy[0]: done
f0: LOAD_REQ
f0: LOAD_L2B_PRAM
i<3>mage 0x70000000 loaded  size 0x2<3>1a4  addr 0x601c0000
f0: LOAD_L2<3>B_PRAM
image 0x80000000 loaded  <3>size 0x8c4  addr 0x60240000
f0: <3>LOAD_L2B_PRAM
image 0x90000000 l<3>oaded  size 0x23b4  addr 0x602c0<3>000
f0: LOAD_L2B_PRAM
image 0xa0<3>000000 loaded  size 0x18c4  addr<3> 0x60340000
evnt[0] 0x0->0x1000
f0: LOAD_DONE
p0: PMF ->f0
init_<3>phy[0]: done
ASSERT! drv_hsi.c #<3>0x341
mcp intr[p0]: 0x1:GRC TIME<3>OUT => 0x115c00b PC 0x8014ce4
mc<3>p intr[p0]: 0x1:GRC TIMEOUT => 0<3>x115c00b PC 0x8014ce4
mcp intr[p<3>0]: 0x1:GRC TIMEOUT => 0x135c00b<3> PC 0x8014cc8
mcp intr[p0]: 0x1:<3>GRC TIMEOUT => 0x135c00b PC 0x80<3>14cc8
mcp intr[p0]: 0x1:GRC TIME<3>OUT => 0x155c00b PC 0x8014c30
mc<3>p intr[p0]: 0x1:GRC TIMEOUT => 0<3>x155c00b PC 0x8014c30
mcp intr[p<3>0]: 0x1:GRC TIMEOUT => 0x175c00b<3> PC 0x8014c48
mcp intr[p0]: 0x1:<3>GRC TIMEOUT => 0x175c00b PC 0x80<3>14c48

bnx2x: end of fw dump
[bnx2x_nic_load:6293(eth1)]MCP response failure, aborting
 failed; no link present.  Check cable?

Comment 11 Eilon Greenstein 2008-10-31 21:26:02 UTC

Hi,

Now things are starting to make sense…. I saw in the past that the “ethtool -i” is being called every ~5 minutes and looking at that code again, there is a potential issue over there. Only one function is allowed to access the PHY and that function is marked as the PMF, since eth0 is down, eth6 is now the active PMF and eth0 should not access the PHY – the problem is that “ethtool -i” is called when the function is down and so the check is not good enough – when a function is going down, it should clear the PMF flag.

Please try the following patch and see if it helps:
diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c
index fce7451..61152e1 100644
--- a/drivers/net/bnx2x_main.c
+++ b/drivers/net/bnx2x_main.c
@@ -6481,6 +6481,7 @@ load_int_disable:
        bnx2x_free_irq(bp);
 load_error:
        bnx2x_free_mem(bp);
+       bp->port.pmf = 0;
 
        /* TBD we really need to reset the chip
           if we want to recover from this */
@@ -6791,6 +6792,7 @@ unload_error:
        /* Report UNLOAD_DONE to MCP */
        if (!BP_NOMCP(bp))
                bnx2x_fw_command(bp, DRV_MSG_CODE_UNLOAD_DONE);
+       bp->port.pmf = 0;
 
        /* Free SKBs, SGEs, TPA pool and driver internals */
        bnx2x_free_skbs(bp);

Note that I still do not understand why this (wrong) scenario can cause the system to hang.

Regarding the single-function mode – from the debug that you posted it seems that the bootcode and the MBA (multi-boot agent) are out of sync. Can you please provide the versions of those two components (they are burned on the nvram).

Thanks,
Eilon

Comment 12 Alex Chiang 2008-11-03 22:30:44 UTC

Hi Eilon,

Thanks for the patch. I applied it and have tested for several hours, and cannot reproduce the MCA. I would say that it is a good fix.

I ACK'ed your patch on netdev.

I can try and keep tracking down some more of that other information you asked for.

Thanks!

Comment 13 Eilon Greenstein 2008-11-05 09:58:41 UTC

Alex - thank you so much for all your support on this issue. Now the only thing left is to make sure that RH will upgrade the driver to 1.45.23 - is that possible?

Thanks,
Eilon

Comment 14 Doug Chapman 2008-11-05 15:55:14 UTC

Can somebody point to the upstream fix?  Either a git commit ID or a ptr to the upstream mailing list posting?

Comment 15 Alex Chiang 2008-11-05 16:01:49 UTC

The minimal patch we need to fix this MCA is:

9a0354405feb0f8bd460349a93db05e4cca8d166

Eilon posted some other patches in his series that were accepted as well. I didn't test them, and we haven't seen the issues his other patches fixed, but I figured I'd point them out in case you like to keep the series together in your tree. Consider these two patches as "fyi" only.

7d96567ac0527703cf1b80043fc0ebd7f21a10ad
12b56ea89e70d4b04f2f5199750310e82894ebbd

Comment 16 Eilon Greenstein 2008-11-05 17:03:48 UTC

Hi,

Though this BZ is not related to the other two issues, I ask to consider taking these two as well since they are critical bug fixes and with minimal code change.

Thanks,
Eilon

Comment 17 Doug Chapman 2008-11-06 14:19:30 UTC

Luming,

This is NOT a tracking BZ.  Tracking bugs are used to group together a collection of other BZs.  Please refrain from adding inappropriate keywords to HPs bugs.

Comment 18 Luming Yu 2008-11-06 15:23:10 UTC

Doug,

I thought adding keyword "Tracking" was least intrusive to get a bug removed from my predefined query when I think it has a solution or I cannot do anything for it. I'm sorry if it breaks anything else.. The predefined query is very useful for me to track and monitor ia64 kernel bugs.  Do you have suggestion for a proper keyword that I can use for that purpose.

Thanks,
Luming

Comment 19 Doug Chapman 2008-11-06 15:31:58 UTC

(In reply to comment #18)
> Doug,
> 
> I thought adding keyword "Tracking" was least intrusive to get a bug removed
> from my predefined query when I think it has a solution or I cannot do anything
> for it. I'm sorry if it breaks anything else.. The predefined query is very
> useful for me to track and monitor ia64 kernel bugs.  Do you have suggestion
> for a proper keyword that I can use for that purpose.
> 
> Thanks,
> Luming

If I recall correctly the last time this issue came up you were asked to use the whiteboard field.

Comment 20 Luming Yu 2008-11-06 15:50:21 UTC

I don't remember I have ever encountered this problem. My tiger4 and Hitachi Cold Fusion-3e 4s4u don't have broadcom network card installed. Probably I need to double check..

Comment 21 Ronald Pacheco 2008-11-06 19:19:54 UTC

Fails QA

Comment 22 Adaora Onyia 2008-11-06 19:22:31 UTC

A request for this driver was made in feature request bz442026.

Comment 23 Ronald Pacheco 2008-11-07 15:04:48 UTC

Given comment #22, we should move this error report to BZ442026 or it will get lost.

gospo, do you agree?

Comment 25 Alex Chiang 2008-11-10 15:47:34 UTC

Adaora, Ron,

Unfortunately for us, bug 442026 talks about driver version 1.45.21, which does _not_ contain the fix for our MCA. If we do want this fix for 5.3, I agree with Ron comment #23 that we need to mention our bug in the other bugzilla.

Ron, re: comment #21, what do you mean it fails QA? This patch broke during testing?

Thanks.

Comment 26 Andy Gospodarek 2008-11-10 16:10:54 UTC

Created attachment 323088 [details]
bnx2x-4-fixes.patch

This is the patch I will propose for inclusion in RHEL5.3

Comment 27 Andy Gospodarek 2008-11-10 16:12:42 UTC

Alex, if you could test the patch in comment#26 on one of your systems for a while that would really help us out.  I presume it will be fine, but it would be nice to have the verification.  Thanks!

Comment 29 Alex Chiang 2008-11-11 22:56:21 UTC

Andy,

I patched my system -- I was running RHEL5.3a1, which is based on this kernel: 2.6.18-118.el5 -- with your patch from comment #26.

My reproducer script has been running successfully (no MCA) for the last 45 minutes or so. Before, I was always able to see an MCA within 10 minutes.

I'll let this run overnight, and then send a Tested-by: to RHKL.

Thanks.

[root@localhost ~]# modinfo bnx2x
filename:       /lib/modules/2.6.18-prep/kernel/drivers/net/bnx2x.ko
version:        1.45.23
license:        GPL
description:    Broadcom NetXtreme II BCM57710 Driver
author:         Eliezer Tamir
srcversion:     0C32B7E9049C6393144912E
alias:          pci:v000014E4d00001650sv*sd*bc*sc*i*
alias:          pci:v000014E4d0000164Fsv*sd*bc*sc*i*
alias:          pci:v000014E4d0000164Esv*sd*bc*sc*i*
depends:        
vermagic:       2.6.18-prep SMP mod_unload ia64gcc-4.1
parm:           disable_tpa:disable the TPA (LRO) feature (int)
parm:           use_inta:use INT#A instead of MSI-X (int)
parm:           poll:use polling (for debug) (int)
parm:           debug:default debug msglevel (int)

Comment 30 Andy Gospodarek 2008-11-11 23:41:36 UTC

Thanks for testing, Alex.  I really appreciate that.

Comment 33 Don Zickus 2008-11-18 21:21:04 UTC

in kernel-2.6.18-124.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Comment 35 Alex Chiang 2008-12-09 01:13:08 UTC

I actually tested 2.6.18-125.el5.

It worked, as expected. No MCA.

Comment 38 errata-xmlrpc 2009-01-20 20:06:31 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-0225.html