499355 – e1000_clean_tx_irq: Detected Tx Unit Hang

Bug 499355 - e1000_clean_tx_irq: Detected Tx Unit Hang

Summary: e1000_clean_tx_irq: Detected Tx Unit Hang

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 4
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	4.8
Hardware:	All
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Dean Nelson
QA Contact:	Network QE
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	596374
TreeView+	depends on / blocked

Reported:	2009-05-06 10:34 UTC by Qian Cai
Modified:	2018-12-01 16:13 UTC (History)
CC List:	12 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	578492 578497 (view as bug list)
Environment:
Last Closed:	2011-02-16 15:28:13 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
patch against -89.EL reverting MMRBC bits (4.02 KB, patch) 2009-11-24 15:35 UTC, Flavio Leitner	no flags	Details \| Diff
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2011:0263	0	normal	SHIPPED_LIVE	Important: Red Hat Enterprise Linux 4.9 kernel security and bug fix update	2011-02-16 15:14:55 UTC

Description Qian Cai 2009-05-06 10:34:43 UTC

Description of problem:
When mounting a corrupted HFS+ image in a loop as the part of bug verification of BZ 469635, several machines with e1000 networking cards seem deadlock with the following similar messages in the console,

hfs: catalog name length corrupted
hfs: catalog name length corrupted
hfs: catalog name length corrupted
e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
  Tx Queue             <0>
  TDH                  <51>
  TDT                  <51>
  next_to_use          <51>
  next_to_clean        <a6>
buffer_info[next_to_clean]
  time_stamp           <100d2713e>
  next_to_watch        <a6>
  jiffies              <100d292e4>
  next_to_watch.status <1>
Badness in local_bh_enable at kernel/softirq.c:141
Call Trace:
[c000000073c26bc0] [0000001000000040] 0x1000000040 (unreliable)
[c000000073c26c40] [c00000000030b100] ._spin_unlock_bh+0x1c/0x30
[c000000073c26cc0] [c000000000297dec] .poll_napi+0xb0/0xd4
[c000000073c26d50] [c000000000297f00] .netpoll_poll_dev+0xa4/0xe4
[c000000073c26de0] [c000000000298434] .netpoll_send_skb+0x168/0x300
[c000000073c26ea0] [c000000000298840] .netpoll_send_udp+0x274/0x2a0
[c000000073c26f80] [d0000000001fc1ac] .write_msg+0x1ac/0x1ec [netconsole]
[c000000073c27040] [c000000000061974] .__call_console_drivers+0x94/0xc0
[c000000073c270d0] [c000000000061b9c] .call_console_drivers+0x120/0x16c
[c000000073c27170] [c000000000062088] .release_console_sem+0x80/0x13c
[c000000073c27210] [c000000000061f08] .vprintk+0x1cc/0x208
[c000000073c272a0] [c000000000061d28] .printk+0x30/0x44
[c000000073c27320] [d00000000043535c] .hfsplus_find_cat+0x7c/0xb8 [hfsplus]
[c000000073c275b0] [d0000000004310bc] .hfsplus_read_inode+0xb8/0x1bc [hfsplus]
[c000000073c27680] [d000000000431dcc] .hfsplus_fill_super+0x384/0x64c [hfsplus]
[c000000073c27980] [c0000000000cee78] .get_sb_bdev+0x1b8/0x258
[c000000073c27a50] [d00000000043213c] .hfsplus_get_sb+0x1c/0x34 [hfsplus]
[c000000073c27ad0] [c0000000000cf2a8] .do_kern_mount+0xfc/0x29c
[c000000073c27b80] [c0000000000efed0] .do_new_mount+0x90/0xf0
[c000000073c27c30] [c0000000000f09e4] .do_mount+0x1e4/0x22c
[c000000073c27d60] [c0000000001002f0] .compat_sys_mount+0x188/0x258
[c000000073c27e30] [c000000000011280] syscall_exit+0x0/0x18

Version-Release number of selected component (if applicable):
kernel-2.6.9-89.EL

How reproducible:
Seen one in four attempts.

Steps to Reproduce:
1. add "options e1000 InterruptThrottleRate=8000" to modprobe.conf and reboot. If there are three e1000 NICs in the system, then this module option,

InterruptThrottleRate=8000,8000,8000.

2. install the test from the RHTS tests repository,

yum -y install rh-tests-kernel-errata-4_7_z-469635.noarch

3. run the test. 

cd /mn/tests/kernel/errata/4.7.z/469635

make run

Comment 1 Andy Gospodarek 2009-05-12 15:33:22 UTC

Which NIC is being used in this test?

Comment 2 Qian Cai 2009-05-13 00:41:56 UTC

One of affected systems had 4 NICs,





alias eth0 e1000
alias eth1 e1000
alias eth2 e1000
alias eth3 e1000

But the primary one is,

eth0 -- Intel(R) PRO/1000 Network Driver - version 7.3.20-k2-NAP
e1000: 0000:11:02.0: e1000_validate_option: Interrupt Throttling Rate (ints/sec) set to 8000
e1000: 0000:11:02.0: e1000_probe: (PCI-X:100MHz:64-bit) 00:14:5e:0c:08:bc
divert: allocating divert_blk for eth0

11:02.0 Ethernet controller: Intel Corporation 82546GB Gigabit Ethernet Controller (rev 03)

Comment 3 Hogan Whittall 2009-05-15 23:05:50 UTC

This is a known bug with the Intel 82544 and 82546 chips that is being triggered with the current e1000 driver.  Check out https://bugzilla.redhat.com/show_bug.cgi?id=334411 for some more info, there are some new module options you can try to see if they help with the problem.

Ran into TX hangs with the 82546 NICs myself when upgrading from 4.3 to 4.7.

Comment 4 Andy Gospodarek 2009-08-31 12:30:15 UTC

Were you using netconsole with this device?  Those messages:

Badness in local_bh_enable at kernel/softirq.c:141
Call Trace:
[c000000073c26bc0] [0000001000000040] 0x1000000040 (unreliable)
[c000000073c26c40] [c00000000030b100] ._spin_unlock_bh+0x1c/0x30
[c000000073c26cc0] [c000000000297dec] .poll_napi+0xb0/0xd4
[c000000073c26d50] [c000000000297f00] .netpoll_poll_dev+0xa4/0xe4

look similar to ones that show up when we get deadlocks with netconsole and napi.

Comment 5 Qian Cai 2009-09-01 11:09:49 UTC

Yes, it was using netconsole through setting up a netdump server.

Comment 6 Andy Gospodarek 2009-10-19 17:05:11 UTC

(In reply to comment #0)
> Description of problem:
> When mounting a corrupted HFS+ image in a loop as the part of bug verification
> of BZ 469635, several machines with e1000 networking cards seem deadlock with
> the following similar messages in the console,
> 
> hfs: catalog name length corrupted
> hfs: catalog name length corrupted
> hfs: catalog name length corrupted
> e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
>   Tx Queue             <0>
>   TDH                  <51>
>   TDT                  <51>
>   next_to_use          <51>
>   next_to_clean        <a6>
> buffer_info[next_to_clean]
>   time_stamp           <100d2713e>
>   next_to_watch        <a6>
>   jiffies              <100d292e4>
>   next_to_watch.status <1>
> Badness in local_bh_enable at kernel/softirq.c:141
> Call Trace:
> [c000000073c26bc0] [0000001000000040] 0x1000000040 (unreliable)
> [c000000073c26c40] [c00000000030b100] ._spin_unlock_bh+0x1c/0x30
> [c000000073c26cc0] [c000000000297dec] .poll_napi+0xb0/0xd4
> [c000000073c26d50] [c000000000297f00] .netpoll_poll_dev+0xa4/0xe4
> [c000000073c26de0] [c000000000298434] .netpoll_send_skb+0x168/0x300
> [c000000073c26ea0] [c000000000298840] .netpoll_send_udp+0x274/0x2a0
> [c000000073c26f80] [d0000000001fc1ac] .write_msg+0x1ac/0x1ec [netconsole]
> [c000000073c27040] [c000000000061974] .__call_console_drivers+0x94/0xc0
> [c000000073c270d0] [c000000000061b9c] .call_console_drivers+0x120/0x16c
> [c000000073c27170] [c000000000062088] .release_console_sem+0x80/0x13c
> [c000000073c27210] [c000000000061f08] .vprintk+0x1cc/0x208
> [c000000073c272a0] [c000000000061d28] .printk+0x30/0x44
> [c000000073c27320] [d00000000043535c] .hfsplus_find_cat+0x7c/0xb8 [hfsplus]
> [c000000073c275b0] [d0000000004310bc] .hfsplus_read_inode+0xb8/0x1bc [hfsplus]
> [c000000073c27680] [d000000000431dcc] .hfsplus_fill_super+0x384/0x64c [hfsplus]
> [c000000073c27980] [c0000000000cee78] .get_sb_bdev+0x1b8/0x258
> [c000000073c27a50] [d00000000043213c] .hfsplus_get_sb+0x1c/0x34 [hfsplus]
> [c000000073c27ad0] [c0000000000cf2a8] .do_kern_mount+0xfc/0x29c
> [c000000073c27b80] [c0000000000efed0] .do_new_mount+0x90/0xf0
> [c000000073c27c30] [c0000000000f09e4] .do_mount+0x1e4/0x22c
> [c000000073c27d60] [c0000000001002f0] .compat_sys_mount+0x188/0x258
> [c000000073c27e30] [c000000000011280] syscall_exit+0x0/0x18
> 

The system continues running normally after printing this message right?  Does this warning:

> Badness in local_bh_enable at kernel/softirq.c:141

and the backtrace always appear immediately after these messages:

> e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
>   Tx Queue             <0>
>   TDH                  <51>
>   TDT                  <51>
>   next_to_use          <51>
>   next_to_clean        <a6>
> buffer_info[next_to_clean]
>   time_stamp           <100d2713e>
>   next_to_watch        <a6>
>   jiffies              <100d292e4>
>   next_to_watch.status <1>

I suspect that the attempt to print these messages to the console (and netconsole) causes softirq_count to be incorrect and results in the messages being printed.  Here is the code from softirq.c:

138 void local_bh_enable(void)
139 {
140         __local_bh_enable();
141         WARN_ON(!softirq_count() && irqs_disabled());
142         if (unlikely(!in_interrupt() &&
143                      local_softirq_pending()))
144                 invoke_softirq();
145         preempt_check_resched();
146 }
147 EXPORT_SYMBOL(local_bh_enable);

I'm also curious if you can look at the log on the remote system and see if any of the messages that appear on the console are not in the netconsole log.

Comment 8 Flavio Leitner 2009-11-24 15:33:21 UTC

Hi Andy,

I did a 'bisect' manually in the update patch[1] because customer said that 
55.0.12 works perfectly and -89.EL and beyond doesn't work, so newer 
kernels fails printing 'Detected Tx Unit Hang problem' message.

[1] Don Howard <dhoward> [2.6.9-55.20]:
e1000: update to version 7.3.20-k2 (Andy Gospodarek) [233609 216769 231676]

Turns out that applying this chunk to revert the update has 
fixed the problem:


@@ -968,8 +972,27 @@ e1000_init_hw(struct e1000_hw *hw)
         break;
     default:
         /* Workaround for PCI-X problem when BIOS sets MMRBC incorrectly. */
-       if (hw->bus_type == e1000_bus_type_pcix && e1000_pcix_get_mmrbc(hw) > 2048)
-               e1000_pcix_set_mmrbc(hw, 2048);
+        if (hw->bus_type == e1000_bus_type_pcix) {
+            int mmrbc =  e1000_pcix_get_mmrbc(hw);
+            e1000_read_pci_cfg(hw, PCIX_COMMAND_REGISTER, &pcix_cmd_word);
+            e1000_read_pci_cfg(hw, PCIX_STATUS_REGISTER_HI,
+                &pcix_stat_hi_word);
+            cmd_mmrbc = (pcix_cmd_word & PCIX_COMMAND_MMRBC_MASK) >>
+                PCIX_COMMAND_MMRBC_SHIFT;
+            stat_mmrbc = (pcix_stat_hi_word & PCIX_STATUS_HI_MMRBC_MASK) >>
+                PCIX_STATUS_HI_MMRBC_SHIFT;
+            printk(KERN_WARNING "e1000: pcix mmrbc=0x%x, stat=0x%x, cmd=0x%x\n",
+                mmrbc, stat_mmrbc, cmd_mmrbc);
+            if (stat_mmrbc == PCIX_STATUS_HI_MMRBC_4K)
+                stat_mmrbc = PCIX_STATUS_HI_MMRBC_2K;
+            if (cmd_mmrbc > stat_mmrbc) {
+                pcix_cmd_word &= ~PCIX_COMMAND_MMRBC_MASK;
+                pcix_cmd_word |= stat_mmrbc << PCIX_COMMAND_MMRBC_SHIFT;
+                e1000_write_pci_cfg(hw, PCIX_COMMAND_REGISTER,
+                    &pcix_cmd_word);
+                printk(KERN_WARNING "e1000: writing cmd=0x%x\n", pcix_cmd_word);
+            }
+        }
        break;
     }

This is the upstream patch:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=007755eb86c3953bb8615bd016246fc99056580c

The code looks very different to me in the way it works if you compare 
before and after the patch. Even the registers updated seems different
to me, but as I'm not familiar with this area, I'll let that for you.

I'll attach the patch used.to revert those bits tested by customer.

Comment 9 Flavio Leitner 2009-11-24 15:35:58 UTC

Created attachment 373470 [details]
patch against -89.EL reverting MMRBC bits 

Brew build:
http://brewweb.devel.redhat.com/brew/taskinfo?taskID=2085481

Customer said that the problem didn't happen with this test kernel.

Flavio

Comment 11 Andy Gospodarek 2009-11-30 21:45:43 UTC

If we need to revert this upstream patch from e1000:

commit 007755eb86c3953bb8615bd016246fc99056580c
Author: Peter Oruba <peter.oruba>
Date:   Fri Sep 28 22:42:06 2007 -0700

    PCI-X/PCI-Express read control interfaces: use them in e1000
    
    These driver changes incorporate the proposed PCI-X / PCI-Express read byte
    count interface.  Reading and setting those valuse doesn't take place
    "manually", instead wrapping functions are called to allow quirks for some
    PCI bridges.
    
we can do it, but we really need to understand the implication and if the 'quirks' mentioned in the changelog are real problems.

Comment 18 Jesse Brandeburg 2010-02-02 17:29:52 UTC

so what is the status on this?  It is easy enough to check with lspci -vvv -xxxx that the MSI-X settings took correctly.  If the MMRBC=4k on an e1000 pci-x part, then you run smack into one of our errata.

Comment 19 Dean Nelson 2010-02-02 18:31:59 UTC

(In reply to comment #18)
> so what is the status on this?  It is easy enough to check with lspci -vvv
> -xxxx that the MSI-X settings took correctly.  If the MMRBC=4k on an e1000
> pci-x part, then you run smack into one of our errata.    

As mentioned in comment #17, I've been investigating what could possibly be wrong with the section of code in e1000_init_hw() that was added by the backport of commit 007755eb86c3953bb8615bd016246fc99056580, and being reverted by Flavio's patch mentioned in comment #8.

  if (hw->bus_type == e1000_bus_type_pcix && e1000_pcix_get_mmrbc(hw) > 2048)
          e1000_pcix_set_mmrbc(hw, 2048);

In the last couple of days, I discovered that e1000_pcix_get_mmrbc() is always returning a value of 135. The only valid values for mmrbc are 512, 1024, 2048, 4096. The 135 reflects an error return of PCIBIOS_BAD_REGISTER_NUMBER from pci_bus_read_config_dword(,, cap + PCI_X_CMD,).

This is because for a dword, the following portion of the PCI_OP_READ() macro:

        if (PCI_##size##_BAD) return PCIBIOS_BAD_REGISTER_NUMBER;

expands to:

        if (pos & 3) return PCIBIOS_BAD_REGISTER_NUMBER;

And is always true for 'cap + PCI_X_CMD', which is 0xe4 + 2 = 0xe6. ('cap' is the the result of calling pci_find_capability(, PCI_CAP_ID_PCIX).)

The same problem exists for pci_bus_write_config_dword(,, cap + PCI_X_CMD,).  In both cases, instead of calling _dword(), _word() should be called.

This problem exists in RHEL4, RHEL5, RHEL6 and upstream. I've got three patches that I'm intending to push upstream to linux-pci. I sent them to Andy (gospo) for his review before I do, since this whole area is new to me.

I've got an additional patch for RHEL4 that fixes a bug in the backport of the fore-mentioned commit. This bug isn't something the customer is seeing. I'm also waiting on feedback from Andy on this patch.

Comment 20 Dean Nelson 2010-02-05 04:00:49 UTC

I have updated my RHEL4 test kernel rpms to include a proposed fix for this bugzilla.

http://people.redhat.com/dnelson/#rhel4

The actual fix consists of the first four patches listed with my test kernel rpms. The fifth patch is for debug purposes only. And is intended to verify the assumptions I have about the problem seen by the customer.

The fifth patch (entitled 0005-generate-debug-info-to-verify-assumptions) has an explanation of what the debug info looks like and describes how I'd like the info copied from /var/log/messages and added as an attachment to this bugzilla.

It would be great if the customer would run the appropriate test kernel rpm at
their site and verify whether the problem is fixed by my patches.

Dean

Comment 26 RHEL Program Management 2010-03-31 13:53:29 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 37 Vivek Goyal 2010-06-02 15:40:06 UTC

Committed in 89.27.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/

Comment 38 Igor Zhang 2010-06-14 14:13:05 UTC

Hi Dean,
With your patch, will the "Call Trace" not occur any more?
We find in kernel 2.6.9-89.0.26.EL.x86_64 call trace still appears, althoug TX hang doesn't appear. Is this right? Or a new problem?
https://bugzilla.redhat.com/show_bug.cgi?id=596374#c5

Comment 39 Dean Nelson 2010-06-17 11:53:15 UTC

(In reply to comment #38)
> Hi Dean,
> With your patch, will the "Call Trace" not occur any more?
> We find in kernel 2.6.9-89.0.26.EL.x86_64 call trace still appears, althoug TX
> hang doesn't appear. Is this right? Or a new problem?
> https://bugzilla.redhat.com/show_bug.cgi?id=596374#c5    

I would say this is a separate problem, and should be tracked in another bug. Perhaps bug 516076 or bug 549494 have some relationship to this.

Comment 40 Andy Gospodarek 2010-06-17 17:36:59 UTC

Igor can you test with 2.6.9-89.27.EL?  This is a RHEL4 development kernel (not a 4.8 z-stream kernel) and should contain the fix for bug 516076 and this bug.  Hopefully the call-trace seen does not appear on that test.  Thanks!

Comment 41 Igor Zhang 2010-06-18 07:32:24 UTC

(In reply to comment #40)
> Igor can you test with 2.6.9-89.27.EL?  This is a RHEL4 development kernel (not
> a 4.8 z-stream kernel) and should contain the fix for bug 516076 and this bug. 
> Hopefully the call-trace seen does not appear on that test.  Thanks!    

Yes, I have done this.
I ran the test /mn/tests/kernel/errata/4.7.z/469635 for three times and got no Call Trace any more.
From my previous experience on this problem this week, if it exists, for a three-times run call trace will definitely appear. So I guess the problem is fixed in 2.6.9-89.27.EL.

Comment 47 errata-xmlrpc 2011-02-16 15:28:13 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0263.html

Note You need to log in before you can comment on or make changes to this bug.