Description of problem: When mounting a corrupted HFS+ image in a loop as the part of bug verification of BZ 469635, several machines with e1000 networking cards seem deadlock with the following similar messages in the console, hfs: catalog name length corrupted hfs: catalog name length corrupted hfs: catalog name length corrupted e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang Tx Queue <0> TDH <51> TDT <51> next_to_use <51> next_to_clean <a6> buffer_info[next_to_clean] time_stamp <100d2713e> next_to_watch <a6> jiffies <100d292e4> next_to_watch.status <1> Badness in local_bh_enable at kernel/softirq.c:141 Call Trace: [c000000073c26bc0] [0000001000000040] 0x1000000040 (unreliable) [c000000073c26c40] [c00000000030b100] ._spin_unlock_bh+0x1c/0x30 [c000000073c26cc0] [c000000000297dec] .poll_napi+0xb0/0xd4 [c000000073c26d50] [c000000000297f00] .netpoll_poll_dev+0xa4/0xe4 [c000000073c26de0] [c000000000298434] .netpoll_send_skb+0x168/0x300 [c000000073c26ea0] [c000000000298840] .netpoll_send_udp+0x274/0x2a0 [c000000073c26f80] [d0000000001fc1ac] .write_msg+0x1ac/0x1ec [netconsole] [c000000073c27040] [c000000000061974] .__call_console_drivers+0x94/0xc0 [c000000073c270d0] [c000000000061b9c] .call_console_drivers+0x120/0x16c [c000000073c27170] [c000000000062088] .release_console_sem+0x80/0x13c [c000000073c27210] [c000000000061f08] .vprintk+0x1cc/0x208 [c000000073c272a0] [c000000000061d28] .printk+0x30/0x44 [c000000073c27320] [d00000000043535c] .hfsplus_find_cat+0x7c/0xb8 [hfsplus] [c000000073c275b0] [d0000000004310bc] .hfsplus_read_inode+0xb8/0x1bc [hfsplus] [c000000073c27680] [d000000000431dcc] .hfsplus_fill_super+0x384/0x64c [hfsplus] [c000000073c27980] [c0000000000cee78] .get_sb_bdev+0x1b8/0x258 [c000000073c27a50] [d00000000043213c] .hfsplus_get_sb+0x1c/0x34 [hfsplus] [c000000073c27ad0] [c0000000000cf2a8] .do_kern_mount+0xfc/0x29c [c000000073c27b80] [c0000000000efed0] .do_new_mount+0x90/0xf0 [c000000073c27c30] [c0000000000f09e4] .do_mount+0x1e4/0x22c [c000000073c27d60] [c0000000001002f0] .compat_sys_mount+0x188/0x258 [c000000073c27e30] [c000000000011280] syscall_exit+0x0/0x18 Version-Release number of selected component (if applicable): kernel-2.6.9-89.EL How reproducible: Seen one in four attempts. Steps to Reproduce: 1. add "options e1000 InterruptThrottleRate=8000" to modprobe.conf and reboot. If there are three e1000 NICs in the system, then this module option, InterruptThrottleRate=8000,8000,8000. 2. install the test from the RHTS tests repository, yum -y install rh-tests-kernel-errata-4_7_z-469635.noarch 3. run the test. cd /mn/tests/kernel/errata/4.7.z/469635 make run
Which NIC is being used in this test?
One of affected systems had 4 NICs, alias eth0 e1000 alias eth1 e1000 alias eth2 e1000 alias eth3 e1000 But the primary one is, eth0 -- Intel(R) PRO/1000 Network Driver - version 7.3.20-k2-NAP e1000: 0000:11:02.0: e1000_validate_option: Interrupt Throttling Rate (ints/sec) set to 8000 e1000: 0000:11:02.0: e1000_probe: (PCI-X:100MHz:64-bit) 00:14:5e:0c:08:bc divert: allocating divert_blk for eth0 11:02.0 Ethernet controller: Intel Corporation 82546GB Gigabit Ethernet Controller (rev 03)
This is a known bug with the Intel 82544 and 82546 chips that is being triggered with the current e1000 driver. Check out https://bugzilla.redhat.com/show_bug.cgi?id=334411 for some more info, there are some new module options you can try to see if they help with the problem. Ran into TX hangs with the 82546 NICs myself when upgrading from 4.3 to 4.7.
Were you using netconsole with this device? Those messages: Badness in local_bh_enable at kernel/softirq.c:141 Call Trace: [c000000073c26bc0] [0000001000000040] 0x1000000040 (unreliable) [c000000073c26c40] [c00000000030b100] ._spin_unlock_bh+0x1c/0x30 [c000000073c26cc0] [c000000000297dec] .poll_napi+0xb0/0xd4 [c000000073c26d50] [c000000000297f00] .netpoll_poll_dev+0xa4/0xe4 look similar to ones that show up when we get deadlocks with netconsole and napi.
Yes, it was using netconsole through setting up a netdump server.
(In reply to comment #0) > Description of problem: > When mounting a corrupted HFS+ image in a loop as the part of bug verification > of BZ 469635, several machines with e1000 networking cards seem deadlock with > the following similar messages in the console, > > hfs: catalog name length corrupted > hfs: catalog name length corrupted > hfs: catalog name length corrupted > e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang > Tx Queue <0> > TDH <51> > TDT <51> > next_to_use <51> > next_to_clean <a6> > buffer_info[next_to_clean] > time_stamp <100d2713e> > next_to_watch <a6> > jiffies <100d292e4> > next_to_watch.status <1> > Badness in local_bh_enable at kernel/softirq.c:141 > Call Trace: > [c000000073c26bc0] [0000001000000040] 0x1000000040 (unreliable) > [c000000073c26c40] [c00000000030b100] ._spin_unlock_bh+0x1c/0x30 > [c000000073c26cc0] [c000000000297dec] .poll_napi+0xb0/0xd4 > [c000000073c26d50] [c000000000297f00] .netpoll_poll_dev+0xa4/0xe4 > [c000000073c26de0] [c000000000298434] .netpoll_send_skb+0x168/0x300 > [c000000073c26ea0] [c000000000298840] .netpoll_send_udp+0x274/0x2a0 > [c000000073c26f80] [d0000000001fc1ac] .write_msg+0x1ac/0x1ec [netconsole] > [c000000073c27040] [c000000000061974] .__call_console_drivers+0x94/0xc0 > [c000000073c270d0] [c000000000061b9c] .call_console_drivers+0x120/0x16c > [c000000073c27170] [c000000000062088] .release_console_sem+0x80/0x13c > [c000000073c27210] [c000000000061f08] .vprintk+0x1cc/0x208 > [c000000073c272a0] [c000000000061d28] .printk+0x30/0x44 > [c000000073c27320] [d00000000043535c] .hfsplus_find_cat+0x7c/0xb8 [hfsplus] > [c000000073c275b0] [d0000000004310bc] .hfsplus_read_inode+0xb8/0x1bc [hfsplus] > [c000000073c27680] [d000000000431dcc] .hfsplus_fill_super+0x384/0x64c [hfsplus] > [c000000073c27980] [c0000000000cee78] .get_sb_bdev+0x1b8/0x258 > [c000000073c27a50] [d00000000043213c] .hfsplus_get_sb+0x1c/0x34 [hfsplus] > [c000000073c27ad0] [c0000000000cf2a8] .do_kern_mount+0xfc/0x29c > [c000000073c27b80] [c0000000000efed0] .do_new_mount+0x90/0xf0 > [c000000073c27c30] [c0000000000f09e4] .do_mount+0x1e4/0x22c > [c000000073c27d60] [c0000000001002f0] .compat_sys_mount+0x188/0x258 > [c000000073c27e30] [c000000000011280] syscall_exit+0x0/0x18 > The system continues running normally after printing this message right? Does this warning: > Badness in local_bh_enable at kernel/softirq.c:141 and the backtrace always appear immediately after these messages: > e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang > Tx Queue <0> > TDH <51> > TDT <51> > next_to_use <51> > next_to_clean <a6> > buffer_info[next_to_clean] > time_stamp <100d2713e> > next_to_watch <a6> > jiffies <100d292e4> > next_to_watch.status <1> I suspect that the attempt to print these messages to the console (and netconsole) causes softirq_count to be incorrect and results in the messages being printed. Here is the code from softirq.c: 138 void local_bh_enable(void) 139 { 140 __local_bh_enable(); 141 WARN_ON(!softirq_count() && irqs_disabled()); 142 if (unlikely(!in_interrupt() && 143 local_softirq_pending())) 144 invoke_softirq(); 145 preempt_check_resched(); 146 } 147 EXPORT_SYMBOL(local_bh_enable); I'm also curious if you can look at the log on the remote system and see if any of the messages that appear on the console are not in the netconsole log.
Hi Andy, I did a 'bisect' manually in the update patch[1] because customer said that 55.0.12 works perfectly and -89.EL and beyond doesn't work, so newer kernels fails printing 'Detected Tx Unit Hang problem' message. [1] Don Howard <dhoward> [2.6.9-55.20]: e1000: update to version 7.3.20-k2 (Andy Gospodarek) [233609 216769 231676] Turns out that applying this chunk to revert the update has fixed the problem: @@ -968,8 +972,27 @@ e1000_init_hw(struct e1000_hw *hw) break; default: /* Workaround for PCI-X problem when BIOS sets MMRBC incorrectly. */ - if (hw->bus_type == e1000_bus_type_pcix && e1000_pcix_get_mmrbc(hw) > 2048) - e1000_pcix_set_mmrbc(hw, 2048); + if (hw->bus_type == e1000_bus_type_pcix) { + int mmrbc = e1000_pcix_get_mmrbc(hw); + e1000_read_pci_cfg(hw, PCIX_COMMAND_REGISTER, &pcix_cmd_word); + e1000_read_pci_cfg(hw, PCIX_STATUS_REGISTER_HI, + &pcix_stat_hi_word); + cmd_mmrbc = (pcix_cmd_word & PCIX_COMMAND_MMRBC_MASK) >> + PCIX_COMMAND_MMRBC_SHIFT; + stat_mmrbc = (pcix_stat_hi_word & PCIX_STATUS_HI_MMRBC_MASK) >> + PCIX_STATUS_HI_MMRBC_SHIFT; + printk(KERN_WARNING "e1000: pcix mmrbc=0x%x, stat=0x%x, cmd=0x%x\n", + mmrbc, stat_mmrbc, cmd_mmrbc); + if (stat_mmrbc == PCIX_STATUS_HI_MMRBC_4K) + stat_mmrbc = PCIX_STATUS_HI_MMRBC_2K; + if (cmd_mmrbc > stat_mmrbc) { + pcix_cmd_word &= ~PCIX_COMMAND_MMRBC_MASK; + pcix_cmd_word |= stat_mmrbc << PCIX_COMMAND_MMRBC_SHIFT; + e1000_write_pci_cfg(hw, PCIX_COMMAND_REGISTER, + &pcix_cmd_word); + printk(KERN_WARNING "e1000: writing cmd=0x%x\n", pcix_cmd_word); + } + } break; } This is the upstream patch: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=007755eb86c3953bb8615bd016246fc99056580c The code looks very different to me in the way it works if you compare before and after the patch. Even the registers updated seems different to me, but as I'm not familiar with this area, I'll let that for you. I'll attach the patch used.to revert those bits tested by customer.
Created attachment 373470 [details] patch against -89.EL reverting MMRBC bits Brew build: http://brewweb.devel.redhat.com/brew/taskinfo?taskID=2085481 Customer said that the problem didn't happen with this test kernel. Flavio
If we need to revert this upstream patch from e1000: commit 007755eb86c3953bb8615bd016246fc99056580c Author: Peter Oruba <peter.oruba> Date: Fri Sep 28 22:42:06 2007 -0700 PCI-X/PCI-Express read control interfaces: use them in e1000 These driver changes incorporate the proposed PCI-X / PCI-Express read byte count interface. Reading and setting those valuse doesn't take place "manually", instead wrapping functions are called to allow quirks for some PCI bridges. we can do it, but we really need to understand the implication and if the 'quirks' mentioned in the changelog are real problems.
so what is the status on this? It is easy enough to check with lspci -vvv -xxxx that the MSI-X settings took correctly. If the MMRBC=4k on an e1000 pci-x part, then you run smack into one of our errata.
(In reply to comment #18) > so what is the status on this? It is easy enough to check with lspci -vvv > -xxxx that the MSI-X settings took correctly. If the MMRBC=4k on an e1000 > pci-x part, then you run smack into one of our errata. As mentioned in comment #17, I've been investigating what could possibly be wrong with the section of code in e1000_init_hw() that was added by the backport of commit 007755eb86c3953bb8615bd016246fc99056580, and being reverted by Flavio's patch mentioned in comment #8. if (hw->bus_type == e1000_bus_type_pcix && e1000_pcix_get_mmrbc(hw) > 2048) e1000_pcix_set_mmrbc(hw, 2048); In the last couple of days, I discovered that e1000_pcix_get_mmrbc() is always returning a value of 135. The only valid values for mmrbc are 512, 1024, 2048, 4096. The 135 reflects an error return of PCIBIOS_BAD_REGISTER_NUMBER from pci_bus_read_config_dword(,, cap + PCI_X_CMD,). This is because for a dword, the following portion of the PCI_OP_READ() macro: if (PCI_##size##_BAD) return PCIBIOS_BAD_REGISTER_NUMBER; expands to: if (pos & 3) return PCIBIOS_BAD_REGISTER_NUMBER; And is always true for 'cap + PCI_X_CMD', which is 0xe4 + 2 = 0xe6. ('cap' is the the result of calling pci_find_capability(, PCI_CAP_ID_PCIX).) The same problem exists for pci_bus_write_config_dword(,, cap + PCI_X_CMD,). In both cases, instead of calling _dword(), _word() should be called. This problem exists in RHEL4, RHEL5, RHEL6 and upstream. I've got three patches that I'm intending to push upstream to linux-pci. I sent them to Andy (gospo) for his review before I do, since this whole area is new to me. I've got an additional patch for RHEL4 that fixes a bug in the backport of the fore-mentioned commit. This bug isn't something the customer is seeing. I'm also waiting on feedback from Andy on this patch.
I have updated my RHEL4 test kernel rpms to include a proposed fix for this bugzilla. http://people.redhat.com/dnelson/#rhel4 The actual fix consists of the first four patches listed with my test kernel rpms. The fifth patch is for debug purposes only. And is intended to verify the assumptions I have about the problem seen by the customer. The fifth patch (entitled 0005-generate-debug-info-to-verify-assumptions) has an explanation of what the debug info looks like and describes how I'd like the info copied from /var/log/messages and added as an attachment to this bugzilla. It would be great if the customer would run the appropriate test kernel rpm at their site and verify whether the problem is fixed by my patches. Dean
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
Committed in 89.27.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/
Hi Dean, With your patch, will the "Call Trace" not occur any more? We find in kernel 2.6.9-89.0.26.EL.x86_64 call trace still appears, althoug TX hang doesn't appear. Is this right? Or a new problem? https://bugzilla.redhat.com/show_bug.cgi?id=596374#c5
(In reply to comment #38) > Hi Dean, > With your patch, will the "Call Trace" not occur any more? > We find in kernel 2.6.9-89.0.26.EL.x86_64 call trace still appears, althoug TX > hang doesn't appear. Is this right? Or a new problem? > https://bugzilla.redhat.com/show_bug.cgi?id=596374#c5 I would say this is a separate problem, and should be tracked in another bug. Perhaps bug 516076 or bug 549494 have some relationship to this.
Igor can you test with 2.6.9-89.27.EL? This is a RHEL4 development kernel (not a 4.8 z-stream kernel) and should contain the fix for bug 516076 and this bug. Hopefully the call-trace seen does not appear on that test. Thanks!
(In reply to comment #40) > Igor can you test with 2.6.9-89.27.EL? This is a RHEL4 development kernel (not > a 4.8 z-stream kernel) and should contain the fix for bug 516076 and this bug. > Hopefully the call-trace seen does not appear on that test. Thanks! Yes, I have done this. I ran the test /mn/tests/kernel/errata/4.7.z/469635 for three times and got no Call Trace any more. From my previous experience on this problem this week, if it exists, for a three-times run call trace will definitely appear. So I guess the problem is fixed in 2.6.9-89.27.EL.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0263.html