Bug 482822
Summary: | Intel E1000 doesn't work on NVIDIA MCP51 motherboards | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Laurent Jean-Rigaud <laurent.jean-rigaud> | ||||||||
Component: | kernel | Assignee: | Andy Gospodarek <agospoda> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Red Hat Kernel QE team <kernel-qe> | ||||||||
Severity: | medium | Docs Contact: | |||||||||
Priority: | low | ||||||||||
Version: | 4.9 | CC: | andriusb, cward, jtluka, peterm, rdoty, vgoyal | ||||||||
Target Milestone: | rc | ||||||||||
Target Release: | 4.8 | ||||||||||
Hardware: | All | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2009-05-18 19:35:04 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 450285 | ||||||||||
Attachments: |
|
Description
Laurent Jean-Rigaud
2009-01-28 13:15:16 UTC
It seems that a quick fix should be to blacklist MCP51 in pci quirks to avoid this problem. Is it possible to know if this bugzilla will be fix by RedHat in any futur RHEL4 update or my configuration (NVidia chipset + e1000) is not a RHEL4 target (in commercial point of view ;-)) ? Regards The fix i talk above do not fix e1000 problem (and introduces a problem with local APIC if hight definition timer is set in bios). Can you try test kernels located here: http://people.redhat.com/agospoda/#rhel4 These kernels include a patch for e1000 that should detect that MSI interrupts are not working with e1000 and switch to INTx mode. This is much better than disabling MSI on the entire system. The patch mentioned above is also included in the version of the e1000 driver (7.5.6) that is part of dkms, so I suspect this will resolve your issue. I will try, if i could get the machine again... but the E1000 MSI patch seem to be dropped ?! $ rpm -qpl ../SRPMS/kernel-2.6.9-80.EL.src.rpm -v | grep -i e1000 -rw-r--r-- 1 mockbuilmockbuil 50348 May 4 2005 linux-2.6.10-net-e1000-update.patch -rw-r--r-- 1 mockbuilmockbuil 1248783 Oct 3 2007 linux-2.6.11-net-e1000-update.patch -rw-r--r-- 1 mockbuilmockbuil 4058 Jan 23 21:46 linux-2.6.9-e1000-add-parameter-to-set-transmit-descriptor-size.patch -rw-r--r-- 1 mockbuilmockbuil 953 Mar 26 2008 linux-2.6.9-e1000-disable-pci-e-completion-timeouts-on-pseries.patch -rw-r--r-- 1 mockbuilmockbuil 440 Dec 16 17:17 linux-2.6.9-e1000-remove-e1000_clean_tx_irq-call-from-e1000_net.patch -rw-r--r-- 1 mockbuilmockbuil 2713 Dec 16 17:17 linux-2.6.9-e1000-restart-receive-unit-on-esb2-hardware.patch -rw-r--r-- 1 mockbuilmockbuil 32832 Apr 3 2008 linux-2.6.9-e1000-upstream-update-and-alternate-mac-address-sup.patch -rw-r--r-- 1 mockbuilmockbuil 1864 Jan 23 21:46 linux-2.6.9-e1000e-add-reboot-notifier-so-wol-will-work.patch -rw-r--r-- 1 mockbuilmockbuil 202142 Mar 26 2008 linux-2.6.9-e1000e-update-to-latest-upstream.patch -rw-r--r-- 1 mockbuilmockbuil 380792 Jan 14 16:41 linux-2.6.9-e1000e-update-to-upstream-version-0.3.3.3-k6.patch -rw-r--r-- 1 mockbuilmockbuil 1758 Jan 16 23:03 linux-2.6.9-enable-entropy-generation-from-e1000-and-bnx2-networ.patch -rw-r--r-- 1 mockbuilmockbuil 1233 Dec 11 2004 linux-2.6.9-net-e1000-64k-align-check-dma.patch -rw-r--r-- 1 mockbuilmockbuil 7611 Apr 30 2005 linux-2.6.9-net-e1000-avoid-sleep-in-timer-context.patch -rw-r--r-- 1 mockbuilmockbuil 14468 Dec 8 2004 linux-2.6.9-net-e1000-erratum23.patch -rw-r--r-- 1 mockbuilmockbuil 605 Mar 22 2005 linux-2.6.9-net-e1000-flush-rmmod.patch -rw-r--r-- 1 mockbuilmockbuil 3901 Dec 8 2004 linux-2.6.9-net-e1000-post-mature-writeback.patch -rw-r--r-- 1 mockbuilmockbuil 1114 Dec 2 2004 linux-2.6.9-net-e1000-rx-mini-jumbo-inval.patch -rw-r--r-- 1 mockbuilmockbuil 533745 Sep 26 2007 linux-2.6.9-net-e1000e.patch $ rpm -qpl ../SRPMS/kernel-2.6.9-80.EL.gtest.57.src.rpm -v | grep -i e1000 -rw-r--r-- 1 mockbuilmockbuil 50348 May 4 2005 linux-2.6.10-net-e1000-update.patch -rw-r--r-- 1 mockbuilmockbuil 1248783 Oct 3 2007 linux-2.6.11-net-e1000-update.patch -rw-r--r-- 1 mockbuilmockbuil 4058 Jan 23 21:46 linux-2.6.9-e1000-add-parameter-to-set-transmit-descriptor-size.patch -rw-r--r-- 1 mockbuilmockbuil 953 Mar 26 2008 linux-2.6.9-e1000-disable-pci-e-completion-timeouts-on-pseries.patch -rw-r--r-- 1 mockbuilmockbuil 440 Dec 16 17:17 linux-2.6.9-e1000-remove-e1000_clean_tx_irq-call-from-e1000_net.patch -rw-r--r-- 1 mockbuilmockbuil 2713 Dec 16 17:17 linux-2.6.9-e1000-restart-receive-unit-on-esb2-hardware.patch -rw-r--r-- 1 mockbuilmockbuil 32832 Apr 3 2008 linux-2.6.9-e1000-upstream-update-and-alternate-mac-address-sup.patch -rw-r--r-- 1 mockbuilmockbuil 1864 Jan 23 21:46 linux-2.6.9-e1000e-add-reboot-notifier-so-wol-will-work.patch -rw-r--r-- 1 mockbuilmockbuil 202142 Mar 26 2008 linux-2.6.9-e1000e-update-to-latest-upstream.patch -rw-r--r-- 1 mockbuilmockbuil 380792 Jan 14 16:41 linux-2.6.9-e1000e-update-to-upstream-version-0.3.3.3-k6.patch -rw-r--r-- 1 mockbuilmockbuil 1758 Jan 16 23:03 linux-2.6.9-enable-entropy-generation-from-e1000-and-bnx2-networ.patch -rw-r--r-- 1 mockbuilmockbuil 1233 Dec 11 2004 linux-2.6.9-net-e1000-64k-align-check-dma.patch -rw-r--r-- 1 mockbuilmockbuil 7611 Apr 30 2005 linux-2.6.9-net-e1000-avoid-sleep-in-timer-context.patch -rw-r--r-- 1 mockbuilmockbuil 14468 Dec 8 2004 linux-2.6.9-net-e1000-erratum23.patch -rw-r--r-- 1 mockbuilmockbuil 605 Mar 22 2005 linux-2.6.9-net-e1000-flush-rmmod.patch -rw-r--r-- 1 mockbuilmockbuil 3901 Dec 8 2004 linux-2.6.9-net-e1000-post-mature-writeback.patch -rw-r--r-- 1 mockbuilmockbuil 1114 Dec 2 2004 linux-2.6.9-net-e1000-rx-mini-jumbo-inval.patch -rw-r--r-- 1 mockbuilmockbuil 533745 Sep 26 2007 linux-2.6.9-net-e1000e.patch $ e1000 patches are identical between 80 and 80.gtest . Same for *msi* patches... Can you confirm the e1000 msi patch inclusion ? Well my test kernels use linux-kernel-test.patch for holding all of my experimental patches, so even if you compared the files searching for 'e1000' you might not find any differences using your methods. For the record, it looks like I dropped this patch: http://people.redhat.com/agospoda/rhel4/0005-e1000-msi-test-and-switch-to-intx.patch from my test kernels. I can add that back or you can try it manually if you like. OK. I've take 2.6.9-80.EL and add 0005-e1000-msi* patch. Rebuilding and need to retrieve a NVidia PC for test. Created attachment 330887 [details]
Dmesg with e1000-msi patch
With patch, e1000 module fallbacks to legagy interrups but no traffic at all (stats @ 0, tcpdump empty).
module version 7.3.20-k3-NAPI
e1000 binds eth0 & eth1 (disabled). forcedeth binds eth2.
/proc/interrups :
CPU0
0: 388428 IO-APIC-edge timer
1: 2337 IO-APIC-edge i8042
7: 3 IO-APIC-edge parport0
8: 1 IO-APIC-edge rtc
9: 0 IO-APIC-level acpi
12: 2355 IO-APIC-edge i8042
15: 3054 IO-APIC-edge ide1
177: 0 IO-APIC-level eth0
185: 0 IO-APIC-level libata
193: 235 IO-APIC-level HDA Intel, ohci_hcd
201: 35190 IO-APIC-level ehci_hcd, eth2
209: 7823 IO-APIC-level libata
NMI: 0
LOC: 388274
ERR: 0
MIS: 0
to resume: - 2.6.9-80.EL (7.3.20-k2-NAPI) : nok - 2.6.9-80.EL + e1000-patch (7.3.20-k3-NAPI) : nok - 2.6.9-80.EL + e1000-dkms (7.5.6-NAPI) : ok (legagy IRQ) Regards Created attachment 330927 [details]
nvidia-fix.patch
Laurent, thanks for the feedback. I have one more patch that might be interesting to try. This is an attempt to address some MSI/HT issues that have become apparent lately. I haven't tested this patch at all since I don't have an offending system, but I think it will be OK.
This patch makes compilation failed on error : drivers/pci/quirks.c: In function `quirk_find_ht_capability': drivers/pci/quirks.c:1713: error: 'pos' redeclared as different kind of symbol drivers/pci/quirks.c:1711: error: previous definition of 'pos' was here drivers/pci/quirks.c:1730: warning: passing arg 3 of `pci_read_config_byte' from incompatible pointer type drivers/pci/quirks.c: In function `nv_msi_ht_cap_quirk': drivers/pci/quirks.c:1746: warning: implicit declaration of function `pci_get_bus_and_slot' drivers/pci/quirks.c:1746: warning: assignment makes pointer from integer without a cast make[2]: *** [drivers/pci/quirks.o] Error 1 make[1]: *** [drivers/pci] Error 2 make[1]: *** Waiting for unfinished jobs.... make: *** [drivers] Error 2 error: Bad exit status from /home/buildsys/rpmbuild/tmp/rpm-tmp.55710 (%build) It seems redeclaration of pos is not very good.... +static int __devinit quirk_find_ht_capability(struct pci_dev *dev, int pos, int ht_cap) +{ + u8 pos; Removing the redefinition, the function pci_get_bus_and slot is undefined ! ../.. CHK include/linux/compile.h UPD include/linux/compile.h drivers/built-in.o(.text+0x4540): In function `nv_msi_ht_cap_quirk': drivers/pci/quirks.c:1746: undefined reference to `pci_get_bus_and_slot' By the way, as the dkms version of e1000 runs well, a patch againt e1000 sources should be suffisant. The actual e1000-msi patch should miss something ;-) Created attachment 331054 [details]
e1000-msi-test-and-switch-to-intx.patch
I think I found the problem with my original patch. Please replace the previous 'e1000-msi' patch with this one and forget that non-compiling patch I uploaded yesterday. :-)
Ok, it's better now with last patch (+ 'adapter->have_msi = 1;') e1000: no version for "struct_module" found: kernel tainted. Intel(R) PRO/1000 Network Driver - version 7.3.20-k3-NAPI Copyright (c) 1999-2006 Intel Corporation. ACPI: PCI interrupt 0000:02:00.0[A] -> GSI 16 (level, low) -> IRQ 177 PCI: Setting latency timer of device 0000:02:00.0 to 64 e1000: 0000:02:00.0: e1000_probe: (PCI Express:2.5Gb/s:Width x1) 00:15:17:24:45:86 divert: allocating divert_blk for eth0 e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection ACPI: PCI interrupt 0000:02:00.1[B] -> GSI 16 (level, low) -> IRQ 177 PCI: Setting latency timer of device 0000:02:00.1 to 64 e1000: 0000:02:00.1: e1000_probe: (PCI Express:2.5Gb/s:Width x1) 00:15:17:24:45:87 divert: allocating divert_blk for eth1 e1000: eth1: e1000_probe: Intel(R) PRO/1000 Network Connection ip_tables: (C) 2000-2002 Netfilter core team e1000: eth1: e1000_test_msi: MSI interrupt test failed, using legacy interrupt. ip_tables: (C) 2000-2002 Netfilter core team e1000: eth0: e1000_test_msi: MSI interrupt test failed, using legacy interrupt. e1000: eth0: e1000_watchdog_task: NIC Link is Up 100 Mbps Full Duplex, Flow Control: RX/TX e1000: eth0: e1000_watchdog_task: 10/100 speed: disabling TSO device eth0 entered promiscuous mode device eth0 left promiscuous mode ip_tables: (C) 2000-2002 Netfilter core team e1000: eth0: e1000_watchdog_task: NIC Link is Up 100 Mbps Full Duplex, Flow Control: RX/TX e1000: eth0: e1000_watchdog_task: 10/100 speed: disabling TSO e1000: eth0: e1000_watchdog_task: NIC Link is Up 100 Mbps Full Duplex, Flow Control: RX/TX e1000: eth0: e1000_watchdog_task: 10/100 speed: disabling TSO And eth0 can retrieve its dynanic address. I will try static speed and bonding. smell Good ! Regards Excellent! I noticed that my patch was missing one line that allowed the previously requested interrupt to be correctly disabled, so I was convinced this would fix the problem. I will work to get this added to the upcoming update. This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. also, Bonding and static negotiation both works. Thanks Requested exception and added to tracker BZ. Committed in 82.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/ ~~ Attention Partners! Snap 1 Released ~~ RHEL 4.8 Snapshot 1 has been released on partners.redhat.com. There should be a fix present, which addresses this bug. NOTE: there is only a short time left to test, please test and report back results on this bug at your earliest convenience. If you encounter any issues, please set the bug back to the ASSIGNED state and describe the issues you encountered. If you have found a NEW bug, clone this bug and describe the issues you encountered. Further questions can be directed to your Red Hat Partner Manager. If you have VERIFIED the bug fix. Please select your PartnerID from the Verified field above. Please leave a comment with your test results details. Include which arches tested, package version and any applicable logs. - Red Hat QE Partner Management ~~ Attention! Snap 4 Released ~~ RHEL 4.8 Snapshot 4 has been released on partners.redhat.com. There should be a fix present that addresses this bug. NOTE: there is only a short time left to test, please test and report back results on this bug ASAP. The latest kernel build can be obtained here: http://people.redhat.com/vgoyal/rhel4/ If you encounter any issues, please set the bug back to the ASSIGNED state and describe the issues you encountered. If you have found a NEW bug, clone this bug and describe the issues you encountered. Further questions can be directed to your Red Hat Partner Manager. If you have VERIFIED the bug fix. Please select your PartnerID from the Verified field above. Please leave a comment with your test results details. Include which arches tested, package version and any applicable logs. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-1024.html |