Description of problem: Can't use the interface because it fails to probe the NIC. See the messages log below: Sep 6 15:30:52 kernel: ACPI: PCI interrupt for device 0000:0d:00.0 disabled Sep 6 15:30:52 kernel: e1000e: probe of 0000:0d:00.0 failed with error -5 Sep 6 15:30:52 kernel: PCI: Enabling device 0000:0d:00.1 (0040 -> 0043) Sep 6 15:30:52 kernel: ACPI: PCI Interrupt 0000:0d:00.1[B] -> GSI 31 (level, low) -> IRQ 193 Sep 6 15:30:52 kernel: 0000:0d:00.1: 0000:0d:00.1: Failed to initialize MSI interrupts. Falling back to legacy interrupts. Sep 6 15:30:55 kernel: 0000:0d:00.1: 0000:0d:00.1: The NVM Checksum Is Not Valid ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Sep 6 15:30:55 kernel: ACPI: PCI interrupt for device 0000:0d:00.1 disabled Sep 6 15:30:55 kernel: e1000e: probe of 0000:0d:00.1 failed with error -5 HP ProLiant DL785 G6 https://hardware.redhat.com/show.cgi?id=538598 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (rev 05) 8086:105e looking at the source code, it does the following: drivers/net/e1000e/netdev.c: 5079 * systems with ASPM and others may see the checksum fail on the fi rst 5080 * attempt. Let's give it a few tries 5081 */ 5082 for (i = 0;; i++) { 5083 if (e1000_validate_nvm_checksum(&adapter->hw) >= 0) 5084 break; 5085 if (i == 2) { 5086 e_err("The NVM Checksum Is Not Valid\\n"); 5087 err = -EIO; 5088 goto err_eeprom; 5089 } 5090 } In order to get the ethtool -e output, a test kernel with the lines #5087 and #5088 commented out had been provided but unfortunately the kernel hangs and the watchdog prints the following: BUG: soft lockup - CPU#45 stuck for 10s! [insmod:14756] ... Pid: 14756, comm: insmod Tainted: P 2.6.18-194.el5.gss00320226.1 #1 RIP: 0010:[<ffffffff887edcd9>] [<ffffffff887edcd9>] :e1000e:e1000e_poll_eerd_eewr_done+0x17/0x43 ... Call Trace: [<ffffffff887edcf5>] :e1000e:e1000e_poll_eerd_eewr_done+0x33/0x43 [<ffffffff887ef13c>] :e1000e:e1000e_read_nvm_eerd+0x52/0x85 [<ffffffff887ee2b4>] :e1000e:e1000e_validate_nvm_checksum_generic+0x26/0x50 [<ffffffff887e9273>] :e1000e:e1000_validate_nvm_checksum_82571+0x8d/0x94 [<ffffffff887e9c73>] :e1000e:e1000_reset_hw_82571+0xd6/0x145 [<ffffffff887f9eb5>] :e1000e:e1000_probe+0x554/0xb7f [<ffffffff8015e733>] pci_device_probe+0x104/0x184 [<ffffffff801c8873>] driver_probe_device+0x52/0xaa [<ffffffff801c89a2>] __driver_attach+0x65/0xb6 [<ffffffff801c893d>] __driver_attach+0x0/0xb6 [<ffffffff801c817a>] bus_for_each_dev+0x43/0x6e [<ffffffff801c7db6>] bus_add_driver+0x76/0x110 [<ffffffff8015ea4f>] __pci_register_driver+0x51/0xa6 [<ffffffff800a7fe0>] sys_init_module+0xaf/0x1f2 [<ffffffff8005e28d>] tracesys+0xd5/0xe0 The kernel and the patch (linux-kernel-test.patch) are in CVS. Version-Release number of selected component (if applicable): kernel-2.6.18-164.el5 (rhel-x86_64-server-5) How reproducible: Always Steps to Reproduce: 1. Not known
Created attachment 462421 [details] 194.el5-e1000e-disables-NVR-check.patch
Created attachment 462422 [details] dmesg while loading the patched driver
We will try to get a dump using flashrom utility.
Flavio, Sorry for the delayed response. Do you still have this issue?
Hi Tushar, The customer has replaced the card and it is working for now. Flavio