Bug 1115562
Summary: | e1000e: it takes random and sometimes very long time for carrier to appear | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Jiri Koten <jkoten> | ||||||
Component: | kernel | Assignee: | John Greene <jogreene> | ||||||
kernel sub component: | NIC Drivers | QA Contact: | Network QE <network-qe> | ||||||
Status: | CLOSED DUPLICATE | Docs Contact: | |||||||
Severity: | medium | ||||||||
Priority: | medium | CC: | dcbw, dnelson, jklimes, jogreene, jpirko, vhumpa | ||||||
Version: | 7.0 | ||||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2014-07-07 18:52:40 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
I have managed to reproduce the problem with: $ nmcli dev dis enp0s25 $ nmcli con add type bridge ifname BR0 $ nmcli con add type bridge-slave ifname enp0s25 master BR0 $ nmcli con up bridge-slave-enp0s25 $ dmesg [10067.835422] device enp0s25 entered promiscuous mode [10067.835482] BR0: port 1(enp0s25) entered listening state [10067.835487] BR0: port 1(enp0s25) entered listening state [10069.480249] e1000e: enp0s25 NIC Link is Down [10069.480316] BR0: port 1(enp0s25) entered disabled state $ nmcli d DEVICE TYPE STATE CONNECTION enp0s25 ethernet connected bridge-slave-enp0s25 BR0 bridge connecting (getting IP configuration) bridge-BR0 00:17:EA:82:E5:83 bt disconnected $ ip link 186: enp0s25: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast master BR0 state DOWN mode DEFAULT qlen 1000 link/ether 3c:97:0e:18:2e:a1 brd ff:ff:ff:ff:ff:ff 188: BR0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT link/ether 3c:97:0e:18:2e:a1 brd ff:ff:ff:ff:ff:ff However, after more debugging it showed up there is a problem with carrier detection in kernel/driver in general. Sometimes it works fine, but often it doesn't. In order to test it I took NetworkManager and bridges out of the picture, and just focused on the ethernet device. Here are my findings: ===================== $ sudo systemctl mask NetworkManager.service $ sudo systemctl stop NetworkManager.service $ brctl show bridge name bridge id STP enabled interfaces $ sudo rmmod e1000e $ sudo modprobe e1000e $ ip link 166: enp0s25: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 1000 link/ether 3c:97:0e:18:2e:a1 brd ff:ff:ff:ff:ff:ff $ cat /sys/class/net/enp0s25/operstate down cat /sys/class/net/enp0s25/flags 0x1002 $ cat /sys/class/net/enp0s25/carrier cat: /sys/class/net/enp0s25/carrier: Invalid argument $ sudo ip link set up dev enp0s25 $ ip link 166: enp0s25: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT qlen 1000 link/ether 3c:97:0e:18:2e:a1 brd ff:ff:ff:ff:ff:ff ]$ cat /sys/class/net/enp0s25/carrier 0 And the interface is not functional now for traffic (DHCP won't work). But, sometimes after a very long time carrier appears: $ cat /sys/class/net/enp0s25/carrier 1 $ ip link 166: enp0s25: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT qlen 1000 link/ether 3c:97:0e:18:2e:a1 brd ff:ff:ff:ff:ff:ff $ dmesg [13290.277467] e1000e 0000:00:19.0 enp0s25: removed PHC [13295.844982] e1000e: Intel(R) PRO/1000 Network Driver - 2.3.2-k [13295.844985] e1000e: Copyright(c) 1999 - 2013 Intel Corporation. [13295.845133] e1000e 0000:00:19.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode [13295.845153] e1000e 0000:00:19.0: irq 41 for MSI/MSI-X [13296.040269] e1000e 0000:00:19.0 eth0: registered PHC clock [13296.040273] e1000e 0000:00:19.0 eth0: (PCI Express:2.5GT/s:Width x1) 3c:97:0e:18:2e:a1 [13296.040275] e1000e 0000:00:19.0 eth0: Intel(R) PRO/1000 Network Connection [13296.040308] e1000e 0000:00:19.0 eth0: MAC: 10, PHY: 11, PBA No: 1000FF-0FF [13296.052193] systemd-udevd[6499]: renamed network interface eth0 to enp0s25 [13306.244860] e1000e 0000:00:19.0: irq 41 for MSI/MSI-X [13306.345836] e1000e 0000:00:19.0: irq 41 for MSI/MSI-X [13306.346071] IPv6: ADDRCONF(NETDEV_UP): enp0s25: link is not ready [13393.371338] e1000e: enp0s25 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [13393.371378] IPv6: ADDRCONF(NETDEV_CHANGE): enp0s25: link becomes ready or [14891.291289] e1000e 0000:00:19.0: irq 41 for MSI/MSI-X [14891.391922] e1000e 0000:00:19.0: irq 41 for MSI/MSI-X [14891.392106] IPv6: ADDRCONF(NETDEV_UP): enp0s25: link is not ready [15077.664101] e1000e: enp0s25 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [15077.664140] IPv6: ADDRCONF(NETDEV_CHANGE): enp0s25: link becomes ready sometimes the carrier is detected quickly (as it should be): $ dmesg [13561.089504] e1000e 0000:00:19.0 enp0s25: removed PHC [13567.200951] e1000e: Intel(R) PRO/1000 Network Driver - 2.3.2-k [13567.200956] e1000e: Copyright(c) 1999 - 2013 Intel Corporation. [13567.201170] e1000e 0000:00:19.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode [13567.201758] e1000e 0000:00:19.0: irq 41 for MSI/MSI-X [13567.398571] e1000e 0000:00:19.0 eth0: registered PHC clock [13567.398576] e1000e 0000:00:19.0 eth0: (PCI Express:2.5GT/s:Width x1) 3c:97:0e:18:2e:a1 [13567.398578] e1000e 0000:00:19.0 eth0: Intel(R) PRO/1000 Network Connection [13567.398609] e1000e 0000:00:19.0 eth0: MAC: 10, PHY: 11, PBA No: 1000FF-0FF [13567.412471] systemd-udevd[6711]: renamed network interface eth0 to enp0s25 [13626.424210] e1000e 0000:00:19.0: irq 41 for MSI/MSI-X [13626.524721] e1000e 0000:00:19.0: irq 41 for MSI/MSI-X [13626.524924] IPv6: ADDRCONF(NETDEV_UP): enp0s25: link is not ready [13630.042548] e1000e: enp0s25 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [13630.042587] IPv6: ADDRCONF(NETDEV_CHANGE): enp0s25: link becomes ready $ modinfo e1000e filename: /lib/modules/3.10.0-125.el7.x86_64/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko version: 2.3.2-k license: GPL description: Intel(R) PRO/1000 Network Driver author: Intel Corporation, <linux.nics> srcversion: E9F7E754F6F3A1AD906634C alias: pci:v00008086d000015A3sv*sd*bc*sc*i* alias: pci:v00008086d000015A2sv*sd*bc*sc*i* alias: pci:v00008086d000015A1sv*sd*bc*sc*i* alias: pci:v00008086d000015A0sv*sd*bc*sc*i* alias: pci:v00008086d00001559sv*sd*bc*sc*i* alias: pci:v00008086d0000155Asv*sd*bc*sc*i* alias: pci:v00008086d0000153Bsv*sd*bc*sc*i* alias: pci:v00008086d0000153Asv*sd*bc*sc*i* alias: pci:v00008086d00001503sv*sd*bc*sc*i* alias: pci:v00008086d00001502sv*sd*bc*sc*i* alias: pci:v00008086d000010F0sv*sd*bc*sc*i* alias: pci:v00008086d000010EFsv*sd*bc*sc*i* alias: pci:v00008086d000010EBsv*sd*bc*sc*i* alias: pci:v00008086d000010EAsv*sd*bc*sc*i* alias: pci:v00008086d00001525sv*sd*bc*sc*i* alias: pci:v00008086d000010DFsv*sd*bc*sc*i* alias: pci:v00008086d000010DEsv*sd*bc*sc*i* alias: pci:v00008086d000010CEsv*sd*bc*sc*i* alias: pci:v00008086d000010CDsv*sd*bc*sc*i* alias: pci:v00008086d000010CCsv*sd*bc*sc*i* alias: pci:v00008086d000010CBsv*sd*bc*sc*i* alias: pci:v00008086d000010F5sv*sd*bc*sc*i* alias: pci:v00008086d000010BFsv*sd*bc*sc*i* alias: pci:v00008086d000010E5sv*sd*bc*sc*i* alias: pci:v00008086d0000294Csv*sd*bc*sc*i* alias: pci:v00008086d000010BDsv*sd*bc*sc*i* alias: pci:v00008086d000010C3sv*sd*bc*sc*i* alias: pci:v00008086d000010C2sv*sd*bc*sc*i* alias: pci:v00008086d000010C0sv*sd*bc*sc*i* alias: pci:v00008086d00001501sv*sd*bc*sc*i* alias: pci:v00008086d00001049sv*sd*bc*sc*i* alias: pci:v00008086d0000104Dsv*sd*bc*sc*i* alias: pci:v00008086d0000104Bsv*sd*bc*sc*i* alias: pci:v00008086d0000104Asv*sd*bc*sc*i* alias: pci:v00008086d000010C4sv*sd*bc*sc*i* alias: pci:v00008086d000010C5sv*sd*bc*sc*i* alias: pci:v00008086d0000104Csv*sd*bc*sc*i* alias: pci:v00008086d000010BBsv*sd*bc*sc*i* alias: pci:v00008086d00001098sv*sd*bc*sc*i* alias: pci:v00008086d000010BAsv*sd*bc*sc*i* alias: pci:v00008086d00001096sv*sd*bc*sc*i* alias: pci:v00008086d0000150Csv*sd*bc*sc*i* alias: pci:v00008086d000010F6sv*sd*bc*sc*i* alias: pci:v00008086d000010D3sv*sd*bc*sc*i* alias: pci:v00008086d0000109Asv*sd*bc*sc*i* alias: pci:v00008086d0000108Csv*sd*bc*sc*i* alias: pci:v00008086d0000108Bsv*sd*bc*sc*i* alias: pci:v00008086d0000107Fsv*sd*bc*sc*i* alias: pci:v00008086d0000107Esv*sd*bc*sc*i* alias: pci:v00008086d0000107Dsv*sd*bc*sc*i* alias: pci:v00008086d000010B9sv*sd*bc*sc*i* alias: pci:v00008086d000010D5sv*sd*bc*sc*i* alias: pci:v00008086d000010DAsv*sd*bc*sc*i* alias: pci:v00008086d000010D9sv*sd*bc*sc*i* alias: pci:v00008086d00001060sv*sd*bc*sc*i* alias: pci:v00008086d000010A5sv*sd*bc*sc*i* alias: pci:v00008086d000010BCsv*sd*bc*sc*i* alias: pci:v00008086d000010A4sv*sd*bc*sc*i* alias: pci:v00008086d0000105Fsv*sd*bc*sc*i* alias: pci:v00008086d0000105Esv*sd*bc*sc*i* depends: ptp intree: Y vermagic: 3.10.0-125.el7.x86_64 SMP mod_unload modversions signer: Red Hat Enterprise Linux kernel signing key sig_key: 00:88:B6:91:0C:E8:48:58:4D:81:B5:E0:4D:EE:EE:9A:DF:A9:6D:72 sig_hashalgo: sha256 parm: debug:Debug level (0=none,...,16=all) (int) parm: copybreak:Maximum size of packet that is copied to a new buffer on receive (uint) parm: TxIntDelay:Transmit Interrupt Delay (array of int) parm: TxAbsIntDelay:Transmit Absolute Interrupt Delay (array of int) parm: RxIntDelay:Receive Interrupt Delay (array of int) parm: RxAbsIntDelay:Receive Absolute Interrupt Delay (array of int) parm: InterruptThrottleRate:Interrupt Throttling Rate (array of int) parm: IntMode:Interrupt Mode (array of int) parm: SmartPowerDownEnable:Enable PHY smart power down (array of int) parm: KumeranLockLoss:Enable Kumeran lock loss workaround (array of int) parm: WriteProtectNVM:Write-protect NVM [WARNING: disabling this can lead to corrupted NVM] (array of int) parm: CrcStripping:Enable CRC Stripping, disable if your BMC needs the CRC (array of int) Transferring to kernel for the moment so that we have an analysis from driver perspective. This looks clearly like NIC/driver issue. Note that the issue happens even in case bridge is out of the picture. Jirka & Jirka, could you please provide output of lspci and "ethtool -i" ? Thanks. Created attachment 914472 [details]
lspci and ethtool output
# lspci -vnn 01:00.0 Ethernet controller [0200]: Broadcom Corporation NetXtreme BCM5764M Gigabit Ethernet PCIe [14e4:1684] (rev 10) Subsystem: Hewlett-Packard Company Device [103c:1309] Flags: bus master, fast devsel, latency 0, IRQ 68 Memory at f6000000 (64-bit, non-prefetchable) [size=64K] Capabilities: [48] Power Management version 3 Capabilities: [40] Vital Product Data Capabilities: [60] Vendor Specific Information: Len=6c <?> Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [cc] Express Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting Capabilities: [13c] Virtual Channel Capabilities: [160] Device Serial Number d4-85-64-ff-fe-a8-92-c0 Capabilities: [16c] Power Budgeting <?> Kernel driver in use: tg3 # ethtool -i enp1s0 driver: tg3 version: 3.136 firmware-version: 5764m-v3.35 bus-info: 0000:01:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: no |
Created attachment 914231 [details] NetworkManager journal log Description of problem: When I configure the bridge using nmcli and then activate the bridge slave connection, I'm stuck on "getting IP configuration". NM never gets the IP configured for the bridge. The slave is physical ethernet device. The ethernet device seems to be down, at least that's what I get when I try to start the bridge with "ifup bridge-br0" -> "... no link present. Check cable?" # nmcli c NAME UUID TYPE DEVICE bridge-br0 01bf26e2-eed3-4ac3-a6cf-240d19dddda1 bridge br0 virbr0 4d147531-bd2a-489d-888b-de097ea0aeec bridge virbr0 ethernet-1 57b6fd48-9ac5-4190-8743-e9f305e5710c 802-3-ethernet -- bridge-slave-enp1s0 aa953e70-adb2-41d6-8ab2-2c1c82d33574 802-3-ethernet enp1s0 # nmcli d DEVICE TYPE STATE CONNECTION enp1s0 ethernet connected bridge-slave-enp1s0 br0 bridge connecting (getting IP configuration) bridge-br0 virbr0 bridge connecting (getting IP configuration) virbr0 lo loopback unmanaged -- # brctl show bridge name bridge id STP enabled interfaces br0 8000.d48564a892c0 yes enp1s0 virbr0 8000.000000000000 yes Version-Release number of selected component (if applicable): kernel-3.10.0-123.4.2.el7 NetworkManager-0.9.9.1-23.git20140326.4dba720.el7_0 How reproducible: 100% Steps to Reproduce: 1. # nmcli con add type bridge ifname br0 2. # nmcli con add type bridge-slave ifname enp1s0 master br0 3. # nmcli dev disconnect enp1s0 4. # nmcli con up bridge-slave-enp1s0 Actual results: Bridge doesn't get an IP addr. Expected results: Bridge obtains IP address successfully. Additional info: