Bug 1067126
Summary: | Virt-manager doesn't configure bridge for VM | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Martin <mholec> | ||||||||||||||
Component: | kernel | Assignee: | John Greene <jogreene> | ||||||||||||||
Status: | CLOSED DUPLICATE | QA Contact: | zenghui.shi <zshi> | ||||||||||||||
Severity: | urgent | Docs Contact: | |||||||||||||||
Priority: | medium | ||||||||||||||||
Version: | 7.0 | CC: | bruce.w.allan, codong, davidx.m.ertman, dyuan, fkluknav, gscrivan, hyao, jasowang, jaster, jkoten, jogreene, john.ronciak, juzhang, kernel-qe, knoel, kzhang, lilu, michele, mprivozn, mst, mzhan, ohudlick, peterm, qiguo, toracat, tpelka, tzheng, vyasevic, zshi | ||||||||||||||
Target Milestone: | rc | Keywords: | Regression, Reopened, TestBlocker | ||||||||||||||
Target Release: | 7.0 | Flags: | jogreene:
needinfo-
|
||||||||||||||
Hardware: | Unspecified | ||||||||||||||||
OS: | Unspecified | ||||||||||||||||
Whiteboard: | |||||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||
Clone Of: | Environment: | ||||||||||||||||
Last Closed: | 2015-01-23 13:34:43 UTC | Type: | Bug | ||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||
Embargoed: | |||||||||||||||||
Bug Depends On: | |||||||||||||||||
Bug Blocks: | 1083244, 1113511 | ||||||||||||||||
Attachments: |
|
Description
Martin
2014-02-19 18:44:25 UTC
Regression - This is working on RHEL 6.x TestBlocker - This breaks QE workflow when installing VMs from PXE. journalctl: Feb 19 19:53:54 dhcp130-132.brq.redhat.com NetworkManager[1138]: <info> (macvtap0): link connected Feb 19 19:53:54 dhcp130-132.brq.redhat.com NetworkManager[1138]: <info> (macvtap0): carrier is ON Feb 19 19:53:54 dhcp130-132.brq.redhat.com NetworkManager[1138]: <info> (macvtap0): new Macvlan device (driver: 'unknown' ifindex: 11) Feb 19 19:53:54 dhcp130-132.brq.redhat.com NetworkManager[1138]: <info> (macvtap0): exported as /org/freedesktop/NetworkManager/Devices/10 Feb 19 19:53:54 dhcp130-132.brq.redhat.com libvirtd[1688]: User record for user '107' was not found: No such file or directory Feb 19 19:53:54 dhcp130-132.brq.redhat.com libvirtd[1688]: Group record for user '107' was not found: No such file or directory Feb 19 19:53:54 dhcp130-132.brq.redhat.com systemd[1]: Starting Virtual Machine qemu-RHEL7-Beta. Feb 19 19:53:54 dhcp130-132.brq.redhat.com systemd-machined[5512]: New machine qemu-RHEL7-Beta. Feb 19 19:53:54 dhcp130-132.brq.redhat.com systemd[1]: Started Virtual Machine qemu-RHEL7-Beta. Feb 19 19:53:54 dhcp130-132.brq.redhat.com libvirtd[1688]: User record for user '107' was not found: No such file or directory Feb 19 19:53:54 dhcp130-132.brq.redhat.com libvirtd[1688]: Group record for user '107' was not found: No such file or directory Feb 19 19:53:54 dhcp130-132.brq.redhat.com libvirtd[1688]: User record for user '107' was not found: No such file or directory Feb 19 19:53:54 dhcp130-132.brq.redhat.com libvirtd[1688]: Group record for user '107' was not found: No such file or directory Feb 19 19:53:55 dhcp130-132.brq.redhat.com libvirtd[1688]: User record for user '107' was not found: No such file or directory Feb 19 19:53:55 dhcp130-132.brq.redhat.com libvirtd[1688]: Group record for user '107' was not found: No such file or directory Feb 19 19:53:55 dhcp130-132.brq.redhat.com libvirtd[1688]: User record for user '107' was not found: No such file or directory Feb 19 19:53:55 dhcp130-132.brq.redhat.com libvirtd[1688]: Group record for user '107' was not found: No such file or directory Feb 19 19:53:55 dhcp130-132.brq.redhat.com libvirtd[1688]: User record for user '107' was not found: No such file or directory Feb 19 19:53:55 dhcp130-132.brq.redhat.com libvirtd[1688]: Group record for user '107' was not found: No such file or directory Feb 19 19:53:55 dhcp130-132.brq.redhat.com kvm[9399]: 1 guest now active Feb 19 19:53:56 dhcp130-132.brq.redhat.com avahi-daemon[952]: Registering new address record for fe80::5054:ff:fefa:cb07 on macvtap0.*. Tested with: virt-manager-0.10.0-16.el7.noarch 1.Create a new guest from virt-manager. 2.In "Step 1" choose "Network Boot(PXE)". 3.In "Step 2" choose "OS type" and "Version". 4 In "Step 5" check the "Customize configuration before install" checkbox to customize the network settings later. Click "Finish" 5.Click "Virtual network interface" (icon of two blue monitors) => "Source device" set to "Host device em1:macvtap" (or eth0 instead of em1, or whatever you normally use to access the network with PXE server). 6.Set "Source mode" as "Bridge". 7.Apply changes, begin installation, successful PXE boot. It seems on my machine macvtap0@em1 interface is UNKNOW state,but I can boot guest from pxe successfully. On the host: # ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 3: virbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP link/ether 52:54:00:e7:8d:e4 brd ff:ff:ff:ff:ff:ff inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0 valid_lft forever preferred_lft forever 4: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master virbr0 state DOWN qlen 500 link/ether 52:54:00:e7:8d:e4 brd ff:ff:ff:ff:ff:ff 8: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:25:64:a6:fe:df brd ff:ff:ff:ff:ff:ff inet 10.66.5.145/22 brd 10.66.7.255 scope global dynamic em1 valid_lft 50633sec preferred_lft 50633sec inet6 fe80::225:64ff:fea6:fedf/64 scope link valid_lft forever preferred_lft forever 32: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master virbr0 state UNKNOWN qlen 500 link/ether fe:54:00:16:3e:a0 brd ff:ff:ff:ff:ff:ff inet6 fe80::fc54:ff:fe16:3ea0/64 scope link valid_lft forever preferred_lft forever 33: vnet1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master virbr0 state UNKNOWN qlen 500 link/ether fe:54:00:08:38:d6 brd ff:ff:ff:ff:ff:ff inet6 fe80::fc54:ff:fe08:38d6/64 scope link valid_lft forever preferred_lft forever 34: macvtap0@em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 500 link/ether 52:54:00:ee:0e:6d brd ff:ff:ff:ff:ff:ff inet6 fe80::5054:ff:feee:e6d/64 scope link valid_lft forever preferred_lft forever Refer to comment 5,on my enviroment,the macvtap network with bridge mode did work. Is there anything I misunderstand or would you pls try again with a new environment?thanks. I've followed the step in comment 5 and I've noticed some UDP bad checksums (trough tcpdump -vvv): 192.168.125.96.1024 > localhost.tftp: [udp sum ok] 40 RRQ "pxelinux.0" octet blksize 1432 tsize 0 18:49:55.555196 IP (tos 0x0, ttl 64, id 26210, offset 0, flags [none], proto UDP (17), length 55) localhost.34118 > 192.168.125.96.1024: [bad udp cksum 0x7be7 -> 0x9e0a!] UDP, length 27 18:49:55.591686 IP (tos 0x0, ttl 64, id 26211, offset 0, flags [none], proto UDP (17), length 55) localhost.59917 > 192.168.125.96.1024: [bad udp cksum 0x7be7 -> 0x3943!] UDP, length 27 18:49:55.887819 IP (tos 0x0, ttl 64, id 26212, offset 0, flags [none], proto UDP (17), length 55) Martin, can you please check if it is the same problem you are having? The following command seems to fix the problem here: $ sudo ethtool --offload virbr0 tx off After that, I can use PXE without problems. any update on this? I am going to close this as I couldn't reproduce it here. I have installed clean RHEL-7.0-20140226.0. I went through steps in comment #5. I still can't boot from PXE and I don't see any virbr or vnet interfaces, have tried reproducing this on clean installed RHEL7? Interfaces on the host after starting VM: 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: enp0s25: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether f0:de:f1:6b:1d:b8 brd ff:ff:ff:ff:ff:ff inet 10.34.131.127/23 brd 10.34.131.255 scope global dynamic enp0s25 valid_lft 77638sec preferred_lft 77638sec inet6 2620:52:0:2282:f2de:f1ff:fe6b:1db8/64 scope global dynamic valid_lft 2591998sec preferred_lft 604798sec inet6 fe80::f2de:f1ff:fe6b:1db8/64 scope link valid_lft forever preferred_lft forever 3: wlp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000 link/ether 00:24:d7:cd:a6:d0 brd ff:ff:ff:ff:ff:ff inet 10.200.136.206/22 brd 10.200.139.255 scope global dynamic wlp3s0 valid_lft 5145sec preferred_lft 5145sec inet6 fe80::224:d7ff:fecd:a6d0/64 scope link valid_lft forever preferred_lft forever 7: macvtap0@enp0s25: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 500 link/ether 52:54:00:f5:8d:7c brd ff:ff:ff:ff:ff:ff inet6 fe80::5054:ff:fef5:8d7c/64 scope link valid_lft forever preferred_lft forever Created attachment 870010 [details]
screenshot
My network configuration in virt-manager.
have you tried tcpdump (as in comment 7) to check if something is going wrong on the network? When I run tcpdump -vvv something interesting, I can suddenly boot from PXE. But it's only temporary, when I interrupt tcpdump, I can't run PXE again. I can't run: "sudo ethtool --offload virbr0 tx off" Because I don't have this interface, see above. does it make a difference if you try that command replacing virbr0 with the interface you use to connect to the network (enp0s25/wlp3s0) ? No change. # ethtool --offload enp0s25 tx off Actual changes: tx-checksumming: off tx-checksum-ip-generic: off tcp-segmentation-offload: off tx-tcp-segmentation: off [requested on] tx-tcp6-segmentation: off [requested on] Comment 14 kind of confirm that it is a problem with your network. Can you also try "rx off" to ethtool? Created attachment 870491 [details]
journalctl log
I've run "ethtool --offload enp0s25 rx off" as suggested by Giuseppe. After that, I lost connection and was unable to get IP using Network Manager applet. I had to run dhclient manually from console.
It seems like this issue has been reported earlier: https://bugzilla.redhat.com/show_bug.cgi?id=844006 However, this seems like a kernel issue to me. I mean, whenever a macvtap device is created, the lower interface needs to be put into promisc mode (it's gonna be receiving packets with different MAC addresses and it cannot drop them). Although, this kind of promisc mode is not exposed to userspace as this e-mail suggests: http://www.spinics.net/lists/netdev/msg174700.html Can you please provide the following information: 1) Output of 'ethtool -i enp0s25' 2) Output of 'bridge fdb show dev enp0s25' 3) Output of 'ip -d link show dev macvtap0' Thanks -vlad Also, please provide output of lspci. At this point, this sounds like it might be a duplicate of Bug 1040315, but we need lspci data to verify. Thanks -vlad 00:00.0 Host bridge: Intel Corporation 2nd Generation Core Processor Family DRAM Controller (rev 09) 00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09) 00:16.0 Communication controller: Intel Corporation 6 Series/C200 Series Chipset Family MEI Controller #1 (rev 04) 00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network Connection (rev 04) 00:1a.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2 (rev 04) 00:1b.0 Audio device: Intel Corporation 6 Series/C200 Series Chipset Family High Definition Audio Controller (rev 04) 00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 1 (rev b4) 00:1c.1 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 2 (rev b4) 00:1c.3 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 4 (rev b4) 00:1c.4 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 5 (rev b4) 00:1d.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1 (rev 04) 00:1f.0 ISA bridge: Intel Corporation QM67 Express Chipset Family LPC Controller (rev 04) 00:1f.2 SATA controller: Intel Corporation 6 Series/C200 Series Chipset Family 6 port SATA AHCI Controller (rev 04) 00:1f.3 SMBus: Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller (rev 04) 03:00.0 Network controller: Intel Corporation Centrino Ultimate-N 6300 (rev 35) 05:00.0 System peripheral: Ricoh Co Ltd MMC/SD Host Controller (rev 07) 0d:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host Controller (rev 04) # ethtool -i enp0s25 driver: e1000e version: 2.3.2-k firmware-version: 0.13-3 bus-info: 0000:00:19.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: no # bridge fdb show dev enp0s25 52:54:00:f5:8d:7c self permanent 01:00:5e:00:00:01 self permanent 33:33:00:00:00:01 self permanent 33:33:ff:6b:1d:b8 self permanent 01:00:5e:00:00:fb self permanent 33:33:ff:f5:8d:7c self permanent # ip -d link show dev macvtap0 5: macvtap0@enp0s25: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN mode DEFAULT qlen 500 link/ether 52:54:00:f5:8d:7c brd ff:ff:ff:ff:ff:ff promiscuity 0 macvtap mode vepa (In reply to Martin Holec from comment #22) > 00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network > Connection (rev 04) Yes, you have a nic that was recently broken in rhel7 as well as upstream. This is already known issue. See Bug 1040315. Feel free to close at duplicate. -vlad *** This bug has been marked as a duplicate of bug 1040315 *** I'm reopening this bug because it seems to be still broken. I tested it on my laptop Lenovo T440s with this ethernet card: 00:19.0 Ethernet controller: Intel Corporation Ethernet Connection I218-LM (rev 04) When I tried to boot virtual machine from pxe, it did not work, after I run tcpdump on host machine on physical ethernet interface, it started to work, but only if promiscuous mode was enabled. I tried server kernels: 110, 116, 117, 119 and virt-manager: virt-manager-0.10.0-20.el7. Could you try this test: On you host create and bring up macvlan device: # ip l a link enp0s25 type macvlan # ip l s macvlan0 up # dhclient -1 macvlan0 If this doesn't work, then there are still driver issues. If this works, we'll have to debug. -vlad (In reply to Jaroslav Aster from comment #26) > I'm reopening this bug because it seems to be still broken. I tested it on > my laptop Lenovo T440s with this ethernet card: > > 00:19.0 Ethernet controller: Intel Corporation Ethernet Connection I218-LM > (rev 04) > > When I tried to boot virtual machine from pxe, it did not work, after I run > tcpdump on host machine on physical ethernet interface, it started to work, > but only if promiscuous mode was enabled. > > I tried server kernels: 110, 116, 117, 119 > and virt-manager: virt-manager-0.10.0-20.el7. Could you provide the PCI IDs for this board also (lspci -n). We'll need to isolate the version that might be impacted. I just retested with the board from the original report and the problem is fixed. I seems likely that other board versions were also effected. Thanks -vlad [root@newton ~]# lspci -n 00:00.0 0600: 8086:0a04 (rev 0b) 00:02.0 0300: 8086:0a16 (rev 0b) 00:03.0 0403: 8086:0a0c (rev 0b) 00:14.0 0c03: 8086:9c31 (rev 04) 00:16.0 0780: 8086:9c3a (rev 04) 00:16.3 0700: 8086:9c3d (rev 04) 00:19.0 0200: 8086:155a (rev 04) 00:1b.0 0403: 8086:9c20 (rev 04) 00:1c.0 0604: 8086:9c1a (rev e4) 00:1c.1 0604: 8086:9c14 (rev e4) 00:1d.0 0c03: 8086:9c26 (rev 04) 00:1f.0 0601: 8086:9c43 (rev 04) 00:1f.2 0106: 8086:9c03 (rev 04) 00:1f.3 0c05: 8086:9c22 (rev 04) 02:00.0 ff00: 10ec:5227 (rev 01) 03:00.0 0280: 8086:08b2 (rev 83) We tested it on ethernet card from original report too and I agree, it seems to be fixed for this type of card. Kernel and kernel driver was the same, version 3.10.0-119.el7.x86_64. I agree, it seems to be driver bug. Right now, I'm running the e1000e code to determine which variants of e1000e may still be at risk to an incomplete change that Bug 1040315 apparently did fix with 82579. That patch was confined to the 82579. Working on the assumption right now this is something along the same lines as that patch, just need propagating more widely in the other family members with the the e1000e driver. That may not be the case, but it's a good start. I need QA guys to please quickly get an assessment of how may different flavors of e1000e are affected by the problem? Let's try the test on as many variant in beaker as possible before 1PM call with vendor please. I'll take a look at code during that time. It seems at this point from my read that the following are at issue: 82579 seems ok. So far I see issues with these: From Bug 1040315: Intel Corporation Ethernet Connection I217-LM (rev 05) (PCI ID 8086:153a) From this Bug: Intel Corporation Ethernet Connection I218-LM (rev 04) Please update this with anything I've missed. Looking at the driver code, the special handling for Manageability locked addresses happens only for PCH2 and PCH-LPT variants. The originally reported 82579 is a PCH2 chip. The I217 and I218 are PCP-LPT variants. My guess is that Intel Manageability is locking some registers in the PCH-LPT variant as well and we fail to program the registers for macvlan. Something of note is that for I217 and I218 cards, the number of programmable receive registers is defined as: #define E1000_PCH_LPT_RAR_ENTRIES 12 /* RAR[0], SHRA[0-10] */ where as the 82579 definition has been changed to: #define E1000_PCH2_RAR_ENTRIES 5 /* RAR[0], SHRA[0-3] */ and accounts for registers reserved for manageability. -vlad (In reply to Vlad Yasevich from comment #36) > Looking at the driver code, the special handling for Manageability locked > addresses happens only for PCH2 and PCH-LPT variants. The originally > reported > 82579 is a PCH2 chip. The I217 and I218 are PCP-LPT variants. > > My guess is that Intel Manageability is locking some registers in the PCH-LPT > variant as well and we fail to program the registers for macvlan. > > Something of note is that for I217 and I218 cards, the number of programmable > receive registers is defined as: > #define E1000_PCH_LPT_RAR_ENTRIES 12 /* RAR[0], SHRA[0-10] */ > > where as the 82579 definition has been changed to: > #define E1000_PCH2_RAR_ENTRIES 5 /* RAR[0], SHRA[0-3] */ > > and accounts for registers reserved for manageability. > > -vlad Vlad, I checked the same code and think you are right. Furthermore, I have a theory I'm working now based on debug of previous issue I217 and I218 are PCP-LPT variants have 12 registers true, 1 unicast register for this device MAC and 11 shared. The piece of code in e1000_rar_set_pch_lpt() { ... if (index < hw->mac.rar_entry_count) { wlock_mac = er32(FWSM) & E1000_FWSM_WLOCK_MAC_MASK; wlock_mac >>= E1000_FWSM_WLOCK_MAC_SHIFT; /* Check if all SHRAR registers are locked */ if (wlock_mac == 1) goto out; if ((wlock_mac == 0) || (index <= wlock_mac)) { ^^^^^^^^^^^^^^^^^^^^ s32 ret_val; ... /* go program a RAR with a MAC, success for the VM dhcp */ else /* error can't do it msg, no dhcp */ Debugging the previous issues shows that when macvtap loads, the requests for a RAR register is always made at the end of the list, i.e. for this device it'll be 11 (MAX - 1). Highlighted section of code above show that wlock_mac value being not 1 or non-zero assumes that firmware owned shared RAR regs will be allocated top down also. So if firmware is using this value, it will fail. Upper layers will retry the same location about 10 times, then quit. It should changes to another location. So either we fix the driver or the macvtap requestor. Again, this is my working theory, not proven yet. John (In reply to John Greene from comment #37) > (In reply to Vlad Yasevich from comment #36) > > Looking at the driver code, the special handling for Manageability locked > > addresses happens only for PCH2 and PCH-LPT variants. The originally > > reported > > 82579 is a PCH2 chip. The I217 and I218 are PCP-LPT variants. > > > > My guess is that Intel Manageability is locking some registers in the PCH-LPT > > variant as well and we fail to program the registers for macvlan. > > > > Something of note is that for I217 and I218 cards, the number of programmable > > receive registers is defined as: > > #define E1000_PCH_LPT_RAR_ENTRIES 12 /* RAR[0], SHRA[0-10] */ > > > > where as the 82579 definition has been changed to: > > #define E1000_PCH2_RAR_ENTRIES 5 /* RAR[0], SHRA[0-3] */ > > > > and accounts for registers reserved for manageability. > > > > -vlad > > Vlad, > I checked the same code and think you are right. Furthermore, I have a > theory I'm working now based on debug of previous issue I217 and I218 are > PCP-LPT variants have 12 registers true, 1 unicast register for this device > MAC and 11 shared. The piece of code in > e1000_rar_set_pch_lpt() > { > ... > if (index < hw->mac.rar_entry_count) { > wlock_mac = er32(FWSM) & E1000_FWSM_WLOCK_MAC_MASK; > wlock_mac >>= E1000_FWSM_WLOCK_MAC_SHIFT; > > /* Check if all SHRAR registers are locked */ > if (wlock_mac == 1) > goto out; > > if ((wlock_mac == 0) || (index <= wlock_mac)) { > ^^^^^^^^^^^^^^^^^^^^ > s32 ret_val; > ... > /* go program a RAR with a MAC, success for the VM dhcp */ > > else > /* error can't do it msg, no dhcp */ > > > Debugging the previous issues shows that when macvtap loads, the requests > for a RAR register is always made at the end of the list, i.e. for this > device it'll be 11 (MAX - 1). > Highlighted section of code above show that wlock_mac value being not 1 or > non-zero assumes that firmware owned shared RAR regs will be allocated top > down also. So if firmware is using this value, it will fail. > Upper layers will retry the same location about 10 times, then quit. It > should changes to another location. I don't see upper layers retrying the same location. The upper layer will jump to the next address and try to write it to the next location. Here is an interesting test. Create 11 macvlan devices. Then bring them up one by one and try to get the address. The one where it starts working will let you know which offset finally started working. -vlad > So either we fix the driver or the macvtap requestor. > > Again, this is my working theory, not proven yet. (In reply to Vlad Yasevich from comment #38) > Here is an interesting test. Create 11 macvlan devices. Then bring them up > one by one and try to get the address. The one where it starts working will > let you know which offset finally started working. Hi, I run this test: #!/bin/sh for i in $(seq 0 10); do ip l add link enp0s25 type macvlan done for i in $(seq 0 10); do ip l set dev macvlan$i up done for i in $(seq 0 10); do iface="macvlan${i}" pid_file="/var/run/dhclient-$iface.pid" dhclient -1 "$iface" -pf "$pid_file" ip a s dev "$iface" | grep -q 'inet ' && echo "${iface} succeed" kill $(cat "$pid_file") 2>/dev/null done for i in $(seq 0 10); do ip l del macvlan$i done and I got this results. macvlan7 succeed macvlan8 succeed macvlan9 succeed macvlan10 succeed So, first 7 devices failed, others succeed. (In reply to Jaroslav Aster from comment #39) > (In reply to Vlad Yasevich from comment #38) > > Here is an interesting test. Create 11 macvlan devices. Then bring them up > > one by one and try to get the address. The one where it starts working will > > let you know which offset finally started working. > > Hi, > > I run this test: > > #!/bin/sh > > for i in $(seq 0 10); do > ip l add link enp0s25 type macvlan > done > > for i in $(seq 0 10); do > ip l set dev macvlan$i up > done > > for i in $(seq 0 10); do > iface="macvlan${i}" > pid_file="/var/run/dhclient-$iface.pid" > > dhclient -1 "$iface" -pf "$pid_file" > ip a s dev "$iface" | grep -q 'inet ' && echo "${iface} succeed" > kill $(cat "$pid_file") 2>/dev/null > done > > for i in $(seq 0 10); do > ip l del macvlan$i > done > > and I got this results. > > macvlan7 succeed > macvlan8 succeed > macvlan9 succeed > macvlan10 succeed > > So, first 7 devices failed, others succeed. Excellent information. We can confirm the bug is a RAR issue with this and one more small piece. Run this cmd, the repeat the test please. Clean start, then sudo echo -n 'module e1000e +p' > /sys/kernel/debug/dynamic_debug/control run the above script. Then check the dmesg for this output. e_dbg("Failed to write receive address at index %d\n", index); Given the above, I'd expect to see these for the six or seven failures. This will either confirm the issue or point us elsewhere. sudo echo -n 'module e1000e -p' > /sys/kernel/debug/dynamic_debug/control will turn off the e1000e debug prints when you are done. (In reply to John Greene from comment #40) > Excellent information. We can confirm the bug is a RAR issue with this and > one more small piece. Run this cmd, the repeat the test please. > > Clean start, then > sudo echo -n 'module e1000e +p' > /sys/kernel/debug/dynamic_debug/control > > run the above script. > Then check the dmesg for this output. > > e_dbg("Failed to write receive address at index %d\n", index); > > Given the above, I'd expect to see these for the six or seven failures. > This will either confirm the issue or point us elsewhere. > > sudo echo -n 'module e1000e -p' > /sys/kernel/debug/dynamic_debug/control > will turn off the e1000e debug prints when you are done. Hi, I run test again and I turned on debug informations. My script had the same results and there were 931 lines like line bellow in dmesg. [197019.349809] e1000e 0000:00:19.0 enp0s25: Failed to write receive address at index 11 Index number was not the same in each line and it had range from 5 to 11. # dmesg | grep 'Failed to write' | cut -d ' ' -f 12 | sort | uniq -c 157 10 169 11 97 5 109 6 121 7 133 8 145 9 After looking at the data sheet for I218 device, I am really surprised that any of the devices above worked. It appears that for I217 and I218, the Intel Management layer can actually reserve all of the SHRAR registers and that would completely preclude macvlans from working. There really needs to be something in the driver that should dynamically limit the number of available registers based on what's reserved by manageability. Then, if the number of addresses desired in the unicast filter is larger then available limit, promiscuous mode needs to be enabled. This would also allow us to start programming the registers at the proper index. -vlad (In reply to Vlad Yasevich from comment #42) > After looking at the data sheet for I218 device, I am really surprised that > any > of the devices above worked. It appears that for I217 and I218, the Intel > Management layer can actually reserve all of the SHRAR registers and that > would completely preclude macvlans from working. > > There really needs to be something in the driver that should dynamically > limit > the number of available registers based on what's reserved by manageability. > Then, if the number of addresses desired in the unicast filter is larger then > available limit, promiscuous mode needs to be enabled. > > This would also allow us to start programming the registers at the proper > index. > > -vlad Two problems I see with your approach (although I agree it's root cause). The driver doesn't "see" the allocation by manageability, it doesn't use the driver code best I can can tell. Access to those registers is just refused by the FWSM check of them being locked. Without a single point allocation, we got a problem. Second, is the opposite side issue: if we have no RAR and just open promisc mode to allow it all it works. But when stuff shuts down that may free RARs, we can't really ever leave promisc mode without possibly killing net access to something. At least not without implementing a software RAR table to supplement hardware; that seems a slippery slope. Lemme check the latest upstream Intel code I got a couple days back for any changes to this. Doubtful, but worth a look to see. Does this system have manageability support? dmidecode should tell you. Haven't tried this but if it can be disabled in the BIOS or elsewhere somehow, wonder if this goes away. (In reply to John Greene from comment #43) > (In reply to Vlad Yasevich from comment #42) > > After looking at the data sheet for I218 device, I am really surprised that > > any > > of the devices above worked. It appears that for I217 and I218, the Intel > > Management layer can actually reserve all of the SHRAR registers and that > > would completely preclude macvlans from working. > > > > There really needs to be something in the driver that should dynamically > > limit > > the number of available registers based on what's reserved by manageability. > > Then, if the number of addresses desired in the unicast filter is larger then > > available limit, promiscuous mode needs to be enabled. > > > > This would also allow us to start programming the registers at the proper > > index. > > > > -vlad > > Two problems I see with your approach (although I agree it's root cause). > The driver doesn't "see" the allocation by manageability, it doesn't use the > driver code best I can can tell. Access to those registers is just refused > by the FWSM check of them being locked. Without a single point allocation, > we got a problem. You are right in that there is no single access point between kernel and IME; however, from just the kernel side, there is one and that the rar_set methods. We already have custom methods for manageability enabled cards. Those methods can simply the FWSM register to see which index they can start programming at and if they have more data to write then available register, they enter promisc mode. > > Second, is the opposite side issue: if we have no RAR and just open promisc > mode to allow it all it works. But when stuff shuts down that may free > RARs, we can't really ever leave promisc mode without possibly killing net > access to something. At least not without implementing a software RAR table > to supplement hardware; that seems a slippery slope. We already keep a software version as a device unicast filter list and already support turning on promisc when the software list exceeds HW table. This also brings up another question. Can manageability layer overwirite SHRAR programming? If yes, anything we do at the driver is really pointless. If no, then we should OK. If manageability layer releases resources, we'll pick them up next time RAR table needs to change due to software list changes. > > Lemme check the latest upstream Intel code I got a couple days back for any > changes to this. Doubtful, but worth a look to see. > > Does this system have manageability support? dmidecode should tell you. > Haven't tried this but if it can be disabled in the BIOS or elsewhere > somehow, wonder if this goes away. That's a question for Jaroslav. I currently don't have a system with this HW. -vlad I checked latest intel out of tree driver, we have the same code here as e1000e 3.0.4. No joy there. No upstream patches (linus/net-next/stable) of interest either, so either we craft a fix, or work with Intel. (In reply to Vlad Yasevich from comment #44) > > > > Lemme check the latest upstream Intel code I got a couple days back for any > > changes to this. Doubtful, but worth a look to see. > > > > Does this system have manageability support? dmidecode should tell you. > > Haven't tried this but if it can be disabled in the BIOS or elsewhere > > somehow, wonder if this goes away. > > That's a question for Jaroslav. I currently don't have a system with this > HW. Hi, I think not, but I don't know what is manageability support exactly. I run this command, and I got nothing. # dmidecode |grep -i manage If you want me to do something more, please give me a note. (In reply to Jaroslav Aster from comment #46) > (In reply to Vlad Yasevich from comment #44) > > > > > > Lemme check the latest upstream Intel code I got a couple days back for any > > > changes to this. Doubtful, but worth a look to see. > > > > > > Does this system have manageability support? dmidecode should tell you. > > > Haven't tried this but if it can be disabled in the BIOS or elsewhere > > > somehow, wonder if this goes away. > > > > That's a question for Jaroslav. I currently don't have a system with this > > HW. > > Hi, > > I think not, but I don't know what is manageability support exactly. I run > this command, and I got nothing. > > # dmidecode |grep -i manage > > If you want me to do something more, please give me a note. dmidecode will show the presence of management as a BIOS extension: the system here looks like this when dumped with dmidecode. Handle 0x0054, DMI type 221, 26 bytes OEM-specific Type Header and Data: DD 1A 54 00 03 01 00 00 07 01 00 00 02 00 09 00 00 13 00 03 04 09 00 00 CD 04 Strings: Reference Code - ME 9.0 MEBx version ME Firmware Version 5MB SKU Does the failing system has this too? As an aside: my little knowledge here is from an Intel paper: An Introduction to Intel® Active Management Technology (Intel® AMT). (In reply to John Greene from comment #47) > (In reply to Jaroslav Aster from comment #46) > > (In reply to Vlad Yasevich from comment #44) > > > > > > > > Lemme check the latest upstream Intel code I got a couple days back for any > > > > changes to this. Doubtful, but worth a look to see. > > > > > > > > Does this system have manageability support? dmidecode should tell you. > > > > Haven't tried this but if it can be disabled in the BIOS or elsewhere > > > > somehow, wonder if this goes away. > > > > > > That's a question for Jaroslav. I currently don't have a system with this > > > HW. > > > > Hi, > > > > I think not, but I don't know what is manageability support exactly. I run > > this command, and I got nothing. > > > > # dmidecode |grep -i manage > > > > If you want me to do something more, please give me a note. > > dmidecode will show the presence of management as a BIOS extension: the > system here looks like this when dumped with dmidecode. > > Handle 0x0054, DMI type 221, 26 bytes > OEM-specific Type > Header and Data: > DD 1A 54 00 03 01 00 00 07 01 00 00 02 00 09 00 > 00 13 00 03 04 09 00 00 CD 04 > Strings: > Reference Code - ME 9.0 > MEBx version > ME Firmware Version > 5MB SKU > > Does the failing system has this too? > > As an aside: my little knowledge here is from an Intel paper: > An Introduction to Intel® Active Management Technology > (Intel® AMT). No, it has not. It has several 'OEM-specific Type' structures in dmidecode's outputs, but nothing with these strings. Hardware is notebook Lenovo t440s. Created attachment 885015 [details] Test patch Jaroslav Can you give this brew build a try: https://brewweb.devel.redhat.com/taskinfo?taskID=7325535 I've also attached a test patch to the BZ, just as an idea of what I was talking about. I was able to create a 10 macvlans with this patch and they all worked. Thanks -vlad (In reply to Vlad Yasevich from comment #49) > Created attachment 885015 [details] > Test patch > > Jaroslav > > Can you give this brew build a try: > https://brewweb.devel.redhat.com/taskinfo?taskID=7325535 > > I've also attached a test patch to the BZ, just as an idea of what I was > talking > about. I was able to create a 10 macvlans with this patch and they all > worked. > > Thanks > -vlad Nice work Vlad! This patch will probably fix the 217 and 218 e1000e flavors assuming we understand the behaviour of management in full. The other part of this is that trying to understand what other variants might be subject to this issue. Jburke was helping make a new automated tests as well. Bug 1086285 Support macvtap-based interfaces in virt install test Anybody find other e1000e with similar problems as yet manually? I think that ME engine is black box and only Intel folks really understand it. I didn't test that patch extensively and I think there might be an off-by-1 error when management is consuming registers, but I didn't have much time/inclination to prove it. :) Feel free to post upstream to get feedback (you can add my: Signed-off-by: Vlad Yasevich <vyasevic>). -vlad (In reply to Vlad Yasevich from comment #51) > I think that ME engine is black box and only Intel folks really understand > it. > I didn't test that patch extensively and I think there might be an off-by-1 > error when management is consuming registers, but I didn't have much > time/inclination to prove it. :) > > Feel free to post upstream to get feedback (you can add my: > Signed-off-by: Vlad Yasevich <vyasevic>). > > -vlad Agreed on the black box notion. Not sure when or where the allocations come and leave either, making me unsure how to safely account for them. Happy to post.. will keep you in the loop. (In reply to Vlad Yasevich from comment #49) > Created attachment 885015 [details] > Test patch > > Jaroslav > > Can you give this brew build a try: > https://brewweb.devel.redhat.com/taskinfo?taskID=7325535 > > I've also attached a test patch to the BZ, just as an idea of what I was > talking > about. I was able to create a 10 macvlans with this patch and they all > worked. Good job, it works. I run my test and all eleven macvlan interfaces succeeded. # ./test-driver.sh macvlan0 succeed macvlan1 succeed macvlan2 succeed macvlan3 succeed macvlan4 succeed macvlan5 succeed macvlan6 succeed macvlan7 succeed macvlan8 succeed macvlan9 succeed macvlan10 succeed Then, I run virt-manager and it worked too :-). (In reply to Jaroslav Aster from comment #53) > (In reply to Vlad Yasevich from comment #49) > > Created attachment 885015 [details] > > Test patch > > > > Jaroslav > > > > Can you give this brew build a try: > > https://brewweb.devel.redhat.com/taskinfo?taskID=7325535 > > > > I've also attached a test patch to the BZ, just as an idea of what I was > > talking > > about. I was able to create a 10 macvlans with this patch and they all > > worked. > > Good job, it works. I run my test and all eleven macvlan interfaces > succeeded. > > # ./test-driver.sh > macvlan0 succeed > macvlan1 succeed > macvlan2 succeed > macvlan3 succeed > macvlan4 succeed > macvlan5 succeed > macvlan6 succeed > macvlan7 succeed > macvlan8 succeed > macvlan9 succeed > macvlan10 succeed > > Then, I run virt-manager and it worked too :-). Any further testing before I post upstream you guys can suggest? Or any comment on implementation? Looks ok to me.. Can you double check the index that we get from ME and what we use to program the registers. I took a guess at its usage... I don't want to turn on promisc on the card if we still have available registers. -vlad I've posted the test patch to Intel for review, as it touches some files they wish to review. Will followup to get it upstream. (In reply to John Greene from comment #56) > I've posted the test patch to Intel for review, as it touches some files > they wish to review. Will followup to get it upstream. I have the patch and am reviewing it. Please allow us to push this patch upstream and give you credit for it, as the files it touches require us to maintain ownership of changes so as to avoid GPL violations. Thanks, Dave Ertman (In reply to Dave Ertman from comment #57) > (In reply to John Greene from comment #56) > > I've posted the test patch to Intel for review, as it touches some files > > they wish to review. Will followup to get it upstream. > > I have the patch and am reviewing it. Please allow us to push this patch > upstream and give you credit for it, as the files it touches require us to > maintain ownership of changes so as to avoid GPL violations. Dave, Thanks! But credit is not due me, I'm just passing it along. I'll be doing the backport to RHEL when it's upstream. Please credit my talented co-worker: >Feel free to post upstream to get feedback (you can add my: >Signed-off-by: Vlad Yasevich <vyasevic>). John (In reply to John Greene from comment #58) > (In reply to Dave Ertman from comment #57) > > (In reply to John Greene from comment #56) > > > I've posted the test patch to Intel for review, as it touches some files > > > they wish to review. Will followup to get it upstream. > > > > I have the patch and am reviewing it. Please allow us to push this patch > > upstream and give you credit for it, as the files it touches require us to > > maintain ownership of changes so as to avoid GPL violations. > Dave, > > Thanks! But credit is not due me, I'm just passing it along. I'll be doing > the backport to RHEL when it's upstream. Please credit my talented > co-worker: > > >Feel free to post upstream to get feedback (you can add my: > >Signed-off-by: Vlad Yasevich <vyasevic>). > > John This patch impacts quite a few of our parts (client and 1Gig server), and also impacts other drivers. We are evaluating other approaches to see if we can lessen the impact. Thanks for your patience! Created attachment 890558 [details]
Test patch 2 - with return value added
Please evaluate whether this approach will still fix your issue.
Yes, technically this still fixes the issue, but at the expense of turning on promiscuous mode much sooner then necessary most of the time. With this patch, my tests have worked, but the interface was made promiscuous with a single macvlan configured. None of the SHRARs could be used in this case. -vlad (In reply to Vlad Yasevich from comment #62) > Yes, technically this still fixes the issue, but at the expense of turning > on promiscuous mode much sooner then necessary most of the time. With this > patch, > my tests have worked, but the interface was made promiscuous with a single > macvlan configured. None of the SHRARs could be used in this case. > > -vlad Vlad, Dave, I tend to agree with Vlad. The issue with who keeps up with what regs are in use between the driver/OS and ME needs something to keep us from stepping on each other and still use the available hardware resources. ME is black box here. Does it have it's own notion of what indexes are in use separate from driver? It seems to. Created attachment 891627 [details]
Test Patch 3
This patch encorporates both methodologies for approaching this problem. Both the implementation of rar_get_count, and adding of a return value to rar_set for LPT devices.
The redundant check of FWSM using the E1000_FWSM_WLOCK_MAC_MASK serves two purposes:
1) serves as a double check for the possible timing issue of ME grabbing a SHRA between check and use (Still trying to find ME folks that can answer on this - and it could change without alerting the driver).
2) maintains functionality for drivers that do not want to utilize rar_get_count.
please let me know what you think of this approach.
(In reply to Dave Ertman from comment #64) > Created attachment 891627 [details] > Test Patch 3 > > This patch encorporates both methodologies for approaching this problem. > Both the implementation of rar_get_count, and adding of a return value to > rar_set for LPT devices. > > The redundant check of FWSM using the E1000_FWSM_WLOCK_MAC_MASK serves two > purposes: > > 1) serves as a double check for the possible timing issue of ME grabbing a > SHRA between check and use (Still trying to find ME folks that can answer on > this - and it could change without alerting the driver). > > 2) maintains functionality for drivers that do not want to utilize > rar_get_count. > > please let me know what you think of this approach. I agree with this approach. I was debating whether to remove redundant checks when I wrote the initial patch and I didn't consider point 1) above. The one question that still worries me a little is whether ME checks if a SHRAR is available before writing to it? It would be bad if ME can over-write values that the kernel wrote. Thanks -vlad And another one I think Vlad sensed in C51: if e1000_rar_get_count_pch_lpt() returns 0 (entries full) at https://bugzilla.redhat.com/attachment.cgi?id=891627&action=diff#a/drivers/net/ethernet/intel/e1000e/ich8lan.c_sec5 then e1000e_write_uc_addr_list() blindly decrements to zero to -1 number, possibly causing issues at line 3320 and beyond in that function in netdev.c Isn't an issue with other family, just ich8lan.c Easy fix would be .. if ( rar_entries == 0) return -ENOMEM; --rar_entries; of similar.. (In reply to John Greene from comment #66) > And another one I think Vlad sensed in C51: > > if e1000_rar_get_count_pch_lpt() returns 0 (entries full) at > https://bugzilla.redhat.com/attachment.cgi?id=891627&action=diff#a/drivers/ > net/ethernet/intel/e1000e/ich8lan.c_sec5 > > then e1000e_write_uc_addr_list() blindly decrements to zero to -1 number, > possibly causing issues at line 3320 and beyond in that function in netdev.c > > > Isn't an issue with other family, just ich8lan.c > > Easy fix would be .. > if ( rar_entries == 0) > return -ENOMEM; > > --rar_entries; > > of similar.. Actually, now that I look specifically at that part. The rar_entry_count counts RAR[0] as an address register. Even with a wlock value of 1, RAR[0] is open (as it has to be). This is why the value for E1000_PCH_LPT_RAR_ENTRIES is 12 instead of 11. e1000e_write_ic_addr_list() accounts for this (rar_entries--). I need to take a quick run through the logic and account for this issue. I will be submitting a new patch if necessary. I will use the same basic logic that we agreed on so far, but I might have to tweak it to account for RAR[0]. Created attachment 891643 [details]
Test Patch 4
That was an excellent catch John! Since e1000e_write_uc_addr_list() is used by multiple parts, we need to keep its logic intact. This means that e1000_rar_get_count_pch_lpt() should never return zero (since RAR[0] is always available). Since the Write Lock MAC Adresses bits in FWSM only account for SHRA registers, the return value needs to look like this:
wlock 0 - return hw->mac.rar_entry_count
(defined as E1000_PCH_LPT_RAR_ENTRIES or 12)
wlock 1 - return 1
(only RAR[0]) available
wlock default - return [wlock + 1]
(number of SHRA unlocked + 1 for RAR[0])
(In reply to Dave Ertman from comment #68) > Created attachment 891643 [details] > Test Patch 4 > > That was an excellent catch John! Since e1000e_write_uc_addr_list() is used > by multiple parts, we need to keep its logic intact. This means that > e1000_rar_get_count_pch_lpt() should never return zero (since RAR[0] is > always available). Since the Write Lock MAC Adresses bits in FWSM only > account for SHRA registers, the return value needs to look like this: > > wlock 0 - return hw->mac.rar_entry_count > (defined as E1000_PCH_LPT_RAR_ENTRIES or 12) > wlock 1 - return 1 > (only RAR[0]) available > wlock default - return [wlock + 1] > (number of SHRA unlocked + 1 for RAR[0]) I forgot to adjust the comments in e1000_rar_get_count_pch_lpt(), will do so before I submit upstream if this is acceptable. (In reply to Dave Ertman from comment #69) > (In reply to Dave Ertman from comment #68) > > Created attachment 891643 [details] > > Test Patch 4 > > > > That was an excellent catch John! Since e1000e_write_uc_addr_list() is used > > by multiple parts, we need to keep its logic intact. This means that > > e1000_rar_get_count_pch_lpt() should never return zero (since RAR[0] is > > always available). Since the Write Lock MAC Adresses bits in FWSM only > > account for SHRA registers, the return value needs to look like this: > > > > wlock 0 - return hw->mac.rar_entry_count > > (defined as E1000_PCH_LPT_RAR_ENTRIES or 12) > > wlock 1 - return 1 > > (only RAR[0]) available > > wlock default - return [wlock + 1] > > (number of SHRA unlocked + 1 for RAR[0]) > > I forgot to adjust the comments in e1000_rar_get_count_pch_lpt(), will do so > before I submit upstream if this is acceptable. Thanks Dave. Good idea: the comments were in need of more clarity in regards to what wlock means and the above sure helps. I was wondering: wlock value means the number registers available or number in use? Must we cycle through them looking at the enable bit to determine which are available or do we assume we fill bottom up or top down, first not enabled? Debugging one of these episodes, I noted it seemed both driver and the ME alloc'ed SHAR[max] downward, which caused issues. A short note in the source as to what the driver should AVOID doing would be great help. It's not obvious that ME is involvled otherwise. Getting it right now will rid us of a class of BZs. Thanks for your efforts! (In reply to John Greene from comment #70) > (In reply to Dave Ertman from comment #69) > > (In reply to Dave Ertman from comment #68) > > > Created attachment 891643 [details] > > > Test Patch 4 > > > > > > That was an excellent catch John! Since e1000e_write_uc_addr_list() is used > > > by multiple parts, we need to keep its logic intact. This means that > > > e1000_rar_get_count_pch_lpt() should never return zero (since RAR[0] is > > > always available). Since the Write Lock MAC Adresses bits in FWSM only > > > account for SHRA registers, the return value needs to look like this: > > > > > > wlock 0 - return hw->mac.rar_entry_count > > > (defined as E1000_PCH_LPT_RAR_ENTRIES or 12) > > > wlock 1 - return 1 > > > (only RAR[0]) available > > > wlock default - return [wlock + 1] > > > (number of SHRA unlocked + 1 for RAR[0]) > > > > I forgot to adjust the comments in e1000_rar_get_count_pch_lpt(), will do so > > before I submit upstream if this is acceptable. > > Thanks Dave. Good idea: the comments were in need of more clarity in > regards to what wlock means and the above sure helps. I was wondering: > wlock value means the number registers available or number in use? > Must we cycle through them looking at the enable bit to determine which are > available or do we assume we fill bottom up or top down, first not enabled? > > Debugging one of these episodes, I noted it seemed both driver and the ME > alloc'ed SHAR[max] downward, which caused issues. > > A short note in the source as to what the driver should AVOID doing would be > great help. It's not obvious that ME is involvled otherwise. > > Getting it right now will rid us of a class of BZs. Thanks for your efforts! As far as what wlock means... the spec has that written in typical HW speak (i.e. clear as mud): FWSM-7:9 SHRA that are WR locked ------ ---------------- 000 None of the SHRA are locked 001 0,1,2,3,4,5,6,7,8,9,10 010 2,3,4,5,6,7,8,9,10 011 3,4,5,6,7,8,9,10 100 4,5,6,7,8,9,10 101 5,6,7,8,9,10 110 6,7,8,9,10 111 7,8,9,10 So, on a platform with ME actively taking SHRA registers, the largest number of SHRA that will be free is 7 (0-6). The driver does fill top down, ME also fills top down (at least on an i217 part), if there are free SHRA, they are the lower indices SHRA. But, with this change to rar_entries, the driver will start its top down filling with the highest *free* SHRA, as long the number of free SHRA is >= netdev_uc_count(netdev). So now, if the driver needs, for example, 3 SHRA and wlock is 101, it will use the last 3 *free* SHRA (4,3, and 2). The reason for writing them top down is that there was a bug where register writes would be combined if they were seen as being in contiguous memory (e.g. the SHRA registers). I believe that this is a compiler issue, but I only have it second hand, as that was found before my time on e1000e started. This gives me an idea for a change to the e1000e_write_uc_addr_list() function that might help all parts "dodge" ME locked registers: If we need 3 SHRA, just take the bottom 3, but start with the third (2, 1, 0), and then zero out what we believe to be the remaining SHRA (e.g. 11..3). I will need to connect with an ME resource before I do this though, as I cannot seem to find documentation on how ME grabs SHRA on other parts (i.e. how the wlock works on other parts) and there is no guarantee that making this change wouldn't make it worse for other parts. Dave, excellent work on getting this. Not intuitive is it..? An issue (as I see it at least): these things get alloc'ed 1 at a time, now way of knowing upfront I need 3 total. RAR0 always for unicast data, any multicast usage (which I am not sure of the use of hardware to assist.. After that, the macvlan come along later (on demand even) that ask for them. Does that cause you a problem? Wish your suggestion was more usable (not a problem to do that) but they will still come in 1 at a time. Any problem with that on your end? Hi Dave, Are you waiting on us for results for test patch 4 here or something else? Been distracted with other stuff. But haven't seen this upstream as yet. Would like to get this in a release upcoming if possible. (In reply to John Greene from comment #73) > Hi Dave, > > Are you waiting on us for results for test patch 4 here or something else? > Been distracted with other stuff. But haven't seen this upstream as yet. > > Would like to get this in a release upcoming if possible. This patch was submitted to our internal queue on 2014-05-05, it is currently assigned to a Validation Engineer who is looking for a good way to finalize his validation of the patch. He has done quite a bit of work on it and I think he is just about done. I will point him to this bugzilla to see if anything here might complete his testing. (In reply to Dave Ertman from comment #74) > (In reply to John Greene from comment #73) > > Hi Dave, > > > > Are you waiting on us for results for test patch 4 here or something else? > > Been distracted with other stuff. But haven't seen this upstream as yet. > > > > Would like to get this in a release upcoming if possible. > > This patch was submitted to our internal queue on 2014-05-05, it is > currently assigned to a Validation Engineer who is looking for a good way to > finalize his validation of the patch. He has done quite a bit of work on it > and I think he is just about done. > > I will point him to this bugzilla to see if anything here might complete his > testing. Thanks for the update Dave.. I'm awaiting it now. Hi Dave, I checked upstream for this this morning and didn't see it. Can you or someone at Intel check on the expected release time of this patch for me? I'm on a deadline to get an OS update out by end of the month and I REALLY want this one if I can get it. Is this going to make it out this week? I need to plan for this one way or other. Thanks. (In reply to John Greene from comment #76) > Hi Dave, > > I checked upstream for this this morning and didn't see it. Can you or > someone at Intel check on the expected release time of this patch for me? > I'm on a deadline to get an OS update out by end of the month and I REALLY > want this one if I can get it. Is this going to make it out this week? I > need to plan for this one way or other. > > Thanks. Oops, I just found it..never mind Dave. http://patchwork.ozlabs.org/patch/352824/ *** Bug 1115562 has been marked as a duplicate of this bug. *** |