Bug 1720157
Summary: | [Azure] Add an udev rule to make multiple SR-IOV NICs both can get ip addresses | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Yuhui Jiang <yujiang> |
Component: | NetworkManager | Assignee: | Rick Barry <ribarry> |
Status: | CLOSED NOTABUG | QA Contact: | Yuxin Sun <yuxisun> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 8.0 | CC: | alsin, atragler, bgalvani, decui, fgiudici, haiyangz, jjarvis, jopoulso, josalisb, mikelley, mmorsy, ribarry, rkhan, sukulkar, thaller, till, vkuznets, yacao, yuxisun |
Target Milestone: | rc | Keywords: | Reopened, TestOnly |
Target Release: | 8.0 | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-07-10 07:28:30 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Yuhui Jiang
2019-06-13 09:34:18 UTC
Hi Josh, The Microsoft Azure build team needs to add a udev rule to their RHEL-8 image builds to allow multiple SRIOV nics to obtain IP addresses. Only eth0 gets configured correctly at the moment. This was confirmed by QE using the "RedHat:RHEL:8:8.0.2019050711" image. Can you pass this along to the Azure build team? Hey Rick - a couple questions on this for you. 1. What is the exact udev rule we should add? 2. Does this need to be added into RHEL 7 images as well? (In reply to Alfred Sin from comment #3) > Hey Rick - a couple questions on this for you. > > 1. What is the exact udev rule we should add? > 2. Does this need to be added into RHEL 7 images as well? Hi Alfred, Yuhui provided the udev rule in the description (https://bugzilla.redhat.com/show_bug.cgi?id=1720157#c0). Yuhui, does this apply to RHEL 7 as well? (In reply to Rick Barry from comment #4) > (In reply to Alfred Sin from comment #3) > > Hey Rick - a couple questions on this for you. > > > > 1. What is the exact udev rule we should add? > > 2. Does this need to be added into RHEL 7 images as well? > > Hi Alfred, Yuhui provided the udev rule in the description > (https://bugzilla.redhat.com/show_bug.cgi?id=1720157#c0). > > Yuhui, does this apply to RHEL 7 as well? I see - my bad. I did a quick check in a RHEL 7 VM and thought I had checked in RHEL 8. I just double-checked in a RHEL 8.0 VM and the rule is indeed not in there. We can add it in during our image build process. Hello. Bug #1661574 got closed because of the lack of communication. Beniamino asked perfectly reasonable questions, but didn't get a response. This is a bit concerning. The udev rule in question seems very obviously wrong -- why would a disablement of a particular interface make any other interface get an IP address? Upon a closer look from Beniamino it became apparent, that it's because there are multiple interfaces with same MAC address. NetworkManager attempts to generate a connection profile for devices that don't have a matching one and because in RHEL 8.0 the MAC address is used in the generated profile, one of the devices "wins" at random. This was changed in RHEL 8.1 for unrelated reasons. [1] [1] https://gitlab.freedesktop.org/NetworkManager/NetworkManager/merge_requests/137/ Therefore, what happens in RHEL 8.1 is that *all* devices attempt to connect. I suppose that it's still not what you want, the ep* devices on hv_pci still don't get connected. In order to solve that we need to understand why. It's far from clear to us that completely disallowing network devices on hv_pci would be a good idea. A side note: this didn't affect eth0 because you ship a configuration file /etc/sysconfig/network-scripts/ifcfg-eth0 that sticks to eth0 and thus the automatism doesn't kick in. Why would you do this? It's just inconsistent and unnecessary. Also, there's a /etc/sysconfig/network-scripts/ifcfg-ens3 file, but the azure installations don't even have ens3. Why? The following udev rule exists on Azure image of 7.x: cat /etc/udev/rules.d/68-azure-sriov-nm-unmanaged.rules # Accelerated Networking on Azure exposes a new SRIOV interface to the VM. # This interface is transparently bonded to the synthetic interface, # so NetworkManager should just ignore any SRIOV interfaces. SUBSYSTEM=="net", DRIVERS=="hv_pci", ACTION=="add", ENV{NM_UNMANAGED}="1" The same rule needs to be added to Azure image of Redhat 8: With this udev rule -- We are not disabling the VF NIC (from hv_pci). On Hyper-V or Azure hosts, VF NICs have the same MAC as their matching synthetic NICs – by design. And a VF NIC is bonded to its matching synthetic NIC automatically as a slave NIC. So VF NICs don’t need IP address, and shouldn’t be managed by “Network manager”. I was able to reproduce the issue with multi VF NICs on Redhat 8 -- only eth0 has IP. After adding that udev rule, the problem is solved. (In reply to Haiyang Zhang from comment #7) > With this udev rule -- We are not disabling the VF NIC (from hv_pci). On > Hyper-V or Azure hosts, VF NICs have the same MAC as their matching > synthetic NICs – by design. I'm just curious -- is the design documented anywhere? > And a VF NIC is bonded to its matching synthetic > NIC automatically as a slave NIC. So VF NICs don’t need IP address, and > shouldn’t be managed by “Network manager”. What I'm interested in is solving this in a way that's not going to need any Azure-specific secret sauce. I'm wondering if just blacklisting all NICs on a Hyper-V PCI bus is not an overkill. If the rule was applied outside of Azure, I suspect it would affect things like PCI passthrough on Virtual PC. (In reply to Lubomir Rintel from comment #8) > What I'm interested in is solving this in a way that's not going to need any > Azure-specific secret sauce. Perhaps not; there doesn't seem to be anything particularly specific to Azure in the sysfs attributes: [root@az2 lkundrak]# udevadm info -a /sys/class/net/enP40072s1 Udevadm info starts with the device specified by the devpath and then walks up the chain of parent devices. It prints for every device found, all possible attributes in the udev rules key format. A rule to match, can be composed by the attributes of the device and the attributes from one single parent device. looking at device '/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/291fe2b6-0a3f-4450-9c88-3b494f14be71/pci9c88:00/9c88:00:02.0/net/enP40072s1': KERNEL=="enP40072s1" SUBSYSTEM=="net" DRIVER=="" ATTR{addr_assign_type}=="0" ATTR{addr_len}=="6" ATTR{address}=="00:0d:3a:55:2c:9e" ATTR{broadcast}=="ff:ff:ff:ff:ff:ff" ATTR{carrier}=="1" ATTR{carrier_changes}=="1" ATTR{carrier_down_count}=="0" ATTR{carrier_up_count}=="1" ATTR{dev_id}=="0x0" ATTR{dev_port}=="0" ATTR{dormant}=="0" ATTR{duplex}=="full" ATTR{flags}=="0x1803" ATTR{gro_flush_timeout}=="0" ATTR{ifalias}=="" ATTR{ifindex}=="4" ATTR{iflink}=="4" ATTR{link_mode}=="0" ATTR{mtu}=="1500" ATTR{name_assign_type}=="4" ATTR{netdev_group}=="0" ATTR{operstate}=="up" ATTR{proto_down}=="0" ATTR{speed}=="40000" ATTR{tx_queue_len}=="1000" ATTR{type}=="1" looking at parent device '/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/291fe2b6-0a3f-4450-9c88-3b494f14be71/pci9c88:00/9c88:00:02.0': KERNELS=="9c88:00:02.0" SUBSYSTEMS=="pci" DRIVERS=="mlx4_core" ATTRS{ari_enabled}=="0" ATTRS{broken_parity_status}=="0" ATTRS{class}=="0x020000" ATTRS{consistent_dma_mask_bits}=="64" ATTRS{current_link_speed}=="Unknown speed" ATTRS{current_link_width}=="0" ATTRS{d3cold_allowed}=="1" ATTRS{device}=="0x1004" ATTRS{dma_mask_bits}=="64" ATTRS{driver_override}=="(null)" ATTRS{enable}=="1" ATTRS{irq}=="0" ATTRS{local_cpulist}=="0-1" ATTRS{local_cpus}=="00000000,00000000,00000000,00000003" ATTRS{max_link_speed}=="8 GT/s" ATTRS{max_link_width}=="8" ATTRS{mlx4_port1}=="eth" ATTRS{mlx4_port1_mtu}=="-1" ATTRS{msi_bus}=="1" ATTRS{numa_node}=="0" ATTRS{revision}=="0x00" ATTRS{subsystem_device}=="0x61b0" ATTRS{subsystem_vendor}=="0x15b3" ATTRS{vendor}=="0x15b3" looking at parent device '/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/291fe2b6-0a3f-4450-9c88-3b494f14be71/pci9c88:00': KERNELS=="pci9c88:00" SUBSYSTEMS=="" DRIVERS=="" looking at parent device '/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/291fe2b6-0a3f-4450-9c88-3b494f14be71': KERNELS=="291fe2b6-0a3f-4450-9c88-3b494f14be71" SUBSYSTEMS=="vmbus" DRIVERS=="hv_pci" ATTRS{channel_vp_mapping}=="20:0" ATTRS{class_id}=="{44c4f61d-4444-4400-9d52-802e27ede19f}" ATTRS{client_monitor_conn_id}=="0" ATTRS{client_monitor_latency}=="0" ATTRS{client_monitor_pending}=="1985940810" ATTRS{device}=="0x5" ATTRS{device_id}=="{291fe2b6-0a3f-4450-9c88-3b494f14be71}" ATTRS{driver_override}=="(null)" ATTRS{id}=="20" ATTRS{in_intr_mask}=="0" ATTRS{in_read_bytes_avail}=="0" ATTRS{in_read_index}=="1016" ATTRS{in_write_bytes_avail}=="12288" ATTRS{in_write_index}=="1016" ATTRS{monitor_id}=="255" ATTRS{out_intr_mask}=="0" ATTRS{out_read_bytes_avail}=="0" ATTRS{out_read_index}=="1136" ATTRS{out_write_bytes_avail}=="12288" ATTRS{out_write_index}=="1136" ATTRS{server_monitor_conn_id}=="0" ATTRS{server_monitor_latency}=="0" ATTRS{server_monitor_pending}=="1985940810" ATTRS{state}=="3" ATTRS{vendor}=="0x1414" looking at parent device '/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01': KERNELS=="VMBUS:01" SUBSYSTEMS=="acpi" DRIVERS=="vmbus" ATTRS{hid}=="VMBUS" ATTRS{path}=="\_SB_.PCI0.SBRG.VMB8" ATTRS{power_state}=="D0" ATTRS{status}=="15" ATTRS{uid}=="0" looking at parent device '/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07': KERNELS=="device:07" SUBSYSTEMS=="acpi" DRIVERS=="" ATTRS{adr}=="0x00070000" ATTRS{path}=="\_SB_.PCI0.SBRG" looking at parent device '/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00': KERNELS=="PNP0A03:00" SUBSYSTEMS=="acpi" DRIVERS=="" ATTRS{adr}=="0x00000000" ATTRS{hid}=="PNP0A03" ATTRS{path}=="\_SB_.PCI0" ATTRS{uid}=="0" looking at parent device '/devices/LNXSYSTM:00/LNXSYBUS:00': KERNELS=="LNXSYBUS:00" SUBSYSTEMS=="acpi" DRIVERS=="" ATTRS{hid}=="LNXSYBUS" ATTRS{path}=="\_SB_" looking at parent device '/devices/LNXSYSTM:00': KERNELS=="LNXSYSTM:00" SUBSYSTEMS=="acpi" DRIVERS=="" ATTRS{hid}=="LNXSYSTM" ATTRS{path}=="\" [root@az2 lkundrak]# I guess an Azure-specific tweak is indeed the way to go. The other udev rules seem to be shipped by the WALinuxAgent package. Let's see if we can get this one added the same way: https://github.com/Azure/WALinuxAgent/pull/1622 I'm wondering if anyone at Microsoft can get WALinuxAgent maintainers to review the pull requests in their queue? Let me bug them, email autoreplies inform me that there are a few of them on vacation so maybe that's why it's taking a little while... We discussed this BZ at our monthly MSFT-RH call. It seems that there are internal Microsoft discussions about which method they prefer to resolve this (adding a udev rule or doing this dynamically as in Lubomir's upstream proposal. (In reply to Rick Barry from comment #12) > We discussed this BZ at our monthly MSFT-RH call. It seems that there are > internal Microsoft discussions about which method they prefer to resolve > this (adding a udev rule or doing this dynamically as in Lubomir's upstream > proposal. What are you even talking about. My upstream proposal also is to just add an udev rule. The only difference is that I'm adding in in the place that actually makes at least some sense. I'm not sure what you discussed with Microsoft, but my repeated attempts to get them to respond in any useful manner about this via e-mail or GitHub have all failed. I am not happy about this, but I am unable to do anything about this at least until Microsoft seriously reconsiders their approach to collaboration. When that happens, please feel free to reopen the bug. Hi Michael, What was the final decision regarding how Microsoft was planning to resolve this bug? Lubomir Rintel submitted an upstream proposal a couple of months ago to add a udev rule in WALinuxAgent to resolve this, but apparently his pull requests were not reviewed/accepted. I don't have the details, but perhaps someone from the WALinuxAgent team can respond to Lubomir. In any event, do you know if this issue has been resolved? (In reply to Lubomir Rintel from comment #13) Rick & Michael, It's been brought to my attention by my colleagues that the tone of my comment was far from appropriate. Re-reading it I wish I had not written it. I do apologize for it. The message that I wanted to get across is that unless the communication with the partner around this issue improves, there isn't anything we can do about the issue other than shipping a downstream patch. We care about doing the right thing here, that is, involving the WALinuxAgent upstream. That is because we care about the upstream opinion, but also because that will fix the issue for other Linux images on Azure that are not necessarily running RHEL. Thanks Lubo I ping'ed the waagent team on this issue a couple of weeks ago, and since then the discussion has been active in the waagent GitHub pull request that Lubomir Rintel originally made. See https://github.com/Azure/WALinuxAgent/pull/1622. I'm marking the "needs info" request as completed while the discussion is ongoing. (In reply to Michael Kelley from comment #21) > I ping'ed the waagent team on this issue a couple of weeks ago, and since > then the discussion has been active in the waagent GitHub pull request that > Lubomir Rintel originally made. See > https://github.com/Azure/WALinuxAgent/pull/1622. > > I'm marking the "needs info" request as completed while the discussion is > ongoing. Thanks, Michael, I appreciate your help to get that discussion kick-started again. Hi Rick, what should we do about this bug? I feel the NetworkManager devel team doesn't have anything to fix here (or do we??) I'd propose to close this bug as NOTABUG. Alternatively, if you use this bug for tracking purpose, can we assign it to a different component? Thanks. Hi Thomas - we have added the udev rules to our Azure RHEL image builds for RHEL 8.x and all our RHEL 8.x images should contain the udev rule. We can probably close this now (and I do apologize for the all churn in the PR). Thanks!! closing thus, according to comment 23 and comment 24. |