Description of problem: Using ML2 with openvswitch and sriovnicswitch Mechanism drivers and the same Physical Function for both SR-IOV and openvswitch . Launching an instance with an OVS interface just after the Physical interface driver is loaded results with no IP for the instance. When launching an instance with a VF (vnic_type=direct) as an interface I get an IP from DHCP and now the first instance also gets IP. Version-Release number of selected component (if applicable): RHEL7.0 igb 3.10.0-123.13.1.el7.x86_64 openstack-neutron-2014.2.1-2.el7ost.noarch openstack-neutron-openvswitch-2014.2.1-2.el7ost.noarch How reproducible: Steps to Reproduce: 1. Configure ML2 with the openvswitch and sriovnicswitch mechanism drivers. 2. Configure SR-IOV and openvswitch on each compute node. 3. Launch and instance and verify the instance has no IP. 4. Launch a second Instance with port with vnic_type=direct. The second Instance gets an IP as well as the first one. Actual results: Expected results: Additional info: The above happens for the instances with openvswitch Interfaces after the igb driver is loaded. When launching instances with 'SR-IOV' interface using the same PF there is no problem until the driver is reloaded again.
FWIW, I have no idea what ML2 and sriovnicswitch is. If this is suspected to be a kernel or openvswitch problem, please provide the networking configuration etc. as usual when reporting kernel bugs.
This behaviour results from misconfiguration of openvswitch by the reporter.
The reporter was not at fault! A nice catch, IMO. Itzik demonstrated the problem to me and I am convinced he is onto something. If the order of operations is significant, I feel it is worthwhile to rule out the possibility that manipulating the root PCI device that is bridged to OVS as a source of problems. There were several msgs in the dmesg about the PF being in resetting state. While it isn't clear that they would be related I wonder if OVS ignores the interface until it is reset by configuring the VF. I wouldn't rule out igb or device issues either, but it seems a bit of a stretch. Perhaps a useful test would be to not bridge the interface and see if it can be assigned an IP (or a VLAN interface with an IP) that it has access to and see if communication can occur before the VF is configured. If it doesn't work, it is a pretty clear indication that something is wrong with our use of the card. Unfortunately it is not nearly as definitive if it does work. A second test would be to configure it as is and manually add an interface to br-vlan with segment ID of 80, give it an IP and see if communications work over that.
In attempt to corroborate Itzik's findings I went a little afield and tried my own test recommendations. Here is what I did: (precondition: the system was an unverified state, but presumably it was after Itzik re-verified his result) 1. created an internal virtual port on br-vlan with vlan ID 80 (brent1) to match the neutron network and gave it an arbitrary IP on that network (192.168.150.7) 2. I then pinged the interface in the relevant DHCP namespace on the controller (192.168.150.2). This succeeded. 3. I rebooted the test compute node (puma48 fwiw) 4. After setting the IP adddress on brent1 again, I tried pinging the controller DHCP namespace again. This did NOT succeed. 5. I manually configured a VF's mac addr using "ip link set eno1 vf 0 mac 8A:F4:D1:81:C1:CE" 6. I tried repinging, this did not succeed. 7. I manually added a VLAN segment id to the vf: "ip link set eno1 vf 0 vlan 80" 8. I tried repinging. This succeeded. Another tactic: 1. rebooted 2. configured brent1 with IP and tried pinging. This failed. 3. removed IP from brent1 4. created a vlan interface off of eno1 with the proper segment id: "ip link add link eno1 name eno1.80 type vlan id 80" 5. gave the interface a relevant IP address "ifconfig eno1.80 192.168.150.8" 6. tried pinging the DHCP namespace again, succeeded. This is *without* any openstack instances on the compute node. virsh list --all returns an empty list so there aren't any shutdown vms or anything of that sort. For some reason, traffic isn't propagating until a VF is given a segment ID. It also worked when I created an addressed link on eno1 with the appropriate vlan ID (80). It could be the intel card, it might also be some interaction between the card and the switch with respect to VLANs. The results are not entirely conclusive, but there is an issue there.
from reading the above it seems like the issue is not neutron specific and can be reproduced with OVS and SRIOV driver w/o Neutrons help. I'm not sure where the issue is, OVS, the SRIOV driver or somewhere else. Moving the bug to the OVS component to see if they can provide more insight as what the problem is.
Also moved priority to low, since to my understanding this bug only reproduced if the OVS is connected to the same PF used to expose the SRIOV VFs. Itzik - if the above understanding is wrong we need to increase the priority of this bug.
(In reply to lpeer from comment #6) > from reading the above it seems like the issue is not neutron specific and > can be reproduced with OVS and SRIOV driver w/o Neutrons help. > > I'm not sure where the issue is, OVS, the SRIOV driver or somewhere else. > > Moving the bug to the OVS component to see if they can provide more insight > as what the problem is. The following is my experiment. Please let me know if I am tracking the same issue. Test-bed: Two Dell PowerEdge 820 servers, ovs2 & ovs4, each having an ixgbe NIC, connected via a Dell switch VMs and VFs: Each server having a VM and an ixgbe VF assigned to the VM Testing: Two VMs can communicate each other via ping and netperf via the ixgbe VF interface. But, they can't ping each other via OVS/VF interface. So, are we talking about the same issue?
As I said in comment 1, I have no idea what the setup is. Could I have access to the system? If not, please install and run plotnetcfg[1] on the system and attach its output here; I'll follow up with more questions afterwards. [1] http://file.rdu.redhat.com/~jbenc/plotnetcfg/plotnetcfg.x86_64.rpm
Basically you can follow Brent's flow without Launching VMs. I can provide step-by-step if needed.
The linked OpenStack patch states this: ------ When using SR-IOV, unknown MAC address from VF goes to wire, not to PF. This cause problem if SR-IOV VM need connectivity to OVS VM on the same network and on the same host, beacuse in this case the traffic should via the PF. This patch update the ovs agent to add fdb entry for each OVS VM MAC on the required physical networks. The fdb entry tells the eswitch to send the trafic via PF and not on the wire. Config option is added to ovs agent to tell for what physical networks this is required. ------ I assume this is what this bug is about. I'm not completely sure as I still don't have enough understanding of what the setup looks like and what is pinged from where. If the problem is that frames sent to a VF to a MAC address that's not assigned to any other VF/PF are not broadcasted to all VFs+PF, that's a limitation of the particular NIC you're using. Usually, NICs do not implement full bridge for SR-IOV which would be needed for this. See e.g. this discussion: https://communities.intel.com/thread/38613 My understanding is this is exactly what the OpenStack patch is trying to implement. Important thing to note here is that this will be inevitable NIC dependent. In such case, this bug should be closed as there's nothing OVS can do.
If the problem is something different than described in the previous comment, please provide setup details: the step by step instructions without using openstack (comment 6 says that it's possible). Alternatively, with openstack configured, grab the plotnetcfg output (available in EPEL7 now: https://dl.fedoraproject.org/pub/epel/7/x86_64/p/plotnetcfg-0.4.1-2.el7.x86_64.rpm). As plotnetcfg won't capture everything needed here, please provide in addition: - ovs-ofctl dump-flows (for the switches that are part of the path the ping packets traverse), - explain how VFs were configured. Please state what address is pinged in the provided setup and from where (from root netns of the host, I guess?).
The problem is the same as stated in the patch.
Do you want to reassign to a different component (openstack-neutron, perhaps?) or should I close the bug? There's nothing openvswitch can do, this is outside of ovs.
Could this issue be related? https://bugzilla.redhat.com/show_bug.cgi?id=1267030 We have an issue with Intel NIC's and Dell R620's during introspection where the Intel NICs often timeout obtaining an IP through DHCP. Broadcom NICs on the same H/W and in the same native VLAN get their IP's almost instantly. We noticed that SRIOV was enabled in the BIOS of our R620's.. Regards,
(In reply to Vincent S. Cojot from comment #19) > Could this issue be related? > https://bugzilla.redhat.com/show_bug.cgi?id=1267030 > We have an issue with Intel NIC's and Dell R620's during introspection where > the Intel NICs often timeout obtaining an IP through DHCP. Broadcom NICs on > the same H/W and in the same native VLAN get their IP's almost instantly. > We noticed that SRIOV was enabled in the BIOS of our R620's.. I don't know, I would need more details about the setup. plotnetcfg output (see comment 16 for instructions), ideally also sosreport -o networking.
Brent, please see whether we can revive the u/s patch for that (it's currently abandoned because Moshe does not have time for it).