Description of problem: 1. Deploy nic partitioning templates. It is configured the following bond: - linux bond: enp130s0f0v0 and enp130s0f1v0 - linux bond: enp130s0f0v1 and enp130s0f1v1 - ovs bond: enp130s0f0v2 and enp130s0f1v2 I can see vfs properly configured: 12: enp130s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether f8:f2:1e:03:a5:40 brd ff:ff:ff:ff:ff:ff vf 0 link/ether f6:b3:8f:c1:9a:db brd ff:ff:ff:ff:ff:ff, vlan 120, spoof checking off, link-state auto, trust on vf 1 link/ether 3a:2e:2b:38:2c:2e brd ff:ff:ff:ff:ff:ff, vlan 122, spoof checking off, link-state auto, trust on vf 2 link/ether be:c6:46:8f:27:c7 brd ff:ff:ff:ff:ff:ff, vlan 121, spoof checking off, link-state auto, trust on vf 3 link/ether 4e:2b:bd:bf:26:dc brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 4 link/ether 9e:ed:26:7d:88:a3 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 5 link/ether ea:b0:85:fe:fc:fd brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 6 link/ether 0a:49:f7:f7:76:13 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 7 link/ether ea:c3:e5:69:f0:88 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 8 link/ether 5e:dc:17:f7:c1:fa brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 9 link/ether ca:b4:be:bc:43:b0 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off 13: enp130s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether f8:f2:1e:03:a5:42 brd ff:ff:ff:ff:ff:ff vf 0 link/ether 1a:97:59:f4:0b:bc brd ff:ff:ff:ff:ff:ff, vlan 120, spoof checking off, link-state auto, trust on vf 1 link/ether 52:a7:06:26:7e:07 brd ff:ff:ff:ff:ff:ff, vlan 122, spoof checking off, link-state auto, trust on vf 2 link/ether 26:96:8a:1c:50:52 brd ff:ff:ff:ff:ff:ff, vlan 121, spoof checking off, link-state auto, trust on vf 3 link/ether 42:47:7e:b1:21:4d brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 4 link/ether 3e:b5:9a:71:1f:b4 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 5 link/ether 2e:85:b6:cf:c3:7a brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 6 link/ether 9e:da:60:02:c2:96 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 7 link/ether a6:11:85:d3:85:4d brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 8 link/ether fe:2e:4c:7d:fb:e0 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 9 link/ether ca:1d:81:80:a0:37 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off 2. Execute testcase nfv_tempest_plugin.tests.scenario.day2.test_hypervisor_usecases.TestHypervisorScenarios.test_hypervisor_reboot when execute the following actions: a. create a vm with a geneve port for management, a vf in enp6s0f2 and a pf in enp6s0f3 b. check that there is ping c. shutdown vm d. reboot hypervisor e. start vm f. check ping. There is no connectivity to the floating ip of the vm I have checked that after reboot, this is the status of vfs. Vf2 is not configured properly. This is the vf used for tenant traffic using an ovs bond, so there is no connectivity to the floating ip 12: enp130s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether f8:f2:1e:03:a5:40 brd ff:ff:ff:ff:ff:ff vf 0 link/ether 5e:d5:a2:42:4e:56 brd ff:ff:ff:ff:ff:ff, vlan 120, spoof checking off, link-state auto, trust on vf 1 link/ether 9e:2b:51:f7:f0:e5 brd ff:ff:ff:ff:ff:ff, vlan 122, spoof checking off, link-state auto, trust on vf 2 link/ether 7e:01:35:68:57:ac brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 3 link/ether 4e:2b:bd:bf:26:dc brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 4 link/ether 9e:ed:26:7d:88:a3 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 5 link/ether ea:b0:85:fe:fc:fd brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 6 link/ether 0a:49:f7:f7:76:13 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 7 link/ether ea:c3:e5:69:f0:88 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 8 link/ether 5e:dc:17:f7:c1:fa brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 9 link/ether ca:b4:be:bc:43:b0 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off 13: enp130s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether f8:f2:1e:03:a5:42 brd ff:ff:ff:ff:ff:ff vf 0 link/ether 5e:d5:a2:42:4e:56 brd ff:ff:ff:ff:ff:ff, vlan 120, spoof checking off, link-state auto, trust on vf 1 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, vlan 122, spoof checking off, link-state auto, trust on vf 2 link/ether b2:45:3b:f0:ad:ce brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 3 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 4 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 5 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 6 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 7 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 8 link/ether fe:2e:4c:7d:fb:e0 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 9 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off If i configure vf properly, then i recover connectivity. I have seen these errors in logs at the time the testcase was executed: Jul 27 11:13:04 computeovndpdksriov-0 systemd[1]: Starting SR-IOV numvfs configuration... Jul 27 11:13:17 computeovndpdksriov-0 os-net-config-sriov[1632]: [2021/07/27 11:13:17 AM] [ERROR] Failed to execute ip link set dev enp130s0f0v1 promisc off Jul 27 11:13:17 computeovndpdksriov-0 os-net-config-sriov[1632]: Traceback (most recent call last): Jul 27 11:13:17 computeovndpdksriov-0 os-net-config-sriov[1632]: File "/usr/bin/os-net-config-sriov", line 10, in <module> Jul 27 11:13:17 computeovndpdksriov-0 os-net-config-sriov[1632]: sys.exit(main()) Jul 27 11:13:17 computeovndpdksriov-0 os-net-config-sriov[1632]: File "/usr/lib/python3.6/site-packages/os_net_config/sriov_config.py", line 617, in main Jul 27 11:13:17 computeovndpdksriov-0 os-net-config-sriov[1632]: configure_sriov_vf() Jul 27 11:13:17 computeovndpdksriov-0 os-net-config-sriov[1632]: File "/usr/lib/python3.6/site-packages/os_net_config/sriov_config.py", line 553, in configure_sriov_vf Jul 27 11:13:17 computeovndpdksriov-0 os-net-config-sriov[1632]: 'promisc', item['promisc']) Jul 27 11:13:17 computeovndpdksriov-0 os-net-config-sriov[1632]: File "/usr/lib/python3.6/site-packages/os_net_config/sriov_config.py", line 451, in run_ip_config_cmd Jul 27 11:13:17 computeovndpdksriov-0 os-net-config-sriov[1632]: processutils.execute(*cmd, delay_on_retry=True, attempts=10, **kwargs) Jul 27 11:13:17 computeovndpdksriov-0 os-net-config-sriov[1632]: File "/usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py", line 431, in execute Jul 27 11:13:17 computeovndpdksriov-0 os-net-config-sriov[1632]: cmd=sanitized_cmd) Jul 27 11:13:17 computeovndpdksriov-0 os-net-config-sriov[1632]: oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command. Version-Release number of selected component (if applicable): RHOS-16.2-RHEL-8-20210722.n.0 How reproducible: See description Actual results: Lost connectivity to vm Expected results: I should not lost connectivity to vm Additional info:
It appears that NetworkManager is attempting to manage the NIC: Jul 27 11:13:11 computeovndpdksriov-0 NetworkManager[1980]: <info> [1627384391.4649] device (enp130s0f1): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed') In general when os-net-config is used to configure the NICs NetworkManager is not used. Can you please attach the NIC config template that was used from the templates on the undercloud? I want to confirm that the templates do not include "nm_controlled: true" for the PF configuration. If that is not the case, do you know when these NICs were configured as NetworkManager connections? It would also be helpful if you could paste the contents of these two files: /etc/sysconfig/network-scripts/ifcfg-enp130s0f1 /etc/sysconfig/network-scripts/ifcfg-enp130s0f1v1
Hi, these are the templates I have used: https://code.engineering.redhat.com/gerrit/gitweb?p=nfv-qe.git;a=tree;f=tht/ospd-16.2-geneve-ovn-dpdk-sriov-ctlplane-dataplane-bonding-hybrid;h=1c4e355418ca13163bb557e5df74d37171e56dcb;hb=HEAD we have nm_controlled: true, could that be the issue? we didnt have any issue before [root@computeovndpdksriov-1 heat-admin]# cat /etc/sysconfig/network-scripts/ifcfg-enp130s0f1 # This file is autogenerated by os-net-config DEVICE=enp130s0f1 ONBOOT=yes HOTPLUG=no NM_CONTROLLED=yes PEERDNS=no BOOTPROTO=none MTU=9000 DEFROUTE=no [root@computeovndpdksriov-1 heat-admin]# cat /etc/sysconfig/network-scripts/ifcfg-enp130s0f1v1 # This file is autogenerated by os-net-config DEVICE=enp130s0f1v1 ONBOOT=yes HOTPLUG=no NM_CONTROLLED=no PEERDNS=no MASTER=storage_bond SLAVE=yes BOOTPROTO=none
I configured nm_controlled: false but i reploduce same issue, so it is not related with this parameter https://gitlab.cee.redhat.com/mnietoji/deployment_templates/-/commit/64c0c113b012761b5f907fad13ecbe154b23f175
I can see in /var/log/extra/failed_services.txt -- Logs begin at Wed 2021-07-28 09:14:38 UTC, end at Wed 2021-07-28 10:10:51 UTC. -- Jul 28 10:01:28 computeovndpdksriov-0 systemd[1]: Starting SR-IOV numvfs configuration... Jul 28 10:01:41 computeovndpdksriov-0 os-net-config-sriov[1645]: [2021/07/28 10:01:41 AM] [ERROR] Failed to execute ip link set dev enp130s0f1v1 promisc off Jul 28 10:01:41 computeovndpdksriov-0 os-net-config-sriov[1645]: Traceback (most recent call last): Jul 28 10:01:41 computeovndpdksriov-0 os-net-config-sriov[1645]: File "/usr/bin/os-net-config-sriov", line 10, in <module> Jul 28 10:01:41 computeovndpdksriov-0 os-net-config-sriov[1645]: sys.exit(main()) Jul 28 10:01:41 computeovndpdksriov-0 os-net-config-sriov[1645]: File "/usr/lib/python3.6/site-packages/os_net_config/sriov_config.py", line 617, in main Jul 28 10:01:41 computeovndpdksriov-0 os-net-config-sriov[1645]: configure_sriov_vf() Jul 28 10:01:41 computeovndpdksriov-0 os-net-config-sriov[1645]: File "/usr/lib/python3.6/site-packages/os_net_config/sriov_config.py", line 553, in configure_sriov_vf Jul 28 10:01:41 computeovndpdksriov-0 os-net-config-sriov[1645]: 'promisc', item['promisc']) Jul 28 10:01:41 computeovndpdksriov-0 os-net-config-sriov[1645]: File "/usr/lib/python3.6/site-packages/os_net_config/sriov_config.py", line 451, in run_ip_config_cmd Jul 28 10:01:41 computeovndpdksriov-0 os-net-config-sriov[1645]: processutils.execute(*cmd, delay_on_retry=True, attempts=10, **kwargs) Jul 28 10:01:41 computeovndpdksriov-0 os-net-config-sriov[1645]: File "/usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py", line 431, in execute Jul 28 10:01:41 computeovndpdksriov-0 os-net-config-sriov[1645]: cmd=sanitized_cmd) Jul 28 10:01:41 computeovndpdksriov-0 os-net-config-sriov[1645]: oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command. Jul 28 10:01:41 computeovndpdksriov-0 os-net-config-sriov[1645]: Command: ip link set dev enp130s0f1v1 promisc off Jul 28 10:01:41 computeovndpdksriov-0 os-net-config-sriov[1645]: Exit code: 1 Jul 28 10:01:41 computeovndpdksriov-0 os-net-config-sriov[1645]: Stdout: '' Jul 28 10:01:41 computeovndpdksriov-0 os-net-config-sriov[1645]: Stderr: 'Cannot find device "enp130s0f1v1"\n' Jul 28 10:01:41 computeovndpdksriov-0 systemd[1]: sriov_config.service: Main process exited, code=exited, status=1/FAILURE Jul 28 10:01:41 computeovndpdksriov-0 systemd[1]: sriov_config.service: Failed with result 'exit-code'. Jul 28 10:01:41 computeovndpdksriov-0 systemd[1]: Failed to start SR-IOV numvfs configuration. In the deployment it was executed sucessfully os-net-config, I do not understand why it is executed on reboot. [root@computeovndpdksriov-0 ~]# ip a | grep enp130s0f1 13: enp130s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP group default qlen 1000 51: enp130s0f1v9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 52: enp130s0f1v3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 53: enp130s0f1v4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 54: enp130s0f1v5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 55: enp130s0f1v6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 56: enp130s0f1v7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 59: enp130s0f1v0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond_api state UP group default qlen 1000 enp130s0f1v1 should not be missing, it is used by a linux bond and it is not used for dpdk. enp130s0f1v1 has pci address 0000:82:06.1 [root@computeovndpdksriov-0 ~]# driverctl list-overrides 0000:82:02.2 vfio-pci 0000:82:06.2 vfio-pci 0000:82:0a.0 vfio-pci 0000:82:0e.0 vfio-pci
I have seen in this doc [1] that the recomended way to define tenant vlan is diferent. I configured as in this doc. Replaced this: - type: ovs_user_bridge name: br-link0 use_dhcp: false addresses: - ip_netmask: get_param: TenantIpSubnet members: - type: ovs_dpdk_bond name: dpdkbond0 mtu: 9000 rx_queue: 1 members: - type: ovs_dpdk_port name: dpdk0 members: - type: sriov_vf device: nic3 vfid: 2 vlan_id: get_param: TenantNetworkVlanID with this: - type: ovs_user_bridge name: br-link0 use_dhcp: false ovs_extra: - str_replace: template: set port br-link0 tag=_VLAN_TAG_ params: _VLAN_TAG_: get_param: TenantNetworkVlanID addresses: - ip_netmask: get_param: TenantIpSubnet members: - type: ovs_dpdk_bond name: dpdkbond0 mtu: 9000 rx_queue: 1 members: - type: ovs_dpdk_port name: dpdk0 members: - type: sriov_vf device: nic3 vfid: 2 But same issue happens, after rebooting several times, i can see: 1. vf1 is unconfigured, vlan 121 is missing 2. In this case vf2 should not have vlan configured as we used the new configuration, but for some reason tenant network is not working either So, this bug is not related with the way tenant network is defined 12: enp130s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether f8:f2:1e:03:a5:40 brd ff:ff:ff:ff:ff:ff vf 0 link/ether 0a:9d:6e:56:de:8c brd ff:ff:ff:ff:ff:ff, vlan 120, spoof checking off, link-state auto, trust on vf 1 link/ether 46:4d:5c:83:d1:71 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 2 link/ether e6:29:99:53:87:e7 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 3 link/ether ce:ca:f6:a7:19:df brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 4 link/ether fa:13:22:50:c7:49 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 5 link/ether a6:63:af:95:8c:71 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 6 link/ether b2:eb:b5:46:fa:2f brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 7 link/ether 86:7d:84:4d:f5:8a brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 8 link/ether 36:7d:e3:3b:b6:96 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 9 link/ether 1e:ad:46:2e:68:48 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off 13: enp130s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether f8:f2:1e:03:a5:42 brd ff:ff:ff:ff:ff:ff vf 0 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, vlan 120, spoof checking off, link-state auto, trust on vf 1 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 2 link/ether 1a:f9:19:46:d0:6e brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 3 link/ether f6:7b:74:83:09:6a brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 4 link/ether 2a:0e:01:58:52:c1 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 5 link/ether ba:f1:d9:81:d7:e9 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off [1] https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.1/html/network_functions_virtualization_planning_and_configuration_guide/assembly_config-vxlan-dpdk-sriov-hybrid
Jul 28 10:01:41 computeovndpdksriov-0 os-net-config-sriov[1645]: Stderr: 'Cannot find device "enp130s0f1v1"\n' This happens when the VF creating is taking time and not yet available for the kernel to configure the VF [or] when the VF is bound to vfio-pci driver. The command is working for other VFs and only failing for VF attached to ovs_bond (other VFs are attached to linux_bond), which rules out the delay scenario. Need to confirm whether ifup scripts have been triggered before sriov_config.service is completed. Can you share the complete boot logs after the reboot? How do you apply the configuration after the reboot is completed to make it work? Does it happen only with specific hardware [or] nic partitioning is not working after reboot in all type of nodes?
(In reply to Miguel Angel Nieto from comment #7) > I have seen in this doc [1] that the recomended way to define tenant vlan is > diferent. I configured as in this doc. > The referenced doc above is specific to dpdk port on the interface, not applicable for dpdk port on VF. As per os-net-config sample, your earlier configuration is correct. https://github.com/openstack/os-net-config/blob/master/etc/os-net-config/samples/sriov_pf_ovs_dpdk.yaml#L71 If this is not present in the document for nic-partitioning, then it's good to add.
we did couple of times manual run (reboot and validations) and didn't find any issue. This may be related timing issue with automation. Miguel is going to test few more times and update. Thanks
there must be some kind of race condition, this time i needed to execute the testcase 4 or 5 times to make it fail. After failing, i have no tenant traffic ping and I can see that vf configuration changed: before reboot 12: enp130s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether f8:f2:1e:03:b4:80 brd ff:ff:ff:ff:ff:ff vf 0 link/ether f2:54:9c:3b:63:da brd ff:ff:ff:ff:ff:ff, vlan 120, spoof checking off, link-state auto, trust on vf 1 link/ether ca:7e:f1:58:29:61 brd ff:ff:ff:ff:ff:ff, vlan 122, spoof checking off, link-state auto, trust on vf 2 link/ether 02:20:e0:a5:1a:67 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on after reboot 12: enp130s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether f8:f2:1e:03:b4:80 brd ff:ff:ff:ff:ff:ff vf 0 link/ether 1e:8b:b3:38:b7:cb brd ff:ff:ff:ff:ff:ff, vlan 120, spoof checking off, link-state auto, trust on vf 1 link/ether 86:69:7e:62:85:ce brd ff:ff:ff:ff:ff:ff, vlan 122, spoof checking off, link-state auto, trust on vf 2 link/ether ca:49:53:a9:a4:09 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
I didt put in my previous comment enp130s0f1, but same behaviour, vf 2 is unconfigured 13: enp130s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether f8:f2:1e:03:b4:82 brd ff:ff:ff:ff:ff:ff vf 0 link/ether 1e:8b:b3:38:b7:cb brd ff:ff:ff:ff:ff:ff, vlan 120, spoof checking off, link-state auto, trust on vf 1 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, vlan 122, spoof checking off, link-state auto, trust on vf 2 link/ether 16:2f:f1:b5:14:a7 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off This is something I do not understand either, for enp130s0f1, virtual functions are missing from the kernel [root@computeovndpdksriov-1 extra]# ip a | grep enp130s0f0 12: enp130s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP group default qlen 1000 34: enp130s0f0v0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond_api state UP group default qlen 1000 35: enp130s0f0v5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 36: enp130s0f0v6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 37: enp130s0f0v7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 38: enp130s0f0v4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 39: enp130s0f0v3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 40: enp130s0f0v9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 41: enp130s0f0v1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master storage_bond state UP group default qlen 1000 42: enp130s0f0v8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 [root@computeovndpdksriov-1 extra]# ip a | grep enp130s0f1 13: enp130s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP group default qlen 1000 44: enp130s0f1v0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond_api state UP group default qlen 1000
Since we are randomly seeing this issue in our CI and with random failures in CI, I am marking this as test blocker.
@hareshkhandelwal I tried with reduced VF and also changes as listed in https://docs.google.com/document/d/1CbyPTQ7ZDpcIDaqYAobNGZ_92Iprfon0f7YoAm9GgVA/edit#heading=h.v3t78c1gzj59 However, intermittently all day2 tests are failing - I ran 4 times, and once it failed this is with the compose RHOS-16.2-RHEL-8-20210804.n.0
RHEL Clone - https://bugzilla.redhat.com/show_bug.cgi?id=1993882
@sanjay, I checked the compose at http://rhos-qe-mirror-tlv.usersys.redhat.com/rcm-guest/puddles/OpenStack/16.2-RHEL-8/RHOS-16.2-RHEL-8-20220210.n.1/compose/OpenStack/x86_64/os/Packages/, and the patches (there are 2 pacthes for this BZ) are merged in this Updated the fixed in version
I have reboot the server serveral times and I have not been able to reproduce it RHOS-16.2-RHEL-8-20220610.n.1 os-net-config-11.5.1-2.20220404114957.173ef73.el8ost.noarch 10: enp4s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 04:3f:72:b8:be:f6 brd ff:ff:ff:ff:ff:ff vf 0 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, vlan 170, spoof checking off, link-state auto, trust on, query_rss off vf 1 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, vlan 172, spoof checking off, link-state auto, trust on, query_rss off vf 2 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on, query_rss off vf 3 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on, query_rss off vf 4 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off vf 5 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off vf 6 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off vf 7 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off vf 8 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off vf 9 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
According to our records, this should be resolved by os-net-config-11.5.1-2.20220404114957.173ef73.el8ost. This build is available now.