Description of problem: One guest fip is replying when 2 or more vms deployed on single compute. Version-Release number of selected component (if applicable): How reproducible: With puddle RHOS-16.2-RHEL-8-20210525.n.0 Steps to Reproduce: 1. deploy ovn with hw-offload computes: refer the following THTs: https://gitlab.cee.redhat.com/yrachman/testing-testbed/-/tree/master/tht/ospd-16.2-geneve-ovn-hw-offload-ctlplane-dataplane-bonding-hybrid 2. create 4 vms at least two on each compute, attach floating ips openstack server list --host computeovshwoffload-1.redhat.local --all -c 'Networks' +-------------------------------------------------------+ | Networks | +-------------------------------------------------------+ | mellanox-geneve-provider=20.20.220.113, 10.35.141.162 | | mellanox-geneve-provider=20.20.220.171, 10.35.141.165 | +-------------------------------------------------------+ 3. send ping ping -c 1 10.35.141.162 PING 10.35.141.162 (10.35.141.162) 56(84) bytes of data. 64 bytes from 10.35.141.162: icmp_seq=1 ttl=61 time=0.703 ms ping -c 1 10.35.141.165 PING 10.35.141.162 (10.35.141.162) 56(84) bytes of data. 64 bytes from 10.35.141.162: icmp_seq=1 ttl=61 time=0.703 ms ping -c 5 10.35.141.162 PING 10.35.141.162 (10.35.141.162) 56(84) bytes of data. ^C --- 10.35.141.162 ping statistics --- 5 packets transmitted, 0 received, 100% packet loss, time 4115ms Actual results: Only one fip is responsive Expected results: Ping should be responsive for both floating ips Additional info: Once fip is mapped to vlan network both fips are reachable and responsive sos report rhos-release.virt.bos.redhat.com:/var/www/html/log/ovn-hw-fip/
BZ, opened due to regression for hw-offload rhel 8.4
this test is passing with ovs backend for 16.2 Another interesting issue overcloud) [stack@undercloud-0 ~]$ ping 10.35.141.162 PING 10.35.141.162 (10.35.141.162) 56(84) bytes of data. 64 bytes from 10.35.141.162: icmp_seq=2 ttl=61 time=26.9 ms ^C --- 10.35.141.162 ping statistics --- 2 packets transmitted, 1 received, 50% packet loss, time 1012ms rtt min/avg/max/mdev = 26.877/26.877/26.877/0.000 ms (overcloud) [stack@undercloud-0 ~]$ ping 10.35.141.163 PING 10.35.141.163 (10.35.141.163) 56(84) bytes of data. ^C --- 10.35.141.163 ping statistics --- 8 packets transmitted, 0 received, 100% packet loss, time 7158ms Waiting few minutes (overcloud) [stack@undercloud-0 ~]$ ping 10.35.141.163 PING 10.35.141.163 (10.35.141.163) 56(84) bytes of data. 64 bytes from 10.35.141.163: icmp_seq=1 ttl=61 time=26.10 ms 64 bytes from 10.35.141.163: icmp_seq=2 ttl=61 time=0.619 ms ^C --- 10.35.141.163 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1001ms rtt min/avg/max/mdev = 0.619/13.800/26.981/13.181 ms (overcloud) [stack@undercloud-0 ~]$ ping 10.35.141.162 PING 10.35.141.162 (10.35.141.162) 56(84) bytes of data. ^C --- 10.35.141.162 ping statistics --- 2 packets transmitted, 0 received, 100% packet loss, time 1061ms
See description, above sos report: rhos-release.virt.bos.redhat.com:/var/www/html/log/ovn-hw-fip/ Adding the commands here: [root@computeovshwoffload-0 ~]# ovs-vsctl show edb712f9-21dd-4816-b0ab-e185994d2312 Manager "ptcp:6640:127.0.0.1" is_connected: true Bridge br-int fail_mode: secure datapath_type: system Port ovn-f5aa6c-0 Interface ovn-f5aa6c-0 type: geneve options: {csum="true", key=flow, remote_ip="10.10.161.127"} bfd_status: {diagnostic="Control Detection Time Expired", flap_count="2", forwarding="false", remote_diagnostic="No Diagnostic", remote_state=down, state=down} Port ovn-3ea0c7-0 Interface ovn-3ea0c7-0 type: geneve options: {csum="true", key=flow, remote_ip="10.10.161.129"} Port enp4s0f0_8 Interface enp4s0f0_8 Port ovn-1c2cb7-0 Interface ovn-1c2cb7-0 type: geneve options: {csum="true", key=flow, remote_ip="10.10.161.135"} bfd_status: {diagnostic="Control Detection Time Expired", flap_count="2", forwarding="false", remote_diagnostic="No Diagnostic", remote_state=down, state=down} Port tap130d9819-20 Interface tap130d9819-20 Port enp4s0f0_0 Interface enp4s0f0_0 Port br-int Interface br-int type: internal Port ovn-ccf1be-0 Interface ovn-ccf1be-0 type: geneve options: {csum="true", key=flow, remote_ip="10.10.161.125"} bfd_status: {diagnostic="Control Detection Time Expired", flap_count="2", forwarding="false", remote_diagnostic="No Diagnostic", remote_state=down, state=down} Bridge br-link1 fail_mode: standalone Port bond0 Interface enp6s0f0 Interface enp6s0f1 Port br-link1 Interface br-link1 type: internal Bridge br-link0 fail_mode: standalone Port br-link0 Interface br-link0 type: internal Port mx-bond Interface mx-bond ovs_version: "2.15.1" [root@computeovshwoffload-0 ~]# [root@computeovshwoffload-0 ~]# ovs-vsctl list Open_vSwitch _uuid : edb712f9-21dd-4816-b0ab-e185994d2312 bridges : [087d9b34-53da-42b1-9052-50dc7819cf0b, 1d9b091c-c8e8-4d50-8ee5-07b51d9f5b53, 8d8e1d31-df9a-4c03-9876-14de0828c795] cur_cfg : 69 datapath_types : [netdev, system] datapaths : {} db_version : "8.2.0" dpdk_initialized : false dpdk_version : "DPDK 20.11.1" external_ids : {hostname=computeovshwoffload-0.redhat.local, ovn-bridge=br-int, ovn-bridge-mappings="mx-network:br-link0,mgmt:br-link1", ovn-encap-ip="10.10.161.101", ovn-encap-type=geneve, ovn-openflow-probe-interval="60", ovn-remote="tcp:10.10.160.115:6642", ovn-remote-probe-interval="60000", rundir="/var/run/openvswitch", system-id="ff44434d-03f6-4cd2-9d57-94dfa476ca32"} iface_types : [bareudp, erspan, geneve, gre, gtpu, internal, ip6erspan, ip6gre, lisp, patch, stt, system, tap, vxlan] manager_options : [fb560279-de47-49d8-9883-bce497d67e1e] next_cfg : 69 other_config : {hw-offload="true"} ovs_version : "2.15.1" ssl : [] statistics : {} system_type : rhel system_version : "8.4" [root@computeovshwoffload-0 ~]# ovs-vsctl list Open_vSwitch _uuid : edb712f9-21dd-4816-b0ab-e185994d2312 bridges : [087d9b34-53da-42b1-9052-50dc7819cf0b, 1d9b091c-c8e8-4d50-8ee5-07b51d9f5b53, 8d8e1d31-df9a-4c03-9876-14de0828c795] cur_cfg : 69 datapath_types : [netdev, system] datapaths : {} db_version : "8.2.0" dpdk_initialized : false dpdk_version : "DPDK 20.11.1" external_ids : {hostname=computeovshwoffload-0.redhat.local, ovn-bridge=br-int, ovn-bridge-mappings="mx-network:br-link0,mgmt:br-link1", ovn-encap-ip="10.10.161.101", ovn-encap-type=geneve, ovn-openflow-probe-interval="60", ovn-remote="tcp:10.10.160.115:6642", ovn-remote-probe-interval="60000", rundir="/var/run/openvswitch", system-id="ff44434d-03f6-4cd2-9d57-94dfa476ca32"} iface_types : [bareudp, erspan, geneve, gre, gtpu, internal, ip6erspan, ip6gre, lisp, patch, stt, system, tap, vxlan] manager_options : [fb560279-de47-49d8-9883-bce497d67e1e] next_cfg : 69 other_config : {hw-offload="true"} ovs_version : "2.15.1" ssl : [] statistics : {} system_type : rhel system_version : "8.4" [root@computeovshwoffload-0 ~]# ovs-vsctl list Bridge _uuid : 1d9b091c-c8e8-4d50-8ee5-07b51d9f5b53 auto_attach : [] controller : [] datapath_id : "00003cfdfe33a5c0" datapath_type : "" datapath_version : "<unknown>" external_ids : {} fail_mode : standalone flood_vlans : [] flow_tables : {} ipfix : [] mcast_snooping_enable: false mirrors : [] name : br-link1 netflow : [] other_config : {} ports : [74c7db58-b96b-4ae1-997a-4b4fe129eca4, d07bb466-f10a-4d4f-a30d-4bbc47d65f6b] protocols : [] rstp_enable : false rstp_status : {} sflow : [] status : {} stp_enable : false _uuid : 087d9b34-53da-42b1-9052-50dc7819cf0b auto_attach : [] controller : [] datapath_id : "0000ea39dbd27b00" datapath_type : system datapath_version : "<unknown>" external_ids : {ct-zone-130d9819-2725-4acb-bd71-08bcc4627bb5_dnat="3", ct-zone-130d9819-2725-4acb-bd71-08bcc4627bb5_snat="2", ct-zone-21fe1f0f-c5c4-4fec-96b6-f75f0d6d84ca="1", ct-zone-3f31490f-4a85-40b4-a55c-e404afe37a14_dnat="9", ct-zone-3f31490f-4a85-40b4-a55c-e404afe37a14_snat="8", ct-zone-5816989a-7d23-4baf-bcad-0f19d7556a0b="4", ct-zone-760c2c71-7f41-487b-ae42-97c5a8e68dbf="5", ct-zone-bacb5372-84c4-442d-8af9-6fe912043908_dnat="6", ct-zone-bacb5372-84c4-442d-8af9-6fe912043908_snat="7", ct-zone-provnet-9944dd33-b0ff-41d3-b1dc-503febf31976="10", ovn-nb-cfg="332"} fail_mode : secure flood_vlans : [] flow_tables : {} ipfix : [] mcast_snooping_enable: false mirrors : [] name : br-int netflow : [] other_config : {disable-in-band="true", hwaddr="ea:39:db:d2:7b:00"} ports : [1fd0ad86-4fc5-4d7f-ba75-a7d83b5ec2cc, 47b547c4-e7ac-40bf-b0c5-3850cbfd3c43, 4f0ab7f9-0757-4bb1-919d-a447eccf03ae, 6e22db8e-de6f-448a-800b-27b7cac5cb9d, cf999796-32d0-4758-b5a7-92c1a9d342ae, d01b3af0-e70a-4c5e-933f-e06e1877a053, dfa292bb-6a5b-4f2a-af64-1b9b22d970a2, f0a86926-91bf-48a6-89ac-7a2fc84ed099] protocols : [] rstp_enable : false rstp_status : {} sflow : [] status : {} stp_enable : false _uuid : 8d8e1d31-df9a-4c03-9876-14de0828c795 auto_attach : [] controller : [] datapath_id : "0000043f72b8bb5e" datapath_type : "" datapath_version : "<unknown>" external_ids : {} fail_mode : standalone flood_vlans : [] flow_tables : {} ipfix : [] mcast_snooping_enable: false mirrors : [] name : br-link0 netflow : [] other_config : {} ports : [46682d89-4531-4101-be78-00e743e37230, 65e8a46a-d494-486a-b1e1-a95613b46d11] protocols : [] rstp_enable : false rstp_status : {} sflow : [] status : {} stp_enable : false [root@computeovshwoffload-0 ~]# ovs-vsctl list Interface _uuid : f5e85f91-dd6b-4889-90ab-13db7dc572e1 admin_state : up bfd : {enable="true"} bfd_status : {diagnostic="Control Detection Time Expired", flap_count="2", forwarding="false", remote_diagnostic="No Diagnostic", remote_state=down, state=down} cfm_fault : [] cfm_fault_status : [] cfm_flap_count : [] cfm_health : [] cfm_mpid : [] cfm_remote_mpids : [] cfm_remote_opstate : [] duplex : [] error : [] external_ids : {} ifindex : 68 ingress_policing_burst: 0 ingress_policing_rate: 0 lacp_current : [] link_resets : 0 link_speed : [] link_state : up lldp : {} mac : [] mac_in_use : "7a:1d:c7:73:af:9c" mtu : [] mtu_request : [] name : ovn-ccf1be-0 ofport : 4 ofport_request : [] options : {csum="true", key=flow, remote_ip="10.10.161.125"} other_config : {} statistics : {rx_bytes=445455, rx_packets=6513, tx_bytes=8671086, tx_packets=131274} status : {tunnel_egress_iface=vlan161, tunnel_egress_iface_carrier=up} type : geneve _uuid : 53cfde91-9e82-476c-8179-159e39ed810f admin_state : up bfd : {} bfd_status : {} cfm_fault : [] cfm_fault_status : [] cfm_flap_count : [] cfm_health : [] cfm_mpid : [] cfm_remote_mpids : [] cfm_remote_opstate : [] duplex : [] error : [] external_ids : {} ifindex : 62 ingress_policing_burst: 0 ingress_policing_rate: 0 lacp_current : [] link_resets : 1 link_speed : [] link_state : up lldp : {} mac : [] mac_in_use : "3c:fd:fe:33:a5:c0" mtu : 9000 mtu_request : [] name : br-link1 ofport : 65534 ofport_request : [] options : {} other_config : {} statistics : {collisions=0, rx_bytes=735902, rx_crc_err=0, rx_dropped=0, rx_errors=0, rx_frame_err=0, rx_missed_errors=0, rx_over_err=0, rx_packets=15801, tx_bytes=3386, tx_dropped=0, tx_errors=0, tx_packets=47} status : {driver_name=openvswitch} type : internal _uuid : ebc8ded2-c187-4287-94cc-451519dd07bf admin_state : up bfd : {} bfd_status : {} cfm_fault : [] cfm_fault_status : [] cfm_flap_count : [] cfm_health : [] cfm_mpid : [] cfm_remote_mpids : [] cfm_remote_opstate : [] duplex : full error : [] external_ids : {iface-id="5816989a-7d23-4baf-bcad-0f19d7556a0b"} ifindex : 87 ingress_policing_burst: 0 ingress_policing_rate: 0 lacp_current : [] link_resets : 0 link_speed : 10000000000 link_state : up lldp : {} mac : [] mac_in_use : "26:24:bc:47:5f:ca" mtu : 1500 mtu_request : [] name : tap130d9819-20 ofport : 28 ofport_request : [] options : {} other_config : {} statistics : {collisions=0, rx_bytes=23784, rx_crc_err=0, rx_dropped=0, rx_errors=0, rx_frame_err=0, rx_missed_errors=0, rx_over_err=0, rx_packets=207, tx_bytes=52633, tx_dropped=0, tx_errors=0, tx_packets=740} status : {driver_name=veth, driver_version="1.0", firmware_version=""} type : "" _uuid : 5ab9c9e3-0efa-4e59-a404-27bf20fdfeb2 admin_state : up bfd : {enable="true"} bfd_status : {diagnostic="Control Detection Time Expired", flap_count="2", forwarding="false", remote_diagnostic="No Diagnostic", remote_state=down, state=down} cfm_fault : [] cfm_fault_status : [] cfm_flap_count : [] cfm_health : [] cfm_mpid : [] cfm_remote_mpids : [] cfm_remote_opstate : [] duplex : [] error : [] external_ids : {} ifindex : 68 ingress_policing_burst: 0 ingress_policing_rate: 0 lacp_current : [] link_resets : 0 link_speed : [] link_state : up lldp : {} mac : [] mac_in_use : "b6:2a:c4:15:7c:9d" mtu : [] mtu_request : [] name : ovn-1c2cb7-0 ofport : 3 ofport_request : [] options : {csum="true", key=flow, remote_ip="10.10.161.135"} other_config : {} statistics : {rx_bytes=409134, rx_packets=6199, tx_bytes=8647254, tx_packets=131019} status : {tunnel_egress_iface=vlan161, tunnel_egress_iface_carrier=up} type : geneve _uuid : 869157d3-e7c3-4418-8c8d-5f65f9ca83d5 admin_state : up bfd : {} bfd_status : {} cfm_fault : [] cfm_fault_status : [] cfm_flap_count : [] cfm_health : [] cfm_mpid : [] cfm_remote_mpids : [] cfm_remote_opstate : [] duplex : full error : [] external_ids : {} ifindex : 18 ingress_policing_burst: 0 ingress_policing_rate: 0 lacp_current : [] link_resets : 0 link_speed : 10000000000 link_state : up lldp : {} mac : [] mac_in_use : "04:3f:72:b8:bb:5e" mtu : 9000 mtu_request : [] name : mx-bond ofport : 1 ofport_request : [] options : {} other_config : {} statistics : {collisions=0, rx_bytes=80336928, rx_crc_err=0, rx_dropped=4497, rx_errors=0, rx_frame_err=0, rx_missed_errors=0, rx_over_err=0, rx_packets=1185360, tx_bytes=21290897, tx_dropped=0, tx_errors=0, tx_packets=311882} status : {driver_name=bonding, driver_version="4.18.0-305.3.1.el8_4.x86_64", firmware_version="2"} type : "" _uuid : 012981dc-10fd-4d48-ae20-24c22c7ba9e2 admin_state : down bfd : {} bfd_status : {} cfm_fault : [] cfm_fault_status : [] cfm_flap_count : [] cfm_health : [] cfm_mpid : [] cfm_remote_mpids : [] cfm_remote_opstate : [] duplex : [] error : [] external_ids : {} ifindex : 67 ingress_policing_burst: 0 ingress_policing_rate: 0 lacp_current : [] link_resets : 0 link_speed : [] link_state : down lldp : {} mac : [] mac_in_use : "ea:39:db:d2:7b:00" mtu : 1500 mtu_request : [] name : br-int ofport : 65534 ofport_request : [] options : {} other_config : {} statistics : {collisions=0, rx_bytes=0, rx_crc_err=0, rx_dropped=0, rx_errors=0, rx_frame_err=0, rx_missed_errors=0, rx_over_err=0, rx_packets=0, tx_bytes=0, tx_dropped=0, tx_errors=0, tx_packets=0} status : {driver_name=openvswitch} type : internal _uuid : 3526b34c-a22a-4c93-89ba-e5c4f09002e3 admin_state : up bfd : {} bfd_status : {} cfm_fault : [] cfm_fault_status : [] cfm_flap_count : [] cfm_health : [] cfm_mpid : [] cfm_remote_mpids : [] cfm_remote_opstate : [] duplex : [] error : [] external_ids : {} ifindex : 68 ingress_policing_burst: 0 ingress_policing_rate: 0 lacp_current : [] link_resets : 0 link_speed : [] link_state : up lldp : {} mac : [] mac_in_use : "7e:ae:de:54:55:90" mtu : [] mtu_request : [] name : ovn-3ea0c7-0 ofport : 1 ofport_request : [] options : {csum="true", key=flow, remote_ip="10.10.161.129"} other_config : {} statistics : {rx_bytes=5140, rx_packets=68, tx_bytes=4912, tx_packets=94} status : {tunnel_egress_iface=vlan161, tunnel_egress_iface_carrier=up} type : geneve _uuid : fc67aa9c-f166-4bbb-937d-d2b1889e8520 admin_state : down bfd : {} bfd_status : {} cfm_fault : [] cfm_fault_status : [] cfm_flap_count : [] cfm_health : [] cfm_mpid : [] cfm_remote_mpids : [] cfm_remote_opstate : [] duplex : [] error : [] external_ids : {} ifindex : 61 ingress_policing_burst: 0 ingress_policing_rate: 0 lacp_current : [] link_resets : 2 link_speed : [] link_state : down lldp : {} mac : [] mac_in_use : "04:3f:72:b8:bb:5e" mtu : 9000 mtu_request : [] name : br-link0 ofport : 65534 ofport_request : [] options : {} other_config : {} statistics : {collisions=0, rx_bytes=0, rx_crc_err=0, rx_dropped=15847, rx_errors=0, rx_frame_err=0, rx_missed_errors=0, rx_over_err=0, rx_packets=0, tx_bytes=0, tx_dropped=0, tx_errors=0, tx_packets=0} status : {driver_name=openvswitch} type : internal _uuid : 276bfcb5-1a91-404a-b5aa-7efe4c58926a admin_state : up bfd : {} bfd_status : {} cfm_fault : [] cfm_fault_status : [] cfm_flap_count : [] cfm_health : [] cfm_mpid : [] cfm_remote_mpids : [] cfm_remote_opstate : [] duplex : [] error : [] external_ids : {attached-mac="fa:16:3e:ae:f2:b1", iface-id="760c2c71-7f41-487b-ae42-97c5a8e68dbf", iface-status=active, ovn-installed="true", vm-uuid="fd5d78d2-dce2-4790-b446-f5a0a4e34249"} ifindex : 30 ingress_policing_burst: 0 ingress_policing_rate: 0 lacp_current : [] link_resets : 0 link_speed : [] link_state : up lldp : {} mac : [] mac_in_use : "da:cc:cd:60:8a:e9" mtu : 8942 mtu_request : [] name : enp4s0f0_0 ofport : 29 ofport_request : [] options : {} other_config : {} statistics : {collisions=0, rx_bytes=133508, rx_crc_err=0, rx_dropped=0, rx_errors=0, rx_frame_err=0, rx_missed_errors=0, rx_over_err=0, rx_packets=1679, tx_bytes=69898, tx_dropped=0, tx_errors=0, tx_packets=889} status : {driver_name=mlx5e_rep, driver_version="4.18.0-305.3.1.el8_4.x86_64", firmware_version="16.27.6120 (DEL0000000015)"} type : "" _uuid : 38dd099e-9c69-4910-9add-86636404c528 admin_state : up bfd : {} bfd_status : {} cfm_fault : [] cfm_fault_status : [] cfm_flap_count : [] cfm_health : [] cfm_mpid : [] cfm_remote_mpids : [] cfm_remote_opstate : [] duplex : full error : [] external_ids : {} ifindex : 7 ingress_policing_burst: 0 ingress_policing_rate: 0 lacp_current : [] link_resets : 0 link_speed : 10000000000 link_state : up lldp : {} mac : [] mac_in_use : "3c:fd:fe:33:a5:c2" mtu : 9000 mtu_request : [] name : enp6s0f1 ofport : 2 ofport_request : [] options : {} other_config : {} statistics : {collisions=0, rx_bytes=43101385, rx_crc_err=0, rx_dropped=0, rx_errors=0, rx_frame_err=0, rx_missed_errors=0, rx_over_err=0, rx_packets=718156, tx_bytes=12603, tx_dropped=0, tx_errors=0, tx_packets=106} status : {driver_name=i40e, driver_version="4.18.0-305.3.1.el8_4.x86_64", firmware_version="5.40 0x80002d36 18.0.17"} type : "" _uuid : f34e95c6-4d67-4a91-ae19-e4945249fd24 admin_state : up bfd : {} bfd_status : {} cfm_fault : [] cfm_fault_status : [] cfm_flap_count : [] cfm_health : [] cfm_mpid : [] cfm_remote_mpids : [] cfm_remote_opstate : [] duplex : [] error : [] external_ids : {attached-mac="fa:16:3e:35:1c:81", iface-id="21fe1f0f-c5c4-4fec-96b6-f75f0d6d84ca", iface-status=active, ovn-installed="true", vm-uuid="c0d00331-9ee5-4242-8494-f6ae60442d16"} ifindex : 38 ingress_policing_burst: 0 ingress_policing_rate: 0 lacp_current : [] link_resets : 0 link_speed : [] link_state : up lldp : {} mac : [] mac_in_use : "0a:19:98:0d:ff:c6" mtu : 8942 mtu_request : [] name : enp4s0f0_8 ofport : 27 ofport_request : [] options : {} other_config : {} statistics : {collisions=0, rx_bytes=144624, rx_crc_err=0, rx_dropped=0, rx_errors=0, rx_frame_err=0, rx_missed_errors=0, rx_over_err=0, rx_packets=1871, tx_bytes=80485, tx_dropped=0, tx_errors=0, tx_packets=1028} status : {driver_name=mlx5e_rep, driver_version="4.18.0-305.3.1.el8_4.x86_64", firmware_version="16.27.6120 (DEL0000000015)"} type : "" _uuid : 77fa407e-1b70-4f9f-8b7c-a0fbe207a433 admin_state : up bfd : {enable="true"} bfd_status : {diagnostic="Control Detection Time Expired", flap_count="2", forwarding="false", remote_diagnostic="No Diagnostic", remote_state=down, state=down} cfm_fault : [] cfm_fault_status : [] cfm_flap_count : [] cfm_health : [] cfm_mpid : [] cfm_remote_mpids : [] cfm_remote_opstate : [] duplex : [] error : [] external_ids : {} ifindex : 68 ingress_policing_burst: 0 ingress_policing_rate: 0 lacp_current : [] link_resets : 0 link_speed : [] link_state : up lldp : {} mac : [] mac_in_use : "02:2e:4b:89:e5:50" mtu : [] mtu_request : [] name : ovn-f5aa6c-0 ofport : 2 ofport_request : [] options : {csum="true", key=flow, remote_ip="10.10.161.127"} other_config : {} statistics : {rx_bytes=499044, rx_packets=6693, tx_bytes=8728759, tx_packets=131496} status : {tunnel_egress_iface=vlan161, tunnel_egress_iface_carrier=up} type : geneve _uuid : 32e14c57-950b-431e-94ba-03be0cd3e80b admin_state : up bfd : {} bfd_status : {} cfm_fault : [] cfm_fault_status : [] cfm_flap_count : [] cfm_health : [] cfm_mpid : [] cfm_remote_mpids : [] cfm_remote_opstate : [] duplex : full error : [] external_ids : {} ifindex : 6 ingress_policing_burst: 0 ingress_policing_rate: 0 lacp_current : [] link_resets : 0 link_speed : 10000000000 link_state : up lldp : {} mac : [] mac_in_use : "3c:fd:fe:33:a5:c0" mtu : 9000 mtu_request : [] name : enp6s0f0 ofport : 1 ofport_request : [] options : {} other_config : {} statistics : {collisions=0, rx_bytes=43102825, rx_crc_err=0, rx_dropped=0, rx_errors=0, rx_frame_err=0, rx_missed_errors=0, rx_over_err=0, rx_packets=718180, tx_bytes=15425, tx_dropped=0, tx_errors=0, tx_packets=147} status : {driver_name=i40e, driver_version="4.18.0-305.3.1.el8_4.x86_64", firmware_version="5.40 0x80002d36 18.0.17"} type : "" [root@computeovshwoffload-0 ~]#
Can you make sure that the system has the tc utilities installed? For example, the sosreport doesn't contain any of the 'tc' commands I would expect ('tc -s filter show {devname} ingress', etc) - maybe you can capture. I'm not sure about that errno - ENOENT - usually I think it implies a generic error installing the flow along the hw datapath. I would expect if too many flows got offloaded, we would see ENOSPC, and if the flow wasn't supported we would see something like EOPNOTSUPP or similar. Maybe mleitner can see something I don't.
(In reply to Aaron Conole from comment #17) > I'm not sure about that errno - ENOENT - usually I think it implies a > generic error installing the > flow along the hw datapath. I would expect if too many flows got offloaded, > we would see ENOSPC, and > if the flow wasn't supported we would see something like EOPNOTSUPP or > similar. > > Maybe mleitner can see something I don't. While we don't have https://bugzilla.redhat.com/show_bug.cgi?id=1916418 We can use a perf probe on https://github.com/torvalds/linux/commit/7e3ce05e7f650371061d0b9eec1e1cf74ed6fca0 to find exactly where and why this error was returned. Btw, interesting how the 1st packet gets through, and then others don't. That pretty much means the upcall handles it, updates the datapath and then things get broken somehow. But if the filter failed to be added in tc, it should have added in dp:ovs. Weird.
BUG not reproduced with the following paddle, Same test is failing with this puddle RHOS-16.2-RHEL-8-20210811.n.1 Checking if issue persist, will update
(In reply to Yariv from comment #23) > BUG not reproduced with the following paddle, > > Same test is failing with this puddle > RHOS-16.2-RHEL-8-20210811.n.1 > > Checking if issue persist, will update The problem still persist RHOS-16.2-RHEL-8-20210811.n.1 (overcloud) [stack@undercloud-0 ~]$ openstack server list --all --host computeovshwoffload-0.redhat.local +--------------------------------------+------------------------------------------+--------+---------------------------------------------------------------------------------------------+---------------------------------------+--------+ | ID | Name | Status | Networks | Image | Flavor | +--------------------------------------+------------------------------------------+--------+---------------------------------------------------------------------------------------------+---------------------------------------+--------+ | 02f9e32d-7078-4f7b-869f-90b75f70dc56 | tempest-TestNfvOffload-server-1658772737 | ACTIVE | mellanox-geneve-provider=20.20.220.192, 10.35.141.167; mellanox-vlan-provider=30.30.220.182 | rhel-guest-image-7-6-210-x86-64-qcow2 | | | 8935550e-35d2-4177-a845-641e2a305c6e | tempest-TestNfvOffload-server-530477859 | ACTIVE | mellanox-geneve-provider=20.20.220.122, 10.35.141.172; mellanox-vlan-provider=30.30.220.125 | rhel-guest-image-7-6-210-x86-64-qcow2 | | +--------------------------------------+------------------------------------------+--------+---------------------------------------------------------------------------------------------+---------------------------------------+--------+ (overcloud) [stack@undercloud-0 ~]$ ping 10.35.141.167 PING 10.35.141.167 (10.35.141.167) 56(84) bytes of data. 64 bytes from 10.35.141.167: icmp_seq=1 ttl=61 time=17.9 ms ^C --- 10.35.141.167 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 17.891/17.891/17.891/0.000 ms (overcloud) [stack@undercloud-0 ~]$ ping 10.35.141.172 PING 10.35.141.172 (10.35.141.172) 56(84) bytes of data. @mleitner, would you like to look at the machines?
(In reply to Yariv from comment #24) > @mleitner, would you like to look at the machines? Not really. :-) Haresh can debug OSP better than I do. I'm here if anything, though.
Restoring need-info that I cleared by mistake.
I have been debugging the issue together with Haresh and I think we have seen the root cause. The problem happens when there are 2 vms in the same compute connected to the same provider network. There is an issue with the flow programming and packets go to the wrong vm, so that ping fails. It is not related with floating ip, ping fails between 2 ips in the same provider network. If that ip is used for floating ip, then floating ip will fail too. If there is a single vm per compute there is no issue. Here I provide an example: I create 4 vms (2 in each compute). Vms does not have floating ip, i will use console: (venv) (overcloud) [stack@undercloud-0 ~]$ openstack server list --a +--------------------------------------+------------------------------------------+--------+------------------------------------------------------------------------------+---------------------------------------+--------+ | ID | Name | Status | Networks | Image | Flavor | +--------------------------------------+------------------------------------------+--------+------------------------------------------------------------------------------+---------------------------------------+--------+ | 048a6bcd-5ada-4127-b95a-9962cf31e80f | tempest-TestNfvOffload-server-1590230603 | ACTIVE | mellanox-geneve-provider=20.20.220.112; mellanox-vlan-provider=30.30.220.147 | rhel-guest-image-7-6-210-x86-64-qcow2 | | | f71166db-2a49-4d73-aa73-4573b3bf8db5 | tempest-TestNfvOffload-server-741569861 | ACTIVE | mellanox-geneve-provider=20.20.220.106; mellanox-vlan-provider=30.30.220.132 | rhel-guest-image-7-6-210-x86-64-qcow2 | | | c71d4bdf-4aa7-4b69-8405-6633c5864a7d | tempest-TestNfvOffload-server-922847220 | ACTIVE | mellanox-geneve-provider=20.20.220.196; mellanox-vlan-provider=30.30.220.172 | rhel-guest-image-7-6-210-x86-64-qcow2 | | | 1ac387cf-31cf-4cf6-97b4-acb3034cca31 | tempest-TestNfvOffload-server-1803927283 | ACTIVE | mellanox-geneve-provider=20.20.220.188; mellanox-vlan-provider=30.30.220.175 | rhel-guest-image-7-6-210-x86-64-qcow2 | | +--------------------------------------+------------------------------------------+--------+------------------------------------------------------------------------------+---------------------------------------+ These are the ports and macs: (venv) (overcloud) [stack@undercloud-0 ~]$ openstack port list | egrep "220.112|220.106|220.196|220.188|220.147|220.132|220.172|220.175" | 54397464-6866-40dd-98cd-8c3e1f48e018 | tempest-port-smoke-1207250115 | fa:16:3e:40:bf:a6 | ip_address='30.30.220.132', subnet_id='82969e14-51b1-4993-90d2-0269dd0bdf8d' | ACTIVE | | 5f893749-f4ba-4230-b46a-9a1491b0bc6c | tempest-port-smoke-331751474 | fa:16:3e:05:48:82 | ip_address='20.20.220.106', subnet_id='8aa962a5-354f-4406-8fa1-baa73ff14f2b' | ACTIVE | | 7052864e-8624-493d-b23c-6b8ecbf54d8f | tempest-port-smoke-1880359353 | fa:16:3e:2d:1c:71 | ip_address='30.30.220.175', subnet_id='82969e14-51b1-4993-90d2-0269dd0bdf8d' | ACTIVE | | 883f8950-5930-4549-a5e8-c6a8d44cf9c8 | tempest-port-smoke-1206809988 | fa:16:3e:e7:e1:1e | ip_address='20.20.220.112', subnet_id='8aa962a5-354f-4406-8fa1-baa73ff14f2b' | ACTIVE | | 92c7fd6f-929c-4f2a-b378-6ac69946ffc8 | tempest-port-smoke-441515917 | fa:16:3e:b6:fa:2c | ip_address='20.20.220.188', subnet_id='8aa962a5-354f-4406-8fa1-baa73ff14f2b' | ACTIVE | | a31551fc-bc18-4c11-971f-9acc6e12d51c | tempest-port-smoke-1291037828 | fa:16:3e:4c:0c:21 | ip_address='30.30.220.147', subnet_id='82969e14-51b1-4993-90d2-0269dd0bdf8d' | ACTIVE | | ec05e264-e102-4c00-a04c-9873d1c7c9b9 | tempest-port-smoke-612065238 | fa:16:3e:52:ed:2a | ip_address='30.30.220.172', subnet_id='82969e14-51b1-4993-90d2-0269dd0bdf8d' | ACTIVE | | ffbae10d-2440-4af7-8a29-7da30d537f59 | tempest-port-smoke-1068018430 | fa:16:3e:37:96:8b | ip_address='20.20.220.196', subnet_id='8aa962a5-354f-4406-8fa1-baa73ff14f2b' | ACTIVE | These are representor ports used: hypervisor 192.0.50.18 29: enp4s0f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master mx-bond state UP mode DEFAULT group default qlen 1000 link/ether 98:03:9b:9c:50:58 brd ff:ff:ff:ff:ff:ff vf 8 link/ether fa:16:3e:37:96:8b brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off vf 9 link/ether fa:16:3e:05:48:82 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off 50: enp4s0f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master mx-bond state UP mode DEFAULT group default qlen 1000 link/ether 98:03:9b:9c:50:58 brd ff:ff:ff:ff:ff:ff permaddr 98:03:9b:9c:50:59 vf 8 link/ether fa:16:3e:52:ed:2a brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off vf 9 link/ether fa:16:3e:40:bf:a6 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off hypervisor 192.0.50.11 29: enp4s0f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master mx-bond state UP mode DEFAULT group default qlen 1000 link/ether ec:0d:9a:7d:7d:32 brd ff:ff:ff:ff:ff:ff vf 8 link/ether fa:16:3e:b6:fa:2c brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off vf 9 link/ether fa:16:3e:e7:e1:1e brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off 50: enp4s0f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master mx-bond state UP mode DEFAULT group default qlen 1000 link/ether ec:0d:9a:7d:7d:32 brd ff:ff:ff:ff:ff:ff permaddr ec:0d:9a:7d:7d:33 vf 8 link/ether fa:16:3e:2d:1c:71 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off vf 9 link/ether fa:16:3e:4c:0c:21 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off Test: ping from 20.20.220.188 to 20.20.220.196 fails. If we execute tcpdump in 20.20.220.106 while executing ping, we can see how packets are comming, but they do not go to 20.20.220.196, so ping fails. Below are the flows, and we can see that it is being delivered to enp4s0f0_9 instead of enp4s0f0_8 flows: Hypervisor 192.0.50.11 ufid:bb8da3b2-727e-415b-b55c-fb1f19310940, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(enp4s0f0_8),packet_type(ns=0/0,id=0/0),eth(src=$a:16:3e:b6:fa:2c,dst=fa:16:3e:37:96:8b),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=20.20.220.192/255.255.255.224,proto=0/0,tos=0/0x3,ttl=0/0,frag=no), packets:22, bytes:3520, used:0.360s, offloaded:yes, dp:t$, actions:set(tunnel(tun_id=0x3,dst=10.10.121.176,ttl=64,tp_dst=6081,key6(bad key length 1, expected 0)(01)geneve({class=0x102,type=0x80,len=4,0x20003}),flags(key))),genev_sys_6081 Hypervisor 192.0.50.18 ufid:734179ef-b530-42f8-aed6-e41ba813d5e7, skb_priority(0/0),tunnel(tun_id=0x3,src=10.10.121.146,dst=10.10.121.176,ttl=0/0,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x20004/0x7fffffff}),flags(+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(genev_sys_6081),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:b6:fa:2c,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:68, bytes:6664, used:0.290s, offloaded:yes, dp:tc, actions:enp4s0f0_9 ufid:6ecf939a-0d5c-4c69-a396-5bbc424e5cfb, skb_priority(0/0),tunnel(tun_id=0x3,src=10.10.121.146,dst=10.10.121.176,ttl=0/0,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x20004/0x7fffffff}),flags(+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(genev_sys_6081),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:b6:fa:2c,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0806),arp(sip=0.0.0.0/0.0.0.0,tip=0.0.0.0/0.0.0.0,op=0/0,sha=00:00:00:00:00:00/00:00:00:00:00:00,tha=00:00:00:00:00:00/00:00:00:00:00:00), packets:0, bytes:0, used:1.310s, offloaded:yes, dp:tc, actions:enp4s0f0_9
The issue can be related to this fix. This fix solves an issue with HW offload of different geneve tunnel with the same tunnel src/dst ip, id and port but different geneve options. commit 929a2faddd55290fbb0b73f453b200ed1b2b2947 Author: Dima Chumak <dchumak> Date: Thu Feb 11 09:36:33 2021 +0200 net/mlx5e: Consider geneve_opts for encap contexts Current algorithm for encap keys is legacy from initial vxlan implementation and doesn't take into account all possible fields of a tunnel. For example, for a Geneve tunnel, which may have additional TLV options, they are ignored when comparing encap keys and a rule can be attached to an incorrect encap entry. Fix that by introducing encap_info_equal() operation in struct mlx5e_tc_tunnel. Geneve tunnel type uses custom implementation, which extends generic algorithm and considers options if they are set. Fixes: 7f1a546e3222 ("net/mlx5e: Consider tunnel type for encap contexts") Signed-off-by: Dima Chumak <dchumak> Reviewed-by: Vlad Buslov <vladbu> Signed-off-by: Saeed Mahameed <saeedm>
Thanks Maor, Hi Amir, Is it possible to get 4.18 kernel patch with fix? We can try out and if it fixes the issue, can request for backport. Thanks
(In reply to Haresh Khandelwal from comment #56) > Thanks Maor, > Hi Amir, Is it possible to get 4.18 kernel patch with fix? We can try out > and if it fixes the issue, can request for backport. > > Thanks Hi, we have this fix already in kernel-4.18.0-324.el8 and above from RHEL 8.5 branch. Is it enough for your testing ?
Thanks Amir, So, shall i assume that commit#929a2faddd55290fbb0b73f453b200ed1b2b2947 would fix this issue? RHOSP 16.2.x would be shipped with RHEL 8.4 throughout life. The latest compose has kernel version 4.18.0-305.19.1.el8_4. I am not aware of how RHEL picks kernel version or if next 8.4z would have the fix kernel. If not, then we need to backport it. Marcelo, can you help here? Thanks
8.4.z kernels will always be 4.18.0-305.*.el8_4. With that, yes, we would need to backport the fix to 8.4.z so that RHOSP can have it. In theory we would need two tests here: - one with y-stream/8.5 kernel, to be sure the issue is fixed in y-stream we don't want regressions for customers updating from 8.4.z to 8.5 or 8.6 later on. - one with a test kernel on 8.4.z, to be sure that the fix is complete and no dependencies were missed we don't want to backport something that later on we find out "ooops, missed this other commit". We can skip one of them if there's enough confidence, though. I think the patch is spot on. If Nvidia agrees, we can proceed with just the 2nd test, with a test kernel for 8.4.z.
Hi Marcelo, (In reply to Marcelo Ricardo Leitner from comment #59) > 8.4.z kernels will always be 4.18.0-305.*.el8_4. With that, yes, we would > need to backport the fix to 8.4.z so that RHOSP can have it. Good, Thanks > > In theory we would need two tests here: > - one with y-stream/8.5 kernel, to be sure the issue is fixed in y-stream > we don't want regressions for customers updating from 8.4.z to 8.5 or 8.6 > later on. RHOSP has no plan to use RHEL 8.5 ever, RHOSP17 will be based on RHEL 9. > > - one with a test kernel on 8.4.z, to be sure that the fix is complete and > no dependencies were missed > we don't want to backport something that later on we find out "ooops, > missed this other commit". > > We can skip one of them if there's enough confidence, though. > I think the patch is spot on. If Nvidia agrees, we can proceed with just the > 2nd test, with a test kernel for 8.4.z. Yes, this Bz was found in our CI, so it would be easy to validate quickly if we have fix. Thanks
(In reply to maord from comment #55) > The issue can be related to this fix. > This fix solves an issue with HW offload of different geneve tunnel with the > same tunnel src/dst ip, id and port but different geneve options. > > commit 929a2faddd55290fbb0b73f453b200ed1b2b2947 > Author: Dima Chumak <dchumak> > Date: Thu Feb 11 09:36:33 2021 +0200 > > net/mlx5e: Consider geneve_opts for encap contexts For the record, this bz was originally backported via https://bugzilla.redhat.com/show_bug.cgi?id=1915308 and that's where the 8.4.z will need to be requested, once the test confirms it.
@atzin any news on the test NVIDIA build?
(In reply to Karrar Fida from comment #63) > @atzin any news on the test NVIDIA build? https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=40713341 test kernel of RHEL-8.4 with 929a2faddd55 ("net/mlx5e: Consider geneve_opts for encap contexts")
We think this is a test-only for the NFV team. Please switch it to our DFG if you think we are wrong!
Hi, I have tested with that patch and the problem is not solved. I have installed the patch: (undercloud) [stack@undercloud-0 ~]$ ssh heat-admin.50.20 "uname -a" Warning: Permanently added '192.0.50.20' (ECDSA) to the list of known hosts. Linux computeovshwoffload-0 4.18.0-305.26.1.el8_4.UNSUPPORTED_1966157.x86_64 #1 SMP Sun Oct 31 06:28:11 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux (undercloud) [stack@undercloud-0 ~]$ ssh heat-admin.50.22 "uname -a" Warning: Permanently added '192.0.50.22' (ECDSA) to the list of known hosts. Linux computeovshwoffload-1 4.18.0-305.26.1.el8_4.UNSUPPORTED_1966157.x86_64 #1 SMP Sun Oct 31 06:28:11 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux I have check that I continue having problems with the ping (overcloud) [stack@undercloud-0 ~]$ ping -c 1 -w 1 10.35.141.53;sleep 12; ping -c 1 -w 1 10.35.141.53 PING 10.35.141.53 (10.35.141.53) 56(84) bytes of data. --- 10.35.141.53 ping statistics --- 1 packets transmitted, 0 received, 100% packet loss, time 0ms PING 10.35.141.53 (10.35.141.53) 56(84) bytes of data. 64 bytes from 10.35.141.53: icmp_seq=1 ttl=61 time=10.0 ms --- 10.35.141.53 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 10.013/10.013/10.013/0.000 ms
Hi Miguel, If the test was positive, we would have skipped this test, but then: Please try the latest 8.5 kernel as well. Maybe something went sour in the backport to 8.4.z, missed a patch dependency or so. You can download it from here: http://download.eng.bos.redhat.com/brewroot/packages/kernel/4.18.0/348.4.el8/ Thanks.
Hi I tried with previous patch and it is not solving the problem either. Apart from installing rpms and rebooting computes, should I do anything else to ensure the that patch is installed properly? With uname -a I can see that I have the correct kernel version, is it enough? (undercloud) [stack@undercloud-0 ~]$ ssh heat-admin.50.20 "uname -a" Warning: Permanently added '192.0.50.20' (ECDSA) to the list of known hosts. Linux computeovshwoffload-0 4.18.0-348.4.el8.x86_64 #1 SMP Mon Oct 25 15:08:07 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux (undercloud) [stack@undercloud-0 ~]$ ssh heat-admin.50.22 "uname -a" Warning: Permanently added '192.0.50.22' (ECDSA) to the list of known hosts. Linux computeovshwoffload-1 4.18.0-348.4.el8.x86_64 #1 SMP Mon Oct 25 15:08:07 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux (undercloud) [stack@undercloud-0 ~]$ ping -c 1 -w 1 10.35.141.51;sleep 12; ping -c 1 -w 1 10.35.141.51 PING 10.35.141.51 (10.35.141.51) 56(84) bytes of data. --- 10.35.141.51 ping statistics --- 1 packets transmitted, 0 received, 100% packet loss, time 0ms PING 10.35.141.51 (10.35.141.51) 56(84) bytes of data. 64 bytes from 10.35.141.51: icmp_seq=1 ttl=61 time=2.48 ms --- 10.35.141.51 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 2.475/2.475/2.475/0.000 ms
(In reply to Miguel Angel Nieto from comment #70) > I tried with previous patch and it is not solving the problem either. Thanks. That's a very important piece of information. > > Apart from installing rpms and rebooting computes, should I do anything else > to ensure the that patch is installed properly? With uname -a I can see > that I have the correct kernel version, is it enough? It is enough, yes.
https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=40903423 test kernel of RHEL-8.4 with 929a2faddd55 ("net/mlx5e: Consider geneve_opts for encap contexts") I think that the build of comment 64 did not eventually contained the fix due to my mistake.
Hi Folks, The fix that Maor suggested above actually solves a problem on the encap side, where it will not respect different encap header that have different geneve options. Therefore the effects of this bug may be observed only on the receiver side which does the matching and classification on the geneve options. So a few questions: 1. Did u install and test the fix also on the client/traffic sender side? 2. If what I mentioned is true, you should see this behavior also with HW offload turned off (As well as see the same geneve options for all traffic on tcpdump). Can you please confirm that you verified that without HW offload it works? 3. We need to understand if this issue is already resolved upstream or is it something we should repro and debug properly in house. Can you please confirm for us? Thanks, Ariel
Hi I answer the questions: 1. Yes, I patched overcloud image during deployment with the new kernel, so the kernel updates were everywhere. The issue is happening in all compute nodes if I have in that compute node 2 or more vms attached to the same geneve network. 2. When I tested the kernel patch I only tested with offload enabled, but from previous tests I did I can confirm that the issue only happens with hw offload. There is no issue if offload is disabled. 3. I will try to get more information about this point Regards Miguel
(In reply to Miguel Angel Nieto from comment #74) > 3. I will try to get more information about this point You can use an ARK kernel for that, btw. It's the kernel-* packages at https://odcs.fedoraproject.org/composes/production/latest-Fedora-ELN/compose/BaseOS/x86_64/os/Packages/ They should be fresh enough for this test, and they install nicely o RHEL 8.
Thanks for the repo. I have tried today, I didnt have any issue to update kernel packages but the servers are not booting properly after updating the kernel, I would say some services are not working properly, ssh is broken. I will need some more time to see what is happening.
Hi Folks, any update on the upstream testing here?
sorry for the delay. I will tested it between thursday and friday.
Hi I tested with upstream kernel and I didnt found the issue, I think it is working properly Linux computeovshwoffload-0 5.15.0-60.eln113.x86_64 #1 SMP Mon Nov 1 16:50:20 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux I have ping to both vms at the same time and I have got the flows: VMS: (overcloud) [stack@undercloud-0 ~]$ openstack server list --a | egrep "141.58|141.55" | e378517a-66de-4363-882c-6a7b11035f24 | tempest-TestNfvOffload-server-2033717592 | ACTIVE | mellanox-geneve-provider=20.20.220.180, 10.35.141.58; mellanox-vlan-provider=30.30.220.116 | rhel-guest-image-7-6-210-x86-64-qcow2 | | | d075c648-197f-460a-a88f-5f19933447e3 | tempest-TestNfvOffload-server-1328589616 | ACTIVE | mellanox-geneve-provider=20.20.220.171, 10.35.141.55; mellanox-vlan-provider=30.30.220.162 | rhel-guest-image-7-6-210-x86-64-qcow2 | | PORTS (overcloud) [stack@undercloud-0 ~]$ openstack port list | egrep "180|171" | 89979eb3-b59d-4c6b-b81c-5264aecd60c8 | tempest-port-smoke-988384003 | fa:16:3e:03:8d:eb | ip_address='20.20.220.180', subnet_id='b25f99f4-5441-4df1-ab99-b3d1c5885042' | ACTIVE | | 89d9df81-3bea-412d-a71b-663df1ed0ce7 | tempest-port-smoke-297925914 | fa:16:3e:de:26:4c | ip_address='20.20.220.171', subnet_id='b25f99f4-5441-4df1-ab99-b3d1c5885042' | ACTIVE | VFS: 11: enp4s0f0: <BROADCAST,MULTICAST,PROMISC,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master mx-bond state UP mode DEFAULT group default qlen 1000 link/ether 98:03:9b:9c:50:58 brd ff:ff:ff:ff:ff:ff vf 2 link/ether fa:16:3e:03:8d:eb brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off vf 9 link/ether fa:16:3e:de:26:4c brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off FLOWS ufid:951f5f75-93a2-482b-a090-9d28a84cf28e, skb_priority(0/0),skb_mark(0/0),ct_state(0/0x21),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(enp4s0f0_9),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:de:26:4c,dst=fa:16:3e:96:d9:20),eth_type(0x0800),ipv4(src=20.20.220.128/255.255.255.192,dst=10.35.0.0/255.255.128.0,proto=1,tos=0/0x3,ttl=64,frag=no),icmp(type=0/0,code=0/0), packets:372, bytes:59520, used:0.850s, offloaded:yes, dp:tc, actions:set(tunnel(tun_id=0x2,dst=10.10.121.103,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x30002}),flags(csum|key))),set(eth(src=fa:16:3e:74:20:44,dst=9c:cc:83:58:1c:60)),set(ipv4(ttl=63)),genev_sys_6081 ufid:0951ac83-ff60-4464-a216-5f52fde3307f, skb_priority(0/0),skb_mark(0/0),ct_state(0/0x21),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(enp4s0f0_9),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:de:26:4c,dst=fa:16:3e:96:d9:20),eth_type(0x0800),ipv4(src=20.20.220.128/255.255.255.192,dst=32.0.0.0/224.0.0.0,proto=17,tos=0/0x3,ttl=64,frag=no),udp(src=0/0,dst=0/0x800), packets:1, bytes:152, used:3.070s, offloaded:yes, dp:tc, actions:set(tunnel(tun_id=0x2,dst=10.10.121.103,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x30002}),flags(csum|key))),set(eth(src=fa:16:3e:74:20:44,dst=9c:cc:83:58:1c:60)),set(ipv4(ttl=63)),genev_sys_6081 ufid:aa2a6075-74e8-4022-b3d6-9d3192cc880d, skb_priority(0/0),skb_mark(0/0),ct_state(0/0x21),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(enp4s0f0_2),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:03:8d:eb,dst=fa:16:3e:96:d9:20),eth_type(0x0800),ipv4(src=20.20.220.128/255.255.255.192,dst=10.35.0.0/255.255.128.0,proto=1,tos=0/0x3,ttl=64,frag=no),icmp(type=0/0,code=0/0), packets:379, bytes:60640, used:0.850s, offloaded:yes, dp:tc, actions:set(tunnel(tun_id=0x2,dst=10.10.121.103,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x30002}),flags(csum|key))),set(eth(src=fa:16:3e:74:20:44,dst=9c:cc:83:58:1c:60)),set(ipv4(ttl=63)),genev_sys_6081 ufid:28fb9f44-4f87-4fe8-b2e6-6ee76847673a, skb_priority(0/0),tunnel(tun_id=0x3,src=10.10.121.103,dst=10.10.121.169,ttl=0/0,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x60005/0x7fffffff}),flags(+key)),skb_mark(0/0),ct_state(0/0x21),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(genev_sys_6081),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:96:d9:20,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=1,tos=0/0,ttl=0/0,frag=no),icmp(type=0/0,code=0/0), packets:379, bytes:37142, used:0.850s, offloaded:yes, dp:tc, actions:enp4s0f0_2 ufid:8bbf6429-398f-46d3-b719-5fbbf506d539, skb_priority(0/0),tunnel(tun_id=0x3,src=10.10.121.103,dst=10.10.121.169,ttl=0/0,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x60003/0x7fffffff}),flags(+key)),skb_mark(0/0),ct_state(0/0x21),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(genev_sys_6081),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:96:d9:20,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=1,tos=0/0,ttl=0/0,frag=no),icmp(type=0/0,code=0/0), packets:372, bytes:36456, used:0.850s, offloaded:yes, dp:tc, actions:enp4s0f0_9 ufid:d7a56785-6997-45ed-b162-ffe59dab9364, skb_priority(0/0),tunnel(tun_id=0x3,src=10.10.121.103,dst=10.10.121.169,ttl=0/0,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x60003/0x7fffffff}),flags(+key)),skb_mark(0/0),ct_state(0/0x21),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(genev_sys_6081),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:96:d9:20,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=17,tos=0/0,ttl=0/0,frag=no),udp(src=0/0,dst=32768/0x8000), packets:3, bytes:270, used:3.070s, offloaded:yes, dp:tc, actions:enp4s0f0_9 ufid:4e9f7708-d134-48fd-9eb0-ad9f646afb14, recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(enp6s0f0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=d0:07:ca:34:e9:17,dst=01:00:5e:00:00:01),eth_type(0x8100),vlan(vid=124,pcp=0/0x0),encap(eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no)), packets:0, bytes:0, used:never, dp:ovs, actions:userspace(pid=4183418373,slow_path(match)) ufid:e3a5a9ff-3f1d-4a5d-ac75-0a006da3b30e, recirc_id(0),dp_hash(0/0),skb_priority(0/0),tunnel(tun_id=0x0,src=10.10.121.172,dst=10.10.121.169,ttl=0/0,flags(-df+csum+key)),in_port(genev_sys_6081),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=17,tos=0/0,ttl=0/0,frag=no),udp(src=0/0,dst=3784), packets:931, bytes:61446, used:0.923s, dp:ovs, actions:userspace(pid=3978135798,slow_path(bfd)) ufid:b652e835-b598-4b30-8be1-379a4b94df21, recirc_id(0),dp_hash(0/0),skb_priority(0/0),tunnel(tun_id=0x0,src=10.10.121.131,dst=10.10.121.169,ttl=0/0,flags(-df+csum+key)),in_port(genev_sys_6081),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=17,tos=0/0,ttl=0/0,frag=no),udp(src=0/0,dst=3784), packets:931, bytes:61446, used:0.359s, dp:ovs, actions:userspace(pid=3978135798,slow_path(bfd)) ufid:6d0b12db-380d-4f64-81e1-bc1d6614025e, recirc_id(0),dp_hash(0/0),skb_priority(0/0),tunnel(tun_id=0x0,src=10.10.121.103,dst=10.10.121.169,ttl=0/0,flags(-df+csum+key)),in_port(genev_sys_6081),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=17,tos=0/0,ttl=0/0,frag=no),udp(src=0/0,dst=3784), packets:938, bytes:61908, used:0.207s, dp:ovs, actions:userspace(pid=3978135798,slow_path(bfd)) ufid:3f0eb711-5f01-46ca-91ed-bf5cd2a0cb80, recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(enp4s0f0_9),skb_mark(0/0),ct_state(0/0x21),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=fa:16:3e:de:26:4c,dst=fa:16:3e:96:d9:20),eth_type(0x0806),arp(sip=20.20.220.171,tip=20.20.220.254,op=1/0xff,sha=fa:16:3e:de:26:4c,tha=00:00:00:00:00:00), packets:0, bytes:0, used:never, dp:ovs, actions:userspace(pid=2567040171,slow_path(action))
So do we have any idea between which versions/commits we should do the bisect?
Marcelo, Do you have any idea for comment#81?
According to comment #70, it DID NOT work with 8.5 kernel 4.18.0-348.4.el8. That kernel has driver rebased to v5.12 as per https://bugzilla.redhat.com/show_bug.cgi?id=1915308. It also has tc rebased to "latest upstream" (fuzzy) by https://bugzilla.redhat.com/show_bug.cgi?id=1946986, which seems it's v5.13. I don't see any net/openvswitch changes between 8.5 and current net-next, 89f971182417cb27abd82cfc48a7f36b99352ddc. Comment #80 says it worked with v5.15. With that, I'm thinking the haystack that we're looking for this needle is v5.12..v5.15. And then, while checking the driver diff between 8.5 and 89f971182417cb27abd82cfc48a7f36b99352ddc, I noticed this commit: $ git show 3442e0335e70f348728c17bca924ec507ad6358a commit 3442e0335e70f348728c17bca924ec507ad6358a Author: Yevgeny Kliteynik <kliteyn> Date: Sun Feb 7 04:27:48 2021 +0200 net/mlx5: DR, Add support for matching on geneve TLV option Enable matching on tunnel geneve TLV option using the flex parser. Well, that's precisely what is being done here. The commit has: @@ -360,10 +365,14 @@ static int dr_matcher_set_ste_builders(struct mlx5dr_matcher *matcher, if (dr_mask_is_tnl_vxlan_gpe(&mask, dmn)) mlx5dr_ste_build_tnl_vxlan_gpe(ste_ctx, &sb[idx++], &mask, inner, rx); - else if (dr_mask_is_tnl_geneve(&mask, dmn)) + else if (dr_mask_is_tnl_geneve(&mask, dmn)) { mlx5dr_ste_build_tnl_geneve(ste_ctx, &sb[idx++], &mask, inner, rx); - + if (dr_mask_is_tnl_geneve_tlv_opt(&mask.misc3)) + mlx5dr_ste_build_tnl_geneve_tlv_opt(ste_ctx, &sb[idx++], + &mask, &dmn->info.caps, + inner, rx); + } if (DR_MASK_IS_ETH_L4_MISC_SET(mask.misc3, outer)) mlx5dr_ste_build_eth_l4_misc(ste_ctx, &sb[idx++], &mask, inner, rx); This is too deep in the driver for me now, but apparently up to this commit it was ignoring this part of the information. This commit is on DR (direct routing / which is AKA software steering), which OSP is using. Checking the patchset that introduced this commit, the cover letter mentions: https://lore.kernel.org/netdev/20210420032018.58639-1-saeed%40kernel.org/T/ """ 3) Dynamic Flex parser support: Flex parser is a HW parser that can support protocols that are not natively supported by the HCA, such as Geneve (TLV options) and GTP-U. There are 8 such parsers, and each of them can be assigned to parse a specific set of protocols. 4) Enable matching on Geneve TLV options """ With the writing on #4, apparently that's the case. The patch that we attempted earlier, 929a2faddd55 ("net/mlx5e: Consider geneve_opts for encap contexts"), AFAICT now, it's meant for tx, right? While 3442e0335e70 ("net/mlx5: DR, Add support for matching on geneve TLV option") is on rx side and we would need both. Note that the test here is failing by delivering the packets from the wire to the wrong VF, which would be 'rx' in my wording here. Depending on Nvidia's review now, perhaps we can narrow down that v5.12~v5.15 further. Ariel, thoughts? Any other test that we can do?
(In reply to Marcelo Ricardo Leitner from comment #83) > $ git show 3442e0335e70f348728c17bca924ec507ad6358a > commit 3442e0335e70f348728c17bca924ec507ad6358a > Author: Yevgeny Kliteynik <kliteyn> > Date: Sun Feb 7 04:27:48 2021 +0200 > > net/mlx5: DR, Add support for matching on geneve TLV option > > Enable matching on tunnel geneve TLV option using the flex parser. This commit is probably slated for 8.6, via https://bugzilla.redhat.com/show_bug.cgi?id=1982191 . But Alaa/Amir will know better.
Not to be backported to 8.4 and will therefor be release noted.
(In reply to Marcelo Ricardo Leitner from comment #83) > According to comment #70, it DID NOT work with 8.5 kernel 4.18.0-348.4.el8. > That kernel has driver rebased to v5.12 as per > https://bugzilla.redhat.com/show_bug.cgi?id=1915308. > > It also has tc rebased to "latest upstream" (fuzzy) by > https://bugzilla.redhat.com/show_bug.cgi?id=1946986, which seems it's v5.13. > > I don't see any net/openvswitch changes between 8.5 and current net-next, > 89f971182417cb27abd82cfc48a7f36b99352ddc. > > Comment #80 says it worked with v5.15. > > With that, I'm thinking the haystack that we're looking for this needle is > v5.12..v5.15. > > > > And then, while checking the driver diff between 8.5 and > 89f971182417cb27abd82cfc48a7f36b99352ddc, I noticed this commit: > > $ git show 3442e0335e70f348728c17bca924ec507ad6358a > commit 3442e0335e70f348728c17bca924ec507ad6358a > Author: Yevgeny Kliteynik <kliteyn> > Date: Sun Feb 7 04:27:48 2021 +0200 > > net/mlx5: DR, Add support for matching on geneve TLV option > > Enable matching on tunnel geneve TLV option using the flex parser. > > > Well, that's precisely what is being done here. The commit has: > > @@ -360,10 +365,14 @@ static int dr_matcher_set_ste_builders(struct > mlx5dr_matcher *matcher, > if (dr_mask_is_tnl_vxlan_gpe(&mask, dmn)) > mlx5dr_ste_build_tnl_vxlan_gpe(ste_ctx, &sb[idx++], > &mask, inner, rx); > - else if (dr_mask_is_tnl_geneve(&mask, dmn)) > + else if (dr_mask_is_tnl_geneve(&mask, dmn)) { > mlx5dr_ste_build_tnl_geneve(ste_ctx, &sb[idx++], > &mask, inner, rx); > - > + if (dr_mask_is_tnl_geneve_tlv_opt(&mask.misc3)) > + mlx5dr_ste_build_tnl_geneve_tlv_opt(ste_ctx, > &sb[idx++], > + &mask, > &dmn->info.caps, > + inner, > rx); > + } > if (DR_MASK_IS_ETH_L4_MISC_SET(mask.misc3, outer)) > mlx5dr_ste_build_eth_l4_misc(ste_ctx, &sb[idx++], > &mask, inner, rx); > > This is too deep in the driver for me now, but apparently up to this commit > it was ignoring this part of the information. > This commit is on DR (direct routing / which is AKA software steering), > which OSP is using. > > Checking the patchset that introduced this commit, the cover letter mentions: > https://lore.kernel.org/netdev/20210420032018.58639-1-saeed%40kernel.org/T/ > > """ > 3) Dynamic Flex parser support: > Flex parser is a HW parser that can support protocols that are not > natively supported by the HCA, such as Geneve (TLV options) and GTP-U. > There are 8 such parsers, and each of them can be assigned to parse a > specific set of protocols. > > 4) Enable matching on Geneve TLV options > """ > > With the writing on #4, apparently that's the case. > > The patch that we attempted earlier, > 929a2faddd55 ("net/mlx5e: Consider geneve_opts for encap contexts"), > AFAICT now, it's meant for tx, right? While 3442e0335e70 ("net/mlx5: DR, Add > support for matching on geneve TLV option") > is on rx side and we would need both. Note that the test here is failing by > delivering the packets from the > wire to the wrong VF, which would be 'rx' in my wording here. > > Depending on Nvidia's review now, perhaps we can narrow down that > v5.12~v5.15 further. > Ariel, thoughts? Any other test that we can do? I think we have a bingo. Nice catch Marcelo. This is indeed affecting matching on geneve headers on the RX path. Looks like u have a valid test for the RX fix already. To validate the TX we need to try and send traffic with different geneve options (but same tunnel IPs) from the same host and see that indeed the different flows have different options.
Thanks Ariel. With that, Amir, can we have 8.4.z test kernel with this fix/series also please? Thanks.
(In reply to Marcelo Ricardo Leitner from comment #87) > Thanks Ariel. > With that, Amir, can we have 8.4.z test kernel with this fix/series also > please? Thanks. Marcelo, we need to confirm the repro u have is with SW steering because the patch you pointed out is relevant only to that mode. Ariel
Right. I thought I had asked folks that already, but if I did, I don't know where. :-} Yariv, Haresh, can you please confirm Ariel's question on comment #88? Thanks, Marcelo
Hi Ariel, Marcelo, Steering mode in OSP (16.1.3 onwards) is smfs. Thanks
And the test was using 16.2, ok. Thanks Haresh. Ariel, should we assume that changing to smfs is hard to fail? Because in that case, it would be using dmfs, but in a normal run I have never seen the change to smfs fail. Asking because AFAIK OSP ignores the failure and continues with dmfs.
As long as it is set while in legacy mode (not switchdev) it is not likely to fail.
With all the above, my understanding is that we can safely assume the host was using smfs at that moment. Please speak if you (anyone) don't agree. :-)
Hi Miguel, From comment#48, this issue is only related to ml2/ovn(and thus geneve), can you please update the bug summary and remove "ovs"? Thanks
Considering patch from comment #86 is already present in 9.0 beta and we agreed today to not backport this to rhel8 unless requested by a customer (well, or via a general driver update), we're good from the RHEL side on this issue.
Haresh, considering the above, what should we do this bz then?
*** Bug 2014183 has been marked as a duplicate of this bug. ***
Raising severity due to the AutomationBlocker keyword
This issue is not happening in 17.1. I have 2 vms in each compute and I can ping all of them (overcloud) [stack@undercloud-0 ~]$ openstack server list --all-projects --long +--------------------------------------+------------------------------------------+--------+------------+-------------+------------------------------------------------------+----------------------------------------------+--------------------------------------+--------------------+-------------------+-----------------------------------+------------+ | ID | Name | Status | Task State | Power State | Networks | Image Name | Image ID | Flavor | Availability Zone | Host | Properties | +--------------------------------------+------------------------------------------+--------+------------+-------------+------------------------------------------------------+----------------------------------------------+--------------------------------------+--------------------+-------------------+-----------------------------------+------------+ | 9c4b9a42-ad31-4e4c-af8b-cdf6f616ef84 | tempest-TestNfvOffload-server-1581528699 | ACTIVE | None | Running | mellanox-geneve-provider=10.46.228.40, 20.20.220.178 | rhel-guest-image-nfv-2-8.7-1660.x86_64.qcow2 | c4d2c2e7-536e-4aa8-b3b7-e8f0f6d0cc90 | nfv_qe_ag_flavor_1 | nova | computehwoffload-r740.localdomain | | | 384e3861-fe40-4b32-a3e5-5df6bba27d08 | tempest-TestNfvOffload-server-123555249 | ACTIVE | None | Running | mellanox-geneve-provider=10.46.228.39, 20.20.220.149 | rhel-guest-image-nfv-2-8.7-1660.x86_64.qcow2 | c4d2c2e7-536e-4aa8-b3b7-e8f0f6d0cc90 | nfv_qe_ag_flavor_0 | nova | computehwoffload-r730.localdomain | | | fdc3a9a1-1271-46db-b6e6-1e69adac4944 | tempest-TestNfvOffload-server-1714201295 | ACTIVE | None | Running | mellanox-geneve-provider=10.46.228.35, 20.20.220.118 | rhel-guest-image-nfv-2-8.7-1660.x86_64.qcow2 | c4d2c2e7-536e-4aa8-b3b7-e8f0f6d0cc90 | nfv_qe_ag_flavor_1 | nova | computehwoffload-r740.localdomain | | | cb50f5de-7a1c-4ead-a7cc-2b5d12cc5f03 | tempest-TestNfvOffload-server-514728000 | ACTIVE | None | Running | mellanox-geneve-provider=10.46.228.34, 20.20.220.140 | rhel-guest-image-nfv-2-8.7-1660.x86_64.qcow2 | c4d2c2e7-536e-4aa8-b3b7-e8f0f6d0cc90 | nfv_qe_ag_flavor_0 | nova | computehwoffload-r730.localdomain | | +--------------------------------------+------------------------------------------+--------+------------+-------------+------------------------------------------------------+----------------------------------------------+--------------------------------------+--------------------+-------------------+-----------------------------------+------------+ (overcloud) [stack@undercloud-0 ~]$ ping -w 1 -c 1 10.46.228.40 PING 10.46.228.40 (10.46.228.40) 56(84) bytes of data. 64 bytes from 10.46.228.40: icmp_seq=1 ttl=61 time=7.34 ms --- 10.46.228.40 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 7.343/7.343/7.343/0.000 ms (overcloud) [stack@undercloud-0 ~]$ ping -w 1 -c 1 10.46.228.39 PING 10.46.228.39 (10.46.228.39) 56(84) bytes of data. 64 bytes from 10.46.228.39: icmp_seq=1 ttl=61 time=7.64 ms --- 10.46.228.39 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 7.637/7.637/7.637/0.000 ms (overcloud) [stack@undercloud-0 ~]$ ping -w 1 -c 1 10.46.228.35 PING 10.46.228.35 (10.46.228.35) 56(84) bytes of data. 64 bytes from 10.46.228.35: icmp_seq=1 ttl=61 time=5.51 ms --- 10.46.228.35 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 5.510/5.510/5.510/0.000 ms (overcloud) [stack@undercloud-0 ~]$ ping -w 1 -c 1 10.46.228.34 PING 10.46.228.34 (10.46.228.34) 56(84) bytes of data. 64 bytes from 10.46.228.34: icmp_seq=1 ttl=61 time=2.19 ms --- 10.46.228.34 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 2.189/2.189/2.189/0.000 ms (overcloud) [stack@undercloud-0 ~]$ cat core_puddle_version RHOS-17.1-RHEL-9-20230613.n.1(overcloud) [stack@undercloud-0 ~]