Created attachment 1771853 [details] python script multicast.py for sending/receiving multicast messages +++ This bug was initially created as a clone of Bug #1949454 +++ Description of problem: A VM on external network subscribed to a multicast group is receiving 2 multicast packets when only one is sent by a sender VM. The issue is a regression, this was not happen in previous puddles. The problem no happens with VMs connected to an internal network. Note: same behavior on OSP16.1 and OSP16.2. The issue started to occur recently on both versions. Version-Release number of selected component (if applicable): RHOS-16.1-RHEL-8-20210413.n.0 python3-networking-ovn-7.3.1-1.20210409093428.4e24f4c.el8ost.noarch ovn2.13-20.12.0-24.el8fdp.x86_64 openvswitch2.13-2.13.0-79.6.el8fdp.x86_64 How reproducible: 100% Steps to Reproduce: 1. Create a keypar and security group allowing ssh, icmp, igmp and all udp. 2. Spawn 2 VMs on external network, in my case VMs were running on different compute nodes. 3. Install the attached multicast.py script into both VMs. 4. On both VMs start tcpdump as follows: sudo tcpdump -i any -vvneA -s0 -l igmp or port 5001. 6. On one VM (receiver) run multicast.py script as follows: python3 multicast -r -g 225.0.0.100 -p 5001 7. On second VM run multicast script as follows: python3 multicast -s -g 225.0.0.100 -p 5001 -m qweqwe -c 1 Actual results: 2 copies of sent multicast message reached the receiver Expected results: 1 copy of sent multicast message reach the receiver Additional info: Capture on sender [cloud-user@vm1 ~]$ sudo tcpdump -i any -vvneA -s0 -l igmp or port 5001 05:43:18.802558 Out fa:16:3e:ad:7e:cf ethertype IPv4 (0x0800), length 50: (tos 0x0, ttl 32, id 51412, offset 0, flags [DF], proto UDP (17), length 34) 10.0.0.241.53673 > 225.0.0.100.commplex-link: [bad udp cksum 0xec74 -> 0xdffb!] UDP, length 6 E.."..@. ... ......d.......qweqwe Capture on receiver [cloud-user@vm2 ~]$ sudo tcpdump -i any -vvneA -s0 -l igmp or port 5001 05:43:18.781359 M fa:16:3e:ad:7e:cf ethertype IPv4 (0x0800), length 50: (tos 0x0, ttl 32, id 51412, offset 0, flags [DF], proto UDP (17), length 34) 10.0.0.241.53673 > 225.0.0.100.commplex-link: [udp sum ok] UDP, length 6 E.."..@. ... ......d........qweqwe 05:43:18.781420 M fa:16:3e:ad:7e:cf ethertype IPv4 (0x0800), length 50: (tos 0x0, ttl 32, id 51412, offset 0, flags [DF], proto UDP (17), length 34) 10.0.0.241.53673 > 225.0.0.100.commplex-link: [udp sum ok] UDP, length 6 E.."..@. ... ......d........qweqwe When doing the same scenario with VMs connected to internal_A network all works properly (only one multicast packet reaches the receiver) Some ovn configs [heat-admin@controller-0 ~]$ ovn-nbctl list logical_switch _uuid : 219b8041-c1a9-4884-ab9e-ec3b39e29e2b acls : [] dns_records : [28dfcc84-bf94-4f22-9e27-d897ceddfa2f] external_ids : {"neutron:mtu"="1500", "neutron:network_name"=nova, "neutron:revision_number"="3"} forwarding_groups : [] load_balancer : [] name : neutron-d1d14a67-6b57-415e-8503-d82f2b2960c4 other_config : {mcast_flood_unregistered="false", mcast_snoop="true", vlan-passthru="false"} ports : [0e8a9359-29a8-498a-9c9f-7742a7c844eb, 237208ec-f516-4169-ab32-f126fce1b413, 37b15a0b-ead6-4a44-9374-2ffc6509020b, ad93b562-46de-4397-a443-c617ad4b2962, b0b3f6ae-35e2-4c73-a8b0-551ed38dcea0, ca9901bf-1afd-45e8-b54e-9ba033d07d70] qos_rules : [] _uuid : 99f3af51-0f64-4ac9-b00f-e0287183aec8 acls : [] dns_records : [74cd5d2c-a1cf-44b8-8a60-e95926ae6aef] external_ids : {"neutron:mtu"="1442", "neutron:network_name"=internal_A, "neutron:revision_number"="2"} forwarding_groups : [] load_balancer : [] name : neutron-b91c4af4-5aa9-4adb-9209-24d1e6383fab other_config : {mcast_flood_unregistered="false", mcast_snoop="true", vlan-passthru="false"} ports : [172c73ff-9a01-44fe-b3b3-6a9b70907538, 24b7cd0b-ba7b-4475-84c9-20c407462e8a, 7f78ce3c-6dac-4ab8-811f-17d79234937b, b9a27e37-4826-42bf-98c0-0c4c72d23073, c175d285-79fc-4162-a859-f71f3d25f275] qos_rules : [] [heat-admin@controller-0 ~]$ ovn-nbctl show switch 219b8041-c1a9-4884-ab9e-ec3b39e29e2b (neutron-d1d14a67-6b57-415e-8503-d82f2b2960c4) (aka nova) port b9576922-fce1-419f-bcf1-6b42809add3c addresses: ["fa:16:3e:d2:dd:68 10.0.0.226 2620:52:0:13b8::1000:96"] port e05a52ca-af40-48b9-9d37-b062b2acc389 type: localport addresses: ["fa:16:3e:4f:5b:4f 10.0.0.151"] port provnet-fc147a22-9144-4988-a7de-c7d8a09c8269 type: localnet addresses: ["unknown"] port 24dba14f-7bf6-41d3-acbc-5ddc4e0cb3b9 addresses: ["fa:16:3e:9c:3a:66 10.0.0.160 2620:52:0:13b8::1000:2f"] port c232ab61-cb41-4dfe-8277-891c44980172 addresses: ["fa:16:3e:ad:7e:cf 10.0.0.241 2620:52:0:13b8::1000:16"] port f8184ba3-0fb1-45f4-ac66-f038f954df21 type: router router-port: lrp-f8184ba3-0fb1-45f4-ac66-f038f954df21 switch 99f3af51-0f64-4ac9-b00f-e0287183aec8 (neutron-b91c4af4-5aa9-4adb-9209-24d1e6383fab) (aka internal_A) port 7c77b212-56ad-457c-a78a-b5d2f587c868 type: router router-port: lrp-7c77b212-56ad-457c-a78a-b5d2f587c868 port 2853f87a-bad8-418b-ab3f-53eddcb255a5 type: localport addresses: ["fa:16:3e:79:8f:32 192.168.1.2"] port 8586719a-2e17-4cfd-a947-3f26dc4acb0a addresses: ["fa:16:3e:c6:5a:ca 192.168.1.227"] port ff3f16f2-4c6e-41e9-81d3-0c2677bdd219 addresses: ["fa:16:3e:7c:76:9b 192.168.1.118"] port 0eb12224-0c1a-4fe6-a6f7-93651b532b9d addresses: ["fa:16:3e:c3:0c:04 192.168.1.65"] switch 415c772f-e8ff-4007-9e0b-02e04fda871d (neutron-7dd68d59-5042-4efc-815a-a6a7b73a7fdb) (aka heat_tempestconf_network) port 98b4694f-e265-4f16-84e3-d659405708e3 type: localport addresses: ["fa:16:3e:c4:4f:74 192.168.199.2"] router 475793f4-53f8-45c7-8b34-4105d74730b8 (neutron-a748c7d2-f546-43a7-8828-e20dd51ab3fb) (aka routerA) port lrp-7c77b212-56ad-457c-a78a-b5d2f587c868 mac: "fa:16:3e:34:d4:62" networks: ["192.168.1.1/24"] port lrp-f8184ba3-0fb1-45f4-ac66-f038f954df21 mac: "fa:16:3e:34:1d:ab" networks: ["10.0.0.200/24", "2620:52:0:13b8::1000:3d/64"] gateway chassis: [cb69ec80-d6d9-4697-9f5c-b89cce5b7510 a8537419-c77a-4bb3-9045-fdfa9dd34698 855269d0-8f37-43bc-9952-40ccbf3afcf4] nat 01d27a77-9d11-4988-ae56-cad845d3abfe external ip: "10.0.0.197" logical ip: "192.168.1.65" type: "dnat_and_snat" nat 2196c8b4-0c15-4617-997d-dc46ddc9100d external ip: "10.0.0.246" logical ip: "192.168.1.227" type: "dnat_and_snat" nat 61cc3a13-0a43-44ac-a251-e251329706c3 external ip: "10.0.0.200" logical ip: "192.168.1.0/24" type: "snat" nat f28e1371-b7f5-46b5-b3ff-9f558a80a25e external ip: "10.0.0.214" logical ip: "192.168.1.118" type: "dnat_and_snat"
OSP16.2 versions of components, puddle RHOS-16.2-RHEL-8-20210409.n.0 python3-networking-ovn-7.4.1-2.20210323164957.ad92505.el8ost.1.noarch ovn2.13-20.12.0-24.el8fdp.x86_64 openvswitch2.13-2.13.0-79.6.el8fdp.x86_64
Note: I found that in case VMs are running on the same compute node there are no duplicated packets.
Fix has been merged and the RPM can be found at https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=1651174
Verified on RHOS-16.2-RHEL-8-20210707.n.0 that the issue does not occur.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenStack Platform (RHOSP) 16.2 enhancement advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2021:3483