Bug 1902075 - [OVN][Multicast] Packets flooding when mcast_relay is enabled and there are subscribed clients
Summary: [OVN][Multicast] Packets flooding when mcast_relay is enabled and there are s...
Keywords:
Status: NEW
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: ovn2.13
Version: FDP 20.D
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ---
: ---
Assignee: Dumitru Ceara
QA Contact: Jianlin Shi
URL:
Whiteboard:
Depends On:
Blocks: 1901615
TreeView+ depends on / blocked
 
Reported: 2020-11-26 18:27 UTC by Roman Safronov
Modified: 2023-07-13 07:25 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
capture on one of the subscribed clients after single multicast packet sent (4.02 MB, application/vnd.tcpdump.pcap)
2020-11-26 18:29 UTC, Roman Safronov
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-961 0 None None None 2021-09-24 13:53:18 UTC

Description Roman Safronov 2020-11-26 18:27:43 UTC
Description of problem:
Found during testing OSP16.1 multicast RFE https://bugzilla.redhat.com/show_bug.cgi?id=1575512

When mcast_relay enabled on OVN router connecting 2 logical switches, after sending a single multicast packet from a host connected to one of the switches, each of the subscribed hosts (3 subscribed hosts) received 64K packets, see the attached pcap file captured on one of the subscribed hosts.

When there was only 1 subscribed client it received 4 packets instead of 1.


Version-Release number of selected component (if applicable):
ovn2.13-20.09.0-2.el8fdp.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Configure two logical switches connected through a logical router. (switches: 'nova' and 'internal_A' and router 'routerA' in the output below)
2. Enable IGMP snooping on the switches.
3. Enable IGMP relay on the router.
4. Join multicast group from a host on switch1 and send a single IP multicast packet from a host on switch2.

Actual results:
After single multicast packet was sent, multiple packets reach each subscribed host

Expected results:
Single packet reach each subscribed host.

Additional info:
[heat-admin@controller-0 ~]$ ovn-nbctl list logical_switch
_uuid               : ecff5e4c-f4ee-4387-99be-409a3e99f2b2
acls                : []
dns_records         : []
external_ids        : {"neutron:mtu"="1500", "neutron:network_name"=nova, "neutron:revision_number"="5"}
forwarding_groups   : []
load_balancer       : []
name                : neutron-9519fe52-4f56-4cad-8089-fa99449ceb31
other_config        : {mcast_flood_unregistered="false", mcast_snoop="true"}
ports               : [01926fb6-efb4-4575-bd65-e6361aed186c, 1ce2e362-b957-4132-88ad-3167acb0bf66, 3cb7e0e2-bc8c-4a8f-b57d-e743532825c1, 5d44aa2a-af47-41b8-94ad-1a0c20ff1fc5, 65ca1830-e73e-47d7-b60c-b1abb9cb472d, ab522168-2e12-4b90-9716-c2bd754d25c9]
qos_rules           : []

_uuid               : 3e27d5a5-5d2f-4ada-b579-520f905ace96
acls                : []
dns_records         : []
external_ids        : {"neutron:mtu"="1500", "neutron:network_name"=heat_tempestconf_network, "neutron:revision_number"="2"}
forwarding_groups   : []
load_balancer       : []
name                : neutron-9ef6f813-5abd-4f58-af8f-784cab84697b
other_config        : {mcast_flood_unregistered="false", mcast_snoop="true"}
ports               : [7dbb48e0-4de7-4938-b61b-c3c5693e01f9, f2eb35cd-17b7-4f12-b420-c69897671a2d]
qos_rules           : []

_uuid               : 38ec8aed-d91e-49c2-95a5-65cac67cd22f
acls                : []
dns_records         : []
external_ids        : {"neutron:mtu"="1500", "neutron:network_name"=internal_B, "neutron:revision_number"="1"}
forwarding_groups   : []
load_balancer       : []
name                : neutron-65029e63-7ef2-4309-af92-3e51b070f9bd
other_config        : {mcast_flood_unregistered="false", mcast_snoop="true"}
ports               : [519e3480-141f-4bd2-8b75-b3d66a040200, bf9d512b-81e9-4c0c-a131-d1cd5c1528e3]
qos_rules           : []

_uuid               : b824f0bf-9dea-4d41-93b9-b095418ef0eb
acls                : []
dns_records         : []
external_ids        : {"neutron:mtu"="1500", "neutron:network_name"=internal_A, "neutron:revision_number"="2"}
forwarding_groups   : []
load_balancer       : []
name                : neutron-e8dca045-ad5b-414a-8742-a5e093f5e892
other_config        : {mcast_flood_unregistered="false", mcast_snoop="true"}
ports               : [35590923-58a7-49c9-9aa4-7aaa0761e9e7, 41574191-1765-4637-9130-2b94f60e4553, 77814243-7943-45fc-806b-46b56285d094, 83e0ea9e-fd4c-4dfe-94a4-4c57a37fb557, c9245c86-06bc-4821-89f3-28b01f26d584]
qos_rules           : []
[heat-admin@controller-0 ~]$ 
[heat-admin@controller-0 ~]$ 
[heat-admin@controller-0 ~]$ 
[heat-admin@controller-0 ~]$ 
[heat-admin@controller-0 ~]$ 
[heat-admin@controller-0 ~]$ 
[heat-admin@controller-0 ~]$ ovn-nbctl show
switch ecff5e4c-f4ee-4387-99be-409a3e99f2b2 (neutron-9519fe52-4f56-4cad-8089-fa99449ceb31) (aka nova)
    port provnet-6e8a612e-8990-44b9-8700-eb9940a57b09
        type: localnet
        addresses: ["unknown"]
    port c2095257-3491-45cc-a167-c2c83e3b3cfc
        type: router
        router-port: lrp-c2095257-3491-45cc-a167-c2c83e3b3cfc
    port 22c32e6d-bb68-4a53-888e-f25f0b4bebdf
        addresses: ["fa:16:3e:8c:78:c6 10.0.0.218 2620:52:0:13b8::1000:a0"]
    port 74e1ffbb-4cd8-4f93-9e40-95f3c0c529aa
        type: localport
        addresses: ["fa:16:3e:cc:b7:d3 10.0.0.210"]
    port 9de3025a-d6b1-4033-a82d-245613585926
        addresses: ["fa:16:3e:d8:87:d7 10.0.0.223 2620:52:0:13b8::1000:3c"]
    port 087edd3a-0877-4e46-b89f-6ac4043a51c0
        addresses: ["fa:16:3e:3c:ec:97 10.0.0.219 2620:52:0:13b8::1000:a5"]
switch 3e27d5a5-5d2f-4ada-b579-520f905ace96 (neutron-9ef6f813-5abd-4f58-af8f-784cab84697b) (aka heat_tempestconf_network)
    port provnet-eaf151e1-e819-4354-92e9-3aaa22509a37
        type: localnet
        tag: 1002
        addresses: ["unknown"]
    port 358bde14-7cf8-4bd7-a0a6-25a7c340e547
        type: localport
        addresses: ["fa:16:3e:ec:cd:a1 192.168.199.2"]
switch 38ec8aed-d91e-49c2-95a5-65cac67cd22f (neutron-65029e63-7ef2-4309-af92-3e51b070f9bd) (aka internal_B)
    port provnet-1535b3a4-787a-4c75-87f2-34df8b5c706f
        type: localnet
        tag: 1001
        addresses: ["unknown"]
    port ca2f7def-efeb-41e5-8203-371b8522790a
        type: localport
        addresses: ["fa:16:3e:a2:21:9d"]
switch b824f0bf-9dea-4d41-93b9-b095418ef0eb (neutron-e8dca045-ad5b-414a-8742-a5e093f5e892) (aka internal_A)
    port 93ade5e7-5e7f-40a5-959c-83355e4c2a52
        addresses: ["fa:16:3e:b2:71:b3 192.168.1.66"]
    port 088dc71d-d09b-42e6-bf47-2c3b4cf84113
        type: localport
        addresses: ["fa:16:3e:e3:c4:cb 192.168.1.2"]
    port 4b296eea-06fd-42cc-9d82-3f708aa032e7
        type: router
        router-port: lrp-4b296eea-06fd-42cc-9d82-3f708aa032e7
    port provnet-bfd9ba1b-0d28-43df-925a-217edb1f5202
        type: localnet
        tag: 1000
        addresses: ["unknown"]
    port 614a4041-f904-43d1-bb65-bf4c08c636f4
        addresses: ["fa:16:3e:8a:40:f9 192.168.1.127"]
router db4aba25-5d01-4a3c-9829-b727a9b4352f (neutron-049523e4-d16f-403e-b0a7-6079d05c893c) (aka routerA)
    port lrp-c2095257-3491-45cc-a167-c2c83e3b3cfc
        mac: "fa:16:3e:35:a8:5f"
        networks: ["10.0.0.247/24", "2620:52:0:13b8::1000:35/64"]
        gateway chassis: [6be6253f-152d-4671-8031-87c576375eee f2f58e16-c9d0-4d98-bfbb-01e4af03c8e9 a74bcb3b-8be7-47d8-aedd-a1b8e35b4394]
    port lrp-4b296eea-06fd-42cc-9d82-3f708aa032e7
        mac: "fa:16:3e:ee:8d:69"
        networks: ["192.168.1.1/24"]
    nat 15336509-e231-4f1d-8f55-8acd1cc2a7eb
        external ip: "10.0.0.213"
        logical ip: "192.168.1.66"
        type: "dnat_and_snat"
    nat 22355740-f350-4da8-8ce7-2571e780e13c
        external ip: "10.0.0.225"
        logical ip: "192.168.1.127"
        type: "dnat_and_snat"
    nat dfca1dea-375f-468a-99c9-66d8938e9698
        external ip: "10.0.0.247"
        logical ip: "192.168.1.0/24"
        type: "snat"

Comment 1 Roman Safronov 2020-11-26 18:29:50 UTC
Created attachment 1733871 [details]
capture on one of the subscribed clients after single multicast packet sent

Comment 4 Dumitru Ceara 2020-12-01 12:27:09 UTC
Hi Lucas,

Would it be an option for OpenStack to restrict the IP Multicast Relay flood tree by setting a new option on the logical router, e.g., mcast-relay-chassis=<chassis-name>?

Like this we could try to ensure in OVN that multicast traffic is routed only once, on the chassis matching mcast-relay-chassis.

Thanks,
Dumitru

Comment 5 Lucas Alvares Gomes 2020-12-01 13:13:25 UTC
(In reply to Dumitru Ceara from comment #4)
> Hi Lucas,
> 
> Would it be an option for OpenStack to restrict the IP Multicast Relay flood
> tree by setting a new option on the logical router, e.g.,
> mcast-relay-chassis=<chassis-name>?
> 
> Like this we could try to ensure in OVN that multicast traffic is routed
> only once, on the chassis matching mcast-relay-chassis.
> 
> Thanks,
> Dumitru

Hi Dumitru,

From the OpenStack perspective it wouldn't be a problem to set that option. The given chassis would be a gateway chassis (one with the "enable-chassis-as-gw" option set to the "ovn-cms-options") ?

The only problem I can see with this approach is in terms of HA, what if that chassis goes away ? Usually it's core OVN that is monitoring those things via BFD.

Another approach perhaps would be point to a HA Chassis Group so that the Chassis with the highest priority would be the one doing the routing and, if it goes away, the next Chassis with the highest priority would be pick. Something like mcast-relay-chassis-group=<ha_chassis_group name>. What do you think ?

Comment 6 Dumitru Ceara 2020-12-01 14:00:31 UTC
(In reply to Lucas Alvares Gomes from comment #5)
[...]
> 
> Another approach perhaps would be point to a HA Chassis Group so that the
> Chassis with the highest priority would be the one doing the routing and, if
> it goes away, the next Chassis with the highest priority would be pick.
> Something like mcast-relay-chassis-group=<ha_chassis_group name>. What do
> you think ?

This is way better indeed.  I hadn't thought about HA.

I'll go ahead and try to implement this.

Thanks,
Dumitru

Comment 8 Mark Michelson 2021-09-24 13:52:07 UTC
Hi Dumitru, according to the latest comment you were going to try to implement the idea proposed by Lucas. Is this something you've already done? Or is this still in progress?

Comment 9 Dumitru Ceara 2021-09-24 14:18:35 UTC
(In reply to Mark Michelson from comment #8)
> Hi Dumitru, according to the latest comment you were going to try to
> implement the idea proposed by Lucas. Is this something you've already done?
> Or is this still in progress?

Hi Mark,

I didn't get a chance to try out the ideas discussed above.

Thanks,
Dumitru


Note You need to log in before you can comment on or make changes to this bug.