Bug 1957025 - [ovn] ARP broadcasts and duplicate mac address issues happen due to unexpected openflow rules before ovn-controller is fully up
Summary: [ovn] ARP broadcasts and duplicate mac address issues happen due to unexpecte...
Keywords:
Status: POST
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: ovn2.13
Version: FDP 20.I
Hardware: Unspecified
OS: Unspecified
high
unspecified
Target Milestone: ---
: ---
Assignee: Numan Siddique
QA Contact: Jianlin Shi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-05-04 21:25 UTC by ffernand
Modified: 2021-05-13 20:10 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)

Description ffernand 2021-05-04 21:25:40 UTC
Description of problem:

Not very long ago, in a cluster near us, a network adventure takes place. While exercising an AZ failure, a node that had
OVN router ports scheduled is powered down. For the sake of this description, that node is called "networker-1". As expected,
the configured lrp-get-gateway-chassis takes effect and the is cr-lrp properly scheduled to one of the backup nodes: "controller-2".
All is good.

After a while, "networker-1" is powered back up BUT due to a config issue, the network interface used to access the control
network does not come up, causing ovn-controller in "networker-1" to remain unable to access the OVN SB DB:

  [root@networker-1 ~]# tail -F /var/log/containers/openvswitch/ovn-controller.log
  2021-05-04T16:34:35.285Z|00028|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: connecting...
  2021-05-04T16:34:36.287Z|00029|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: connection timed out
  2021-05-04T16:34:36.287Z|00030|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: waiting 2 seconds before reconnect
  2021-05-04T16:34:38.289Z|00031|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: connecting...

HOWEVER, the openflow rules configured before power down remain in effect in "networker-1" and that causes lots
of trouble in the cluster when it comes to ownership of the distributed router ports. Duplicate MACS and Arp broadcasts chaos
take place.

Later on, the sys-admin hero manage to get the control plane back up in "networker-1" and ovn-controller finally syncs up with
the rest of the cluster. And peace is restored in the network land once again. 
And the vms lived happily ever after... the END... heh... the BZ! :)


Version-Release number of selected component (if applicable):
ovn2.13-20.12.0-97.el8fdp.x86_64
openvswitch2.13-2.13.0-98.el8fdp.x86_64

Not related, but this is a special build, that includes hotfixes for
https://bugzilla.redhat.com/1939470 and https://bugzilla.redhat.com/1939469

How reproducible:
100%

Steps to Reproduce:
1. schedule a router port to a node and power it down
2. power node back up, but do not allow it to connect to sb-db (see additional info)
3. ping ip of distributed router and look for arps for its mac.

Actual results:
Stale openflow rules for the distributed router ports conflict head to head with the rules where the router port was scheduled to after the original node went down.

Expected results:
Upon powering up, rules for distributed router ports should not be trusted unless control plane is connected.


Additional info:
The reason why control connection is not coming up is because ovs gives up waiting for it to become available. That issue will be tracked by another bz.

May 04 15:09:58 networker-1.osp-002.prod.iad2.dc.redhat.com network[2382]: INFO      : [ipv6_wait_tentative] Waiting for interface vlan315 IPv6 address(es) t>
May 04 15:09:59 networker-1.osp-002.prod.iad2.dc.redhat.com network[2382]: [  OK  ]
May 04 15:09:59 networker-1.osp-002.prod.iad2.dc.redhat.com ovs-vsctl[5799]: ovs|00001|vsctl|INFO|Called as ovs-vsctl -t 10 -- --if-exists del-port br-provid>
May 04 15:10:07 networker-1.osp-002.prod.iad2.dc.redhat.com systemd[1]: network.service: Start operation timed out. Terminating.
May 04 15:10:07 networker-1.osp-002.prod.iad2.dc.redhat.com systemd[1]: network.service: Failed with result 'timeout'.
May 04 15:10:07 networker-1.osp-002.prod.iad2.dc.redhat.com systemd[1]: Failed to start LSB: Bring up/down networking.

Comment 1 ffernand 2021-05-04 21:28:40 UTC
()[root@controller-0 /]# ovn-sbctl find port_binding logical_port="cr-lrp-d796ca6d-dd92-49c2-9f4b-e94413f2fb8d"
_uuid               : 42d7a1c3-8f7c-4a03-bedd-bda5b6a9b209
chassis             : cdda0ab3-ec8f-4932-b3db-a6c4c573d06d
datapath            : 46ef0b93-ca0b-488b-99be-34617fdcb9ca
encap               : []
external_ids        : {}
gateway_chassis     : []
ha_chassis_group    : 69af9edb-0e1f-4b1f-a982-abfb2b029bdd
logical_port        : cr-lrp-d796ca6d-dd92-49c2-9f4b-e94413f2fb8d
mac                 : ["fa:16:3e:61:a1:98 10.2.113.250/24 2620:52:4:230d::3d9/64"]
nat_addresses       : []
options             : {distributed-port=lrp-d796ca6d-dd92-49c2-9f4b-e94413f2fb8d}
parent_port         : []
tag                 : []
tunnel_key          : 3
type                : chassisredirect
up                  : true
virtual_parent      : []

2021-05-04T16:38:33.338Z|00034|binding|INFO|cr-lrp-d796ca6d-dd92-49c2-9f4b-e94413f2fb8d: Claiming fa:16:3e:61:a1:98 10.2.113.250/24 2620:52:4:230d::3d9/64
2021-05-04T16:38:33.338Z|00035|binding|INFO|Changing chassis for lport cr-lrp-8732c441-8155-4fe6-bf64-da68e871c211 from 466e1018-1a42-4d16-907a-d1fc62910ea3 to 651b4ea3-f4ea-42b3-bc11-1ec03154aebb.


()[root@controller-0 /]# ovn-nbctl lrp-get-gateway-chassis lrp-d796ca6d-dd92-49c2-9f4b-e94413f2fb8d
lrp-d796ca6d-dd92-49c2-9f4b-e94413f2fb8d_466e1018-1a42-4d16-907a-d1fc62910ea3     5  <-- controller-2
lrp-d796ca6d-dd92-49c2-9f4b-e94413f2fb8d_66758056-d693-4a33-89af-dd15d08b05fa     4
lrp-d796ca6d-dd92-49c2-9f4b-e94413f2fb8d_98cc5a8e-c116-4f82-8c67-b1242888cb09     3
lrp-d796ca6d-dd92-49c2-9f4b-e94413f2fb8d_651b4ea3-f4ea-42b3-bc11-1ec03154aebb     2  <-- networker-1
lrp-d796ca6d-dd92-49c2-9f4b-e94413f2fb8d_4f2021dd-ee45-4677-aa17-854aa0d228ca     1


()[root@controller-0 /]# ovn-sbctl --column _uuid,hostname list chassis 466e1018-1a42-4d16-907a-d1fc62910ea3
_uuid               : cdda0ab3-ec8f-4932-b3db-a6c4c573d06d
hostname            : controller-2.osp-002.prod.iad2.dc.redhat.com

()[root@controller-0 /]# ovn-sbctl --column _uuid,hostname list chassis 651b4ea3-f4ea-42b3-bc11-1ec03154aebb
_uuid               : 45ebb384-c348-425a-b3c4-9369e54e11a1
hostname            : networker-1.osp-002.prod.iad2.dc.redhat.com
()[root@controller-0 /]#

Comment 3 ffernand 2021-05-05 18:27:16 UTC
**Important** follow up on this issue: 

After consulting with one of my OVN gurus, I learned that the Openflow rules should not be present in the node upon power cycle.

That was definitely the case, BUT it turned out that the br-int bridge did not have fail-mode set to secure [0]. Because of that,
the default 'NORMAL' action rule was present and that is what caused the "chaos" as default bridge operations were being performed
on all of its ports:

  [root@networker-0 openvswitch]# ovs-ofctl dump-flows br-int
   cookie=0x0, duration=1611.902s, table=0, n_packets=857734338, n_bytes=55326381337, priority=0 actions=NORMAL

Further inspection is needed to understand why/how the br-int bridge in some of the nodes were not set with fail-mode secure.

This workaround fixed the problem in the cluster:

[stack@director1 ~]$ for i in `OS_CLOUD=undercloud openstack server list| grep control | awk '{print $8}' | sed 's/ctlplane=//'`; do echo $i ; \
  ssh $i sudo ovs-vsctl set-fail-mode br-int secure ; done




[0]: 
http://www.openvswitch.org/support/dist-docs/ovs-vsctl.8.txt  ==>   Controller Failure Settings

Comment 4 Dan Williams 2021-05-05 20:31:18 UTC
If OVN creates br-int (default) then it sets fail-mode=secure:

static const struct ovsrec_bridge *
create_br_int(struct ovsdb_idl_txn *ovs_idl_txn,
              const struct ovsrec_open_vswitch_table *ovs_table)
{
    <snip>

    struct ovsrec_bridge *bridge;
    bridge = ovsrec_bridge_insert(ovs_idl_txn);
    ovsrec_bridge_set_name(bridge, bridge_name);
>>> ovsrec_bridge_set_fail_mode(bridge, "secure");
    ovsrec_bridge_set_ports(bridge, &port, 1);

but if that bridge already exists when ovn-controller starts, I don't think it will enforce fail-mode=secure on the existing bridge.

Is there any way that br-int would have been created by something (even ovn-controller) and then fail-mode got reset, then when ovn-controller restarts it doesn't touch the bridge?

Comment 5 ffernand 2021-05-06 11:45:11 UTC
(In reply to Dan Williams from comment #4)
> If OVN creates br-int (default) then it sets fail-mode=secure:
> 
> static const struct ovsrec_bridge *
> create_br_int(struct ovsdb_idl_txn *ovs_idl_txn,
>               const struct ovsrec_open_vswitch_table *ovs_table)
> {
>     <snip>
> 
>     struct ovsrec_bridge *bridge;
>     bridge = ovsrec_bridge_insert(ovs_idl_txn);
>     ovsrec_bridge_set_name(bridge, bridge_name);
> >>> ovsrec_bridge_set_fail_mode(bridge, "secure");
>     ovsrec_bridge_set_ports(bridge, &port, 1);
> 
> but if that bridge already exists when ovn-controller starts, I don't think
> it will enforce fail-mode=secure on the existing bridge.
> 
> Is there any way that br-int would have been created by something (even
> ovn-controller) and then fail-mode got reset, then when ovn-controller
> restarts it doesn't touch the bridge?

Thanks, Dan. I still think that just as ovn creates br-int with secure fail mode when needed,
I am thinking that that should be checked and set, regardless. Is that an okay approach?

static const struct ovsrec_bridge *
process_br_int(struct ovsdb_idl_txn *ovs_idl_txn,
               const struct ovsrec_bridge_table *bridge_table,
               const struct ovsrec_open_vswitch_table *ovs_table)
{
    const struct ovsrec_bridge *br_int = get_br_int(bridge_table,
                                                    ovs_table);
    if (!br_int) {
        br_int = create_br_int(ovs_idl_txn, ovs_table);
    } <=== else { ... ovsrec_bridge_set_fail_mode(bridge, "secure");  ... }  <---- br_int for ovn must be always used with fail mode "secure" 

    ...
    return br_int;
}

Comment 6 Dan Williams 2021-05-06 14:44:39 UTC
(In reply to ffernand from comment #5)
> (In reply to Dan Williams from comment #4)
> > If OVN creates br-int (default) then it sets fail-mode=secure:
> > 
> > static const struct ovsrec_bridge *
> > create_br_int(struct ovsdb_idl_txn *ovs_idl_txn,
> >               const struct ovsrec_open_vswitch_table *ovs_table)
> > {
> >     <snip>
> > 
> >     struct ovsrec_bridge *bridge;
> >     bridge = ovsrec_bridge_insert(ovs_idl_txn);
> >     ovsrec_bridge_set_name(bridge, bridge_name);
> > >>> ovsrec_bridge_set_fail_mode(bridge, "secure");
> >     ovsrec_bridge_set_ports(bridge, &port, 1);
> > 
> > but if that bridge already exists when ovn-controller starts, I don't think
> > it will enforce fail-mode=secure on the existing bridge.
> > 
> > Is there any way that br-int would have been created by something (even
> > ovn-controller) and then fail-mode got reset, then when ovn-controller
> > restarts it doesn't touch the bridge?
> 
> Thanks, Dan. I still think that just as ovn creates br-int with secure fail
> mode when needed,
> I am thinking that that should be checked and set, regardless. Is that an
> okay approach?

I would agree; if OVN needs fail-mode=secure, then it should probably enforce that on the integration bridge.

Comment 7 Dan Williams 2021-05-06 14:45:36 UTC
(In reply to Dan Williams from comment #6)
> (In reply to ffernand from comment #5)
> > (In reply to Dan Williams from comment #4)
> > > If OVN creates br-int (default) then it sets fail-mode=secure:
> > > 
> > > static const struct ovsrec_bridge *
> > > create_br_int(struct ovsdb_idl_txn *ovs_idl_txn,
> > >               const struct ovsrec_open_vswitch_table *ovs_table)
> > > {
> > >     <snip>
> > > 
> > >     struct ovsrec_bridge *bridge;
> > >     bridge = ovsrec_bridge_insert(ovs_idl_txn);
> > >     ovsrec_bridge_set_name(bridge, bridge_name);
> > > >>> ovsrec_bridge_set_fail_mode(bridge, "secure");
> > >     ovsrec_bridge_set_ports(bridge, &port, 1);
> > > 
> > > but if that bridge already exists when ovn-controller starts, I don't think
> > > it will enforce fail-mode=secure on the existing bridge.
> > > 
> > > Is there any way that br-int would have been created by something (even
> > > ovn-controller) and then fail-mode got reset, then when ovn-controller
> > > restarts it doesn't touch the bridge?
> > 
> > Thanks, Dan. I still think that just as ovn creates br-int with secure fail
> > mode when needed,
> > I am thinking that that should be checked and set, regardless. Is that an
> > okay approach?
> 
> I would agree; if OVN needs fail-mode=secure, then it should probably
> enforce that on the integration bridge.

And perhaps warn if it isn't set, since this should be an exceptional condition that the admin should look into?

Comment 9 ffernand 2021-05-07 18:03:44 UTC
Changes v2 posted upstream at: https://patchwork.ozlabs.org/project/ovn/patch/20210507174947.1879798-1-flavio@flaviof.com/

Comment 11 ffernand 2021-05-13 20:10:11 UTC
Fix is merged upstream.
I'm assigning bz to Numan, so he can help me out creating a d/s rpm for ovn 2.13 that includes the fix.

https://github.com/ovn-org/ovn/commit/9cc334bc1a036a93cc1a541513d48f4df6933e9b
https://github.com/ovn-org/ovn/commit/be65a461ce134c1e874b65add402f7c2744f29f5


Note You need to log in before you can comment on or make changes to this bug.