Bug 1695122

Summary: [OVN] mac-table-size of provider bridges (br-ex, br-isolated) was cleared after node reboot
Product: Red Hat OpenStack Reporter: Roman Safronov <rsafrono>
Component: os-net-configAssignee: Jakub Libosvar <jlibosva>
Status: CLOSED ERRATA QA Contact: Fiorella Yanac <fyanac>
Severity: high Docs Contact:
Priority: high    
Version: 13.0 (Queens)CC: apevec, bfournie, dsneddon, ekuris, eolivare, hbrock, jlibosva, jslagle, lhh, lmartins, majopela, mburns, scohen, skaplons
Target Milestone: betaKeywords: Triaged
Target Release: 17.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: os-net-config-14.2.1-0.20220626013437.755f151.el9ost Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-09-21 12:07:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Roman Safronov 2019-04-02 14:19:55 UTC
Description of problem:
In https://bugzilla.redhat.com/show_bug.cgi?id=1592333 was introduced a fix for a small size of ovn bridges mac table which can cause openvswitch to explode. However when I tried to check this setting on my system the setting was not there. My environment was already after some other tests including node reboots, openvswitch restarts, ovn services reboots, ovndb bundle restarts etc.

Numan investigated the issue and found that the setting was applied but then cleared by some process, see details below (in Additional info section) 

Version-Release number of selected component (if applicable):
Puddle (OSP13 Z5) 13.0-RHEL-7/2019-03-01.1

puppet-ovn-12.4.0-2.el7ost.noarch
openstack-tripleo-heat-templates-8.2.0-6.el7ost.noarch

openvswitch-2.9.0-97.el7fdp.x86_64
openvswitch-ovn-central-2.9.0-97.el7fdp.x86_64
openvswitch-ovn-common-2.9.0-97.el7fdp.x86_64
openstack-neutron-openvswitch-12.0.5-4.el7ost.noarch
python-openvswitch-2.9.0-97.el7fdp.x86_64
openvswitch-selinux-extra-policy-1.0-9.el7fdp.noarch
openvswitch-ovn-host-2.9.0-97.el7fdp.x86_64

python-networking-ovn-4.0.3-3.el7ost.noarch


How reproducible:


Steps to Reproduce:
Still not sure what caused the problem but environment was after some tests including node reboots, openvswitch restarts, ovn services reboots, ovndb bundle restarts etc.
Run "ovs-vsctl list bridge" on all ovn nodes with ovn controller running

Actual results:
other_config setting for br-ex and br-isolated does not contain mac-table-size="50000"

Expected results:
other_config setting for br-ex and br-isolated contain mac-table-size="50000" setting

Additional info:

Some information from our discussion with Numan

 {"_date":1552325717046,"Bridge":{"6b916a05-0f93-4bb6-b354-81f5ab39d9cd":{"other_config":["map",[["hwaddr","52:54:00:e3:56:fb"],["mac-table-size","50000"]]]}},"_comment":"ovs-vsctl (invoked by /usr/bin/ruby): ovs-vsctl --timeout=5 set Bridge br-ex other-config:mac-table-size=50000","Open_vSwitch":{"c3849fb0-55e6-446a-9ee2-9a31b42c02fa":{"next_cfg":20}}}
OVSDB JSON 95 b064f820b504f6fd87a6847b3df2aef893e0a0a2
{"_date":1552325717046,"Open_vSwitch":{"c3849fb0-55e6-446a-9ee2-9a31b42c02fa":{"cur_cfg":20}}}
OVSDB JSON 361 60a8f3c8575fe4ee86f415e97dd68ba88785a5fc
{"_date":1552325717082,"Bridge":{"ae50fd2f-cae5-4a87-9397-bf946efb2323":{"other_config":["map",[["hwaddr","52:54:00:bd:1a:d1"],["mac-table-size","50000"]]]}},"_comment":"ovs-vsctl (invoked by /usr/bin/ruby): ovs-vsctl --timeout=5 set Bridge br-isolated other-config:mac-table-size=50000","Open_vSwitch":{"c3849fb0-55e6-446a-9ee2-9a31b42c02fa":{"next_cfg":21}}}


<numans> if you see here, puppet-ovn has invoked the command to set the mac-table-size
something is clearing it later...

<rsafrono> i did many tests on this system before, including killing openvswitch, restarting it, rebooting nodes etc.

<numans> looks like

{"Interface":{"7700da57-df24-4efa-8c39-c66e60ea495f":{"name":"br-ex","type":"internal"}},"Port":{"d481c43e-92fc-4a54-a50e-b207e010a341":{"name":"br-ex","interfaces":["uuid","7700da57-df24-4efa-8c39-c66e60ea495f"]}},"_date":1552574743312,"Bridge":{"f3ffbb95-e614-4c39-a21f-a029cc715228":{"name":"br-ex","ports":["uuid","d481c43e-92fc-4a54-a50e-b207e010a341"],"fail_mode":"standalone","other_config":["map",[[
"hwaddr","52:54:00:e3:56:fb"]]]}},"_comment":"ovs-vsctl (invoked by /bin/bash): ovs-vsctl -t 10 -- --may-exist add-br br-ex -- set bridge br-ex other-config:hwaddr=52:54:00:e3:56:fb -- set bridge br-ex fail_mode=standalone -- del-controller br-ex","Open_vSwitch":{"c3849fb0-55e6-446a-9ee2-9a31b42c02fa":{"bridges":["set",[["uuid","a7b182f4-b142-4bb3-a672-134949b08047"],["uuid","f3ffbb95-e614-4c39-a21f-a029
cc715228"]]],"next_cfg":53}}}
OVSDB JSON 165 b8ca4191cb23e4ac23183c29962588d9b61a54d4
{"Interface":{"7700da57-df24-4efa-8c39-c66e60ea495f":{"ofport":65534}},"_date":1552574743331,"Open_vSwitch":{"c3849fb0-55e6-446a-9ee2-9a31b42c02fa":{"cur_cfg":53}}}
OVSDB JSON 107 433508e062ba43caf7dfa35b68c86c0813e9107d
{"Bridge":{"f3ffbb95-e614-4c39-a21f-a029cc715228":{"datapath_version":"<unknown>"}},"_date":1552574743331}
OVSDB JSON 934 11bccdce72c178a53771ef754735c20b4b5c2e2c
{"Port":{"d9705e30-1b40-4e53-ac70-61cbc261b6a3":{"name":"br-isolated","interfaces":["uuid","ea356f48-4741-47f3-9575-9ffbdfffba4e"]}},"Interface":{"ea356f48-4741-47f3-9575-9ffbdfffba4e":{"name":"br-isolated","type":"internal"}},"_date":1552574747565,"Bridge":{"e27da9c1-cb11-4375-b97e-76e3275b4e55":{"name":"br-isolated","ports":["uuid","d9705e30-1b40-4e53-ac70-61cbc261b6a3"],"fail_mode":"standalone","other
 _config":["map",[["hwaddr","52:54:00:bd:1a:d1"]]]}},"_comment":"ovs-vsctl (invoked by /bin/bash): ovs-vsctl -t 10 -- --may-exist add-br br-isolated -- set bridge br-isolated other-config:hwaddr=52:54:00:bd:1a:d1 -- set bridge br-isolated fail_mode=standalone -- del-controller br-isolated","Open_vSwitch":{"c3849fb0-55e6-446a-9ee2-9a31b42c02fa":{"bridges":["set",[["uuid","a7b182f4-b142-4bb3-a672-134949b080
 47"],["uuid","e27da9c1-cb11-4375-b97e-76e3275b4e55"],["uuid","f3ffbb95-e614-4c39-a21f-a029cc715228"]]],"next_cfg":54}}}
OVSDB JSON 165 3a1c51853466e976abd78e6fd602009cedcc93e6

if you see there, some other process is resetting the other-config

ovs-vsctl -t 10 -- --may-exist add-br br-isolated -- set bridge br-isolated other-config:hwaddr=52:54:00:bd:1a:d1 ...

the file is /etc/openvswitch/conf.db

Comment 1 Roman Safronov 2019-04-02 14:26:33 UTC
[heat-admin@controller-2 ~]$ sudo ovs-vsctl list bridge
_uuid               : 5a7a528c-9135-4469-92fc-0a11d29073d8
auto_attach         : []
controller          : []
datapath_id         : "00005254002bc9f9"
datapath_type       : ""
datapath_version    : "<unknown>"
external_ids        : {}
fail_mode           : standalone
flood_vlans         : []
flow_tables         : {}
ipfix               : []
mcast_snooping_enable: false
mirrors             : []
name                : br-isolated
netflow             : []
other_config        : {hwaddr="52:54:00:2b:c9:f9"}     <----HERE SHOULD BE mac-table-size="50000" SETTING
ports               : [3fd4752b-c133-4923-bdfd-606c2260d6d5, 769928fb-61f0-4575-adff-1cc91d75e572, a5ee9483-77c6-45b2-88b6-3d849e3ed09e, a8c37667-9891-4afd-9f3d-8f7db85f307a, aa76fb7e-8391-48a5-98ea-708f41953400, f720434f-c906-41e1-96d4-7843a29f003f]
protocols           : []
rstp_enable         : false
rstp_status         : {}
sflow               : []
status              : {}
stp_enable          : false

_uuid               : 87ca0291-e464-4b79-bc83-9adb875803d9
auto_attach         : []
controller          : []
datapath_id         : "0000525400cfca29"
datapath_type       : ""
datapath_version    : "<unknown>"
external_ids        : {}
fail_mode           : standalone
flood_vlans         : []
flow_tables         : {}
ipfix               : []
mcast_snooping_enable: false
mirrors             : []
name                : br-ex
netflow             : []
other_config        : {hwaddr="52:54:00:cf:ca:29"}         <----HERE SHOULD BE mac-table-size="50000" SETTING
ports               : [2cd930b8-51ca-4614-a790-9f514bd8a22d, 369661dc-2551-43de-9f1b-bbe14e3f411c, b0759e8c-7c04-44e7-9f57-d94861ed6f72]
protocols           : []
rstp_enable         : false
rstp_status         : {}
sflow               : []
status              : {}
stp_enable          : false

Comment 2 Roman Safronov 2019-04-04 09:15:59 UTC
I was able to reproduce the issue on a fresh system. The setting disappeared after node reboot (just executed "sudo reboot").

Comment 3 Roman Safronov 2019-04-04 16:08:37 UTC
sos report http://rhos-release.virt.bos.redhat.com/log/bz1695122

Comment 4 Lucas Alvares Gomes 2019-04-09 14:36:28 UTC
Hi,

I had an environment handy so I took a look at this problem and I can confirm the problem is in os-net-config (so I'm changing the component).

The os-net-config application is the one responsible for creating the "networking-scripts" files for the br-ex and br-isolated and those files does not include those options. So, once you reboot the system will configure the interfaces with only the options set in the networking-scripts.

Here's what is being generated by os-net-config (using br-ex as an example):

# This file is autogenerated by os-net-config
DEVICE=br-ex
ONBOOT=yes
HOTPLUG=no
NM_CONTROLLED=no
PEERDNS=no
DEVICETYPE=ovs
TYPE=OVSBridge
BOOTPROTO=static
IPADDR=10.0.0.106
NETMASK=255.255.255.0
OVS_EXTRA="set bridge br-ex other-config:hwaddr=52:54:00:d1:68:db -- set bridge br-ex fail_mode=standalone -- del-controller br-ex"

To confirm this was the problem, I did:

1. Manually edited the "/etc/sysconfig/network-scripts/ifcfg-br-ex" file and modified the OVS_EXTRA line as follow:

OVS_EXTRA="set bridge br-ex other-config:hwaddr=52:54:00:d1:68:db other-config:mac-table-size=50000 -- set bridge br-ex fail_mode=standalone -- del-controller br-ex"

2. Rebooted the node again.

And after the reboot I could confirm via "ovs-vsctl list bridge" command that the "mac-table-size" option was set in br-ex.

Comment 6 Dan Sneddon 2019-04-09 21:27:40 UTC
(In reply to Lucas Alvares Gomes from comment #4)

> 1. Manually edited the "/etc/sysconfig/network-scripts/ifcfg-br-ex" file and
> modified the OVS_EXTRA line as follow:
> 
> OVS_EXTRA="set bridge br-ex other-config:hwaddr=52:54:00:d1:68:db
> other-config:mac-table-size=50000 -- set bridge br-ex fail_mode=standalone
> -- del-controller br-ex"
> 
> 2. Rebooted the node again.
> 
> And after the reboot I could confirm via "ovs-vsctl list bridge" command
> that the "mac-table-size" option was set in br-ex.

There is no reason to hard-code entries in the ifcfg file, os-net-config can do that for you. There are two options that you can set on any OVS bridge. First is ovs_options, which allows you to add options to the ovs-vsctl add-port command (in column[:key]=value format):

- type: ovs_bridge
  name: bridge_name
  ovs_options: "other-config:mac-table-size=5000"

Or ovs_extra, which allows you to supply a list of full ovs-vsctl commands that will be separated by " -- " in the OVS_EXTRA field in the ifcfg file:

- type: ovs_bridge
  name: bridge_name
  ovs_extra:
    - "set bridge {name} other-config:mac-table-size=5000"
    - "set bridge {name} stp_enable=true"

(the {name} in the above will be substituted for the actual name of the bridge by os-net-config)

Adding that ovs_extra to the existing config for the br-ex bridge will result in the following in the ifcfg-br-ex:

OVS_EXTRA="set bridge br-ex other-config:mac-table-size=5000 -- set bridge br-ex stp_enable=true -- set bridge br-ex other-config:hwaddr=52:54:00:d1:68:db -- set bridge br-ex fail_mode=standalone -- del-controller br-ex"


So the supported way to add these OVS commands is the following procedure:

1) Modify the NIC config templates on the roles to include the list of ovs_extra commands (or a single port command in ovs_options).

2) Temporarily set the following in an environment file that will be included in the stack update command:
parameter_defaults:
  NetworkDeploymentActions: ['CREATE', 'UPDATE']

3) Run the "openstack overcloud deploy" command to update the stack, including all the same templates as the original deployment, including whatever environment file contains the NetworkDeploymentActions override

This will cause each of the systems to update the networking. If downtime is not acceptable, then the ifcfg files may be updated by hand to include the exact syntax that os-net-config will lay down on disk. If there is no difference between the ifcfg file on disk and the one that os-net-config will write, then there is no disruption to the networking. Any differences will result in the bridge being restarted.

Another option is to manually make the changes to existing nodes, and then update the NIC config templates for newly deployed or redeployed nodes.

The reason we don't recommend making changes manually that are not reflected in the NIC config templates is that when a major upgrade occurs, the existing nodes will have their network configuration rewritten, overwriting any manual changes. Also, if a node is redeployed it will need the desired settings; updating the NIC config ensures that new or replaced nodes will be configured correctly.

Comment 7 Dan Sneddon 2019-04-09 21:32:09 UTC
(In reply to Dan Sneddon from comment #6)

> - type: ovs_bridge
>   name: bridge_name
>   ovs_extra:
>     - "set bridge {name} other-config:mac-table-size=5000"
>     - "set bridge {name} stp_enable=true"

Please note that the stp_enable=true in the above example is there only to show how multiple commands could be added to ovs_extra. We do not recommend enabling Spanning Tree Protocol on bridges, as this may delay processing of network traffic when the bridge is activated. If fault-tolerance is desired, we support using bonds instead of active/passive links based on STP. STP is disabled on bridges by default.

Comment 8 Dan Sneddon 2019-04-09 21:37:10 UTC
(In reply to Dan Sneddon from comment #7)
> (In reply to Dan Sneddon from comment #6)
> 
> > - type: ovs_bridge
> >   name: bridge_name
> >   ovs_extra:
> >     - "set bridge {name} other-config:mac-table-size=5000"
> >     - "set bridge {name} stp_enable=true"
> 
> Please note that the stp_enable=true in the above example is there only to
> show how multiple commands could be added to ovs_extra. We do not recommend
> enabling Spanning Tree Protocol on bridges, as this may delay processing of
> network traffic when the bridge is activated. If fault-tolerance is desired,
> we support using bonds instead of active/passive links based on STP. STP is
> disabled on bridges by default.

In order to get an accurate example of what the ifcfg file should look like for existing nodes, I recommend modifying the NIC config template and deploying an additional new node. You can copy the ifcfg file(s), and modify only the MAC address on a per-node basis when modifying existing nodes.

Comment 19 errata-xmlrpc 2022-09-21 12:07:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 17.0 (Wallaby)), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2022:6543