+++ This bug was initially created as a clone of Bug #1572698 +++ Description of problem: Hi, We need to restart neutron-openvswitch-agent after reconfiguration of network with os-net-config Additional info: We just hit a major outage in a customer environment with OSP 8 due to some interesting behavior between tripleo and neutron-openvswitch-agent. I just reproduced part of this issue in a lab in both OSP 8 and OSP 10: a) Modify br-ex: ~~~ cat /etc/sysconfig/network-scripts/ifcfg-br-ex # This file is autogenerated by os-net-config DEVICE=br-ex MTU=2000 ONBOOT=yes HOTPLUG=no NM_CONTROLLED=no PEERDNS=no DEVICETYPE=ovs TYPE=OVSBridge OVS_EXTRA="set bridge br-ex other-config:hwaddr=52:54:00:94:27:2f -- set bridge br-ex fail_mode=standalone" ~~~ b) Start a stack update with `openstack overcloud deploy (...)` c) Verify flows on br-ex after the stack update: ~~~ [root@overcloud-compute-0 ~]# ovs-ofctl dump-flows br-ex NXST_FLOW reply (xid=0x4): cookie=0x0, duration=57042.432s, table=0, n_packets=797291, n_bytes=102211740, idle_age=1, priority=0 actions=NORMAL ~~~ We do obviously not support the manipulation of ifcfg files outside the scope of Director. However, we should at least deal with this properly: When ifcfg-<interface> files are manipulated outside the scope of Director, os-net-config will detect this and will reconfigure the network. It will, as part of it, restart br-ex, and it will delete the flows which were created by neutron-openvswitch-agent. We can reproduce this without Director as well: Note that os-net-config will normally *not* bring up/down the network, unless a change to files in /etc/sysconfig/network-scripts/ was detected! ~~~ [root@overcloud-compute-0 ~]# os-net-config -v -c /etc/os-net-config/config.json [2018/04/26 11:18:11 PM] [INFO] Using config file at: /etc/os-net-config/config.json [2018/04/26 11:18:11 PM] [INFO] Using mapping file at: /etc/os-net-config/mapping.yaml [2018/04/26 11:18:11 PM] [INFO] Ifcfg net config provider created. [2018/04/26 11:18:11 PM] [INFO] nic5 mapped to: eth4 [2018/04/26 11:18:11 PM] [INFO] nic4 mapped to: eth3 [2018/04/26 11:18:11 PM] [INFO] nic3 mapped to: eth2 [2018/04/26 11:18:11 PM] [INFO] nic2 mapped to: eth1 [2018/04/26 11:18:11 PM] [INFO] nic1 mapped to: eth0 [2018/04/26 11:18:11 PM] [INFO] adding interface: eth0 [2018/04/26 11:18:11 PM] [INFO] adding custom route for interface: eth0 [2018/04/26 11:18:11 PM] [INFO] adding bridge: br-ex [2018/04/26 11:18:11 PM] [INFO] adding interface: eth1 [2018/04/26 11:18:11 PM] [INFO] adding vlan: vlan901 [2018/04/26 11:18:11 PM] [INFO] adding vlan: vlan903 [2018/04/26 11:18:11 PM] [INFO] adding vlan: vlan902 [2018/04/26 11:18:11 PM] [INFO] adding interface: eth2 [2018/04/26 11:18:11 PM] [INFO] adding interface: eth3 [2018/04/26 11:18:11 PM] [INFO] applying network configs... [2018/04/26 11:18:11 PM] [INFO] No changes required for interface: eth3 [2018/04/26 11:18:11 PM] [INFO] No changes required for interface: eth2 [2018/04/26 11:18:11 PM] [INFO] No changes required for interface: eth1 [2018/04/26 11:18:11 PM] [INFO] No changes required for interface: eth0 [2018/04/26 11:18:11 PM] [INFO] No changes required for vlan interface: vlan903 [2018/04/26 11:18:11 PM] [INFO] No changes required for vlan interface: vlan902 [2018/04/26 11:18:11 PM] [INFO] No changes required for vlan interface: vlan901 [2018/04/26 11:18:11 PM] [INFO] No changes required for bridge: br-ex [root@overcloud-compute-0 ~]# ~~~ Even if we manipulate interfaces live manually, os-net-config would not cause a restart: ~~~ [root@overcloud-compute-0 ~]# ip link set dev br-ex mtu 2000 [root@overcloud-compute-0 ~]# ip a a dev br-ex 192.168.123.5/24 [root@overcloud-compute-0 ~]# ip link set dev br-ex up [root@overcloud-compute-0 ~]# ip link set dev vlan902 down [root@overcloud-compute-0 ~]# os-net-config -v -c /etc/os-net-config/config.json [2018/04/26 11:19:11 PM] [INFO] Using config file at: /etc/os-net-config/config.json [2018/04/26 11:19:11 PM] [INFO] Using mapping file at: /etc/os-net-config/mapping.yaml [2018/04/26 11:19:11 PM] [INFO] Ifcfg net config provider created. [2018/04/26 11:19:11 PM] [INFO] nic5 mapped to: eth4 [2018/04/26 11:19:11 PM] [INFO] nic4 mapped to: eth3 [2018/04/26 11:19:11 PM] [INFO] nic3 mapped to: eth2 [2018/04/26 11:19:11 PM] [INFO] nic2 mapped to: eth1 [2018/04/26 11:19:11 PM] [INFO] nic1 mapped to: eth0 [2018/04/26 11:19:11 PM] [INFO] adding interface: eth0 [2018/04/26 11:19:11 PM] [INFO] adding custom route for interface: eth0 [2018/04/26 11:19:11 PM] [INFO] adding bridge: br-ex [2018/04/26 11:19:11 PM] [INFO] adding interface: eth1 [2018/04/26 11:19:11 PM] [INFO] adding vlan: vlan901 [2018/04/26 11:19:11 PM] [INFO] adding vlan: vlan903 [2018/04/26 11:19:11 PM] [INFO] adding vlan: vlan902 [2018/04/26 11:19:11 PM] [INFO] adding interface: eth2 [2018/04/26 11:19:11 PM] [INFO] adding interface: eth3 [2018/04/26 11:19:11 PM] [INFO] applying network configs... [2018/04/26 11:19:11 PM] [INFO] No changes required for interface: eth3 [2018/04/26 11:19:11 PM] [INFO] No changes required for interface: eth2 [2018/04/26 11:19:11 PM] [INFO] No changes required for interface: eth1 [2018/04/26 11:19:11 PM] [INFO] No changes required for interface: eth0 [2018/04/26 11:19:11 PM] [INFO] No changes required for vlan interface: vlan903 [2018/04/26 11:19:11 PM] [INFO] No changes required for vlan interface: vlan902 [2018/04/26 11:19:11 PM] [INFO] No changes required for vlan interface: vlan901 [2018/04/26 11:19:11 PM] [INFO] No changes required for bridge: br-ex [root@overcloud-compute-0 ~]# ~~~ Now changing a file in /etc/sysconfig/network-scripts, e.g. as follows: ~~~ [root@overcloud-compute-0 ~]# ovs-ofctl dump-flows br-ex NXST_FLOW reply (xid=0x4): cookie=0xb37330a54aeb3705, duration=110643.987s, table=0, n_packets=0, n_bytes=0, idle_age=65534, hard_age=65534, priority=4,in_port=5,dl_vlan=2 actions=mod_vlan_vid:906,NORMAL cookie=0xb37330a54aeb3705, duration=110651.569s, table=0, n_packets=3420, n_bytes=403312, idle_age=11, hard_age=65534, priority=2,in_port=5 actions=drop cookie=0xb37330a54aeb3705, duration=110651.578s, table=0, n_packets=2139296, n_bytes=349532217, idle_age=0, hard_age=65534, priority=0 actions=NORMAL # note, I'm pushing a custom MTU line in ifcfg-br-ex to trigger the network restart [root@overcloud-compute-0 ~]# cat !$ cat /etc/sysconfig/network-scripts/ifcfg-br-ex # This file is autogenerated by os-net-config DEVICE=br-ex MTU=2000 ONBOOT=yes HOTPLUG=no NM_CONTROLLED=no PEERDNS=no DEVICETYPE=ovs TYPE=OVSBridge OVS_EXTRA="set bridge br-ex other-config:hwaddr=52:54:00:94:27:2f -- set bridge br-ex fail_mode=standalone" [root@overcloud-compute-0 ~]# os-net-config -v -c /etc/os-net-config/config.json [2018/04/26 11:15:17 PM] [INFO] Using config file at: /etc/os-net-config/config.json [2018/04/26 11:15:17 PM] [INFO] Using mapping file at: /etc/os-net-config/mapping.yaml [2018/04/26 11:15:17 PM] [INFO] Ifcfg net config provider created. [2018/04/26 11:15:17 PM] [INFO] nic5 mapped to: eth4 [2018/04/26 11:15:17 PM] [INFO] nic4 mapped to: eth3 [2018/04/26 11:15:17 PM] [INFO] nic3 mapped to: eth2 [2018/04/26 11:15:17 PM] [INFO] nic2 mapped to: eth1 [2018/04/26 11:15:17 PM] [INFO] nic1 mapped to: eth0 [2018/04/26 11:15:17 PM] [INFO] adding interface: eth0 [2018/04/26 11:15:17 PM] [INFO] adding custom route for interface: eth0 [2018/04/26 11:15:17 PM] [INFO] adding bridge: br-ex [2018/04/26 11:15:17 PM] [INFO] adding interface: eth1 [2018/04/26 11:15:17 PM] [INFO] adding vlan: vlan901 [2018/04/26 11:15:17 PM] [INFO] adding vlan: vlan903 [2018/04/26 11:15:17 PM] [INFO] adding vlan: vlan902 [2018/04/26 11:15:17 PM] [INFO] adding interface: eth2 [2018/04/26 11:15:17 PM] [INFO] adding interface: eth3 [2018/04/26 11:15:17 PM] [INFO] applying network configs... [2018/04/26 11:15:17 PM] [INFO] No changes required for interface: eth3 [2018/04/26 11:15:17 PM] [INFO] No changes required for interface: eth2 [2018/04/26 11:15:17 PM] [INFO] No changes required for interface: eth1 [2018/04/26 11:15:17 PM] [INFO] No changes required for interface: eth0 [2018/04/26 11:15:17 PM] [INFO] No changes required for vlan interface: vlan903 [2018/04/26 11:15:17 PM] [INFO] No changes required for vlan interface: vlan902 [2018/04/26 11:15:17 PM] [INFO] No changes required for vlan interface: vlan901 [2018/04/26 11:15:17 PM] [INFO] running ifdown on interface: vlan903 [2018/04/26 11:15:18 PM] [INFO] running ifdown on interface: vlan902 [2018/04/26 11:15:19 PM] [INFO] running ifdown on interface: vlan901 [2018/04/26 11:15:19 PM] [INFO] running ifdown on interface: eth1 [2018/04/26 11:15:20 PM] [INFO] running ifdown on bridge: br-ex [2018/04/26 11:15:20 PM] [INFO] Writing config /etc/sysconfig/network-scripts/route6-br-ex [2018/04/26 11:15:20 PM] [INFO] Writing config /etc/sysconfig/network-scripts/ifcfg-br-ex [2018/04/26 11:15:20 PM] [INFO] Writing config /etc/sysconfig/network-scripts/route-br-ex [2018/04/26 11:15:20 PM] [INFO] running ifup on bridge: br-ex [2018/04/26 11:15:22 PM] [INFO] running ifup on interface: vlan903 [2018/04/26 11:15:27 PM] [INFO] running ifup on interface: vlan902 [2018/04/26 11:15:32 PM] [INFO] running ifup on interface: vlan901 [2018/04/26 11:15:37 PM] [INFO] running ifup on interface: eth1 [root@overcloud-compute-0 ~]# [root@overcloud-compute-0 ~]# [root@overcloud-compute-0 ~]# [root@overcloud-compute-0 ~]# ovs-ofctl dump-flows br-ex NXST_FLOW reply (xid=0x4): cookie=0x0, duration=21.750s, table=0, n_packets=610, n_bytes=71190, idle_age=0, priority=0 actions=NORMAL [root@overcloud-compute-0 ~]# ~~~ O.k., so what's the consequence of this? a) VLAN connections will be broken due to loss of internal patch cable to br-int: ~~~ [root@overcloud-compute-0 ~]# ovs-vsctl show 3b8f43d7-01f5-4bb7-8155-2fde36264c5d Bridge br-ex Port "vlan902" tag: 902 Interface "vlan902" type: internal Port br-ex Interface br-ex type: internal Port "vlan903" tag: 903 Interface "vlan903" type: internal Port "eth1" Interface "eth1" Port "vlan901" tag: 901 Interface "vlan901" type: internal Bridge br-int ~~~ b) flows are switched from neutron-openvswitch-agent flows to default "NORMAL" with cookie=0x0 This has the potential to be a time bomb. Under certain circumstances, neutron-openvswitch-agent will clean up stale flows (flows that have the wrong cookie). This cleanup can happen months (!) later, see BZ https://bugzilla.redhat.com/1571647
*** This bug has been marked as a duplicate of bug 1571647 ***