Description of problem: ovs|00001|vsctl|INFO|Called as ovsctl -t 10 -- --fake-iface add-bond br-bond bond1 em1 em2 bond_mode=balance-slb lacp=active other-config:lacp-fallback-ab=true ovs|00002|vsctl|ERR|cannot create a port named bond1 because a port named bond1 already exists on bridge br-bond Version-Release number of selected component (if applicable): 7.1 How reproducible: Not always Steps to Reproduce: 1. Configure network-environment.yaml with the following: BondInterfaceOvsOptions: "bond_mode=balance-slb lacp=active other-config:lacp-fallback-ab=true" 2. deploy overcloud 3. on the nodes - ceph, controller, or compute, 'journalctl -u os-collect-config' and search for bond1 Actual results: ibid. Expected results: No failure. Additional info:
Here is more of the error produced: Dec 1 14:23:51 localhost os-collect-config: [2015/12/01 02:23:51 PM] [INFO] running ifup on interface: vlan1100 Dec 1 14:23:51 localhost ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl -t 10 -- --may-exist add-port br-bond vlan1100 tag=1100 -- set Interface vlan1100 type=internal Dec 1 14:23:52 localhost os-collect-config: [2015/12/01 02:23:52 PM] [INFO] running ifup on interface: em2 Dec 1 14:23:52 localhost os-collect-config: [2015/12/01 02:23:52 PM] [INFO] running ifup on interface: em1 Dec 1 14:23:52 localhost os-collect-config: [2015/12/01 02:23:52 PM] [INFO] running ifup on interface: bond1 Dec 1 14:23:53 localhost ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl -t 10 -- --fake-iface add-bond br-bond bond1 em1 em2 bond_mode=balance-slb lacp=active other-config:lacp-fallback-ab=true Dec 1 14:23:53 localhost ovs-vsctl: ovs|00002|vsctl|ERR|cannot create a port named bond1 because a port named bond1 already exists on bridge br-bond Dec 1 14:23:53 localhost os-collect-config: [2015/12/01 02:23:53 PM] [INFO] Running ovs-appctl bond/set-active-slave ('bond1', 'em1') Dec 1 14:23:53 localhost os-collect-config: Traceback (most recent call last): Dec 1 14:23:53 localhost os-collect-config: File "/usr/bin/os-net-config", line 10, in <module> Dec 1 14:23:53 localhost os-collect-config: sys.exit(main()) Dec 1 14:23:53 localhost os-collect-config: File "/usr/lib/python2.7/site-packages/os_net_config/cli.py", line 187, in main Dec 1 14:23:53 localhost os-collect-config: activate=not opts.no_activate) Dec 1 14:23:53 localhost os-collect-config: File "/usr/lib/python2.7/site-packages/os_net_config/impl_ifcfg.py", line 318, in apply Dec 1 14:23:53 localhost os-collect-config: self.bond_primary_ifaces[bond]) Dec 1 14:23:53 localhost os-collect-config: File "/usr/lib/python2.7/site-packages/os_net_config/__init__.py", line 146, in ovs_appctl Dec 1 14:23:53 localhost os-collect-config: self.execute(msg, '/bin/ovs-appctl', action, *parameters) Dec 1 14:23:53 localhost os-collect-config: File "/usr/lib/python2.7/site-packages/os_net_config/__init__.py", line 108, in execute Dec 1 14:23:53 localhost os-collect-config: processutils.execute(cmd, *args, **kwargs) Dec 1 14:23:53 localhost os-collect-config: File "/usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py", line 266, in execute Dec 1 14:23:53 localhost os-collect-config: cmd=sanitized_cmd) Dec 1 14:23:53 localhost os-collect-config: oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command. Dec 1 14:23:53 localhost os-collect-config: Command: /bin/ovs-appctl bond/set-active-slave bond1 em1 Dec 1 14:23:53 localhost os-collect-config: Exit code: 2 Dec 1 14:23:53 localhost os-collect-config: Stdout: u'' Dec 1 14:23:53 localhost os-collect-config: Stderr: u'cannot make disabled slave active\novs-appctl: ovs-vswitchd: server returned an error\n' Dec 1 14:23:53 localhost os-collect-config: + RETVAL=1 Dec 1 14:23:53 localhost os-collect-config: + [[ 1 == 2 ]] Dec 1 14:23:53 localhost os-collect-config: + [[ 1 != 0 ]] Dec 1 14:23:53 localhost os-collect-config: + echo 'ERROR: os-net-config configuration failed.' Dec 1 14:23:53 localhost os-collect-config: ERROR: os-net-config configuration failed. Dec 1 14:23:53 localhost os-collect-config: + exit 1 Dec 1 14:23:53 localhost os-collect-config: [2015-12-01 14:23:53,857] (os-refresh-config) [ERROR] during configure phase. [Command '['dib-run-parts', '/usr/libexec/os-refresh-config/configure.d']' returned non-zero exit status 1] Dec 1 14:23:53 localhost os-collect-config: [2015-12-01 14:23:53,857] (os-refresh-config) [ERROR] Aborting... Dec 1 14:23:53 localhost os-collect-config: 2015-12-01 14:23:53.861 12236 ERROR os-collect-config [-] Command failed, will not cache new data. Command 'os-refresh-config' returned non-zero exit status 1 Dec 1 14:23:53 localhost os-collect-config: 2015-12-01 14:23:53.861 12236 WARNING os-collect-config [-] Sleeping 30.00 seconds before re-exec.
A suggested solution would be to add the --may-exist option to the ovsctl command.
This bug did not make the OSP 8.0 release. It is being deferred to OSP 10.
I suspect configuration error here. I notice a problem with the BondIntefaceOvsOptions string: BondInterfaceOvsOptions: "bond_mode=balance-slb lacp=active other-config:lacp-fallback-ab=true" The lacp=active and other-config:lacp-fallback-ab=true only apply to balance-tcp bonds, which should not be used due to a packet loss bug in OVS. If balance-slb mode is used, remove the rest of the BondInterfaceOvsOptions.