Bug 1287774

Summary: ovsctl -t 10 -- --fake-iface add-bond br-bond called twice
Product: Red Hat OpenStack Reporter: Dan Yocum <dyocum>
Component: rhosp-directorAssignee: Hugh Brock <hbrock>
Status: CLOSED NOTABUG QA Contact: Shai Revivo <srevivo>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 7.0 (Kilo)CC: dsneddon, jcoufal, mburns, rhel-osp-director-maint
Target Milestone: ---   
Target Release: 10.0 (Newton)   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-10-14 18:49:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Dan Yocum 2015-12-02 16:06:09 UTC
Description of problem:

ovs|00001|vsctl|INFO|Called as ovsctl -t 10 -- --fake-iface add-bond br-bond bond1 em1 em2 bond_mode=balance-slb lacp=active other-config:lacp-fallback-ab=true
ovs|00002|vsctl|ERR|cannot create a port named bond1 because a port named bond1 already exists on bridge br-bond

Version-Release number of selected component (if applicable):

7.1

How reproducible:

Not always

Steps to Reproduce:
1. Configure network-environment.yaml with the following:

  BondInterfaceOvsOptions: "bond_mode=balance-slb lacp=active other-config:lacp-fallback-ab=true"

2. deploy overcloud

3. on the nodes - ceph, controller, or compute, 'journalctl -u os-collect-config' and search for bond1


Actual results:

ibid.

Expected results:

No failure.

Additional info:

Comment 2 Dan Yocum 2015-12-02 16:28:09 UTC
Here is more of the error produced:

Dec  1 14:23:51 localhost os-collect-config: [2015/12/01 02:23:51 PM] [INFO] running ifup on interface: vlan1100
Dec  1 14:23:51 localhost ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl -t 10 -- --may-exist add-port br-bond vlan1100 tag=1100 -- set Interface vlan1100 type=internal
Dec  1 14:23:52 localhost os-collect-config: [2015/12/01 02:23:52 PM] [INFO] running ifup on interface: em2
Dec  1 14:23:52 localhost os-collect-config: [2015/12/01 02:23:52 PM] [INFO] running ifup on interface: em1
Dec  1 14:23:52 localhost os-collect-config: [2015/12/01 02:23:52 PM] [INFO] running ifup on interface: bond1
Dec  1 14:23:53 localhost ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl -t 10 -- --fake-iface add-bond br-bond bond1 em1 em2 bond_mode=balance-slb lacp=active other-config:lacp-fallback-ab=true
Dec  1 14:23:53 localhost ovs-vsctl: ovs|00002|vsctl|ERR|cannot create a port named bond1 because a port named bond1 already exists on bridge br-bond
Dec  1 14:23:53 localhost os-collect-config: [2015/12/01 02:23:53 PM] [INFO] Running ovs-appctl bond/set-active-slave ('bond1', 'em1')
Dec  1 14:23:53 localhost os-collect-config: Traceback (most recent call last):
Dec  1 14:23:53 localhost os-collect-config: File "/usr/bin/os-net-config", line 10, in <module>
Dec  1 14:23:53 localhost os-collect-config: sys.exit(main())
Dec  1 14:23:53 localhost os-collect-config: File "/usr/lib/python2.7/site-packages/os_net_config/cli.py", line 187, in main
Dec  1 14:23:53 localhost os-collect-config: activate=not opts.no_activate)
Dec  1 14:23:53 localhost os-collect-config: File "/usr/lib/python2.7/site-packages/os_net_config/impl_ifcfg.py", line 318, in apply
Dec  1 14:23:53 localhost os-collect-config: self.bond_primary_ifaces[bond])
Dec  1 14:23:53 localhost os-collect-config: File "/usr/lib/python2.7/site-packages/os_net_config/__init__.py", line 146, in ovs_appctl
Dec  1 14:23:53 localhost os-collect-config: self.execute(msg, '/bin/ovs-appctl', action, *parameters)
Dec  1 14:23:53 localhost os-collect-config: File "/usr/lib/python2.7/site-packages/os_net_config/__init__.py", line 108, in execute
Dec  1 14:23:53 localhost os-collect-config: processutils.execute(cmd, *args, **kwargs)
Dec  1 14:23:53 localhost os-collect-config: File "/usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py", line 266, in execute
Dec  1 14:23:53 localhost os-collect-config: cmd=sanitized_cmd)
Dec  1 14:23:53 localhost os-collect-config: oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command.
Dec  1 14:23:53 localhost os-collect-config: Command: /bin/ovs-appctl bond/set-active-slave bond1 em1
Dec  1 14:23:53 localhost os-collect-config: Exit code: 2
Dec  1 14:23:53 localhost os-collect-config: Stdout: u''
Dec  1 14:23:53 localhost os-collect-config: Stderr: u'cannot make disabled slave active\novs-appctl: ovs-vswitchd: server returned an error\n'
Dec  1 14:23:53 localhost os-collect-config: + RETVAL=1
Dec  1 14:23:53 localhost os-collect-config: + [[ 1 == 2 ]]
Dec  1 14:23:53 localhost os-collect-config: + [[ 1 != 0 ]]
Dec  1 14:23:53 localhost os-collect-config: + echo 'ERROR: os-net-config configuration failed.'
Dec  1 14:23:53 localhost os-collect-config: ERROR: os-net-config configuration failed.
Dec  1 14:23:53 localhost os-collect-config: + exit 1
Dec  1 14:23:53 localhost os-collect-config: [2015-12-01 14:23:53,857] (os-refresh-config) [ERROR] during configure phase. [Command '['dib-run-parts', '/usr/libexec/os-refresh-config/configure.d']' returned non-zero exit status 1]
Dec  1 14:23:53 localhost os-collect-config: [2015-12-01 14:23:53,857] (os-refresh-config) [ERROR] Aborting...
Dec  1 14:23:53 localhost os-collect-config: 2015-12-01 14:23:53.861 12236 ERROR os-collect-config [-] Command failed, will not cache new data. Command 'os-refresh-config' returned non-zero exit status 1
Dec  1 14:23:53 localhost os-collect-config: 2015-12-01 14:23:53.861 12236 WARNING os-collect-config [-] Sleeping 30.00 seconds before re-exec.

Comment 3 Dan Yocum 2015-12-02 16:29:23 UTC
A suggested solution would be to add the --may-exist option to the ovsctl command.

Comment 5 Mike Burns 2016-04-07 21:00:12 UTC
This bug did not make the OSP 8.0 release.  It is being deferred to OSP 10.

Comment 7 Dan Sneddon 2016-10-14 18:49:05 UTC
I suspect configuration error here. I notice a problem with the BondIntefaceOvsOptions string:

BondInterfaceOvsOptions: "bond_mode=balance-slb lacp=active other-config:lacp-fallback-ab=true"

The lacp=active and other-config:lacp-fallback-ab=true only apply to balance-tcp bonds, which should not be used due to a packet loss bug in OVS.

If balance-slb mode is used, remove the rest of the BondInterfaceOvsOptions.