Bug 1393644

Summary: balance-tcp not accepted as valid for BondInterfaceOvsOptions
Product: Red Hat OpenStack Reporter: August Simonelli <asimonel>
Component: openstack-tripleo-heat-templatesAssignee: Jiri Stransky <jstransk>
Status: CLOSED NOTABUG QA Contact: Arik Chernetsky <achernet>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 10.0 (Newton)CC: amuller, bperkins, djuran, dsneddon, jcoufal, kbasil, mburns, mcornea, pbandark, rhel-osp-director-maint, sputhenp
Target Milestone: ---Flags: jcoufal: needinfo? (kbasil)
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1469461 (view as bug list) Environment:
Last Closed: 2017-03-23 17:30:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1469461    

Description August Simonelli 2016-11-10 05:02:04 UTC
Description of problem:

when setting the following in network-environment.yaml

BondInterfaceOvsOptions: "bond_mode=balance-tcp lacp=active other_config:lacp-time=fast other_config:bond-detect-mode=miimon other_config:bond-miimon-interval=100"

the deployment fails with a warning about balance-tcp

Version-Release number of selected component (if applicable):
python-tripleoclient-5.3.0-4.el7ost.noarch
openstack-tripleo-heat-templates-5.0.0-1.2.el7ost.noarch

How reproducible:
set balance-tcp in BondInterfaceOvsOptions

Steps to Reproduce:
1. use bond_mode=balance-tcp in BondInterfaceOvsOptions
2. run a deployment
3.

Actual results:
deployment fails with the following:
Removing the current plan files
Uploading new plan files
Started Mistral Workflow. Execution ID: 0aef7d6d-f6a8-4b8e-8623-f5d8787045e0
Plan updated
Deploying templates in the directory /tmp/tripleoclient-hl_1b2/tripleo-heat-templates
Started Mistral Workflow. Execution ID: 71b30b15-ebab-48c1-a749-cf8a84be2884
{u'execution': {u'id': u'71b30b15-ebab-48c1-a749-cf8a84be2884',
                u'input': {u'container': u'overcloud',
                           u'queue_name': u'8cdbf972-330c-4c1c-a001-501847562731',
                           u'timeout': 90},
                u'name': u'tripleo.deployment.v1.deploy_plan',
                u'params': {},
                u'spec': {u'input': [u'container',
                                     {u'timeout': 240},
                                     {u'queue_name': u'tripleo'}],
                          u'name': u'deploy_plan',
                          u'tasks': {u'copy_validation_ssh_keys': {u'name': u'copy_validation_ssh_keys',
                                                                   u'on-complete': u'send_message',
                                                                   u'type': u'direct',
                                                                   u'version': u'2.0',
                                                                   u'workflow': u'tripleo.validations.v1.copy_ssh_key'},
                                     u'deploy': {u'action': u'tripleo.deployment.deploy timeout=<% $.timeout %> container=<% $.container %>',
                                                 u'name': u'deploy',
                                                 u'on-error': u'set_deployment_failed',
                                                 u'on-success': u'test_validations_enabled',
                                                 u'type': u'direct',
                                                 u'version': u'2.0'},
                                     u'send_message': {u'action': u'zaqar.queue_post',
                                                       u'input': {u'messages': {u'body': {u'payload': {u'execution': u'<% execution() %>',
                                                                                                       u'message': u"<% $.get('message', '') %>",
                                                                                                       u'status': u"<% $.get('status', 'SUCCESS') %>"},
                                                                                          u'type': u'tripleo.deployment.v1.deploy_plan'}},
                                                                  u'queue_name': u'<% $.queue_name %>'},
                                                       u'name': u'send_message',
                                                       u'retry': u'count=5 delay=1',
                                                       u'type': u'direct',
                                                       u'version': u'2.0'},
                                     u'set_deployment_failed': {u'name': u'set_deployment_failed',
                                                                u'on-success': u'send_message',
                                                                u'publish': {u'message': u'<% task(deploy).result %>',
                                                                             u'status': u'FAILED'},
                                                                u'type': u'direct',
                                                                u'version': u'2.0'},
                                     u'test_validations_enabled': {u'action': u'tripleo.validations.enabled',
                                                                   u'name': u'test_validations_enabled',
                                                                   u'on-error': u'send_message',
                                                                   u'on-success': u'copy_validation_ssh_keys',
                                                                   u'type': u'direct',
                                                                   u'version': u'2.0'}},
                          u'version': u'2.0'}},
 u'message': u"Failed to run action [action_ex_id=9e77f84b-edbd-42f6-9811-928a89a60052, action_cls='<class 'mistral.actions.action_factory.DeployStackAction'>', attributes='{}', params='{u'container': u'overcloud', u'timeout': 90}']\n ERROR: Failed to validate: Failed to validate: resources[0]: Failed to validate: resources.NetworkConfig: Parameter 'BondInterfaceOvsOptions' is invalid: Invalid default bond_mode=balance-tcp lacp=active other_config:lacp-time=fast other_config:bond-detect-mode=miimon other_config:bond-miimon-interval=100 (The balance-tcp bond mode is known to cause packet loss and\nshould not be used in BondInterfaceOvsOptions.\n)",
 u'status': u'FAILED'}


Expected results:

OSP10 should support this bonding mode for dpdk/ovs

Additional info:

Comment 1 August Simonelli 2016-11-10 21:37:49 UTC
patch that added the constraint

https://review.openstack.org/#/c/355073/

Comment 2 Dan Sneddon 2016-11-17 22:57:14 UTC
The bug with LACP+OVS is applicable to OSP10 + RHEL 7.3. The fix for OVS is in a newer kernel which will be available with RHEL 7.4.

See https://bugzilla.redhat.com/show_bug.cgi?id=1388592 for more info.

My suggestion is that we wait until the point release of OSP 10 and RHEL 7.4 to undo this constraint.

My own feeling is that we should close this bug as either NOTABUG or DEFERRED, and then create a new bug for tracking, OR we can retarget this bug to be fixed in OSP 10 point release or OSP 11.

Comment 3 Sadique Puthen 2016-11-18 10:40:55 UTC
(In reply to Dan Sneddon from comment #2)
> The bug with LACP+OVS is applicable to OSP10 + RHEL 7.3. The fix for OVS is
> in a newer kernel which will be available with RHEL 7.4.
> 
> See https://bugzilla.redhat.com/show_bug.cgi?id=1388592 for more info.

This bug says this can be backported to 7.3.z. Do we still need to wait till 7.4 here? Don't we need to remove this constraint when the 7.3.z kernel is out?

> 
> My suggestion is that we wait until the point release of OSP 10 and RHEL 7.4
> to undo this constraint.
> 
> My own feeling is that we should close this bug as either NOTABUG or
> DEFERRED, and then create a new bug for tracking, OR we can retarget this
> bug to be fixed in OSP 10 point release or OSP 11.

Comment 7 Assaf Muller 2016-12-14 17:36:08 UTC
> My own feeling is that we should close this bug as either NOTABUG [...]

Agreed, this is by design.

Comment 8 August Simonelli 2016-12-14 22:02:16 UTC
Yes, this makes sense. Thanks.