Bug 1431498

Summary: Mistral Workflow for ease of deployment
Product: Red Hat OpenStack Reporter: atelang <atelang>
Component: openstack-tripleo-commonAssignee: Jaganathan Palanisamy <jpalanis>
Status: CLOSED ERRATA QA Contact: Yariv <yrachman>
Severity: medium Docs Contact:
Priority: high    
Version: 12.0 (Pike)CC: achernet, atelang, dbecker, fbaudin, jpalanis, jraju, mburns, morazi, rhel-osp-director-maint, sasha, sclewis, skramaja, slinaber, supadhya, tvignaud, vchundur, yrachman, zgreenbe
Target Milestone: rcKeywords: FutureFeature, Triaged
Target Release: 12.0 (Pike)   
Hardware: Unspecified   
OS: Unspecified   
URL: https://review.openstack.org/#/c/423304/
Whiteboard:
Fixed In Version: openstack-tripleo-common-7.4.1-0.20170807001945.8c46306.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1599843 (view as bug list) Environment:
Last Closed: 2017-12-13 21:13:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1516911, 1518019    
Bug Blocks: 1432986, 1442136, 1469768, 1599843    

Description atelang 2017-03-13 02:10:01 UTC
Description of problem:
From partner point of views having a 'reference workflow' in place for deployment makes a lot of sense as it takes away a lot of pressure to get the documentation right besides aiding operators to deploy easily.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info: Upstream Blueprint https://review.openstack.org/#/c/423304/
Details: Add the ability to save a set of Overcloud tunings in a Performance
Profile in Heat and provide a list of predefined tunings to help
operators easily deploy an Overcloud that is already tuned for a
particular type of workload, i.e. "tuned for TripleO".

Comment 1 Vijay Chundury 2017-03-13 09:11:31 UTC
Just for tracking purposes this feature would be blocked on ensuring this review https://review.openstack.org/#/c/424729/ is upstreamed.

The workflow architecture is almost finalised and would add more details in the execution path.

Comment 6 Ziv Greenberg 2017-10-18 09:53:51 UTC
Based on the handover document, it is not clear how to enable it in the underlcoud node the Derive Parameters collections.

https://docs.google.com/document/d/1oyBi7PE_Y15CZJnWBYR8cq0pB1lmlXEzpCNaO3_DQ14/edit#

I have added the "inspection_extras = true" parameter to the underlcoud.conf file.
Are there any additional prerequisites in order to collect the extra parameteres?

The output of the command does not fit expected results in the handover doc, above:

(undercloud) [stack@undercloud-0 ospd-12-dpdk]$ openstack overcloud deploy --templates -e /home/stack/ospd-12-dpdk/network-environment.yaml --update-plan-only -p /home/stack/plan-environment-derived-params.yamlStarted Mistral Workflow tripleo.validations.v1.check_pre_deployment_validations. Execution ID: 3df7ae34-8218-4f21-9d0a-fe303f80fde4
Waiting for messages on queue '1027b2d9-0ce5-4f04-b705-fe3ca83f8609' with no timeout.
Removing the current plan files
Uploading new plan files
Started Mistral Workflow tripleo.plan_management.v1.update_deployment_plan. Execution ID: 3907e85d-d445-452e-870f-c84da2f24049
Plan updated.
Processing templates in the directory /tmp/tripleoclient-Zl8Wnc/tripleo-heat-templates
Invoking workflow (tripleo.derive_params.v1.derive_parameters) specified in plan-environment file
Started Mistral Workflow tripleo.derive_params.v1.derive_parameters. Execution ID: e33d81f1-fabb-4de2-b1ce-57ba386161cf
Started Mistral Workflow tripleo.plan_management.v1.get_deprecated_parameters. Execution ID: 4f525fe3-8140-4ecb-998e-1f31a3b4cdf6
WARNING: Following parameters are deprecated and still defined. Deprecated parameters will be removed soon!
  OvercloudControlFlavor

Comment 7 Jaganathan Palanisamy 2017-10-23 07:00:44 UTC
inspection_extras is true default and configured numa-topology collectors default in /httpboot/inspector.ipxe (ipa-inspection-collectors=default,extra-hardware,numa-topology,logs).

No additional prerequisites to get numa-topology introspection data.

Can you please share the introspection data for compute dpdk node?

Comment 8 Jaganathan Palanisamy 2017-10-23 11:02:56 UTC
(undercloud) [stack@undercloud-0 ospd-12-sriov-dpdk-heterogeneous-cluster]$ openstack overcloud deploy --templates --update-plan-only -r /home/stack/ospd-12-sriov-dpdk-heterogeneous-cluster/roles_data.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e /home/stack/ospd-12-sriov-dpdk-heterogeneous-cluster/docker-images.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/host-config-and-reboot.yaml -e /home/stack/ospd-12-sriov-dpdk-heterogeneous-cluster/network-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/neutron-sriov.yaml 
-e /usr/share/openstack-tripleo-heat-templates/environments/neutron-ovs-dpdk.yaml 
-p /home/stack/plan-environment-derived-params.yaml


Started Mistral Workflow tripleo.validations.v1.check_pre_deployment_validations. Execution ID: 073c57da-1614-45d2-967c-b334cdbd2324
Waiting for messages on queue '86e00b13-f246-4142-a29f-7beea29aea7a' with no timeout.
Removing the current plan files
Uploading new plan files
Started Mistral Workflow tripleo.plan_management.v1.update_deployment_plan. Execution ID: b2e825dc-839b-4ca5-b2d3-5fadf3991474
Plan updated.
Processing templates in the directory /tmp/tripleoclient-BaCv5t/tripleo-heat-templates
Invoking workflow (tripleo.derive_params.v1.derive_parameters) specified in plan-environment file
Started Mistral Workflow tripleo.derive_params.v1.derive_parameters. Execution ID: b953ff2d-651d-470b-97d3-a8a8d95a2178
Workflow execution is failed: [{u'status': u'SUCCESS', u'message': u'', u'role_name': u'Controller'}, {u'status': u'SUCCESS', u'message': u'', u'role_name': u'ComputeSriov'}, {u'status': u'FAILED', u'message': u"Unable to determine matching node for profile 'computeovsdpdk'", u'role_name': u'ComputeOvsDpdk'}]

Getting "Unable to determine matching node for profile 'computeovsdpdk'" message because currently no available nodes for computeovsdpdk profile on your environment.
mistral run-action ironic.node_list '{"maintenance": "false", "provision_state": "available"}'
{"result": []}

We need to try once updated the nodes provision state on your env.

Comment 9 Ziv Greenberg 2017-10-23 12:39:07 UTC
as agreed, the environment has been reinstalled. overcloud nodes have finished the introspection successfully. 

the system is ready for your debugging.

Comment 10 Saravanan KR 2017-10-23 13:20:26 UTC
Here is the introspection of computeovsdpdk node -
 http://chunk.io/krsacme/563675e464fe40aeaf7e22b5b5c26c2e

Deriving parameter workflow is failing with below error:
  {
    "role_name": "ComputeOvsDpdk"
    "status": "FAILED",
    "message": {
      "status": "FAILED",
      "message": "Unable to determine NUMA node for DPDK NIC: ens2f0",
      "result": "None",
    ....

It is related to the error https://review.openstack.org/#/c/511411/. Need to apply this patch and try again as mentioned in the handoff doc. The script to initiate derive params is at 
/home/stack/ospd-12-sriov-dpdk-heterogeneous-cluster/overcloud_deploy_derive.sh

Comment 11 Yariv 2017-10-23 13:47:16 UTC
(In reply to Saravanan KR from comment #10)
> Here is the introspection of computeovsdpdk node -
>  http://chunk.io/krsacme/563675e464fe40aeaf7e22b5b5c26c2e
> 
> Deriving parameter workflow is failing with below error:
>   {
>     "role_name": "ComputeOvsDpdk"
>     "status": "FAILED",
>     "message": {
>       "status": "FAILED",
>       "message": "Unable to determine NUMA node for DPDK NIC: ens2f0",
>       "result": "None",
>     ....
> 
> It is related to the error https://review.openstack.org/#/c/511411/. Need to
> apply this patch and try again as mentioned in the handoff doc. The script
> to initiate derive params is at 
> /home/stack/ospd-12-sriov-dpdk-heterogeneous-cluster/overcloud_deploy_derive.
> sh

It is Fixed in
openstack-tripleo-common-7.4.1-0.20170807001945.8c46306.el7ost

latest deployment is 
openstack-tripleo-common-7.6.3-0.20171010234828.el7ost.noarch.rpm 

It seems that there is a packaging problem

Comment 12 Jaganathan Palanisamy 2017-10-24 06:16:12 UTC
Unable to determine NUMA node Issue BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1486620

Comment 16 Yariv 2017-11-28 19:55:25 UTC
Verified with KnownIssue
BZ https://bugzilla.redhat.com/show_bug.cgi?id=1516911 with exception of HP DL 360/380 HW

Comment 19 errata-xmlrpc 2017-12-13 21:13:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:3462