Created attachment 1358249 [details] compute node introspection details Description of problem: During mistral workflows for dpdk procedure, I have encountered the following issue with the OvsDpdkMemoryChannels parameter: [stack@undercloud-0 ospd-12-dpdk]$ openstack overcloud deploy --templates --update-plan-only -p plan-environment-derived-params.yaml -r /home/stack/ospd-12-dpdk/roles_data.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/docker.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/host-config-and-reboot.yaml -e /home/stack/ospd-12-dpdk/network-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/neutron-ovs-dpdk.yaml Started Mistral Workflow tripleo.validations.v1.check_pre_deployment_validations. Execution ID: e975bcdf-324d-4cb3-9684-9e1f20c621de Waiting for messages on queue '68288b5b-2459-41e2-9da9-478627b16e8d' with no timeout. Removing the current plan files Uploading new plan files Started Mistral Workflow tripleo.plan_management.v1.update_deployment_plan. Execution ID: af723afc-77d8-47c6-bacd-6e121e9314fb Plan updated. Processing templates in the directory /tmp/tripleoclient-1Ayzjn/tripleo-heat-templates Invoking workflow (tripleo.derive_params.v1.derive_parameters) specified in plan-environment file Started Mistral Workflow tripleo.derive_params.v1.derive_parameters. Execution ID: 35826bb7-1c4c-4776-8a48-864eb08f1345 Workflow execution is failed: [{u'status': u'SUCCESS', u'message': u'', u'role_name': u'Controller'}, {u'status': u'FAILED', u'message': {u'status': u'FAILED', u'sock_mem': u'2048,1024', u'mem_channel': 0, u'network_configs': [{u'type': u'interface', u'defroute': False, u'name': u'nic1', u'use_dhcp': False}, {u'routes': [{u'ip_netmask': u'169.254.169.254/32', u'next_hop': u'192.0.20.1'}, {u'default': True, u'next_hop': u'192.0.20.1'}], u'use_dhcp': False, u'type': u'interface', u'addresses': [{u'ip_netmask': u'/24'}], u'name': u'nic2'}, {u'dns_servers': [u'10.35.28.28', u'8.8.8.8'], u'name': u'bond_api', u'bonding_options': u'mode=active-backup', u'members': [{u'type': u'interface', u'name': u'nic3', u'primary': True}], u'use_dhcp': False, u'type': u'linux_bond'}, {u'device': u'bond_api', u'type': u'vlan', u'addresses': [{u'ip_netmask': u''}], u'vlan_id': 525}, {u'device': u'bond_api', u'type': u'vlan', u'addresses': [{u'ip_netmask': u''}], u'vlan_id': 526}, {u'device': u'bond_api', u'type': u'vlan', u'addresses': [{u'ip_netmask': u''}], u'vlan_id': 527}, {u'use_dhcp': False, u'type': u'ovs_user_bridge', u'name': u'br-link', u'members': [{u'type': u'ovs_dpdk_port', u'name': u'dpdk0', u'members': [{u'type': u'interface', u'name': u'nic4'}]}]}], u'memory_slot_info': [u'PROC 1 DIMM 9', u'PROC 1 DIMM 10', u'PROC 1 DIMM 3', u'PROC 1 DIMM 4', u'PROC 1 DIMM 1', u'PROC 1 DIMM 2', u'PROC 1 DIMM 7', u'PROC 1 DIMM 8', u'PROC 1 DIMM 5', u'PROC 1 DIMM 6', u'PROC 2 DIMM 10', u'PROC 2 DIMM 9', u'PROC 2 DIMM 12', u'PROC 2 DIMM 11', u'PROC 1 DIMM 11', u'PROC 1 DIMM 12', u'PROC 2 DIMM 1', u'PROC 2 DIMM 2', u'PROC 2 DIMM 3', u'PROC 2 DIMM 4', u'PROC 2 DIMM 5', u'PROC 2 DIMM 6', u'PROC 2 DIMM 7', u'PROC 2 DIMM 8'], u'updated_mem_slot_info': [u'PROC ', u'PROC ', u'PROC ', u'PROC ', u'PROC ', u'PROC ', u'PROC ', u'PROC ', u'PROC ', u'PROC ', u'PROC ', u'PROC ', u'PROC ', u'PROC ', u'PROC ', u'PROC ', u'PROC ', u'PROC ', u'PROC ', u'PROC ', u'PROC ', u'PROC ', u'PROC ', u'PROC '], u'host_cpus': u'0,8,16,24', u'pmd_cpus': u'1,7,10,17,23,26', u'num_cores_per_numa_nodes': [2, 1], u'result': None, u'dpdk_nics_numa_info': [{u'numa_node': 0, u'name': u'ens2f0', u'mtu': 1500}], u'numa_nodes': [0, 1], u'message': u'Unable to determine OvsDpdkMemoryChannels parameter', u'dpdk_nics_numa_nodes': [0]}, u'role_name': u'ComputeOvsDpdk'}] Version-Release number of selected component (if applicable): openstack-tripleo-common-7.6.3-3.el7ost.noarch Steps to Reproduce: 1. deploy rhos 12 undercloud 2. introspect overcloud nodes 3. run mistral workflow for dpdk. Actual results: Unable to determine DIMMs format correctly. Additional info: compute node introspection details is attached.
Memory channels parameter is not derivable using introspection memory bank data in mistral derive parameter since format of memory slot name is not consistent on different environments. Mostly default memory channels as 4 and user can read the environment spec and update memory channels parameter for that environment. Different formats : P1-DIMMA1, A1, PROC 1 DIMM 1.
LP: https://bugs.launchpad.net/tripleo/+bug/1734814 Added below patches, once merged upstream, it will be back-ported to stable/pike. https://review.openstack.org/#/c/523315/ (tripleo-common) https://review.openstack.org/#/c/523358/ (tripleo-heat-tempaltes)
Below patches are merged in upstream. https://review.openstack.org/#/c/523315/ (tripleo-common) https://review.openstack.org/#/c/523358/ (tripleo-heat-tempaltes) Backported to stable/pike in upstream https://review.openstack.org/#/c/525476/ (tripleo-common) - Merged in upstream https://review.openstack.org/#/c/527875/ (tripleo-heat-tempaltes) - In progress
Backported and merged in upstream stable/pike https://review.openstack.org/#/c/525476/ (tripleo-common) https://review.openstack.org/#/c/527875/ (tripleo-heat-tempaltes)
This BZ is a result of the bug that was reported during the OSP12 test cycle. https://bugzilla.redhat.com/show_bug.cgi?id=1431498#c16 I believe this should be part of the OSP 13 testing. From development side this was the only bug that got tested. In case it is pushed to OSP 14, it would help understand from Franck to see if he perceives any risk or any blockers to be made part of the roadmap. Need Franck's opinion on this. Regards Vijay.
Verified with Puddle 2018-08-08.2
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:0045