Bug 1384845

Summary: [RFE] tuned and tuned NFV profile installation and configuration, on the overcloud
Product: Red Hat OpenStack Reporter: Franck Baudin <fbaudin>
Component: openstack-tripleo-puppet-elementsAssignee: RHOS Maint <rhos-maint>
Status: CLOSED ERRATA QA Contact: Ofer Blaut <oblaut>
Severity: medium Docs Contact:
Priority: high    
Version: 10.0 (Newton)CC: atelang, dnavale, edannon, fbaudin, hbrock, jjung, jslagle, lbopf, mbabushk, mburns, nlevinki, oblaut, rhel-osp-director-maint, sclewis, sgordon, skramaja, tvignaud, vchundur, wchadwic, yrachman, zgreenbe
Target Milestone: Upstream M1Keywords: FutureFeature, InstallerIntegration, Triaged
Target Release: 12.0 (Pike)   
Hardware: x86_64   
OS: Linux   
URL: https://blueprints.launchpad.net/tripleo/+spec/tuned-nfv-dpdk
Whiteboard: upstream_milestone_pike-1 upstream_definition_approved upstream_status_implemented
Fixed In Version: openstack-tripleo-puppet-elements-7.0.0-0.20170614005502.9285877.el7ost Doc Type: Known Issue
Doc Text:
When an overcloud image is shipped with 'tuned' version lower than 2.7.1-4, you should apply a manual update of the 'tuned' package to the overcloud image. If the 'tuned' version is equal to 2.7.1-4 or higher, you should provide the list of the core to 'tuned' and activate the profile, for example: # echo "isolated_cores=2,4,6,8,10,12,14,18,20,22,24,26,28,30" >> /etc/tuned/cpu-partitioning-variables.conf # tuned-adm profile cpu-partitioning This is a known issue until the 'tuned' packages are available in the Centos repositories.
Story Points: ---
Clone Of:
: 1441923 (view as bug list) Environment:
Last Closed: 2017-12-13 20:46:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1235009, 1341176, 1414580, 1441923, 1442136, 1445709, 1445858, 1469555    

Description Franck Baudin 2016-10-14 09:10:24 UTC
Description of problem:

For NFV compute composable roles  we need to install and enable tuned package and NFV profile, in order to configure CPUAffinity and IRQ Repinning on the hosts. As of RHOSP10, this isbe done via first-boot scripts, but it would be appropriate if it's done via puppet manifests during deployment.

Comment 1 Yariv 2016-11-24 13:55:27 UTC
https://blueprints.launchpad.net/tripleo/+spec/tuned-nfv-dpdk

Is it only for DPDK?

Comment 5 Vijay Chundury 2016-12-06 09:58:10 UTC
(In reply to Yariv from comment #1)
> https://blueprints.launchpad.net/tripleo/+spec/tuned-nfv-dpdk
> 
> Is it only for DPDK?

Tune-d is definitely needed to provide affinitiy and isolation for host cores/DPDK PMD(guest)cores respectively, for OVSDPDK usecase.

In case of SRIOV i would think the re-mapping of system interrupts to the Host CPU's would be needed. 

Regards
Vijay.

Comment 13 Saravanan KR 2017-01-09 06:36:24 UTC
Working on upstream blueprint - https://blueprints.launchpad.net/tripleo/+spec/tuned-nfv-dpdk.

Review - https://review.openstack.org/#/c/411797/

Comment 14 atelang 2017-01-09 12:04:31 UTC
Patches upstream. Owner: skramaja

Comment 15 Saravanan KR 2017-01-10 13:33:59 UTC
tripleo-puppet-elements review, https://review.openstack.org/#/c/418348/

Comment 17 Stephen Gordon 2017-01-17 20:15:46 UTC
Can you give a bit more context as to what is actually being done here? Is this profile being enabled for *all* Compute nodes, or only those specific subsets (Dpdk, SR-IOV, in the future real-time) that require it?

Comment 18 Saravanan KR 2017-01-18 05:33:25 UTC
(In reply to Stephen Gordon from comment #17)
> Can you give a bit more context as to what is actually being done here? Is
> this profile being enabled for *all* Compute nodes, or only those specific
> subsets (Dpdk, SR-IOV, in the future real-time) that require it?

By default, it is disabled. And it is independent of a feature. If a particular deployment needs tune-d support for a role, the corresponding role has to add the parameter and environment file. It is applied to all the nodes in the given role.

Assuming a cluster has multiple roles, like
* Normal Compute (Compute)
* Compute with DPDK (ComputeOvsDpdk)
* Compute with SR-IOV (ComputeSriov)

then, in order to enable the tune-d profile in ComputeOvsDpdk and ComputeSriov, following parameter has to be given:

  ComputeOvsDpdkTunedProfileName: "cpu-partitioning"
  ComputeSriovTunedProfileName: "cpu-partitioning"

Comment 20 Stephen Gordon 2017-02-02 21:30:26 UTC
(In reply to Saravanan KR from comment #18)
> (In reply to Stephen Gordon from comment #17)
> > Can you give a bit more context as to what is actually being done here? Is
> > this profile being enabled for *all* Compute nodes, or only those specific
> > subsets (Dpdk, SR-IOV, in the future real-time) that require it?
> 
> By default, it is disabled. And it is independent of a feature. If a
> particular deployment needs tune-d support for a role, the corresponding
> role has to add the parameter and environment file. It is applied to all the
> nodes in the given role.

I think ideally it would actually always be enabled, as even in the "normal" compute case there is a tuned profile for that (something like virtual-host IIRC).

> Assuming a cluster has multiple roles, like
> * Normal Compute (Compute)
> * Compute with DPDK (ComputeOvsDpdk)
> * Compute with SR-IOV (ComputeSriov)
> 
> then, in order to enable the tune-d profile in ComputeOvsDpdk and
> ComputeSriov, following parameter has to be given:
> 
>   ComputeOvsDpdkTunedProfileName: "cpu-partitioning"
>   ComputeSriovTunedProfileName: "cpu-partitioning"

When it comes time to do real-time we may have to perform some extra analysis on our the usage of the cpu-partitioning profile intersects with the real-time one which contains a superset of the same options as we may want to combine on the same host.

Comment 21 Vijay Chundury 2017-03-02 11:27:38 UTC
We are waiting for Centos to package tuned-profiles-cpu-partitioning to be packaged.
Once this is done we have a review pending that needs to be pushed upstream.

Comment 23 Saravanan KR 2017-03-15 13:07:48 UTC
Karanbir's input: tuned-profiles-cpu-partitioning is added in "rt" repo. There is a possibility to include this repo file to build DIB elements with "includepkg" set only to the specific package. We need to analyze if it is possible with DIB elements configuration.

Comment 24 Saravanan KR 2017-03-16 18:01:21 UTC
I have enabled the centos repo with includepkgs set only for tuned profile and updated the review in the upstream.

Comment 27 Eyal Dannon 2017-07-19 08:35:22 UTC
As far as I see at /usr/share/openstack-tripleo-heat-templates/extraconfig/pre_network/host_config_and_reboot.role.j2.yaml 
The role takes _HOST_CPUS_LIST_: {get_param: {{role}}HostCpusList} as parameter for tuned (isolated_cores).
we moved to HostIsolatedCoreList as parameter for isolated cores.
shouldn't it use it as parameter?

Thanks

Comment 28 Saravanan KR 2017-07-19 08:45:48 UTC
This (In reply to Eyal Dannon from comment #27)
> As far as I see at
> /usr/share/openstack-tripleo-heat-templates/extraconfig/pre_network/
> host_config_and_reboot.role.j2.yaml 
This file has been deprecated. 

> The role takes _HOST_CPUS_LIST_: {get_param: {{role}}HostCpusList} as
> parameter for tuned (isolated_cores).
> we moved to HostIsolatedCoreList as parameter for isolated cores.
> shouldn't it use it as parameter?
> 
> Thanks
Refer https://github.com/openstack/tripleo-heat-templates/blob/master/extraconfig/pre_network/host_config_and_reboot.yaml for the updated file. 

Use the environment file for enabling it.
https://github.com/openstack/tripleo-heat-templates/blob/master/environments/host-config-and-reboot.j2.yaml

I will provide the document with update changes for enabling it, once it has been tested (there is an in-progress patch for it).

Comment 30 Ziv Greenberg 2017-10-25 12:23:10 UTC
Has been verified.

Thanks,
Ziv

Comment 33 errata-xmlrpc 2017-12-13 20:46:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:3462