Bug 1925203

Summary: [RFE] [OCPonRHV] - High Performance Mode in OCP on RHV - huge pages, CPU and Numa pinning configuration
Product: OpenShift Container Platform Reporter: michal <mgold>
Component: InstallerAssignee: Liran Rotenberg <lrotenbe>
Installer sub component: OpenShift on RHV QA Contact: Guy <gafik>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: high CC: emarcus, gzaidman, hpopal, jpasztor, lleistne, lrotenbe, mburman, mkalinin, mstaeble
Version: 4.7   
Target Milestone: ---   
Target Release: 4.9.0   
Hardware: Unspecified   
OS: Unspecified   
Fixed In Version: Doc Type: Enhancement
Doc Text:
Providing autoPinningPolicy and/or hugepages property into the ovirt installer config is now possible. It will make the control plan nodes and/or compute nodes. `autoPinningPolicy` will set automatically the CPUs and NUMAs. `hugepages` will set the nodes with a custom property in ovirt, telling these nodes to use the hugepages of the hypervisor.
Story Points: ---
Clone Of:
: 1988794 (view as bug list) Environment:
Last Closed: 2021-10-18 17:29:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 1948963    
Bug Blocks: 1988794    

Description michal 2021-02-04 15:15:23 UTC
Description of problem:
We support setting Huge pages and CPU/NUMA pining with the machine spec but not with the installer on the masters.
Those features are most relevant to the masters anyway and we really should add support from the installer, currently this is a day 2 operation.

We should add a field to the installer machine spec for huge pages that acts like BZ#1948963[1], allowing 2048 or 1048576 values only, and add that field to the terraform call to the masters.

We should add a field to the installer machine spec for CPUs/NUMAs that acts like BZ#1948963[1], allowing Disabled, Existing or Adjust.
and add that field to the terraform call to the masters.

Both advanced features should be disabled by default

How reproducible:

Steps to Reproduce:
Deploy ocp with an install config that contains a machine set with HUGE PAGES or CPUs/NUMAs (or both)

Actual results:
The installer fails due to fields he doesn't know

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1948963
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1948963

Comment 2 Gal Zaidman 2021-03-30 14:14:04 UTC
due to capacity constraints, we will be revisiting this bug in the upcoming sprint

Comment 3 Gal Zaidman 2021-04-14 10:14:32 UTC
*** Bug 1925201 has been marked as a duplicate of this bug. ***

Comment 4 Matthew Staebler 2021-04-29 02:59:26 UTC
Is this really a bug? Or is this a new feature? The presence of "RFE" in the title of the BZ seems to support this being a new feature. Can this wait to be added as a new feature in 4.9?

Comment 11 Liran Rotenberg 2021-07-11 07:59:17 UTC
The yaml should look like this (copying from the PR):
- name: worker
      autoPinningPolicy: resize_and_pin
      hugepages: 2048
  replicas: 2
  name: master
      autoPinningPolicy: resize_and_pin
      hugepages: 2048
  replicas: 3

Note that for hugepages you need to have enough hugepages available on your hypervisor in order to let the nodes to start.

Comment 16 Guy 2021-07-20 12:20:06 UTC
Verified on openshift cluster version 4.9.0-0.nightly-2021-07-17-212317 and RHV engine ovirt-engine-4.4.8-0.19.el8ev.noarch

Comment 20 errata-xmlrpc 2021-10-18 17:29:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.