Bug 1925203 - [RFE] [OCPonRHV] - High Performance Mode in OCP on RHV - huge pages, CPU and Numa pinning configuration
Summary: [RFE] [OCPonRHV] - High Performance Mode in OCP on RHV - huge pages, CPU and ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.7
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
: 4.9.0
Assignee: Liran Rotenberg
QA Contact: Guy
URL:
Whiteboard:
: 1925201 (view as bug list)
Depends On: 1948963
Blocks: 1988794
TreeView+ depends on / blocked
 
Reported: 2021-02-04 15:15 UTC by michal
Modified: 2021-10-18 17:29 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Providing autoPinningPolicy and/or hugepages property into the ovirt installer config is now possible. It will make the control plan nodes and/or compute nodes. `autoPinningPolicy` will set automatically the CPUs and NUMAs. `hugepages` will set the nodes with a custom property in ovirt, telling these nodes to use the hugepages of the hypervisor.
Clone Of:
: 1988794 (view as bug list)
Environment:
Last Closed: 2021-10-18 17:29:20 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift installer pull 4873 0 None open Bug 1925203: add auto pin and hugepages support 2021-04-26 08:45:22 UTC
Red Hat Product Errata RHSA-2021:3759 0 None None None 2021-10-18 17:29:39 UTC

Description michal 2021-02-04 15:15:23 UTC
Description of problem:
We support setting Huge pages and CPU/NUMA pining with the machine spec but not with the installer on the masters.
Those features are most relevant to the masters anyway and we really should add support from the installer, currently this is a day 2 operation.

HUGE PAGES: 
We should add a field to the installer machine spec for huge pages that acts like BZ#1948963[1], allowing 2048 or 1048576 values only, and add that field to the terraform call to the masters.

CPUs/NUMAs:
We should add a field to the installer machine spec for CPUs/NUMAs that acts like BZ#1948963[1], allowing Disabled, Existing or Adjust.
and add that field to the terraform call to the masters.

Both advanced features should be disabled by default

How reproducible:
always

Steps to Reproduce:
Deploy ocp with an install config that contains a machine set with HUGE PAGES or CPUs/NUMAs (or both)


Actual results:
The installer fails due to fields he doesn't know

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1948963
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1948963

Comment 2 Gal Zaidman 2021-03-30 14:14:04 UTC
due to capacity constraints, we will be revisiting this bug in the upcoming sprint

Comment 3 Gal Zaidman 2021-04-14 10:14:32 UTC
*** Bug 1925201 has been marked as a duplicate of this bug. ***

Comment 4 Matthew Staebler 2021-04-29 02:59:26 UTC
Is this really a bug? Or is this a new feature? The presence of "RFE" in the title of the BZ seems to support this being a new feature. Can this wait to be added as a new feature in 4.9?

Comment 11 Liran Rotenberg 2021-07-11 07:59:17 UTC
The yaml should look like this (copying from the PR):
...
compute:
- name: worker
  platform:
    ovirt:
      autoPinningPolicy: resize_and_pin
      hugepages: 2048
...
  replicas: 2
controlPlane:
...
  name: master
  platform:
    ovirt:
      autoPinningPolicy: resize_and_pin
      hugepages: 2048
...
  replicas: 3
...

Note that for hugepages you need to have enough hugepages available on your hypervisor in order to let the nodes to start.

Comment 16 Guy 2021-07-20 12:20:06 UTC
Verified on openshift cluster version 4.9.0-0.nightly-2021-07-17-212317 and RHV engine ovirt-engine-4.4.8-0.19.el8ev.noarch

Comment 20 errata-xmlrpc 2021-10-18 17:29:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759


Note You need to log in before you can comment on or make changes to this bug.