Bug 1833440

Summary: NTO cannot remove kernel command line parameters shipped in parent profiles.
Product: OpenShift Container Platform Reporter: Jiří Mencák <jmencak>
Component: Node Tuning OperatorAssignee: Jiří Mencák <jmencak>
Status: CLOSED ERRATA QA Contact: Simon <skordas>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.5CC: sejug, yquinn
Target Milestone: ---   
Target Release: 4.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-07-13 17:36:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jiří Mencák 2020-05-08 16:41:41 UTC
Description of problem:
When creating child profiles for NTO, it is impossible to remove kernel command line parameters defined in parent profiles.

Version-Release number of selected component (if applicable):
All

How reproducible:
Always

Steps to Reproduce:
$ oc label node "worker-node" node-role.kubernetes.io/worker-rt=

$ oc create -f- <<EOF
apiVersion: tuned.openshift.io/v1
kind: Tuned
metadata:
  name: openshift-realtime
  namespace: openshift-cluster-node-tuning-operator
spec:
  profile:
  - data: |
      [main]
      summary=Custom OpenShift realtime profile
      include=openshift-node,realtime
      [variables]
      # isolated_cores take a list of ranges; e.g. isolated_cores=2,4-7
      isolated_cores=1
      [bootloader]
      cmdline_openshift_realtime=-intel_pstate=disable
    name: openshift-realtime

  recommend:
  - machineConfigLabels:
      machineconfiguration.openshift.io/role: "worker-rt"
    priority: 30
    profile: openshift-realtime
EOF

$ oc create -f- <<EOF
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  name: worker-rt
  labels:
    worker-rt: ""
spec:
  machineConfigSelector:
    matchExpressions:
      - {key: machineconfiguration.openshift.io/role, operator: In, values: [worker,worker-rt]}
  nodeSelector:
    matchLabels:
      node-role.kubernetes.io/worker-rt: ""
EOF

$ oc logs "NTO-operator-pod"
I0508 15:46:28.543553       1 main.go:24] Go Version: go1.13.4
I0508 15:46:28.543643       1 main.go:25] Go OS/Arch: linux/amd64
I0508 15:46:28.543653       1 main.go:26] node-tuning Version: v4.5.0-202005061719-0-g32eea34-dirty
I0508 15:46:28.546784       1 controller.go:780] trying to become a leader
I0508 15:46:28.570352       1 controller.go:785] became a leader
I0508 15:46:28.576367       1 controller.go:792] starting Tuned controller
I0508 15:46:28.676653       1 controller.go:850] started events processor/controller
I0508 15:47:39.803593       1 controller.go:406] updated Tuned rendered
I0508 15:47:39.817353       1 controller.go:517] updated profile ip-10-0-137-57.eu-west-1.compute.internal [openshift-realtime]
I0508 15:47:41.404917       1 controller.go:544] created MachineConfig 50-nto-worker-rt with kernelArguments: [skew_tick=1 isolcpus=1 intel_pstate=disable nosoftlockup tsc=nowatchdog]

Actual results:
intel_pstate=disable has not been removed.

Expected results:
intel_pstate=disable has been removed.

Additional info:
This is a tuned bug which was fixed upstream:
https://github.com/redhat-performance/tuned/pull/265

Comment 1 Yanir Quinn 2020-05-10 14:21:24 UTC
> Additional info:
> This is a tuned bug which was fixed upstream:
> https://github.com/redhat-performance/tuned/pull/265

I would consider this as fixed if all parent profiles contain a unique cmdline name as fixed in https://github.com/redhat-performance/tuned/pull/265. 
(Probably they are already aligned but this should be part of the criteria for VERIFIED)

Comment 4 Simon 2020-05-14 19:54:25 UTC
Retest positive:

# oc get clusterversions.config.openshift.io
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.5.0-0.nightly-2020-05-14-021132   True        False         145m    Cluster version is 4.5.0-0.nightly-2020-05-14-021132

# node=$(oc get nodes --no-headers | grep worker | cut -d ' ' -f 1 | head -1) && echo $node
skorda-z5h4h-w-a-xnlfc.c.openshift-qe.internal

# pod=$(oc get pods -n openshift-cluster-node-tuning-operator -o wide --no-headers | grep $node | cut -d' ' -f1) && echo $pod
tuned-4jcj4

# oc label node $node node-role.kubernetes.io/worker-rt=
node/skorda-z5h4h-w-a-xnlfc.c.openshift-qe.internal labeled

# oc get nodes
NAME                                             STATUS   ROLES              AGE     VERSION
skorda-z5h4h-m-0.c.openshift-qe.internal         Ready    master             5h56m   v1.18.2
skorda-z5h4h-m-1.c.openshift-qe.internal         Ready    master             5h56m   v1.18.2
skorda-z5h4h-m-2.c.openshift-qe.internal         Ready    master             5h56m   v1.18.2
skorda-z5h4h-w-a-xnlfc.c.openshift-qe.internal   Ready    worker,worker-rt   5h44m   v1.18.2
skorda-z5h4h-w-b-d5x2q.c.openshift-qe.internal   Ready    worker             5h45m   v1.18.2
skorda-z5h4h-w-c-x2vkf.c.openshift-qe.internal   Ready    worker             5h44m   v1.18.2

after creating tuned and machineconfigpool

# oc logs $pod -n openshift-cluster-node-tuning-operator
2020-05-14 19:50:46,077 INFO     tuned.daemon.daemon: static tuning from profile 'openshift-realtime' applied
I0514 19:50:46.198499    2829 tuned.go:602] updated Profile skorda-z5h4h-w-a-xnlfc.c.openshift-qe.internal with bootcmdline: skew_tick=1 isolcpus=1 nosoftlockup tsc=nowatchdog

Comment 5 errata-xmlrpc 2020-07-13 17:36:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409