Description of problem: Comparing to previous version (4.2) it takes long time to restore default tuned. | ver. | 4.2 | 4.3 | | new tuned| 31s | 49s | | restore | 8s | 464s | its ~60 times slower! Version-Release number of selected component (if applicable): 4.3.0-0.nightly-2019-11-02-092336 How reproducible: 100% The best time I had was 86s Steps to Reproduce: I was using two terminal windows. 1. Terminal 1 - create cr oc new-project my-logging-project oc create -f https://raw.githubusercontent.com/hongkailiu/svt-case-doc/master/files/pod_test.yaml 2. Terminal 2 - check which node will be tuned and debug this node: oc get pod web -A -o wide ##to get node oc debug node/<my_node> # on pod: chroot /host i=0; while [[ "$(sysctl kernel.pid_max | cut -d ' ' -f 3)" != "131074" ]]; do sysctl kernel.pid_max; sleep 1;i=$[$i+1]; echo "time: $i"; done ## this will create loop to check changed value by NTO 3. Terminal 1 - label pod and create new tuned. oc label pod web -n my-logging-project tuned.openshift.io/elasticsearch= oc create -f https://raw.githubusercontent.com/openshift/svt/master/openshift_tooling/node_tuning_operator/content/tuned-kernel-pid_max.yml ## Creating new tuned check time on Terminal 2 4. Check time when node will be tuned by new tuned. 5. Do something similar deleting tuned. 6. Terminal 2 - check node for changes i=0; while [[ "$(sysctl kernel.pid_max | cut -d ' ' -f 3)" != "4194304" ]]; do sysctl kernel.pid_max; sleep 1;i=$[$i+1]; echo "time: $i"; done 7. Terminal 1 - delete tuned oc delete tuned max-pid-test -n openshift-cluster-node-tuning-operator ## deleting tuned check time on terminal 2 8. Check time when default value on node will be restored. Actual results: Restoring default values taking too long comparing to previous versions. Expected results: Similar time to restore default values.
RETEST POSITIVE: ```bash Cluster version is 4.4.0-0.nightly-2019-12-13-082744 oc get pods -n openshift-cluster-node-tuning-operator NAME READY STATUS RESTARTS AGE cluster-node-tuning-operator-7666899684-vljzz 1/1 Running 0 73m tuned-5lpp7 1/1 Running 0 73m tuned-c9t9n 1/1 Running 0 69m tuned-cvmbn 1/1 Running 0 69m tuned-l4qk2 1/1 Running 0 73m tuned-qvfd2 1/1 Running 0 73m tuned-tvpzc 1/1 Running 0 68m oc rsh -n openshift-cluster-node-tuning-operator cluster-node-tuning-operator-7666899684-vljzz sh-4.2$ cluster-node-tuning-operator --version I1213 18:55:48.964995 73 main.go:22] Go Version: go1.12.12 I1213 18:55:48.965124 73 main.go:23] Go OS/Arch: linux/amd64 I1213 18:55:48.965142 73 main.go:24] node-tuning Version: v4.4.0-201912110523-0-g4db2d1c-dirty ``` Now waiting for new tuned value ~1-2 sec! Restore ~ 3-5 sec! WOW!
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0581