Bug 2018053
| Summary: | NTO does not restart TuneD daemon when profile application is taking too long | |||
|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Jiří Mencák <jmencak> | |
| Component: | Node Tuning Operator | Assignee: | Jiří Mencák <jmencak> | |
| Status: | CLOSED ERRATA | QA Contact: | liqcui | |
| Severity: | high | Docs Contact: | ||
| Priority: | high | |||
| Version: | 4.10 | CC: | aos-bugs, dagray, openshift-bugzilla-robot, skordas | |
| Target Milestone: | --- | |||
| Target Release: | 4.8.z | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | 2017488 | |||
| : | 2020518 (view as bug list) | Environment: | ||
| Last Closed: | 2021-11-16 21:22:58 UTC | Type: | --- | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | 2017488 | |||
| Bug Blocks: | 2020518 | |||
|
Description
Jiří Mencák
2021-10-28 05:50:25 UTC
Fixed in 4.8.0-0.nightly-2021-11-03-171325 and above.
$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.8.0-0.nightly-2021-11-03-171325 True False 7h37m Cluster version is 4.8.0-0.nightly-2021-11-03-171325
$ oc project openshift-cluster-node-tuning-operator
$ oc get po -o wide|grep worker-a
tuned-d6s6j 1/1 Running 0 7h51m 10.0.128.3 jmencak-hcp9p-worker-a-7rzvf.c.openshift-gce-devel.internal <none> <none>
$ oc label no jmencak-hcp9p-worker-a-7rzvf.c.openshift-gce-devel.internal profile=
$ cat stuck.yaml
apiVersion: tuned.openshift.io/v1
kind: Tuned
metadata:
name: openshift-profile-stuck
namespace: openshift-cluster-node-tuning-operator
spec:
profile:
- data: |
[main]
summary=OpenShift profile stuck
[variables]
v=${f:exec:sleep:72}
name: openshift-profile-stuck
recommend:
- match:
- label: profile
priority: 20
profile: openshift-profile-stuck
$ oc create -f stuck.yaml
$ oc logs -f tuned-d6s6j | tail -n17
I1104 15:45:46.249348 2398 tuned.go:542] reloading tuned...
I1104 15:45:46.249354 2398 tuned.go:545] sending HUP to PID 3628
2021-11-04 15:45:46,249 INFO tuned.daemon.daemon: stopping tuning
2021-11-04 15:45:46,266 INFO tuned.daemon.daemon: terminating Tuned, rolling back all changes
2021-11-04 15:45:46,313 INFO tuned.daemon.daemon: Running in automatic mode, checking what profile is recommended for your configuration.
2021-11-04 15:45:46,314 INFO tuned.daemon.daemon: Using 'openshift-profile-stuck' profile
2021-11-04 15:45:46,314 INFO tuned.profiles.loader: loading profile: openshift-profile-stuck
E1104 15:46:46.249925 2398 tuned.go:1128] timeout (60) to apply TuneD profile; restarting TuneD daemon
E1104 15:46:56.252435 2398 tuned.go:479] error waiting for tuned: signal: killed
I1104 15:46:56.252578 2398 tuned.go:429] starting tuned...
I1104 15:46:56.268933 2398 tuned.go:917] updated Profile jmencak-hcp9p-worker-a-7rzvf.c.openshift-gce-devel.internal stalld=<nil>, bootcmdline:
I1104 15:46:56.269286 2398 tuned.go:416] written "/etc/tuned/recommend.d/50-openshift.conf" to set Tuned profile openshift-profile-stuck
2021-11-04 15:46:56,371 INFO tuned.daemon.application: dynamic tuning is globally disabled
2021-11-04 15:46:56,377 INFO tuned.daemon.daemon: using sleep interval of 1 second(s)
2021-11-04 15:46:56,377 INFO tuned.daemon.daemon: Running in automatic mode, checking what profile is recommended for your configuration.
2021-11-04 15:46:56,378 INFO tuned.daemon.daemon: Using 'openshift-profile-stuck' profile
2021-11-04 15:46:56,379 INFO tuned.profiles.loader: loading profile: openshift-profile-stuck
In 4.8, no exponential backoff was implemented, but the profile application times out after 60 seconds
and is retried.
QE, please acknowledge the fix.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.8.20 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:4574 |