Bug 2017427
Summary: | NTO does not restart TuneD daemon when profile application is taking too long | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Jiří Mencák <jmencak> | |
Component: | Node Tuning Operator | Assignee: | Jiří Mencák <jmencak> | |
Status: | CLOSED ERRATA | QA Contact: | liqcui | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 4.10 | CC: | aos-bugs, dagray, liqcui | |
Target Milestone: | --- | |||
Target Release: | 4.10.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 2029436 (view as bug list) | Environment: | ||
Last Closed: | 2022-03-10 16:22:07 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 2017488, 2029436 |
Description
Jiří Mencák
2021-10-26 13:54:01 UTC
Fixed in 4.10.0-0.nightly-2021-10-27-230233 and above. $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-2021-10-27-230233 True False 20m Cluster version is 4.10.0-0.nightly-2021-10-27-230233 $ oc get no NAME STATUS ROLES AGE VERSION jmencak-7r5lg-master-0.c.openshift-gce-devel.internal Ready master 35m v1.22.1+674f31e jmencak-7r5lg-master-1.c.openshift-gce-devel.internal Ready master 35m v1.22.1+674f31e jmencak-7r5lg-master-2.c.openshift-gce-devel.internal Ready master 35m v1.22.1+674f31e jmencak-7r5lg-worker-a-tlq29.c.openshift-gce-devel.internal Ready worker 27m v1.22.1+674f31e jmencak-7r5lg-worker-b-dd727.c.openshift-gce-devel.internal Ready worker 27m v1.22.1+674f31e $ oc label no jmencak-7r5lg-worker-a-tlq29.c.openshift-gce-devel.internal profile= node/jmencak-7r5lg-worker-a-tlq29.c.openshift-gce-devel.internal labeled $ oc get po -o wide|grep jmencak-7r5lg-worker-a-tlq29.c.openshift-gce-devel.internal tuned-pnl8x 1/1 Running 0 28m 10.0.128.2 jmencak-7r5lg-worker-a-tlq29.c.openshift-gce-devel.internal <none> <none> $ cat stuck.yaml apiVersion: tuned.openshift.io/v1 kind: Tuned metadata: name: openshift-profile-stuck namespace: openshift-cluster-node-tuning-operator spec: profile: - data: | [main] summary=OpenShift profile stuck [variables] v=${f:exec:sleep:72} name: openshift-profile-stuck recommend: - match: - label: profile priority: 20 profile: openshift-profile-stuck $ oc create -f stuck.yaml $ oc logs tuned-pnl8x | tail -n 28 I1028 06:37:13.201963 2182 tuned.go:1229] previous application of TuneD profile failed; change detected, scheduling full restart in 1s 2021-10-28 06:37:13,299 INFO tuned.daemon.application: dynamic tuning is globally disabled 2021-10-28 06:37:13,303 INFO tuned.daemon.daemon: using sleep interval of 1 second(s) 2021-10-28 06:37:13,304 INFO tuned.daemon.daemon: Running in automatic mode, checking what profile is recommended for your configuration. 2021-10-28 06:37:13,304 INFO tuned.daemon.daemon: Using 'openshift-profile-stuck' profile 2021-10-28 06:37:13,305 INFO tuned.profiles.loader: loading profile: openshift-profile-stuck E1028 06:37:14.202848 2182 tuned.go:1211] timeout (60) to apply TuneD profile; restarting TuneD daemon E1028 06:37:14.205003 2182 tuned.go:508] error waiting for tuned: signal: terminated I1028 06:37:14.205213 2182 tuned.go:441] starting tuned... 2021-10-28 06:37:14,327 INFO tuned.daemon.application: dynamic tuning is globally disabled 2021-10-28 06:37:14,332 INFO tuned.daemon.daemon: using sleep interval of 1 second(s) 2021-10-28 06:37:14,333 INFO tuned.daemon.daemon: Running in automatic mode, checking what profile is recommended for your configuration. 2021-10-28 06:37:14,333 INFO tuned.daemon.daemon: Using 'openshift-profile-stuck' profile 2021-10-28 06:37:14,334 INFO tuned.profiles.loader: loading profile: openshift-profile-stuck E1028 06:38:14.205888 2182 tuned.go:1211] timeout (120) to apply TuneD profile; restarting TuneD daemon E1028 06:38:14.207821 2182 tuned.go:508] error waiting for tuned: signal: terminated I1028 06:38:14.208077 2182 tuned.go:441] starting tuned... 2021-10-28 06:38:14,339 INFO tuned.daemon.application: dynamic tuning is globally disabled 2021-10-28 06:38:14,343 INFO tuned.daemon.daemon: using sleep interval of 1 second(s) 2021-10-28 06:38:14,344 INFO tuned.daemon.daemon: Running in automatic mode, checking what profile is recommended for your configuration. 2021-10-28 06:38:14,344 INFO tuned.daemon.daemon: Using 'openshift-profile-stuck' profile 2021-10-28 06:38:14,345 INFO tuned.profiles.loader: loading profile: openshift-profile-stuck 2021-10-28 06:39:26,351 INFO tuned.daemon.controller: starting controller 2021-10-28 06:39:26,351 INFO tuned.daemon.daemon: starting tuning 2021-10-28 06:39:26,352 INFO tuned.daemon.daemon: static tuning from profile 'openshift-profile-stuck' applied I1028 06:39:26.365143 2182 tuned.go:995] updated Profile jmencak-7r5lg-worker-a-tlq29.c.openshift-gce-devel.internal stalld=<nil>, bootcmdline: I1028 06:39:26.365402 2182 tuned.go:428] written "/etc/tuned/recommend.d/50-openshift.conf" to set TuneD profile openshift-profile-stuck I1028 06:39:26.476307 2182 tuned.go:719] active and recommended profile (openshift-profile-stuck) match; profile change will not trigger profile reload $ oc get profile NAME TUNED APPLIED DEGRADED AGE jmencak-7r5lg-master-0.c.openshift-gce-devel.internal openshift-control-plane True False 48m jmencak-7r5lg-master-1.c.openshift-gce-devel.internal openshift-control-plane True False 48m jmencak-7r5lg-master-2.c.openshift-gce-devel.internal openshift-control-plane True False 48m jmencak-7r5lg-worker-a-tlq29.c.openshift-gce-devel.internal openshift-profile-stuck True False 42m jmencak-7r5lg-worker-b-dd727.c.openshift-gce-devel.internal openshift-node True False 42m Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056 |