Bug 1892459
Summary: | NTO-shipped stalld needs to use FIFO for boosting. | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Jiří Mencák <jmencak> |
Component: | Node Tuning Operator | Assignee: | Jiří Mencák <jmencak> |
Status: | CLOSED ERRATA | QA Contact: | Simon <skordas> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 4.6 | CC: | sejug, skordas |
Target Milestone: | --- | ||
Target Release: | 4.6.z | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | 1892457 | Environment: | |
Last Closed: | 2020-11-16 14:37:43 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1892457 | ||
Bug Blocks: |
Description
Jiří Mencák
2020-10-28 20:14:20 UTC
$ oc get clusterversions.config.openshift.io NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.6.0-0.nightly-2020-11-05-134712 True False 4h37m Cluster version is 4.6.0-0.nightly-2020-11-05-134712 $ oc project openshift-cluster-node-tuning-operator Now using project "openshift-cluster-node-tuning-operator" on server "https://api.skordas511a.qe.devcluster.openshift.com:6443". $ oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-154-69.us-east-2.compute.internal Ready worker 4h58m v1.19.0+9f84db3 ip-10-0-159-2.us-east-2.compute.internal Ready master 5h4m v1.19.0+9f84db3 ip-10-0-181-86.us-east-2.compute.internal Ready master 5h4m v1.19.0+9f84db3 ip-10-0-182-196.us-east-2.compute.internal Ready worker 4h53m v1.19.0+9f84db3 ip-10-0-202-160.us-east-2.compute.internal Ready worker 5h v1.19.0+9f84db3 ip-10-0-205-149.us-east-2.compute.internal Ready master 5h4m v1.19.0+9f84db3 $ node=ip-10-0-154-69.us-east-2.compute.internal $ echo $node ip-10-0-154-69.us-east-2.compute.internal $ oc label node $node node-role.kubernetes.io/worker-rt= node/ip-10-0-154-69.us-east-2.compute.internal labeled $ oc create -f- <<EOF > apiVersion: machineconfiguration.openshift.io/v1 > kind: MachineConfigPool > metadata: > name: worker-rt > labels: > worker-rt: "" > spec: > machineConfigSelector: > matchExpressions: > - {key: machineconfiguration.openshift.io/role, operator: In, values: [worker,worker-rt]} > nodeSelector: > matchLabels: > node-role.kubernetes.io/worker-rt: "" > EOF machineconfigpool.machineconfiguration.openshift.io/worker-rt created $ oc create -f- <<EOF > apiVersion: tuned.openshift.io/v1 > kind: Tuned > metadata: > name: openshift-realtime > namespace: openshift-cluster-node-tuning-operator > spec: > profile: > - data: | > [main] > summary=Custom OpenShift realtime profile > include=openshift-node,realtime > [variables] > # isolated_cores take a list of ranges; e.g. isolated_cores=2,4-7 > isolated_cores=1 > #isolate_managed_irq=Y > not_isolated_cores_expanded=${f:cpulist_invert:${isolated_cores_expanded}} > [bootloader] > cmdline_ocp_realtime=+systemd.cpu_affinity=${not_isolated_cores_expanded} > [service] > service.stalld=start,enable > name: openshift-realtime > > recommend: > - machineConfigLabels: > machineconfiguration.openshift.io/role: "worker-rt" > priority: 20 > profile: openshift-realtime > EOF tuned.tuned.openshift.io/openshift-realtime created $ oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-154-69.us-east-2.compute.internal Ready,SchedulingDisabled worker,worker-rt 5h v1.19.0+9f84db3 ip-10-0-159-2.us-east-2.compute.internal Ready master 5h5m v1.19.0+9f84db3 ip-10-0-181-86.us-east-2.compute.internal Ready master 5h5m v1.19.0+9f84db3 ip-10-0-182-196.us-east-2.compute.internal Ready worker 4h55m v1.19.0+9f84db3 ip-10-0-202-160.us-east-2.compute.internal Ready worker 5h1m v1.19.0+9f84db3 ip-10-0-205-149.us-east-2.compute.internal Ready master 5h5m v1.19.0+9f84db3 $ oc get mcp NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-f578e5fdbe575539b1e14c36f757e432 True False False 3 3 3 0 5h5m worker rendered-worker-7da090ee233da82a7c774564fa964a72 True False False 2 2 2 0 5h5m worker-rt False True False 1 0 $ oc get nodes && oc get mcp NAME STATUS ROLES AGE VERSION ip-10-0-154-69.us-east-2.compute.internal Ready worker,worker-rt 5h11m v1.19.0+9f84db3 ip-10-0-159-2.us-east-2.compute.internal Ready master 5h17m v1.19.0+9f84db3 ip-10-0-181-86.us-east-2.compute.internal Ready master 5h17m v1.19.0+9f84db3 ip-10-0-182-196.us-east-2.compute.internal Ready worker 5h6m v1.19.0+9f84db3 ip-10-0-202-160.us-east-2.compute.internal Ready worker 5h13m v1.19.0+9f84db3 ip-10-0-205-149.us-east-2.compute.internal Ready master 5h17m v1.19.0+9f84db3 NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-f578e5fdbe575539b1e14c36f757e432 True False False 3 3 3 0 5h16m worker rendered-worker-7da090ee233da82a7c774564fa964a72 True False False 2 2 2 0 5h16m worker-rt rendered-worker-rt-6e1c08ca08fdfaccf8e7995f6899680b True False False 1 1 1 0 12m $ oc debug node/$node Starting pod/ip-10-0-154-69us-east-2computeinternal-debug ... To use host binaries, run `chroot /host` Pod IP: 10.0.154.69 If you don't see a command prompt, try pressing enter. sh-4.4# ps auxww | grep stalld root 3472 0.4 0.0 7440 2616 ? Ss 19:40 0:02 /usr/local/bin/stalld -p 1000000000 -r 10000 -d 3 -t 20 --log_syslog --log_kmsg --foreground --pidfile /run/stalld.pid root 10359 0.0 0.0 9180 980 pts/0 S+ 19:49 0:00 grep stalld sh-4.4# grep ExecStart /host/etc/systemd/system/stalld.service ExecStart=/usr/bin/chrt -f 10 /usr/local/bin/stalld $CLIST $AGGR $BP $BR $BD $THRESH $LOGGING $FG $PF Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6.4 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4987 |