Bug 1900196

Summary:	stalld is not restarted after crash
Product:	OpenShift Container Platform	Reporter:	Martin Sivák <msivak>
Component:	Node Tuning Operator	Assignee:	Jiří Mencák <jmencak>
Status:	CLOSED ERRATA	QA Contact:	Simon <skordas>
Severity:	high	Docs Contact:
Priority:	high
Version:	4.6.z	CC:	sejug
Target Milestone:	---
Target Release:	4.7.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2021-02-24 15:35:07 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1900261

Description Martin Sivák 2020-11-21 10:21:14 UTC

Description of problem:

The systemd unit file that is shipped with NTO does not configure stalld to be restarted in case it exits.

This is an issue if the node gets overloaded or stalld crashes as the user will start seeing NMIs and there is no automatic recovery from that. The node has to be rebooted or the user needs to call systemctl start stalld on the node.


Version-Release number of selected component (if applicable):

OCP 4.6.3


How reproducible:

kill stalld process, observe it stays down


Expected results:


systemd always restarts stalld with the exception of intentional user command.
Use the Restart=always config option to ask for this behavior.

Comment 2 Simon 2020-11-24 18:54:55 UTC

$ oc get clusterversions.config.openshift.io
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.7.0-0.nightly-2020-11-24-113830   True        False         75m     Cluster version is 4.7.0-0.nightly-2020-11-24-113830
$ oc project openshift-cluster-node-tuning-operator
Now using project "openshift-cluster-node-tuning-operator" on server "https://api.skordas2411.qe.devcluster.openshift.com:6443".

$ oc get nodes
NAME                                         STATUS   ROLES    AGE    VERSION
ip-10-0-149-21.us-east-2.compute.internal    Ready    master   101m   v1.19.2+13d6aa9
ip-10-0-159-175.us-east-2.compute.internal   Ready    worker   94m    v1.19.2+13d6aa9
ip-10-0-170-112.us-east-2.compute.internal   Ready    master   101m   v1.19.2+13d6aa9
ip-10-0-177-142.us-east-2.compute.internal   Ready    worker   92m    v1.19.2+13d6aa9
ip-10-0-210-52.us-east-2.compute.internal    Ready    master   101m   v1.19.2+13d6aa9
ip-10-0-223-201.us-east-2.compute.internal   Ready    worker   93m    v1.19.2+13d6aa9

$ # Using worker node
$ node=ip-10-0-159-175.us-east-2.compute.internal
$ echo $node
ip-10-0-159-175.us-east-2.compute.internal

$ oc label node $node node-role.kubernetes.io/worker-rt=
node/ip-10-0-159-175.us-east-2.compute.internal labeled

$ oc create -f- <<EOF
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
 name: worker-rt
 labels:
   worker-rt: ""
spec:
 machineConfigSelector:
   matchExpressions:
     - {key: machineconfiguration.openshift.io/role, operator: In, values: [worker,worker-rt]}
 nodeSelector:
   matchLabels:
     node-role.kubernetes.io/worker-rt: ""
EOF
machineconfigpool.machineconfiguration.openshift.io/worker-rt created

$ oc create -f- <<EOF
apiVersion: tuned.openshift.io/v1
kind: Tuned
metadata:
 name: openshift-realtime
 namespace: openshift-cluster-node-tuning-operator
spec:
 profile:
 - data: |
     [main]
     summary=Custom OpenShift realtime profile
     include=openshift-node,realtime
     [variables]
     # isolated_cores take a list of ranges; e.g. isolated_cores=2,4-7
     isolated_cores=1
     #isolate_managed_irq=Y
     not_isolated_cores_expanded=${f:cpulist_invert:${isolated_cores_expanded}}
     [bootloader]
     cmdline_ocp_realtime=+systemd.cpu_affinity=${not_isolated_cores_expanded}
     [service]
     service.stalld=start,enable
   name: openshift-realtime

 recommend:
 - machineConfigLabels:
     machineconfiguration.openshift.io/role: "worker-rt"
   priority: 20
   profile: openshift-realtime
EOF
tuned.tuned.openshift.io/openshift-realtime created

$ oc get nodes
NAME                                         STATUS   ROLES              AGE    VERSION
ip-10-0-149-21.us-east-2.compute.internal    Ready    master             115m   v1.19.2+13d6aa9
ip-10-0-159-175.us-east-2.compute.internal   Ready    worker,worker-rt   107m   v1.19.2+13d6aa9
ip-10-0-170-112.us-east-2.compute.internal   Ready    master             114m   v1.19.2+13d6aa9
ip-10-0-177-142.us-east-2.compute.internal   Ready    worker             106m   v1.19.2+13d6aa9
ip-10-0-210-52.us-east-2.compute.internal    Ready    master             114m   v1.19.2+13d6aa9
ip-10-0-223-201.us-east-2.compute.internal   Ready    worker             106m   v1.19.2+13d6aa9
$ oc get mcp
NAME        CONFIG                                                UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
master      rendered-master-7fc779fc3075d82c9dc6e66f4a7da331      True      False      False      3              3                   3                     0                      114m
worker      rendered-worker-db825c5f533a49125e760e8a24e1be69      True      False      False      2              2                   2                     0                      114m
worker-rt   rendered-worker-rt-ba47b802db57ed1656d6aa35b68f6aee   True      False      False      1              1                   1                     0                      10m

$ oc debug node/$node
Starting pod/ip-10-0-159-175us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.159.175
If you don't see a command prompt, try pressing enter.
sh-4.4# ps auxww | grep stalld
root        3425  0.5  0.0   8140  2596 ?        Ss   18:37   0:02 /usr/local/bin/stalld -p 1000000000 -r 10000 -d 3 -t 20 --log_syslog --log_kmsg --foreground --pidfile /run/stal
ld.pid
root        8765  0.0  0.0   9184  1080 pts/0    S+   18:46   0:00 grep stalld
sh-4.4# kill 3425
sh-4.4# ps auxww | grep stalld
root       10691  0.7  0.0   7568  2348 ?        Ss   18:49   0:00 /usr/local/bin/stalld -p 1000000000 -r 10000 -d 3 -t 20 --log_syslog --log_kmsg --foreground --pidfile /run/stalld.pid
root       10765  0.0  0.0   9184   976 pts/0    S+   18:49   0:00 grep stalld
sh-4.4# kill 10691
sh-4.4# ps auxww | grep stalld
root       11127  1.0  0.0   7260  2396 ?        Ss   18:50   0:00 /usr/local/bin/stalld -p 1000000000 -r 10000 -d 3 -t 20 --log_syslog --log_kmsg --foreground --pidfile /run/stalld.pid
root       11148  0.0  0.0   9184  1092 pts/0    S+   18:50   0:00 grep stalld
sh-4.4# ps auxww | grep stalld
root       11127  0.7  0.0   7568  2396 ?        Ss   18:50   0:00 /usr/local/bin/stalld -p 1000000000 -r 10000 -d 3 -t 20 --log_syslog --log_kmsg --foreground --pidfile /run/stalld.pid
root       11167  0.0  0.0   9184  1036 pts/0    S+   18:50   0:00 grep stalld
sh-4.4# exit
exit

Removing debug pod ...
$ # ^^ new stalld ID after killing previous ones

Comment 5 errata-xmlrpc 2021-02-24 15:35:07 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633