Bug 1885864
| Summary: | Stalld service crashed under the worker node | |||
|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Artyom <alukiano> | |
| Component: | Node Tuning Operator | Assignee: | Jiří Mencák <jmencak> | |
| Status: | CLOSED ERRATA | QA Contact: | Simon <skordas> | |
| Severity: | high | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 4.6 | CC: | sejug, skordas | |
| Target Milestone: | --- | |||
| Target Release: | 4.7.0 | |||
| Hardware: | All | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | No Doc Update | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1886511 (view as bug list) | Environment: | ||
| Last Closed: | 2021-02-24 15:23:22 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1886511 | |||
|
Description
Artyom
2020-10-07 08:22:34 UTC
On pod: $ oc project openshift-cluster-node-tuning-operator Now using project "openshift-cluster-node-tuning-operator" on server "https://api.skordas1015.qe.devcluster.openshift.com:6443". $ oc get pods NAME READY STATUS RESTARTS AGE cluster-node-tuning-operator-67dbdbf885-lmd7m 1/1 Running 0 4h45m tuned-7pzdp 1/1 Running 0 5h9m tuned-8qttl 1/1 Running 0 5h9m tuned-jpb24 1/1 Running 0 5h9m tuned-mkl86 1/1 Running 0 5h2m tuned-ntz7f 1/1 Running 0 5h1m tuned-td9rs 1/1 Running 0 5h $ oc rsh tuned-7pzdp sh-4.4# stalld -h 2>&1|grep force_fifo -F/--force_fifo: use SCHED_FIFO for boosting On host node with enabled stalld: $ oc get pods -n openshift-cluster-node-tuning-operator -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES cluster-node-tuning-operator-67dbdbf885-lmd7m 1/1 Running 0 4h47m 10.130.0.15 ip-10-0-139-57.us-east-2.compute.internal <none> <none> tuned-7pzdp 1/1 Running 0 5h10m 10.0.201.78 ip-10-0-201-78.us-east-2.compute.internal <none> <none> tuned-8qttl 1/1 Running 0 5h10m 10.0.181.44 ip-10-0-181-44.us-east-2.compute.internal <none> <none> tuned-jpb24 1/1 Running 0 5h10m 10.0.139.57 ip-10-0-139-57.us-east-2.compute.internal <none> <none> tuned-mkl86 1/1 Running 0 5h4m 10.0.199.83 ip-10-0-199-83.us-east-2.compute.internal <none> <none> tuned-ntz7f 1/1 Running 0 5h2m 10.0.173.192 ip-10-0-173-192.us-east-2.compute.internal <none> <none> tuned-td9rs 1/1 Running 0 5h2m 10.0.159.212 ip-10-0-159-212.us-east-2.compute.internal <none> <none> $ oc debug node/ip-10-0-201-78.us-east-2.compute.internal Starting pod/ip-10-0-201-78us-east-2computeinternal-debug ... To use host binaries, run `chroot /host` Pod IP: 10.0.201.78 If you don't see a command prompt, try pressing enter. sh-4.4# chroot /host sh-4.4# ps -ef | grep stalld root 298592 297867 0 19:07 pts/0 00:00:00 stalld -p 1000000000 -r 10000 -d 3 -t 30 --log_syslog --log_kmsg --foreground --pidfile /run/stalld.pid clusterversion 4.7.0-0.nightly-2020-10-15-051208 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633 |