+++ This bug was initially created as a clone of Bug #1974277 +++ Description of problem: When the tuned net plugin is set to scan devices and apply net queues configuration it can throw an unhandeld exception if the device has a combined channel with the n/a value. Version-Release number of selected component (if applicable): OCP 4.8 and OCP 4.9 How reproducible: Always Steps to Reproduce: 1. See https://bugzilla.redhat.com/show_bug.cgi?id=1974071#c0 Additional info: https://bugzilla.redhat.com/show_bug.cgi?id=1974071 # original BZ https://github.com/redhat-performance/tuned/pull/360 # u/s fix for tuned
Clusterversion: 4.8.0-0.nightly-2021-06-23-232238 $ nto=openshift-cluster-node-tuning-operator $ oc project $nto Now using project "openshift-cluster-node-tuning-operator" on server "https://api.skordas0624.qe.devcluster.openshift.com:6443". $ oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-148-91.us-east-2.compute.internal Ready worker 51m v1.21.0-rc.0+766a5fe ip-10-0-153-206.us-east-2.compute.internal Ready master 56m v1.21.0-rc.0+766a5fe ip-10-0-166-117.us-east-2.compute.internal Ready master 56m v1.21.0-rc.0+766a5fe ip-10-0-166-134.us-east-2.compute.internal Ready worker 49m v1.21.0-rc.0+766a5fe ip-10-0-207-169.us-east-2.compute.internal Ready worker 49m v1.21.0-rc.0+766a5fe ip-10-0-221-168.us-east-2.compute.internal Ready master 55m v1.21.0-rc.0+766a5fe $ node=ip-10-0-148-91.us-east-2.compute.internal $ oc get pods -o wide | grep $node tuned-zd4q4 1/1 Running 0 50m 10.0.148.91 ip-10-0-148-91.us-east-2.compute.internal <none> <none> $ pod=tuned-zd4q4 $ oc label pod $pod tuned.openshift.io/elasticsearch= pod/tuned-zd4q4 labeled $ oc get pods --show-labels NAME READY STATUS RESTARTS AGE LABELS cluster-node-tuning-operator-5957c5df4f-fktgv 1/1 Running 1 65m name=cluster-node-tuning-operator,pod-template-hash=5957c5df4f tuned-28bzx 1/1 Running 0 50m controller-revision-hash=649d574bbd,openshift-app=tuned,pod-template-generation=1 tuned-fm8tn 1/1 Running 0 50m controller-revision-hash=649d574bbd,openshift-app=tuned,pod-template-generation=1 tuned-hlxg9 1/1 Running 0 54m controller-revision-hash=649d574bbd,openshift-app=tuned,pod-template-generation=1 tuned-rfmhw 1/1 Running 0 54m controller-revision-hash=649d574bbd,openshift-app=tuned,pod-template-generation=1 tuned-zd4q4 1/1 Running 0 51m controller-revision-hash=649d574bbd,openshift-app=tuned,pod-template-generation=1,tuned.openshift.io/elasticsearch= tuned-zhjgz 1/1 Running 0 54m controller-revision-hash=649d574bbd,openshift-app=tuned,pod-template-generation=1 $ oc create -f- <<EOF apiVersion: tuned.openshift.io/v1 kind: Tuned metadata: name: net-plugin namespace: openshift-cluster-node-tuning-operator spec: profile: - data: | [main] summary=Test BZ 1974718 include=openshift-control-plane [net] channels=combined 1 name: testnetplugin recommend: - match: - label: tuned.openshift.io/elasticsearch type: pod priority: 5 profile: testnetplugin EOF tuned.tuned.openshift.io/net-plugin created $ oc get tuned NAME AGE default 55m net-plugin 10s rendered 55m $ oc get profiles NAME TUNED APPLIED DEGRADED AGE ip-10-0-148-91.us-east-2.compute.internal testnetplugin True False 52m ip-10-0-153-206.us-east-2.compute.internal openshift-control-plane True False 55m ip-10-0-166-117.us-east-2.compute.internal openshift-control-plane True False 55m ip-10-0-166-134.us-east-2.compute.internal openshift-node True False 50m ip-10-0-207-169.us-east-2.compute.internal openshift-node True False 50m ip-10-0-221-168.us-east-2.compute.internal openshift-control-plane True False 55m $ oc logs $pod [...] I0624 12:43:43.905653 3046 tuned.go:312] extracting Tuned profiles I0624 12:43:44.036297 3046 tuned.go:346] recommended Tuned profile openshift-node content unchanged I0624 12:43:44.049606 3046 tuned.go:390] written "/etc/tuned/recommend.d/50-openshift.conf" to set Tuned profile testnetplugin I0624 12:43:44.708138 3046 tuned.go:644] active profile (openshift-node) != recommended profile (testnetplugin) I0624 12:43:44.708182 3046 tuned.go:499] reloading tuned... I0624 12:43:44.708188 3046 tuned.go:502] sending HUP to PID 4522 2021-06-24 12:43:44,708 INFO tuned.daemon.daemon: stopping tuning 2021-06-24 12:43:44,731 INFO tuned.daemon.daemon: terminating Tuned, rolling back all changes 2021-06-24 12:43:44,745 INFO tuned.daemon.daemon: Running in automatic mode, checking what profile is recommended for your configuration. 2021-06-24 12:43:44,747 INFO tuned.daemon.daemon: Using 'testnetplugin' profile 2021-06-24 12:43:44,748 INFO tuned.profiles.loader: loading profile: testnetplugin 2021-06-24 12:43:44,817 INFO tuned.daemon.daemon: starting tuning 2021-06-24 12:43:44,819 INFO tuned.plugins.base: instance cpu: assigning devices cpu0, cpu1 2021-06-24 12:43:44,820 INFO tuned.plugins.plugin_cpu: We are running on an x86 GenuineIntel platform 2021-06-24 12:43:44,823 WARNING tuned.plugins.plugin_cpu: your CPU doesn't support MSR_IA32_ENERGY_PERF_BIAS, ignoring CPU energy performance bias 2021-06-24 12:43:44,826 WARNING tuned.plugins.base: instance disk: no matching devices available 2021-06-24 12:43:44,830 INFO tuned.plugins.base: instance net: assigning devices ens5 2021-06-24 12:43:44,834 INFO tuned.plugins.plugin_sysctl: reapplying system sysctl 2021-06-24 12:43:44,859 INFO tuned.daemon.daemon: static tuning from profile 'testnetplugin' applied I0624 12:43:45.585700 3046 tuned.go:390] written "/etc/tuned/recommend.d/50-openshift.conf" to set Tuned profile testnetplugin I0624 12:43:45.585931 3046 tuned.go:842] updated Profile ip-10-0-148-91.us-east-2.compute.internal stalld=<nil>, bootcmdline: I0624 12:43:45.710848 3046 tuned.go:655] active and recommended profile (testnetplugin) match; profile change will not trigger profile reload $ oc debug node/$node Starting pod/ip-10-0-148-91us-east-2computeinternal-debug ... To use host binaries, run `chroot /host` Pod IP: 10.0.148.91 If you don't see a command prompt, try pressing enter. sh-4.4# find /sys/class/net -type l -not -lname *virtual* -printf '%f\n' ens5 sh-4.4# ethtool -l ens5 Channel parameters for ens5: Pre-set maximums: RX: n/a TX: n/a Other: n/a Combined: 2 Current hardware settings: RX: n/a TX: n/a Other: n/a Combined: 1 sh-4.4# exit exit Removing debug pod ... $ oc delete tuned net-plugin tuned.tuned.openshift.io "net-plugin" deleted $ oc get profiles NAME TUNED APPLIED DEGRADED AGE ip-10-0-148-91.us-east-2.compute.internal openshift-node True False 55m ip-10-0-153-206.us-east-2.compute.internal openshift-control-plane True False 58m ip-10-0-166-117.us-east-2.compute.internal openshift-control-plane True False 58m ip-10-0-166-134.us-east-2.compute.internal openshift-node True False 53m ip-10-0-207-169.us-east-2.compute.internal openshift-node True False 53m ip-10-0-221-168.us-east-2.compute.internal openshift-control-plane True False 58m $ oc debug node/$node Starting pod/ip-10-0-148-91us-east-2computeinternal-debug ... To use host binaries, run `chroot /host` Pod IP: 10.0.148.91 If you don't see a command prompt, try pressing enter. sh-4.4# find /sys/class/net -type l -not -lname *virtual* -printf '%f\n' ens5 sh-4.4# ethtool -l ens5 Channel parameters for ens5: Pre-set maximums: RX: n/a TX: n/a Other: n/a Combined: 2 Current hardware settings: RX: n/a TX: n/a Other: n/a Combined: 2 sh-4.4# exit exit Removing debug pod ...
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438