Description of problem: I use oc debug node/<nodename> a lot to debug issues on the node since our managed service does not allow ssh access by default. Recently I had a deamonset that set some sysctl values and was using oc debug to verify the values were set. However, when looking at the sysctl values they were always set to the default values. The issue turns out that the pod created for the oc debug node session did not have the hostIPC property set to true. The pod does set hostNetwork and hostPid but not hostIPC. Version-Release number of selected component (if applicable): 4.10 How reproducible: Set a sysctl value such as kernel.sem in the host IPC either via ssh, node tuning operator, or daemonset and run oc debug node from a client machine. Then look at the sysctl value that was previously set. Actual results: You'll see the sysctl value is the default value and not the value that was set. Expected results: oc debug should show you the host's IPC sysctl values. Additional info: Run another pod with the hostIPC: true and exec into the pod. Look at the sysctl values and you'll see the updated sysctl.
Hi Todd, is hostIPC property not getting set to true causing any issues/problems? > Recently I had a deamonset that set some sysctl values and was using oc debug to verify the values were set. However, when looking at the sysctl values they were always set to the default values. The issue turns out that the pod created for the oc debug node session did not have the hostIPC property set to true. The pod does set hostNetwork and hostPid but not hostIPC. Would you please more elaborate on the connection between the daemonset and the oc debug's pod systemctl values? Resp. is the expectation here the oc debug pod should somehow mirror the daemonset systemctl values? Any example you might share? E.g. printing the sysctl values after they are set through ssh, node tuning operator, or daemonset and then how they changed after running oc debug.
Hi Jan, Thanks for looking into this. The expectation is when you oc debug node you should see the nodes sysctl settings (not mirror the settings and not the pod's default values). For example, I've set the kernel.sem sysctl property using the node tuning operator. Node Tuning Operator: $ oc -n openshift-cluster-node-tuning-operator get tuned rendered -o yaml apiVersion: tuned.openshift.io/v1 kind: Tuned metadata: creationTimestamp: "2022-06-30T16:22:37Z" generation: 7 name: rendered namespace: openshift-cluster-node-tuning-operator ownerReferences: - apiVersion: tuned.openshift.io/v1 blockOwnerDeletion: true controller: true kind: Tuned name: default uid: 29efad9b-a901-4217-aa41-971d0389e5eb resourceVersion: "10461034" uid: fdde3b58-d84c-42c8-9607-ab37286b4f85 spec: profile: - data: "[main]\nsummary=Custom OpenShift node profile for cp4d workloads\ninclude=openshift-control-plane\n[sysctl]\nkernel.sem=\"250 1024000 300 32768\" \nkernel.msgmax=\"65536\"\nkernel.msgmnb=\"65536\"\nkernel.msgmni=\"32768\"\nkernel.shmmni=\"32768\"\nvm.max_map_count=320000\nkernel.shmall=\"33554432\"\nkernel.shmmax=\"68719476736\" \ \n" name: openshift-node-cp4d recommend: [] status: {} -------------------------------- ssh to host: $ ssh toddjohn.128.53 [toddjohn@kube-causc5cw09qpg1kojphg-roks410-default-00000127 ~]$ sysctl kernel.sem kernel.sem = 250 1024000 300 32768 You can see the value is what was set by node tuning. -------------------------------- oc debug: $ oc debug node/10.241.128.53 Starting pod/1024112853-debug ... To use host binaries, run `chroot /host` Pod IP: 10.241.128.53 If you don't see a command prompt, try pressing enter. sh-4.4# chroot /host sh-4.2# sysctl kernel.sem kernel.sem = 250 32000 32 128 You can see the value is the "default". ------------------------------- Pod with hostIPC: true [roks-4.10]:~$ cat ubuntu-pod-hostipc.yaml kind: Pod apiVersion: v1 metadata: name: tej-ubuntu-hostipc spec: containers: - name: tej-ubuntu-hostipc image: us.icr.io/toddjohn/ubuntu-tools command: ["/bin/bash", "-ec", "while :; do echo '.'; sleep 5 ; done"] restartPolicy: Never hostIPC: true $ oc exec -it tej-ubuntu-hostipc -- sysctl kernel.sem kernel.sem = 250 1024000 300 32768 The value is what is set by node tuning. -------------------------------- Pod with hostIPC: false $ cat ubuntu-pod-no-hostipc.yaml kind: Pod apiVersion: v1 metadata: name: tej-ubuntu-no-hostipc spec: containers: - name: tej-ubuntu-no-hostipc image: us.icr.io/toddjohn/ubuntu-tools command: ["/bin/bash", "-ec", "while :; do echo '.'; sleep 5 ; done"] restartPolicy: Never hostIPC: false $ oc exec -it tej-ubuntu-no-hostipc -- sysctl kernel.sem kernel.sem = 250 32000 32 128 ------------------------------- I think oc debug node should show the node's values not the pod default (which happens when hostIPC is false) or at least an option on oc debug node that sets the hostIPC value to true in the debug pod so you can see the node's sysctl values.
oc version --client Client Version: 4.12.0-0.nightly-2022-08-12-053438 Kustomize Version: v4.5.4 oc get tuned ips -o yaml apiVersion: tuned.openshift.io/v1 kind: Tuned metadata: creationTimestamp: "2022-08-16T06:47:51Z" generation: 1 name: ips namespace: openshift-cluster-node-tuning-operator resourceVersion: "122759" uid: 8106c15c-31cf-407b-b2db-7a9ff5c0e0ba spec: profile: - data: | [main] summary=A custom OpenShift IPS host profile [sysctl] kernel.msgmni=4096 kernel.pid_max=1048575 kernel.shmmax=180000000 kernel.sem="128 1048576 32 32768" net.core.rmem_default=>33554431 net.core.rmem_max=>33554431 fs.file-max=>240000 vm.dirty_background_ratio=64 vm.dirty_ratio=72 name: ips-host recommend: - match: - label: tuned value: ips priority: 20 profile: ips-host oc get profile yinzhou16-w8mkn-worker-1-hjchn NAME TUNED APPLIED DEGRADED AGE yinzhou16-w8mkn-worker-1-hjchn ips-host True False 4h55m oc debug node/yinzhou16-w8mkn-worker-1-hjchn Starting pod/yinzhou16-w8mkn-worker-1-hjchn-debug ... To use host binaries, run `chroot /host` Pod IP: 10.242.1.5 If you don't see a command prompt, try pressing enter. sh-4.4# chroot /host sh-4.4# sysctl kernel.sem kernel.sem = 128 1048576 32 32768
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:7399