Description of Problem: Must gather pod doesn't run on the master node Version-Release number of selected component (if applicable): $ oc version Client Version: 4.6.0-0.nightly-2020-09-24-015627 Server Version: 4.6.0-0.nightly-2020-09-23-022756 How Reproducible: Always Steps to Reproduce: oc admin must-gather -h Actual Results: Options: --dest-dir='': Set a specific directory on the local machine to write gathered data to. --image=[]: Specify a must-gather plugin image to run. If not specified, OpenShift's default must-gather image will be used. --image-stream=[]: Specify an image stream (namespace/name:tag) containing a must-gather plugin image to run. --node-name='': Set a specific node to use - by default a random master will be used --source-dir='/must-gather/': Set the specific directory on the pod copy the gathered data from. --timeout=600: The length of time to gather data, in seconds. Defaults to 10 minutes. Per the help info, if we don't specify the node-name, must-gather pod should be running on a master node, actually it's not. $ oc get pod must-gather-l4hkr -n openshift-must-gather-n8mss -o json | jq .spec { "containers": [ { <--skip--> "nodeName": "ip-10-0-164-189.us-east-2.compute.internal", "nodeSelector": { "kubernetes.io/os": "linux" }, "preemptionPolicy": "PreemptLowerPriority", "priority": 0, "restartPolicy": "Never", "schedulerName": "default-scheduler", "securityContext": {}, "serviceAccount": "default", "serviceAccountName": "default", "terminationGracePeriodSeconds": 0, "tolerations": [ { "operator": "Exists" } ], <--skip--> } $ oc get nodes ip-10-0-164-189.us-east-2.compute.internal NAME STATUS ROLES AGE VERSION ip-10-0-164-189.us-east-2.compute.internal Ready worker 3h36m v1.19.0+8a39924 Expected Results: Ensure the must gather pod is running on master node, or update the help info.
Checking the master HEAD and git history it does not seem there was ever a mechanism to gravitate the must-gather pod(s) towards master nodes by default.
Verified with the payload below and i see that must-gather runs on a master node when no --node-name is specified. [knarra@knarra openshift-client-linux-4.6.0-0.nightly-2020-10-01-070841]$ ./oc version Client Version: 4.6.0-0.nightly-2020-10-01-070841 Server Version: 4.6.0-0.nightly-2020-10-01-041253 Kubernetes Version: v1.19.0+beb741b [knarra@knarra openshift-client-linux-4.6.0-0.nightly-2020-10-01-070841]$ oc get nodes | grep master ip-10-0-148-147.us-east-2.compute.internal Ready master 114m v1.19.0+beb741b ip-10-0-179-122.us-east-2.compute.internal Ready master 114m v1.19.0+beb741b ip-10-0-198-134.us-east-2.compute.internal Ready master 114m v1.19.0+beb741b [knarra@knarra openshift-client-linux-4.6.0-0.nightly-2020-10-01-070841]$ ./oc adm must-gather [knarra@knarra openshift-client-linux-4.6.0-0.nightly-2020-10-01-070841]$ oc get pod must-gather-kfjc5 -n openshift-must-gather-827br -o json | jq .spec { "containers": [ { "command": [ "/bin/bash", "-c", "/usr/bin/gather; sync" ], "image": "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:3fd026964eb4eb754fc6c28e241bc54f61f71c084424549488b34ecb7c86ba7f", "imagePullPolicy": "IfNotPresent", "name": "gather", "resources": {}, "terminationMessagePath": "/dev/termination-log", "terminationMessagePolicy": "File", "volumeMounts": [ { "mountPath": "/must-gather", "name": "must-gather-output" }, { "mountPath": "/var/run/secrets/kubernetes.io/serviceaccount", "name": "default-token-z9th7", "readOnly": true } ] }, { "command": [ "/bin/bash", "-c", "trap : TERM INT; sleep infinity & wait" ], "image": "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:3fd026964eb4eb754fc6c28e241bc54f61f71c084424549488b34ecb7c86ba7f", "imagePullPolicy": "IfNotPresent", "name": "copy", "resources": {}, "terminationMessagePath": "/dev/termination-log", "terminationMessagePolicy": "File", "volumeMounts": [ { "mountPath": "/must-gather", "name": "must-gather-output" }, { "mountPath": "/var/run/secrets/kubernetes.io/serviceaccount", "name": "default-token-z9th7", "readOnly": true } ] } ], "dnsPolicy": "ClusterFirst", "enableServiceLinks": true, "imagePullSecrets": [ { "name": "default-dockercfg-lpfsl" } ], "nodeName": "ip-10-0-148-147.us-east-2.compute.internal", "nodeSelector": { "kubernetes.io/os": "linux", "node-role.kubernetes.io/master": "" }, "preemptionPolicy": "PreemptLowerPriority", "priority": 0, "restartPolicy": "Never", "schedulerName": "default-scheduler", "securityContext": {}, "serviceAccount": "default", "serviceAccountName": "default", "terminationGracePeriodSeconds": 0, "tolerations": [ { "operator": "Exists" } ], "volumes": [ { "emptyDir": {}, "name": "must-gather-output" }, { "name": "default-token-z9th7", "secret": { "defaultMode": 420, "secretName": "default-token-z9th7" } } ] } when --node-name is specified it runs on that node. [knarra@knarra openshift-client-linux-4.6.0-0.nightly-2020-10-01-070841]$ ./oc get nodes | grep worker ip-10-0-132-124.us-east-2.compute.internal Ready worker 116m v1.19.0+beb741b ip-10-0-181-128.us-east-2.compute.internal Ready worker 116m v1.19.0+beb741b ip-10-0-196-150.us-east-2.compute.internal Ready worker 116m v1.19.0+beb741b [knarra@knarra openshift-client-linux-4.6.0-0.nightly-2020-10-01-070841]$ ./oc adm must-gather --node-name=ip-10-0-132-124.us-east-2.compute.internal [knarra@knarra openshift-client-linux-4.6.0-0.nightly-2020-10-01-070841]$ oc get pod must-gather-hnttn -n openshift-must-gather-wblh7 -o json | jq .spec { "containers": [ { "command": [ "/bin/bash", "-c", "/usr/bin/gather; sync" ], "image": "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:3fd026964eb4eb754fc6c28e241bc54f61f71c084424549488b34ecb7c86ba7f", "imagePullPolicy": "IfNotPresent", "name": "gather", "resources": {}, "terminationMessagePath": "/dev/termination-log", "terminationMessagePolicy": "File", "volumeMounts": [ { "mountPath": "/must-gather", "name": "must-gather-output" }, { "mountPath": "/var/run/secrets/kubernetes.io/serviceaccount", "name": "default-token-dkq6b", "readOnly": true } ] }, { "command": [ "/bin/bash", "-c", "trap : TERM INT; sleep infinity & wait" ], "image": "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:3fd026964eb4eb754fc6c28e241bc54f61f71c084424549488b34ecb7c86ba7f", "imagePullPolicy": "IfNotPresent", "name": "copy", "resources": {}, "terminationMessagePath": "/dev/termination-log", "terminationMessagePolicy": "File", "volumeMounts": [ { "mountPath": "/must-gather", "name": "must-gather-output" }, { "mountPath": "/var/run/secrets/kubernetes.io/serviceaccount", "name": "default-token-dkq6b", "readOnly": true } ] } ], "dnsPolicy": "ClusterFirst", "enableServiceLinks": true, "imagePullSecrets": [ { "name": "default-dockercfg-wpd4q" } ], "nodeName": "ip-10-0-132-124.us-east-2.compute.internal", "nodeSelector": { "kubernetes.io/os": "linux" }, "preemptionPolicy": "PreemptLowerPriority", "priority": 0, "restartPolicy": "Never", "schedulerName": "default-scheduler", "securityContext": {}, "serviceAccount": "default", "serviceAccountName": "default", "terminationGracePeriodSeconds": 0, "tolerations": [ { "operator": "Exists" } ], "volumes": [ { "emptyDir": {}, "name": "must-gather-output" }, { "name": "default-token-dkq6b", "secret": { "defaultMode": 420, "secretName": "default-token-dkq6b" } } ] } When --node-name is specified as one of the master node, pod gets scheduled on that node. [knarra@knarra openshift-client-linux-4.6.0-0.nightly-2020-10-01-070841]$ ./oc get nodes | grep master ip-10-0-148-147.us-east-2.compute.internal Ready master 130m v1.19.0+beb741b ip-10-0-179-122.us-east-2.compute.internal Ready master 130m v1.19.0+beb741b ip-10-0-198-134.us-east-2.compute.internal Ready master 130m v1.19.0+beb741b [knarra@knarra openshift-client-linux-4.6.0-0.nightly-2020-10-01-070841]$ ./oc adm must-gather --node-name=ip-10-0-179-122.us-east-2.compute.internal [knarra@knarra openshift-client-linux-4.6.0-0.nightly-2020-10-01-070841]$ oc get pod must-gather-75gng -n openshift-must-gather-lxc44 -o json | jq .spec { "containers": [ { "command": [ "/bin/bash", "-c", "/usr/bin/gather; sync" ], "image": "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:3fd026964eb4eb754fc6c28e241bc54f61f71c084424549488b34ecb7c86ba7f", "imagePullPolicy": "IfNotPresent", "name": "gather", "resources": {}, "terminationMessagePath": "/dev/termination-log", "terminationMessagePolicy": "File", "volumeMounts": [ { "mountPath": "/must-gather", "name": "must-gather-output" }, { "mountPath": "/var/run/secrets/kubernetes.io/serviceaccount", "name": "default-token-7rtm9", "readOnly": true } ] }, { "command": [ "/bin/bash", "-c", "trap : TERM INT; sleep infinity & wait" ], "image": "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:3fd026964eb4eb754fc6c28e241bc54f61f71c084424549488b34ecb7c86ba7f", "imagePullPolicy": "IfNotPresent", "name": "copy", "resources": {}, "terminationMessagePath": "/dev/termination-log", "terminationMessagePolicy": "File", "volumeMounts": [ { "mountPath": "/must-gather", "name": "must-gather-output" }, { "mountPath": "/var/run/secrets/kubernetes.io/serviceaccount", "name": "default-token-7rtm9", "readOnly": true } ] } ], "dnsPolicy": "ClusterFirst", "enableServiceLinks": true, "imagePullSecrets": [ { "name": "default-dockercfg-ld87g" } ], "nodeName": "ip-10-0-179-122.us-east-2.compute.internal", "nodeSelector": { "kubernetes.io/os": "linux" }, "preemptionPolicy": "PreemptLowerPriority", "priority": 0, "restartPolicy": "Never", "schedulerName": "default-scheduler", "securityContext": {}, "serviceAccount": "default", "serviceAccountName": "default", "terminationGracePeriodSeconds": 0, "tolerations": [ { "operator": "Exists" } ], "volumes": [ { "emptyDir": {}, "name": "must-gather-output" }, { "name": "default-token-7rtm9", "secret": { "defaultMode": 420, "secretName": "default-token-7rtm9" } } ] } Based on the above moving bug to verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196