Bug 1882304 - Must gather pod doesn't run on the master node
Summary: Must gather pod doesn't run on the master node
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: oc
Version: 4.6
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: 4.6.0
Assignee: Jan Chaloupka
QA Contact: RamaKasturi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-24 09:48 UTC by Qin Ping
Modified: 2020-10-27 16:45 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-27 16:45:08 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift oc pull 595 0 None closed bug 1882304: oc adm must-gather: have must-gather pods run on master nodes if --node-name is not specified 2021-02-18 03:02:13 UTC
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:45:22 UTC

Description Qin Ping 2020-09-24 09:48:49 UTC
Description of Problem:
Must gather pod doesn't run on the master node

Version-Release number of selected component (if applicable):
$ oc version
Client Version: 4.6.0-0.nightly-2020-09-24-015627
Server Version: 4.6.0-0.nightly-2020-09-23-022756


How Reproducible:
Always


Steps to Reproduce:
oc admin must-gather -h

Actual Results:
Options:
      --dest-dir='': Set a specific directory on the local machine to write gathered data to.
      --image=[]: Specify a must-gather plugin image to run. If not specified, OpenShift's default must-gather image
will be used.
      --image-stream=[]: Specify an image stream (namespace/name:tag) containing a must-gather plugin image to run.
      --node-name='': Set a specific node to use - by default a random master will be used
      --source-dir='/must-gather/': Set the specific directory on the pod copy the gathered data from.
      --timeout=600: The length of time to gather data, in seconds. Defaults to 10 minutes.

Per the help info, if we don't specify the node-name, must-gather pod should be running on a master node, actually it's not.

$ oc get pod must-gather-l4hkr -n openshift-must-gather-n8mss -o json | jq .spec
{
  "containers": [
    {
  <--skip-->
  "nodeName": "ip-10-0-164-189.us-east-2.compute.internal",
  "nodeSelector": {
    "kubernetes.io/os": "linux"
  },
  "preemptionPolicy": "PreemptLowerPriority",
  "priority": 0,
  "restartPolicy": "Never",
  "schedulerName": "default-scheduler",
  "securityContext": {},
  "serviceAccount": "default",
  "serviceAccountName": "default",
  "terminationGracePeriodSeconds": 0,
  "tolerations": [
    {
      "operator": "Exists"
    }
  ],
  <--skip-->
}

$ oc get nodes ip-10-0-164-189.us-east-2.compute.internal
NAME                                         STATUS   ROLES    AGE     VERSION
ip-10-0-164-189.us-east-2.compute.internal   Ready    worker   3h36m   v1.19.0+8a39924

Expected Results:
Ensure the must gather pod is running on master node, or update the help info.

Comment 2 Jan Chaloupka 2020-09-30 10:17:23 UTC
Checking the master HEAD and git history it does not seem there was ever a mechanism to gravitate the must-gather pod(s) towards master nodes by default.

Comment 4 RamaKasturi 2020-10-01 13:23:12 UTC
Verified with the payload below and i see that must-gather runs on a master node when no --node-name is specified.

[knarra@knarra openshift-client-linux-4.6.0-0.nightly-2020-10-01-070841]$ ./oc version
Client Version: 4.6.0-0.nightly-2020-10-01-070841
Server Version: 4.6.0-0.nightly-2020-10-01-041253
Kubernetes Version: v1.19.0+beb741b

[knarra@knarra openshift-client-linux-4.6.0-0.nightly-2020-10-01-070841]$ oc get nodes | grep master
ip-10-0-148-147.us-east-2.compute.internal   Ready    master   114m   v1.19.0+beb741b
ip-10-0-179-122.us-east-2.compute.internal   Ready    master   114m   v1.19.0+beb741b
ip-10-0-198-134.us-east-2.compute.internal   Ready    master   114m   v1.19.0+beb741b
[knarra@knarra openshift-client-linux-4.6.0-0.nightly-2020-10-01-070841]$ ./oc adm must-gather

[knarra@knarra openshift-client-linux-4.6.0-0.nightly-2020-10-01-070841]$ oc get pod must-gather-kfjc5 -n openshift-must-gather-827br -o json | jq .spec
{
  "containers": [
    {
      "command": [
        "/bin/bash",
        "-c",
        "/usr/bin/gather; sync"
      ],
      "image": "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:3fd026964eb4eb754fc6c28e241bc54f61f71c084424549488b34ecb7c86ba7f",
      "imagePullPolicy": "IfNotPresent",
      "name": "gather",
      "resources": {},
      "terminationMessagePath": "/dev/termination-log",
      "terminationMessagePolicy": "File",
      "volumeMounts": [
        {
          "mountPath": "/must-gather",
          "name": "must-gather-output"
        },
        {
          "mountPath": "/var/run/secrets/kubernetes.io/serviceaccount",
          "name": "default-token-z9th7",
          "readOnly": true
        }
      ]
    },
    {
      "command": [
        "/bin/bash",
        "-c",
        "trap : TERM INT; sleep infinity & wait"
      ],
      "image": "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:3fd026964eb4eb754fc6c28e241bc54f61f71c084424549488b34ecb7c86ba7f",
      "imagePullPolicy": "IfNotPresent",
      "name": "copy",
      "resources": {},
      "terminationMessagePath": "/dev/termination-log",
      "terminationMessagePolicy": "File",
      "volumeMounts": [
        {
          "mountPath": "/must-gather",
          "name": "must-gather-output"
        },
        {
          "mountPath": "/var/run/secrets/kubernetes.io/serviceaccount",
          "name": "default-token-z9th7",
          "readOnly": true
        }
      ]
    }
  ],
  "dnsPolicy": "ClusterFirst",
  "enableServiceLinks": true,
  "imagePullSecrets": [
    {
      "name": "default-dockercfg-lpfsl"
    }
  ],
  "nodeName": "ip-10-0-148-147.us-east-2.compute.internal",
  "nodeSelector": {
    "kubernetes.io/os": "linux",
    "node-role.kubernetes.io/master": ""
  },
  "preemptionPolicy": "PreemptLowerPriority",
  "priority": 0,
  "restartPolicy": "Never",
  "schedulerName": "default-scheduler",
  "securityContext": {},
  "serviceAccount": "default",
  "serviceAccountName": "default",
  "terminationGracePeriodSeconds": 0,
  "tolerations": [
    {
      "operator": "Exists"
    }
  ],
  "volumes": [
    {
      "emptyDir": {},
      "name": "must-gather-output"
    },
    {
      "name": "default-token-z9th7",
      "secret": {
        "defaultMode": 420,
        "secretName": "default-token-z9th7"
      }
    }
  ]
}

when --node-name is specified it runs on that node.

[knarra@knarra openshift-client-linux-4.6.0-0.nightly-2020-10-01-070841]$ ./oc get nodes | grep worker
ip-10-0-132-124.us-east-2.compute.internal   Ready    worker   116m   v1.19.0+beb741b
ip-10-0-181-128.us-east-2.compute.internal   Ready    worker   116m   v1.19.0+beb741b
ip-10-0-196-150.us-east-2.compute.internal   Ready    worker   116m   v1.19.0+beb741b

[knarra@knarra openshift-client-linux-4.6.0-0.nightly-2020-10-01-070841]$ ./oc adm must-gather --node-name=ip-10-0-132-124.us-east-2.compute.internal

[knarra@knarra openshift-client-linux-4.6.0-0.nightly-2020-10-01-070841]$ oc get pod must-gather-hnttn -n openshift-must-gather-wblh7 -o json | jq .spec
{
  "containers": [
    {
      "command": [
        "/bin/bash",
        "-c",
        "/usr/bin/gather; sync"
      ],
      "image": "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:3fd026964eb4eb754fc6c28e241bc54f61f71c084424549488b34ecb7c86ba7f",
      "imagePullPolicy": "IfNotPresent",
      "name": "gather",
      "resources": {},
      "terminationMessagePath": "/dev/termination-log",
      "terminationMessagePolicy": "File",
      "volumeMounts": [
        {
          "mountPath": "/must-gather",
          "name": "must-gather-output"
        },
        {
          "mountPath": "/var/run/secrets/kubernetes.io/serviceaccount",
          "name": "default-token-dkq6b",
          "readOnly": true
        }
      ]
    },
    {
      "command": [
        "/bin/bash",
        "-c",
        "trap : TERM INT; sleep infinity & wait"
      ],
      "image": "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:3fd026964eb4eb754fc6c28e241bc54f61f71c084424549488b34ecb7c86ba7f",
      "imagePullPolicy": "IfNotPresent",
      "name": "copy",
      "resources": {},
      "terminationMessagePath": "/dev/termination-log",
      "terminationMessagePolicy": "File",
      "volumeMounts": [
        {
          "mountPath": "/must-gather",
          "name": "must-gather-output"
        },
        {
          "mountPath": "/var/run/secrets/kubernetes.io/serviceaccount",
          "name": "default-token-dkq6b",
          "readOnly": true
        }
      ]
    }
  ],
  "dnsPolicy": "ClusterFirst",
  "enableServiceLinks": true,
  "imagePullSecrets": [
    {
      "name": "default-dockercfg-wpd4q"
    }
  ],
  "nodeName": "ip-10-0-132-124.us-east-2.compute.internal",
  "nodeSelector": {
    "kubernetes.io/os": "linux"
  },
  "preemptionPolicy": "PreemptLowerPriority",
  "priority": 0,
  "restartPolicy": "Never",
  "schedulerName": "default-scheduler",
  "securityContext": {},
  "serviceAccount": "default",
  "serviceAccountName": "default",
  "terminationGracePeriodSeconds": 0,
  "tolerations": [
    {
      "operator": "Exists"
    }
  ],
  "volumes": [
    {
      "emptyDir": {},
      "name": "must-gather-output"
    },
    {
      "name": "default-token-dkq6b",
      "secret": {
        "defaultMode": 420,
        "secretName": "default-token-dkq6b"
      }
    }
  ]
}

When --node-name is specified as one of the master node, pod gets scheduled on that node.

[knarra@knarra openshift-client-linux-4.6.0-0.nightly-2020-10-01-070841]$ ./oc get nodes | grep master
ip-10-0-148-147.us-east-2.compute.internal   Ready    master   130m   v1.19.0+beb741b
ip-10-0-179-122.us-east-2.compute.internal   Ready    master   130m   v1.19.0+beb741b
ip-10-0-198-134.us-east-2.compute.internal   Ready    master   130m   v1.19.0+beb741b
[knarra@knarra openshift-client-linux-4.6.0-0.nightly-2020-10-01-070841]$ ./oc adm must-gather --node-name=ip-10-0-179-122.us-east-2.compute.internal

[knarra@knarra openshift-client-linux-4.6.0-0.nightly-2020-10-01-070841]$ oc get pod must-gather-75gng -n openshift-must-gather-lxc44 -o json | jq .spec
{
  "containers": [
    {
      "command": [
        "/bin/bash",
        "-c",
        "/usr/bin/gather; sync"
      ],
      "image": "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:3fd026964eb4eb754fc6c28e241bc54f61f71c084424549488b34ecb7c86ba7f",
      "imagePullPolicy": "IfNotPresent",
      "name": "gather",
      "resources": {},
      "terminationMessagePath": "/dev/termination-log",
      "terminationMessagePolicy": "File",
      "volumeMounts": [
        {
          "mountPath": "/must-gather",
          "name": "must-gather-output"
        },
        {
          "mountPath": "/var/run/secrets/kubernetes.io/serviceaccount",
          "name": "default-token-7rtm9",
          "readOnly": true
        }
      ]
    },
    {
      "command": [
        "/bin/bash",
        "-c",
        "trap : TERM INT; sleep infinity & wait"
      ],
      "image": "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:3fd026964eb4eb754fc6c28e241bc54f61f71c084424549488b34ecb7c86ba7f",
      "imagePullPolicy": "IfNotPresent",
      "name": "copy",
      "resources": {},
      "terminationMessagePath": "/dev/termination-log",
      "terminationMessagePolicy": "File",
      "volumeMounts": [
        {
          "mountPath": "/must-gather",
          "name": "must-gather-output"
        },
        {
          "mountPath": "/var/run/secrets/kubernetes.io/serviceaccount",
          "name": "default-token-7rtm9",
          "readOnly": true
        }
      ]
    }
  ],
  "dnsPolicy": "ClusterFirst",
  "enableServiceLinks": true,
  "imagePullSecrets": [
    {
      "name": "default-dockercfg-ld87g"
    }
  ],
  "nodeName": "ip-10-0-179-122.us-east-2.compute.internal",
  "nodeSelector": {
    "kubernetes.io/os": "linux"
  },
  "preemptionPolicy": "PreemptLowerPriority",
  "priority": 0,
  "restartPolicy": "Never",
  "schedulerName": "default-scheduler",
  "securityContext": {},
  "serviceAccount": "default",
  "serviceAccountName": "default",
  "terminationGracePeriodSeconds": 0,
  "tolerations": [
    {
      "operator": "Exists"
    }
  ],
  "volumes": [
    {
      "emptyDir": {},
      "name": "must-gather-output"
    },
    {
      "name": "default-token-7rtm9",
      "secret": {
        "defaultMode": 420,
        "secretName": "default-token-7rtm9"
      }
    }
  ]
}


Based on the above moving bug to verified state.

Comment 7 errata-xmlrpc 2020-10-27 16:45:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.