Bug 2117400

Summary: Must Gather, "must-gather" pod serviceaccount "default" not found
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Oded <oviner>
Component: must-gatherAssignee: yati padia <ypadia>
Status: CLOSED WORKSFORME QA Contact: Prasad Desala <tdesala>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 4.11CC: muagarwa, ocs-bugs, odf-bz-bot, ypadia
Target Milestone: ---Keywords: Automation
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-11-10 11:41:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Oded 2022-08-10 22:35:34 UTC
Description of problem (please be detailed as possible and provide log
snippests):
Must Gather,  "must-gather" pod  serviceaccount "default" not found

Version of all relevant components (if applicable):
OCP Version: 4.12.0-0.nightly-2022-07-27-133042
ODF Version: 4.11.0-127
Provider: AWS

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?


Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue reproducible?


Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
Links:
https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/4901/parameters/

http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j-118ai3c33-uo/j-118ai3c33-uo_20220728T044943/logs/ocs-ci-logs-1659011848/tests/manage/z_cluster/test_must_gather.py/TestMustGather/test_must_gather-CEPH/logs

Test Process:
1. Deploy OCP4.11 + ODF4.11
2. Upgrade OCP [4.11->4.12]
3. Run MG cmd:
oc adm must-gather --image=quay.io/rhceph-dev/ocs-must-gather:latest-4.11 --dest-dir=/tmp/tmp8p5ygm7g_ocs_logs/ocs_must_gather
4. "must-gather" pod  serviceaccount "default" not found
Error running must-gather collection:
    pods "must-gather-" is forbidden: error looking up service account openshift-must-gather-tjwj8/default: serviceaccount "default" not found

2022-07-28 22:59:48,457 - MainThread - ERROR - ocs_ci.ocs.utils.run_must_gather.906 - Failed during must gather logs! Error: Error during execution of command: oc adm must-gather --image=quay.io/rhceph-dev/ocs-must-gather:latest-4.11 --dest-dir=/tmp/tmp8p5ygm7g_ocs_logs/ocs_must_gather.
Error is 

Error running must-gather collection:
    pods "must-gather-" is forbidden: error looking up service account openshift-must-gather-tjwj8/default: serviceaccount "default" not found

Falling back to `oc adm inspect clusteroperators.v1.config.openshift.io` to collect basic cluster information.
error running backup collection: errors occurred while gathering data:
    [skipping gathering namespaces/openshift-config due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-config-managed due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-authentication due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-authentication-operator due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-ingress due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-oauth-apiserver due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-machine-api due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-cloud-controller-manager-operator due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-cloud-controller-manager due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-cloud-credential-operator due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-config-operator due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering oauthclients.oauth.openshift.io/console due to error: the server doesn't have a resource type "oauthclients", skipping gathering namespaces/openshift-console-operator due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-console due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-cluster-storage-operator due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-dns-operator due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-dns due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-etcd-operator due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-etcd due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-image-registry due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-ingress-operator due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-ingress-canary due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-insights due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering secrets/support due to error: secrets "support" not found, skipping gathering namespaces/openshift-kube-apiserver-operator due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-kube-apiserver due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-kube-controller-manager due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-kube-controller-manager-operator due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/kube-system due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-kube-scheduler due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-kube-scheduler-operator due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-kube-storage-version-migrator due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-kube-storage-version-migrator-operator due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-cluster-machine-approver due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-machine-config-operator due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-kni-infra due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-openstack-infra due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-ovirt-infra due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-vsphere-infra due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-nutanix-infra due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-marketplace due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-monitoring due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-user-workload-monitoring due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-multus due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-sdn due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-host-network due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-network-diagnostics due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-network-operator due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-cloud-network-config-controller due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-cluster-node-tuning-operator due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-apiserver-operator due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-apiserver due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering endpoints/host-etcd-2 due to error: endpoints "host-etcd-2" not found, skipping gathering namespaces/openshift-controller-manager-operator due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-controller-manager due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-cluster-samples-operator due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering templates.template.openshift.io due to error: the server doesn't have a resource type "templates", skipping gathering namespaces/openshift-operator-lifecycle-manager due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-service-ca-operator due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-service-ca due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering namespaces/openshift-cluster-csi-drivers due to error: the server doesn't have a resource type "egressfirewalls", skipping gathering sharedconfigmaps.sharedresource.openshift.io due to error: the server doesn't have a resource type "sharedconfigmaps", skipping gathering sharedsecrets.sharedresource.openshift.io due to error: the server doesn't have a resource type "sharedsecrets"]
Error from server (Forbidden): pods "must-gather-" is forbidden: error looking up service account openshift-must-gather-tjwj8/default: serviceaccount "default" not found
Must-Gather Output: 


Actual results:


Expected results:


Additional info:

Comment 2 Mudit Agarwal 2022-10-31 10:19:04 UTC
Oded, is this reproducible with the latest build?

Comment 4 yati padia 2022-11-08 04:55:40 UTC
Have asked oded to reproduce the issue so that I can debug the cluster. Once done will the required ack, for now removing the needinfo request on me.

Comment 5 Oded 2022-11-10 11:41:28 UTC
Bug does not reproduced.

SetUp:
Provider:Vmware
ODF Version:4.11.3-5
OCP Version:4.12.0-ec.5

Test Process:
1.Deploy ODF4.11 + OCP 4.11
2.Upgrade OCP Version [OCP4.11->4.12]
$  oc patch clusterversions/version -p '{"spec":{"channel":"stable-4.12"}}' --type=merge
clusterversion.config.openshift.io/version patched

[odedviner@fedora auth]$  oc adm upgrade --to-image=quay.io/openshift-release-dev/ocp-release:4.12.0-ec.5-x86_64 --allow-explicit-upgrade --force
warning: Using by-tag pull specs is dangerous, and while we still allow it in combination with --force for backward compatibility, it would be much safer to pass a by-digest pull spec instead
warning: The requested upgrade image is not one of the available updates.You have used --allow-explicit-upgrade for the update to proceed anyway
warning: --force overrides cluster verification of your supplied release image and waives any update precondition failures.
Requesting update to release image quay.io/openshift-release-dev/ocp-release:4.12.0-ec.5-x86_64
3.Run MG cmd
oc adm must-gather --image=quay.io/rhceph-dev/ocs-must-gather:latest-4.11

4.Check relevant files in mg dir:
https://github.com/red-hat-storage/ocs-ci/blob/master/tests/manage/z_cluster/test_must_gather.py