Description of problem: cloud-credential-operator pod is restarting continuously with error. $ omg get pods NAME READY STATUS RESTARTS AGE cloud-credential-operator-54bd6754d9-k7gzz 2/2 Running 5 49m ~~~ containerID: cri-o://4eb38ca0eeb4df340bb008a4b4c5c24bc2a71b0be4166b28468650ea934acd61 exitCode: 2 finishedAt: '2021-08-10T12:43:07Z' message: "aws.(*ReconcileCloudCredSecret).Reconcile(0xc0006d81b0, 0xc004fc2510,\ \ 0xb, 0xc004fc24f0, 0x9, 0xc010bf2500, 0x0, 0x0, 0x0)\n\t/go/src/github.com/openshift/cloud-credential-operator/pkg/operator/secretannotator/aws/reconciler.go:172\ \ +0x5d7\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000ab4990,\ \ 0x208a620, 0xc010bf2380, 0x0)\n\t/go/src/github.com/openshift/cloud-credential-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:235\ \ +0x2a9\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000ab4990,\ \ 0x203000)\n\t/go/src/github.com/openshift/cloud-credential-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:209\ \ +0xb0\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker(0xc000ab4990)\n\ \t/go/src/github.com/openshift/cloud-credential-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:188\ \ +0x2b\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc000eb31b0)\n\ \t/go/src/github.com/openshift/cloud-credential-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155\ \ +0x5f\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc000eb31b0, 0x2606fa0,\ \ 0xc010f9e0c0, 0x1, 0xc0004fdd40)\n\t/go/src/github.com/openshift/cloud-credential-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156\ \ +0xad\nk8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc000eb31b0, 0x3b9aca00,\ \ 0x0, 0xc00101c401, 0xc0004fdd40)\n\t/go/src/github.com/openshift/cloud-credential-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\ \ +0x98\nk8s.io/apimachinery/pkg/util/wait.Until(0xc000eb31b0, 0x3b9aca00,\ \ 0xc0004fdd40)\n\t/go/src/github.com/openshift/cloud-credential-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:90\ \ +0x4d\ncreated by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\ \t/go/src/github.com/openshift/cloud-credential-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:170\ \ +0x3fa\n" reason: Error startedAt: '2021-08-10T12:34:16Z' ~~~
Hi Mani, I checked the cco log from must-gather you attached, i see the below information from log, it shows that cco detect an aws root secret in kube-system namespace too, the customer cluster is installed on openstack, so aws secret is not required and this cause the panic. as a workaround, could you remove that aws root secret($oc delete secret aws-creds -n kube-system). i tested, that could resolve this panic. ################ 2021-08-10T12:43:07.086582668Z time="2021-08-10T12:43:07Z" level=info msg="observed admin cloud credential secret event" namespace=kube-system secret=openstack-credentials 2021-08-10T12:43:07.086582668Z time="2021-08-10T12:43:07Z" level=info msg="requeueing all CredentialsRequests" 2021-08-10T12:43:07.086582668Z time="2021-08-10T12:43:07Z" level=info msg="observed admin cloud credential secret event" namespace=kube-system secret=aws-creds 2021-08-10T12:43:07.086582668Z time="2021-08-10T12:43:07Z" level=info msg="requeueing all CredentialsRequests" ################
@wang Lin I suggested the workaround to the customer and it worked. Is this affect all RHOCP 4.6 openstack clusters?
I launched a regular installation on openstack using the same version as the customer, it won't create such aws root secret in kube-system namespace by default, i am not sure why the customer's cluster had this secret. To reproduce this issue, I created an aws secret manually in kube-system, then i reproduce the same issue with the customer.
The verified steps pasted on PR: https://github.com/openshift/cloud-credential-operator/pull/399#issuecomment-940787856 move this one to Verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056