Description of problem: Run the regenerate-certificates command on master failed with error: E1220 02:19:38.188639 1 reflector.go:123] k8s.io/client-go/informers/factory.go:134: Failed to list *v1.ConfigMap: illegal base64 data at input byte 3 E1220 02:19:38.390169 1 reflector.go:123] k8s.io/client-go/informers/factory.go:134: Failed to list *v1.Secret: illegal base64 data at input byte 3 Version-Release number of selected component (if applicable): Payload: 4.3.0-0.nightly-2019-12-13-180405 How reproducible: Sometimes Steps to Reproduce: 1. Follow the doc: https://docs.openshift.com/container-platform/4.2/backup_and_restore/disaster_recovery/scenario-3-expired-certs.html to do certificate recovery; Actual results: 1. Failed when run regenerate-certificates command on master: [root@control-plane-0 ~]# podman run -it --network=host -v /etc/kubernetes/:/etc/kubernetes/:Z --entrypoint=/usr/bin/cluster-kube-apiserver-operator "${KAO_IMAGE}" regenerate-certificates I1220 02:11:21.185177 1 certrotationcontroller.go:492] Waiting for CertRotation E1220 02:11:21.210381 1 reflector.go:123] k8s.io/client-go/informers/factory.go:134: Failed to list *v1.Secret: illegal base64 data at input byte 3 E1220 02:11:21.210392 1 reflector.go:123] k8s.io/client-go/informers/factory.go:134: Failed to list *v1.ConfigMap: illegal base64 data at input byte 3 ...many repetions of above E1220 without stop... Expected results: 1. Should succeed. Additional info:
[root@control-plane-0 ~]# oc adm must-gather [must-gather ] OUT the server is currently unable to handle the request (get imagestreams.image.openshift.io must-gather) [must-gather ] OUT [must-gather ] OUT Using must-gather plugin-in image: quay.io/openshift/origin-must-gather:latest [must-gather ] OUT namespace/openshift-must-gather-56qwg created [must-gather ] OUT clusterrolebinding.rbac.authorization.k8s.io/must-gather-6qml5 created [must-gather ] OUT clusterrolebinding.rbac.authorization.k8s.io/must-gather-6qml5 deleted [must-gather ] OUT namespace/openshift-must-gather-56qwg deleted Error from server (Forbidden): pods "must-gather-" is forbidden: error looking up service account openshift-must-gather-56qwg/default: serviceaccount "default" not found
Created attachment 1646739 [details] inspect result
the root cause of the issue was that the recovery API didn't know how to decrypt the encrypted content from the DB. please validate the fix on an encrypted cluster.
The 4.4 certs disaster recovery bug 1771410 is verified with successful auto recovery, that process covers this bug's issue. Thus moving to VERIFIED directly.
(In reply to Lukasz Szaszkiewicz from comment #3) > please validate the fix on an encrypted cluster. Ah, didn't notice this. Will try on Etcd Encrypted cluster later.
Installed 4.4.0-0.nightly-2020-03-18-102708 ipi on aws env, enabled etcd encryption. Then broke the cluster per google document of bug 1771410, waited for certs expired, then re-start masters, clusters can come back well, control plane certs can recover automatically, oc get po/co/no and other oc basic operations (new-project, new-app, rsh etc) have no problem, kas 4 containers logs no abnormality. In short, the bug issue is not seen now.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0581