Description of problem:
While upgrading the cluster form 4.5.6 to 4.5.13, the folowing secrets were removed and are not created causing unstability in the cluster as multiple cluster-operators are in Degraded state:
Post upgrade succeeded, the following errors were observed in the operators:
message: 'RevisionControllerDegraded: secrets "etcd-client" not found'
message: 'Progressing: Unable to apply resources: unable to apply objects: failed to update object *v1.Secret, Namespace=openshift-image-registry, Name=installation-pull-secrets: Secret "installation-pull-secrets" is invalid: data[.dockerconfigjson]: Required
message: 'Failed to resync 4.5.13 because: timed out waiting for the condition during waitForControllerConfigToBeCompleted: controllerconfig is not completed: ControllerConfig has not completed: completed(false) running(false) failing(true)'
Checking the logs, we could not find why the secrets were deleted.
On comparing with the working cluster, it seems that most of the deleted secrets are managed by "cluster-bootstrap".
What we are looking to know:
- why the secrets were deleted?
- as they are managed by cluster-bootstrap, what is the process to create all the lost secrets?
- is there any operator which should be resonsible to manage and create the secrets?
Steps to Reproduce:
1. delete the secret pull-secret or any other secret in openshift-config project
The secrets ere lost during upgrade and were not created automatically.
The secrets should be created automatically
- attaching the must-gather
I created the
Is there any update on this?
Is there any way we can re-create the secrets created by openshift-installer while extracting the information from the cluster in current state?
Let me know if there is any additional information needed from Support or from Customer cluster, I will get it for investigation.
Closing this BZ. Created an RFE for providing a way to recover when the signers are missing.