Bug 1889003 - few secrets got deleted while upgrading cluster from 4.5.6 to 4.5.13
Summary: few secrets got deleted while upgrading cluster from 4.5.6 to 4.5.13
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Etcd
Version: 4.5
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Suresh Kolichala
QA Contact: ge liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-10-16 18:33 UTC by Sudarshan Chaudhari
Modified: 2021-04-15 01:48 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-04-15 01:48:13 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Sudarshan Chaudhari 2020-10-16 18:33:43 UTC
Description of problem:
While upgrading the cluster form 4.5.6 to 4.5.13, the folowing secrets were removed and are not created causing unstability in the cluster as multiple cluster-operators are in Degraded state:
- etcd-client
- etcd-metric-client
- etcd-metric-signer
- etcd-signer
- pull-secret


Post upgrade succeeded, the following errors were observed in the operators:


kube-apiserver:
~~~
message: 'RevisionControllerDegraded: secrets "etcd-client" not found'
~~~

image-registry:
~~~
message: 'Progressing: Unable to apply resources: unable to apply objects: failed to update object *v1.Secret, Namespace=openshift-image-registry, Name=installation-pull-secrets: Secret "installation-pull-secrets" is invalid: data[.dockerconfigjson]: Required
~~~

machine-config:
~~~
message: 'Failed to resync 4.5.13 because: timed out waiting for the condition during waitForControllerConfigToBeCompleted: controllerconfig is not completed: ControllerConfig has not completed: completed(false) running(false) failing(true)'
~~~

Checking the logs, we could not find why the secrets were deleted.

On comparing with the working cluster, it seems that most of the deleted secrets are managed by "cluster-bootstrap".

What we are looking to know:
- why the secrets were deleted?
- as they are managed by cluster-bootstrap, what is the process to create all the lost secrets?
- is there any operator which should be resonsible to manage and create the secrets?


How reproducible:
Always

Steps to Reproduce:
1. delete the secret pull-secret or any other secret in openshift-config project

Actual results:
The secrets ere lost during upgrade and were not created automatically.

Expected results:
The secrets should be created automatically

Additional info:
- attaching the must-gather


Additional info:
I created the

Comment 2 Sudarshan Chaudhari 2020-11-06 01:17:37 UTC
Hello @Suresh

Is there any update on this?

Is there any way we can re-create the secrets created by openshift-installer while extracting the information from the cluster in current state?

Let me know if there is any additional information needed from Support or from Customer cluster, I will get it for investigation.

Comment 16 Suresh Kolichala 2021-04-15 01:48:13 UTC
Closing this BZ. Created an RFE for providing a way to recover when the signers are missing.

https://issues.redhat.com/browse/RFE-1790


Note You need to log in before you can comment on or make changes to this bug.