Bug 1850687
| Summary: | [DR]etcd fail to back after restore from automated-cluster-backups | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Neelesh Agrawal <nagrawal> | ||||
| Component: | Node | Assignee: | Seth Jennings <sjenning> | ||||
| Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Sunil Choudhary <schoudha> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 4.5 | CC: | aos-bugs, geliu, jokerman, nagrawal, rphillips, sbatsche, schoudha, skolicha, sttts | ||||
| Target Milestone: | --- | Flags: | zyu:
needinfo-
|
||||
| Target Release: | 4.5.z | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | 1848939 | Environment: | |||||
| Last Closed: | 2020-08-10 20:46:24 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | 1848939 | ||||||
| Bug Blocks: | |||||||
| Attachments: |
|
||||||
|
Comment 19
Suresh Kolichala
2020-06-26 16:35:13 UTC
Created attachment 1698970 [details]
kubelet log for ip-10-0-156-170.us-east-2.compute.internal
On the first node where the etcd pod deletion was successful, after old database is restored, etcd and apiserver pods re-appear. However, the kubelet on that node completed the pod deletion prior to the restore and doesn't have knowledge about etcd pod anymore. One solution is to reconcile the pod manager on the node with the current api server so that the phantom etcd pod is removed. I am going over this fix and evaluate the proper presentation (current formation depends on #89155). Ryan is on leave |