Bug 2003775
Summary: | etcd pod on CrashLoopBackOff after master replacement procedure | ||||||
---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | rlobillo | ||||
Component: | Etcd | Assignee: | Nobody <nobody> | ||||
Status: | CLOSED ERRATA | QA Contact: | ge liu <geliu> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 4.9 | CC: | alray, htariq, mcornea, yprokule | ||||
Target Milestone: | --- | ||||||
Target Release: | 4.10.0 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | No Doc Update | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2022-03-10 16:10:01 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 2016174 | ||||||
Attachments: |
|
Description
rlobillo
2021-09-13 16:54:11 UTC
Can you please verify each step you performed vs a link to the steps? For example are you sure that you stopped etcd my moving the etcd-pod.yaml from /etc/kubernetes/manifests. Then removed the data directory of the failed member. `rm -rf /var/lib/etcd` Next removed the etcd member `etcdctl member remove $ID` Then after that force a new rollout. `oc patch etcd cluster -p='{"spec": {"forceRedeploymentReason": "single-master-recovery-'"$( date --rfc-3339=ns )"'"}}' --type=merge ` Finally was master-0 the member you replaced I assume? I see the ansible logs now.. reviewing. I believe this is an upstream bug related to new logic around the handling of membership data[1],[2]. [1] https://github.com/etcd-io/etcd/issues/13196 [2] https://github.com/etcd-io/etcd/pull/13348. Thanks Sam. Removing NEEDINFO flag. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056 |