Bug 1808073 - etcd binary is not archiving data directory if unstarted and data dir exists [NEEDINFO]
Summary: etcd binary is not archiving data directory if unstarted and data dir exists
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Etcd
Version: 4.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 4.5.0
Assignee: Alay Patel
QA Contact: ge liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-02-27 18:27 UTC by Alay Patel
Modified: 2020-07-13 17:22 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-07-13 17:21:52 UTC
Target Upstream Version:
scuppett: needinfo? (alpatel)


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift etcd pull 34 0 None closed Bug 1808073: fix archive member name, unmask error 2020-10-18 12:57:24 UTC
Red Hat Product Errata RHBA-2020:2409 0 None None None 2020-07-13 17:22:13 UTC

Description Alay Patel 2020-02-27 18:27:43 UTC
Description of problem:

The etcd pod will have discovery logic that will archive the data directory if it exists and if the member is in unstarted state. This is to avoid conflicting cluster ids. The discovery logic does not archive the directory correctly.

Comment 2 Xingxing Xia 2020-03-23 10:56:02 UTC
Tried to verify from QE angle in 4.5.0-0.nightly-2020-03-22-224936 env after reading the code:
Given run on one master:
[root@xxia03-ff29h-m-0 manifests]# mkdir /tmp/xxia; mv etcd-pod.yaml /tmp/xxia/manifests-etcd-pod.yaml
Then:
$ oc delete po etcd-xxia03-ff29h-m-0.c.openshift-qe.internal -n openshift-etcd
$ oc rsh -c etcd -n openshift-etcd <another etcd pod>:
sh-4.2# cp -r /var/lib/etcd/member/ /tmp/xxia/member
sh-4.2# export ETCDCTL_API=3 ETCDCTL_CACERT=/etc/kubernetes/static-pod-resources/configmaps/etcd-serving-ca/ca-bundle.crt ETCDCTL_CERT=`find /etc/kubernetes/static-pod-resources -name "*etcd-peer*.crt" | head -n 1` ETCDCTL_KEY=`find /etc/kubernetes/static-pod-resources -name "*etcd*peer*.key" | head -n 1`
sh-4.2# discover-etcd-initial-cluster --cacert $ETCDCTL_CACERT --cert $ETCDCTL_CERT --key $ETCDCTL_KEY --endpoints localhost:2379 --data-dir /tmp/xxia/member --target-peer-url-host 10.0.0.5 --target-name xxia03-ff29h-m-0.c.openshift-qe.internal

Since no much idea how to make it goes to code `case targetMember != nil && len(targetMember.Name) == 0 && memberDirExists:`, no idea to verify it. Thus, per https://github.com/openshift/etcd/pull/34#issuecomment-592097774 verification, moving to VERIFIED

Comment 5 errata-xmlrpc 2020-07-13 17:21:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409


Note You need to log in before you can comment on or make changes to this bug.