1776811 – [MSTR-485] Cluster is abnormal after etcd backup/restore when the backup is conducted during etcd encryption is migrating

Bug 1776811 - [MSTR-485] Cluster is abnormal after etcd backup/restore when the backup is conducted during etcd encryption is migrating

Summary: [MSTR-485] Cluster is abnormal after etcd backup/restore when the backup is c...

Keywords:
Status:	CLOSED DEFERRED
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	4.3.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	4.3.z
Assignee:	Andrew McDermott
QA Contact:	Hongan Li
Docs Contact:
URL:
Whiteboard:
Depends On:	1775057 1776797
Blocks:
TreeView+	depends on / blocked

Reported:	2019-11-26 12:30 UTC by Stefan Schimanski
Modified:	2023-09-15 01:28 UTC (History)
CC List:	11 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:	1775057
Environment:
Last Closed:	2020-05-12 16:17:35 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift library-go pull 606	0	'None'	closed	Bug 1776811: encryption: keep last read key after migration for easier backup/restore	2021-02-08 15:12:23 UTC

Comment 4 Xingxing Xia 2019-12-02 11:20:05 UTC

Hi, Stefan and Lukasz, saw https://github.com/openshift/enhancements/pull/131/files#diff-29a58870b4078595bb0b7d5a2a3bee18R279 :
"encryption-config ... mounted via host mount as ... in the kube-apiserver pod"
"A restore must put ... the backup in place ... before starting up kube-apiserver"

I've a question about the restore: for etcd restore, it has doc https://docs.openshift.com/container-platform/4.2/backup_and_restore/disaster_recovery/scenario-2-restoring-cluster-state.html ; for encryption-config restore, what's right steps to do the host mount into the static pods? Modify /etc/kubernetes/manifests/kube-apiserver-pod.yaml on each master? Or modify /etc/kubernetes/static-pod-resources/kube-apiserver-pod-$LATEST_REVISION/kube-apiserver-pod.yaml? Or whatever? Thanks.

Comment 30 Xingxing Xia 2020-01-07 15:46:13 UTC

Tried 4.3.0-0.nightly-2020-01-06-185654 env twice, one time did not hit above issue, another time hit above issue. For the time that hit the issue, tried to restart the pods by: oc delete po router-default-6b44978bc4-mrslh router-default-6b44978bc4-z6st7 -n openshift-ingress . Then wait several mins, the issue is gone.

Comment 32 ge liu 2020-01-08 10:34:15 UTC

Sam, i filed a doc bug to trace this workaround, https://bugzilla.redhat.com/show_bug.cgi?id=1788895

Comment 33 ge liu 2020-01-08 10:34:42 UTC

Hello Sam, i filed a doc bug to trace this workaround, https://bugzilla.redhat.com/show_bug.cgi?id=1788895

Comment 35 Ben Bennett 2020-05-12 16:17:35 UTC

Closing this for now because the documentation is an appropriate fix: https://bugzilla.redhat.com/show_bug.cgi?id=1788895


We can consider a backport if we work out what, if anything, we can do to fix it on the master bug.

Comment 36 Red Hat Bugzilla 2023-09-15 01:28:55 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 365 days

Note You need to log in before you can comment on or make changes to this bug.