Bug 1813743 - etcd:[DR] should backup and restore all static pods
Summary: etcd:[DR] should backup and restore all static pods
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Etcd
Version: 4.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 4.5.0
Assignee: Sam Batschelet
QA Contact: ge liu
URL:
Whiteboard:
: 1812275 (view as bug list)
Depends On:
Blocks: 1813744 1829452
TreeView+ depends on / blocked
 
Reported: 2020-03-16 00:27 UTC by Sam Batschelet
Modified: 2023-10-06 19:25 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 1813744 1829452 (view as bug list)
Environment:
Last Closed: 2020-07-13 17:20:18 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Replace master field with encryption (33.74 KB, text/plain)
2020-07-06 11:51 UTC, Masaki Hatada
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:2409 0 None None None 2020-07-13 17:20:41 UTC

Description Sam Batschelet 2020-03-16 00:27:12 UTC
Description of problem: Currently DR scripts only backup etcd and kubeapiserver. Restore will start the etcd restore pod but not any of the other static pods.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results: only etcd starts on node after restore


Expected results: all static pods start based on the revision they were backed up to.


Additional info:

Comment 1 Suresh Kolichala 2020-03-16 13:27:53 UTC
*** Bug 1812275 has been marked as a duplicate of this bug. ***

Comment 9 Maria Alonso 2020-06-10 11:35:35 UTC
Hi,

Any update about this?

Regards.

Comment 13 Masaki Hatada 2020-06-22 11:17:04 UTC
Dear Red Hat,

We have two questions.

* From OCP4.4, some oc patch commands were added to the recovery steps of the manual in order to redeploy control plane components forcibly.

    Restoring to a previous cluster state
    https://docs.openshift.com/container-platform/4.4/backup_and_restore/disaster_recovery/scenario-2-restoring-cluster-state.html#dr-restoring-cluster-state

  Is the update just for this bugzilla?

* Even with this OCP4.4 steps, we fail to recover etcd if etcd has been encrypted.
  As we reported at Case 02610044, to restore encryption key, we have to do the following step after running /usr/local/bin/cluster-restore.sh.

    $ tar xvf static_kuberesources_<date>.tar.gz
    $ sudo cp -p static-pod-resources/kube-apiserver-pod-26/kube-apiserver-pod.yaml /etc/kubernetes/manifests/
    $ sudo cp -pr static-pod-resources/kube-apiserver-pod-26 /etc/kubernetes/static-pod-resources/
    $ systemctl restart kubelet

  Could Red Hat describe the above step in the manual?
  Or, are Red Hat planning to describe a more better way in the manual?

Best Regards,
Masaki Hatada

Comment 14 Masaki Furuta ( RH ) 2020-07-01 07:51:22 UTC
(In reply to Masaki Hatada from comment #13)

Dear Sam Batschelet (and Ge Liu),

Would you please take a look at comment 13 by Hatada-san, and please consider to include it ?
In case you find any apparent problem, would you please respond to Hatada-san ?

I am grateful for your help and support.

Thank you,

BR,
Masaki

Comment 15 Suresh Kolichala 2020-07-01 12:26:53 UTC
@Masaki Hatada,

I am surprised you needed to do that. The restore process does copy the manifest from the tar file. The only extra step you are doing is to restart kubelet. Do you have a cluster without these extra steps?

Thanks,
Suresh.

Comment 16 Masaki Hatada 2020-07-02 01:18:58 UTC
Dear Suresh,

Sorry, I might be wrong.
We had to recover static-pod-resources manuall in OCP4.3. But in OCP4.4, indeed the restore script seems to restore static-pod-resources automatically.

We will retest it and let you know the result later.

Best Regards,
Masaki Hatada

Comment 17 Masaki Hatada 2020-07-06 11:51:21 UTC
Created attachment 1700017 [details]
Replace master field with encryption

We confirmed that the steps of OCP4.4 manual can restore etcd even if it was encrypted.
So currently. all problems are gone, we think.

Comment 18 Maria Alonso 2020-07-10 12:40:42 UTC
Hi,

Do you know if this will be backported to 4.3?

Comment 20 errata-xmlrpc 2020-07-13 17:20:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409


Note You need to log in before you can comment on or make changes to this bug.