Bug 1419670
| Summary: | [DOCS] Incorrect etcd backup and restore procedure | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Jaspreet Kaur <jkaur> |
| Component: | Documentation | Assignee: | Ashley Hardin <ahardin> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Anping Li <anli> |
| Severity: | high | Docs Contact: | Vikram Goyal <vigoyal> |
| Priority: | high | ||
| Version: | 3.3.0 | CC: | anli, aos-bugs, bfallonf, jkaur, jokerman, mmccomas, rhowe, sttts, tstclair |
| Target Milestone: | --- | Flags: | sttts:
needinfo+
|
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-03-27 15:52:53 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1405338 | ||
|
Description
Jaspreet Kaur
2017-02-06 17:31:09 UTC
This seems like two separate issues... 1. Update the docs on disaster recovery 2. Determine why your etcd instance is not starting. To #1, iirc before an upgrade a snapshot is saved. *** Bug 1421072 has been marked as a duplicate of this bug. *** Updated pull request: https://github.com/openshift/openshift-docs/pull/3827 (In reply to Anping Li from comment #36) > For > https://github.com/ahardin-rh/openshift-docs/blob/ > 5cfb6dc2ee7fcb1d15007ad85afb2998f81e6cdf/admin_guide/backup_restore. > adoc#cluster-backup > Should note 'cp "$ETCD_DATA_DIR"/member/snap/db "$HOTDIR"/member/snap/db' > when use etcd 3.0.15. It was talked at comment 2,10,15. > For > https://github.com/ahardin-rh/openshift-docs/blob/ > 5cfb6dc2ee7fcb1d15007ad85afb2998f81e6cdf/admin_guide/backup_restore. > adoc#cluster-restore > No step force-new-cluster and restart etcd service. without them, for > single-member etcd clusters, we also need to see > https://docs.openshift.com/container-platform/3.4/install_config/downgrade. > html#downgrading-restoring-embedded-etcd There is a note 'This restore operation only works for single-member etcd clusters. For multiple-member etcd clusters, see Restoring etcd.'. In fact, the following restore operation aren't complete. the step force-new-cluster and restart etcd service is missing. Either change the note or copy force-new-cluster and restart etcd service step herein. > 4.c) > mkdir $PREFIX before run openssl Without $PREFIX directory, the following command will fail. > 4.e) cp ca.crt ${PREFIX} -> cp ca/ca.crt ${PREFIX} This step is not necessary; drop it. 1. https://github.com/ahardin-rh/openshift-docs/blob/240abad8bc6109fc349c6f5b76521e144f08119a/admin_guide/backup_restore.adoc#cluster-backup # tar cf /tmp/certs-and-keys-$(hostname).tar *.key *.crt' \ master.proxy-client.crt \ master.proxy-client.key \ proxyca.crt \ proxyca.key \ master.server.crt \ master.server.key \ ca.crt \ ca.key \ master.etcd-client.crt \ master.etcd-client.key \ master.etcd-ca.crt Should be # tar cf /tmp/certs-and-keys-$(hostname).tar *.key *.crt 2. https://github.com/ahardin-rh/openshift-docs/blob/240abad8bc6109fc349c6f5b76521e144f08119a/admin_guide/backup_restore.adoc#cluster-restore-for-single-member-etcd-clusters A similar step need to be added as https://github.com/ahardin-rh/openshift-docs/blob/240abad8bc6109fc349c6f5b76521e144f08119a/admin_guide/backup_restore.adoc#external-etcd: step 4 For example: Verify the etcd service started correctly, then re-edit the /usr/lib/systemd/system/etcd.service file and remove the --force-new-cluster option: # sed -i '/ExecStart/s/ --force-new-cluster//' /usr/lib/systemd/system/etcd.service # cat /usr/lib/systemd/system/etcd.service | grep ExecStart ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /usr/bin/etcd" Then restart the etcd service: # systemctl daemon-reload # systemctl start etcd 3. The other part looks good https://github.com/ahardin-rh/openshift-docs/blob/240abad8bc6109fc349c6f5b76521e144f08119a/admin_guide/backup_restore.adoc#cluster-backup tar cf /tmp/certs-and-keys-$(hostname).tar *.key *.crt \ > master.proxy-client.crt \ > master.proxy-client.key \ > proxyca.crt \ > proxyca.key \ > master.server.crt \ > master.server.key \ > ca.crt \ > ca.key \ > master.etcd-client.crt \ > master.etcd-client.key \ > master.etcd-ca.crt tar: proxyca.crt: Cannot stat: No such file or directory tar: proxyca.key: Cannot stat: No such file or directory 1) The name be vary for crt and key files. For example: The Custom specify different names. That is why I suggested using command 'tar cf /tmp/certs-and-keys-$(hostname).tar *.key *.crt '. It look good to me. Commits pushed to master at https://github.com/openshift/openshift-docs https://github.com/openshift/openshift-docs/commit/b38042de02d9780842dce95cfa0ef45d53b58bc6 Bug 1419670, Update backup and restore procedure https://github.com/openshift/openshift-docs/commit/be0f62d5b5e30b5a56a061382cec07cba1909f94 Merge pull request #3827 from ahardin-rh/etcd-backup-restore Bug 1419670, Update backup and restore procedure |