Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1529522

Summary: [DOCS] etcd backup procedure missing info to backup CA
Product: OpenShift Container Platform Reporter: daniel <dmoessne>
Component: DocumentationAssignee: Kathryn Alexander <kalexand>
Status: CLOSED CURRENTRELEASE QA Contact: ge liu <geliu>
Severity: low Docs Contact: Vikram Goyal <vigoyal>
Priority: medium    
Version: 3.5.0CC: acomabon, adellape, aos-bugs, dmoessne, jokerman, mmccomas
Target Milestone: ---   
Target Release: 3.11.0   
Hardware: Unspecified   
OS: Unspecified   
URL: http://docs.openshift.com/container-platform/3.5/admin_guide/backup_restore.html#cluster-backup
Whiteboard: 3.10-release-plan
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-09-12 17:56:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description daniel 2017-12-28 14:02:07 UTC
Document URL: 
- https://docs.openshift.com/container-platform/3.5/admin_guide/backup_restore.html#cluster-backup
- https://docs.openshift.com/container-platform/3.6/admin_guide/backup_restore.html#etcd-backup
- https://docs.openshift.com/container-platform/3.7/admin_guide/backup_restore.html#etcd-backup

Section Number and Name: 
1) Cluster Backup -> Etcd Backup
2) Adding New etcd Hosts -> Generate the required certificates for the new host. On a surviving etcd host:
    Make a backup of the /etc/etcd/ca/ directory.


Describe the issue: 
ad1) 
I think we should mention here as well to back up /etc/etcd/ca/ on the first master (as this is the only one having the ca), on other masters this is empty. Without the CA it is neither possible to add new nodes nor to recover a master. Especially as CA is only on the first master it is important to have a backup of the CA. Although the CA and certs can be regenerated that causes as (short) outage which is bad for heavily used clusters and could be avoided if a backup of CA is available.

ad2)
add that this needs to be done on the (first) master as only this one is holding the CA needed to (re)generate certs for reinstalled or new masters. On other masters this dir is empty and therefor not of much use

Suggestions for improvement: 
ad1) 
as outlined, mention that the CA folder (/etc/etcd/ca/) need to be backed up as well on the master where certs have been generated by installer so we avoid customer just blindly  running backup just to find they do not hold info, especially as we state that running the procedure on one master is sufficient

ad2) 
mention that this backup needs to be run on the master that has actually the CA in the dir 

Additional information: 
- Overall, if I recall correctly, redeploy plabook places CA then on all nodes, and I think this is te right way and will open a bz against installer to do that during install as well

Comment 4 Kathryn Alexander 2018-08-22 19:43:56 UTC
As part of the etcd backup process in version 3.7 and later, you back up the entire /etc/etcd directory for all nodes: https://docs.openshift.com/container-platform/3.7/day_two_guide/environment_backup.html#backing-up-etcd_environment-backup

I'm actually not clear on if you need to back up the etcd CA for embedded etcd, external etcd, or both. (If it's for external etcd, you wouldn't necessarily do it on a master.)

@Alejandro, can you give me more information about this one?

Comment 7 Kathryn Alexander 2018-08-28 19:37:05 UTC
PR's here: https://github.com/openshift/openshift-docs/pull/11805

@Johnny, will you PTAL?

Comment 9 ge liu 2018-09-03 02:37:24 UTC
@kalexand-rh, LGTM, but how to restore ca in 'etcd restore' should be added, right? pls confirm with the dev or bug reporter for this, thx

Comment 11 ge liu 2018-09-11 06:50:13 UTC
Regarding to comment9, file a new bug to trace it, https://bugzilla.redhat.com/show_bug.cgi?id=1627638

Close this bug now.

Comment 12 Kathryn Alexander 2018-09-12 15:40:16 UTC
I've merged the change and am waiting for it to go live. Thanks!