Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1772580

Summary: Redeploy-certificates do not set proper bootstrap.kubeconfig on masters
Product: OpenShift Container Platform Reporter: Pablo Alonso Rodriguez <palonsor>
Component: InstallerAssignee: Russell Teague <rteague>
Installer sub component: openshift-ansible QA Contact: Gaoyun Pei <gpei>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: adahiya, bleanhar
Version: 3.11.0   
Target Milestone: ---   
Target Release: 3.11.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: During master certificate redeployment, the master/admin.kubeconfig is updated. However, the node/boostrap.kubeconfig is not. Consequence: When master is bootstrapped again in the future for other reasons, the node is not able to connect to the api due to the outdated kubeconfig. Fix: Update the node/bootstrap.kubeconfig on masters when recreating the master/admin.kubeconfig. Result: The node service on masters will be able to bootstrap and access the API when rebootstrapping.
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-22 11:02:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Pablo Alonso Rodriguez 2019-11-14 16:43:09 UTC
Description of problem:

If I run /usr/share/ansible/openshift-ansible/playbooks/redeploy-certificates.yml or /usr/share/ansible/openshift-ansible/playbooks/openshift-master/redeploy-certificates.yml , old /etc/origin/node/bootstrap.kubeconfig is kept although it should be replaced with contents of new /etc/origin/master/admin.kubeconfig. Without proper kubeconfig, issues can occur should master certificates be re-bootstrapped again.

Version-Release number of the following components:

rpm -q openshift-ansible
openshift-ansible-3.11.153-2.git.0.ee699b5.el7.noarch

rpm -q ansible
ansible-2.6.19-1.el7ae.noarch

ansible --version

ansible 2.6.19
  config file = /home/XXXX/ansible.cfg
  configured module search path = [u'/home/XXXX/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python2.7/site-packages/ansible
  executable location = /usr/bin/ansible
  python version = 2.7.5 (default, Jun 11 2019, 14:33:56) [GCC 4.8.5 20150623 (Red Hat 4.8.5-39)]


How reproducible:

Always

Steps to Reproduce:
1. Run /usr/share/ansible/openshift-ansible/playbooks/redeploy-certificates.yml or /usr/share/ansible/openshift-ansible/playbooks/openshift-master/redeploy-certificates.yml

Actual results:

Old /etc/origin/node/bootstrap.kubeconfig is kept

Expected results:

/etc/origin/node/bootstrap.kubeconfig to be replaced with new /etc/origin/master/admin.kubeconfig contents

Additional info:

Comment 1 Russell Teague 2019-11-14 18:54:53 UTC
The bootstrap.kubeconfig is only used during the initial install of a node to join it to the cluster.  After it is part of the cluster, the bootstrap.kubeconfig is no longer needed.  Can you describe a scenario that entails "re-bootstrapping"?

Comment 2 Pablo Alonso Rodriguez 2019-11-15 08:27:31 UTC
Well, to be honest, the only times I have seen the need of that would be either during CA redeployments or in order to face specific and rare failure scenarios. For the CA redeployment, that would be definitely needed so please let me some time to test this scenario specifically. For the other, I understand it is not a common situation (although just copying a single file should be also nothing complicated).

I will try to test the CA redeployment scenario and I will get back to you (keeping needinfo on myself until I can do the test).

Comment 3 Pablo Alonso Rodriguez 2019-11-15 16:04:55 UTC
If CA is redeployed, bootstrap kubeconfig is set to the same than in the nodes. Is it expected? Should then it set to the same than the nodes even during a normal certificate redeployment?

Comment 13 Gaoyun Pei 2020-09-25 10:52:25 UTC
Verified with latest release-3.11 branch 
# git describe
openshift-ansible-3.11.295-1-2-g4454dbf

/etc/origin/node/bootstrap.kubeconfig was updated during the master certificates redeployment

09-25 18:27:13  TASK [openshift_master_certificates : Update the default bootstrap kubeconfig for masters] ***
09-25 18:27:14  changed: [ci-vm-10-0-151-246.hosted.upshift.rdu2.redhat.com] => (item=/etc/origin/node/bootstrap.kubeconfig) => {"ansible_loop_var": "item", "changed": true, "checksum": "6abd9c5075ac4229e490c7ed0c8fda6d04f08d9e", "dest": "/etc/origin/node/bootstrap.kubeconfig", "gid": 0, "group": "root", "item": "/etc/origin/node/bootstrap.kubeconfig", "md5sum": "eae3022fb009ba7cd4790bd5a8fdacee", "mode": "0600", "owner": "root", "secontext": "system_u:object_r:etc_t:s0", "size": 7836, "src": "/etc/origin/master/admin.kubeconfig", "state": "file", "uid": 0}

Steps:
1. Redeploy openshift CA against a 3.11 cluster using playbook playbooks/openshift-master/redeploy-openshift-ca.yml 
2. Redeploy master certificates using playbook playbooks/openshift-master/redeploy-certificates.yml
Check /etc/origin/node/bootstrap.kubeconfig file, it's the same with /etc/origin/master/admin.kubeconfig, the cluster.certificate-authority-data filed was updated with new CA.


The latest openshift-ansible rpm package is still openshift-ansible-3.11.295-1.git.0.7cf87c6.el7.noarch.rpm which doesn't have the proposed PR, will move this bug to verified once there's new rpm built.

Comment 14 Gaoyun Pei 2020-09-26 04:09:14 UTC
Move bug to verified with openshift-ansible-3.11.296-1.git.0.4ba1e83.el7.noarch.rpm which includes the proposed PR.

Comment 17 errata-xmlrpc 2020-10-22 11:02:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 3.11.306 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4170