Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1460969

Summary:	[3.5] Redeploy CA will try to restart services when certs are expired, causing failure.
Product:	OpenShift Container Platform	Reporter:	Gaoyun Pei <gpei>
Component:	Installer	Assignee:	Andrew Butcher <abutcher>
Status:	CLOSED ERRATA	QA Contact:	Gaoyun Pei <gpei>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	3.5.1	CC:	abutcher, aos-bugs, jokerman, mmccomas, rhowe, smunilla
Target Milestone:	---
Target Release:	3.5.z
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:	The OpenShift CA redeployment playbook (playbooks/byo/openshift-cluster/redeploy-openshift-ca.yml) would fail to restart services if certificates were previously expired. Service restarts are now skipped within the OpenShift CA redeployment playbook when expired certificates are detected. Expired cluster certificates may be replaced with the certificate redeployment playbook (playbooks/byo/openshift-cluster/redeploy-certificates.yml) once the OpenShift CA certificate has been replaced via the OpenShift CA redeployment playbook.	Story Points:	---
Clone Of:	1452367	Environment:
Last Closed:	2017-06-29 13:33:14 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1452367
Bug Blocks:

Comment 1 Scott Dodson 2017-06-14 01:52:34 UTC

https://github.com/openshift/openshift-ansible/pull/4428

Comment 3 Gaoyun Pei 2017-06-15 09:19:54 UTC

Test with openshift-ansible-3.5.82-1.git.0.e3e25f6.el7.noarch, redeploy CA playbook failed as:

PLAY [Validate configuration for rolling restart] ******************************

TASK [setup] *******************************************************************
fatal: [ec2-54-174-42-35.compute-1.amazonaws.com]: FAILED! => {
    "failed": true
}

MSG:

The conditional check '('expired' not in hostvars | oo_select_keys(groups['oo_masters_to_config']) | oo_collect('check_results.check_results.ocp_certs') | oo_collect('health', {'path':hostvars[groups.oo_first_master.0].openshift.common.config_base ~ "/master/master.server.crt"})) and ('expired' not in hostvars | oo_select_keys(groups['oo_masters_to_config']) | oo_collect('check_results.check_results.ocp_certs') | oo_collect('health', {'path':hostvars[groups.oo_first_master.0].openshift.common.config_base ~ "/master/ca-bundle.crt"}))' failed. The error was: 'list' object has no attribute 'get'
	to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/redeploy-openshift-ca.retry

Comment 5 Andrew Butcher 2017-06-20 20:16:56 UTC

https://github.com/openshift/openshift-ansible/pull/4461

Comment 7 Gaoyun Pei 2017-06-27 09:26:41 UTC

Verify this bug with openshift-ansible-3.5.88-1.git.0.9901d92.el7.noarch

When openshift certs expired, redeploy openshift CA cert
ansible-playbook -i host /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/redeploy-openshift-ca.yml
Redeploy openshift CA playbook will update openshift CA cert and skip restart master/node service since expired cert detected. 

Redeploy etcd CA cert
ansible-playbook -i host /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/redeploy-etcd-ca.yml
Redeploy openshift CA playbook will update etcd CA cert and skip restart etcd/master service since expired cert detected. 

Redeploy openshift certs next:
ansible-playbook -i host /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/redeploy-certificates.yml
This playbook will generate new certs and restart etcd/master/docker/node service.

Then all the certs were replaced by new certs, ocp env works well again.

Comment 9 errata-xmlrpc 2017-06-29 13:33:14 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1666