Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1537726 - redeploy_node_certificates.yaml restarts docker daemon
redeploy_node_certificates.yaml restarts docker daemon
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer (Show other bugs)
3.7.0
Unspecified Unspecified
unspecified Severity high
: ---
: 3.9.0
Assigned To: Scott Dodson
Gaoyun Pei
:
Depends On:
Blocks: 1542162
  Show dependency treegraph
 
Reported: 2018-01-23 13:34 EST by Borja
Modified: 2018-03-28 10:22 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
The docker daemon was incorrectly restarted when redeploying node certificates. This is only necessary when deploying a new CA and can safely be skipped which ensures that running pods are not restarted when updating node certificates.
Story Points: ---
Clone Of:
: 1542162 (view as bug list)
Environment:
Last Closed: 2018-03-28 10:21:18 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:0489 None None None 2018-03-28 10:22 EDT

  None (edit)
Description Borja 2018-01-23 13:34:44 EST
Description of problem:
When running the playbook redeploy-node-certificates.yml, every node in the cluster gets its certificates regenerated and a new proper kubeconfig.

Once the task is completed, the playbook restarts docker and atomic-openshift-node, this causes an unnecessary downtime in nodes since the only component loading the kubeconfig is atomic-openshift-node.

The docker restart happens here: https://github.com/openshift/openshift-ansible/blob/9a405010c5a656f89866906d29866ba98493e91b/playbooks/openshift-node/private/restart.yml#L10

As this seems to be a task called from different playbooks, it would be great to add some kind of check or flag. If this task is called from redeploy_certs, then do not trigger the docker restart.

Actual results:
docker daemon gets restarted in playbook redeploy-node-certificates.yml

Expected results:
docker daemon not restarted. Only atomic-openshift-node restart should be needed to load the new kubeconfig and certificates.
By doing this, this playbook can run without downtime.
Comment 1 Scott Dodson 2018-01-23 13:52:16 EST
Andrew,

Do you agree there's no need to restart docker when re-deploying node certificates? Seems like this would only be necessary when a new CA is generated so that docker picks up that change.
Comment 2 Andrew Butcher 2018-01-23 14:03:15 EST
Scott,

I agree. Restarting docker should be skipped when we have not replaced the CA certificate.
Comment 4 Scott Dodson 2018-01-24 12:47:13 EST
Master PR https://github.com/openshift/openshift-ansible/pull/6855
Comment 6 Gaoyun Pei 2018-02-05 01:02:04 EST
Verify this bug with openshift-ansible-3.9.0-0.36.0.git.0.da68f13.el7.noarch


Run node cert redeployment playbook, docker was not restart during redeployment.
ansible-playbook -i host /usr/share/ansible/openshift-ansible/playbooks/openshift-node/redeploy-certificates.yml

PLAY [Restart nodes] ********************************************************************************************************************************************************

TASK [Gathering Facts] ******************************************************************************************************************************************************
ok: [ec2-34-207-99-213.compute-1.amazonaws.com]

TASK [Restart docker] *******************************************************************************************************************************************************
skipping: [ec2-34-207-99-213.compute-1.amazonaws.com] => {"changed": false, "skip_reason": "Conditional result was False"}
Comment 9 errata-xmlrpc 2018-03-28 10:21:18 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0489

Note You need to log in before you can comment on or make changes to this bug.