Bug 1469358 - node_certificates failed when the master sequence changed in inventory file
node_certificates failed when the master sequence changed in inventory file
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer (Show other bugs)
Unspecified Unspecified
medium Severity medium
: ---
: 3.10.0
Assigned To: Andrew Butcher
Johnny Liu
Depends On:
  Show dependency treegraph
Reported: 2017-07-11 01:47 EDT by Anping Li
Modified: 2018-05-23 10:20 EDT (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2018-04-10 10:13:24 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Bugzilla 1529532 None NEW [RFE] etcd ca should be available on all nodes after installation 2018-06-18 09:11 EDT

  None (edit)
Description Anping Li 2017-07-11 01:47:13 EDT
Description of problem:
/etc/origin/master/ca.serial.txt is only on the origin first master. When the first master is broken or the sequence of masters is changed in inventory file, the node_certificates may fail for there isn't /etc/origin/master/ca.serial.txt on the other masters.

Version-Release number of the following components:

How reproducible:

Steps to Reproduce:
1. install HA OCP 3.6
2. adjust the sequence of masters in inventory file.
3. redeploy node certification or scaleup node
   ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/redeploy-node-certificates.yml
   ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-node/scaleup.yml

Actual results:
TASK [openshift_node_certificates : Generate the node client config] ***********
failed: [openshift-210.lab.eng.nay.redhat.com -> openshift-182.lab.eng.nay.redhat.com] (item=openshift-210.lab.eng.nay.redhat.com) => {
    "changed": true, 
    "cmd": [
    "delta": "0:00:00.244486", 
    "end": "2017-07-10 22:59:01.502011", 
    "failed": true, 
    "item": "openshift-210.lab.eng.nay.redhat.com", 
    "rc": 1, 
    "start": "2017-07-10 22:59:01.257525", 
    "warnings": []


error: --signer-serial, "/etc/origin/master/ca.serial.txt" must be a valid file
See 'oc adm create-api-client-config -h' for help and examples.

Expected results:

Expected results:

Additional info:
/usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/redeploy-certificates.yml works well.
Comment 1 Scott Dodson 2017-07-11 09:08:47 EDT
Andrew, Tim,

Do you think we should replicate the CA data to the other masters for disaster recovery?
Comment 2 Andrew Butcher 2017-07-13 14:34:42 EDT
@Scott, yep we should absolutely be syncing the serial file after we sign any certificates within openshift_master_certificates, openshift_node_certificates and likely openshift_hosted roles.
Comment 3 Tim Bielawa 2017-08-14 15:45:54 EDT
beginning work on this here https://github.com/openshift/openshift-ansible/pull/5085

Note You need to log in before you can comment on or make changes to this bug.