Bug 1469358 - node_certificates failed when the master sequence changed in inventory file
node_certificates failed when the master sequence changed in inventory file
Status: CLOSED WONTFIX
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer (Show other bugs)
3.6.0
Unspecified Unspecified
medium Severity medium
: ---
: 3.10.0
Assigned To: Andrew Butcher
Johnny Liu
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-07-11 01:47 EDT by Anping Li
Modified: 2018-05-23 10:20 EDT (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-04-10 10:13:24 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Bugzilla 1529532 None NEW [RFE] etcd ca should be available on all nodes after installation 2018-06-18 09:11 EDT

  None (edit)
Description Anping Li 2017-07-11 01:47:13 EDT
Description of problem:
/etc/origin/master/ca.serial.txt is only on the origin first master. When the first master is broken or the sequence of masters is changed in inventory file, the node_certificates may fail for there isn't /etc/origin/master/ca.serial.txt on the other masters.

Version-Release number of the following components:
openshift-ansible-3.6.140

How reproducible:
always 

Steps to Reproduce:
1. install HA OCP 3.6
2. adjust the sequence of masters in inventory file.
3. redeploy node certification or scaleup node
   ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/redeploy-node-certificates.yml
   ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-node/scaleup.yml


Actual results:
TASK [openshift_node_certificates : Generate the node client config] ***********
failed: [openshift-210.lab.eng.nay.redhat.com -> openshift-182.lab.eng.nay.redhat.com] (item=openshift-210.lab.eng.nay.redhat.com) => {
    "changed": true, 
    "cmd": [
        "/usr/local/bin/oc", 
        "adm", 
        "create-api-client-config", 
        "--certificate-authority=/etc/origin/master/ca.crt", 
        "--client-dir=/etc/origin/generated-configs/node-openshift-210.lab.eng.nay.redhat.com", 
        "--groups=system:nodes", 
        "--master=https://openshift-220.lab.eng.nay.redhat.com:8443", 
        "--signer-cert=/etc/origin/master/ca.crt", 
        "--signer-key=/etc/origin/master/ca.key", 
        "--signer-serial=/etc/origin/master/ca.serial.txt", 
        "--user=system:node:openshift-210.lab.eng.nay.redhat.com", 
        "--expire-days=730"
    ], 
    "delta": "0:00:00.244486", 
    "end": "2017-07-10 22:59:01.502011", 
    "failed": true, 
    "item": "openshift-210.lab.eng.nay.redhat.com", 
    "rc": 1, 
    "start": "2017-07-10 22:59:01.257525", 
    "warnings": []
}

STDERR:

error: --signer-serial, "/etc/origin/master/ca.serial.txt" must be a valid file
See 'oc adm create-api-client-config -h' for help and examples.

Expected results:


Expected results:

Additional info:
/usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/redeploy-certificates.yml works well.
Comment 1 Scott Dodson 2017-07-11 09:08:47 EDT
Andrew, Tim,

Do you think we should replicate the CA data to the other masters for disaster recovery?
Comment 2 Andrew Butcher 2017-07-13 14:34:42 EDT
@Scott, yep we should absolutely be syncing the serial file after we sign any certificates within openshift_master_certificates, openshift_node_certificates and likely openshift_hosted roles.
Comment 3 Tim Bielawa 2017-08-14 15:45:54 EDT
beginning work on this here https://github.com/openshift/openshift-ansible/pull/5085

Note You need to log in before you can comment on or make changes to this bug.