Bug 1469358 - node_certificates failed when the master sequence changed in inventory file
Summary: node_certificates failed when the master sequence changed in inventory file
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.6.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 3.10.0
Assignee: Andrew Butcher
QA Contact: Johnny Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-07-11 05:47 UTC by Anping Li
Modified: 2019-04-17 17:21 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-04-10 14:13:24 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1529532 0 low CLOSED [RFE] etcd ca should be available on all nodes after installation 2022-03-13 14:36:40 UTC

Description Anping Li 2017-07-11 05:47:13 UTC
Description of problem:
/etc/origin/master/ca.serial.txt is only on the origin first master. When the first master is broken or the sequence of masters is changed in inventory file, the node_certificates may fail for there isn't /etc/origin/master/ca.serial.txt on the other masters.

Version-Release number of the following components:
openshift-ansible-3.6.140

How reproducible:
always 

Steps to Reproduce:
1. install HA OCP 3.6
2. adjust the sequence of masters in inventory file.
3. redeploy node certification or scaleup node
   ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/redeploy-node-certificates.yml
   ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-node/scaleup.yml


Actual results:
TASK [openshift_node_certificates : Generate the node client config] ***********
failed: [openshift-210.lab.eng.nay.redhat.com -> openshift-182.lab.eng.nay.redhat.com] (item=openshift-210.lab.eng.nay.redhat.com) => {
    "changed": true, 
    "cmd": [
        "/usr/local/bin/oc", 
        "adm", 
        "create-api-client-config", 
        "--certificate-authority=/etc/origin/master/ca.crt", 
        "--client-dir=/etc/origin/generated-configs/node-openshift-210.lab.eng.nay.redhat.com", 
        "--groups=system:nodes", 
        "--master=https://openshift-220.lab.eng.nay.redhat.com:8443", 
        "--signer-cert=/etc/origin/master/ca.crt", 
        "--signer-key=/etc/origin/master/ca.key", 
        "--signer-serial=/etc/origin/master/ca.serial.txt", 
        "--user=system:node:openshift-210.lab.eng.nay.redhat.com", 
        "--expire-days=730"
    ], 
    "delta": "0:00:00.244486", 
    "end": "2017-07-10 22:59:01.502011", 
    "failed": true, 
    "item": "openshift-210.lab.eng.nay.redhat.com", 
    "rc": 1, 
    "start": "2017-07-10 22:59:01.257525", 
    "warnings": []
}

STDERR:

error: --signer-serial, "/etc/origin/master/ca.serial.txt" must be a valid file
See 'oc adm create-api-client-config -h' for help and examples.

Expected results:


Expected results:

Additional info:
/usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/redeploy-certificates.yml works well.

Comment 1 Scott Dodson 2017-07-11 13:08:47 UTC
Andrew, Tim,

Do you think we should replicate the CA data to the other masters for disaster recovery?

Comment 2 Andrew Butcher 2017-07-13 18:34:42 UTC
@Scott, yep we should absolutely be syncing the serial file after we sign any certificates within openshift_master_certificates, openshift_node_certificates and likely openshift_hosted roles.

Comment 3 Tim Bielawa 2017-08-14 19:45:54 UTC
beginning work on this here https://github.com/openshift/openshift-ansible/pull/5085


Note You need to log in before you can comment on or make changes to this bug.