Description of problem: Install failed due to Approve bootstrap nodes timeout Is timeout 60 too short? https://github.com/openshift/openshift-ansible/blob/master/playbooks/openshift-node/private/join.yml#L29 Version-Release number of the following components: openshift-ansible-3.10.0-0.36.0 How reproducible: Sometimes Steps to Reproduce: 1. Install OCP 3.10 Actual results: Install failed. TASK [Approve bootstrap nodes] ************************************************* Monday 07 May 2018 22:03:37 -0400 (0:00:00.054) 0:11:56.686 ************ fatal: [qe-wmengcrio4-master-etcd-1.0507-o68.qe.rhcloud.com]: FAILED! => {"changed": true, "failed": true, "finished": false, "msg": "Timed out accepting certificate signing requests. Failing as requested.", "nodes": [{"accepted": true, "csrs": {"csr-c4dbd": {"apiVersion": "certificates.k8s.io/v1beta1", "kind": "CertificateSigningRequest", "metadata": {"creationTimestamp": "2018-05-08T02:02:37Z", "generateName": "csr-", "name": "csr-c4dbd", "namespace": "", "resourceVersion": "522", "selfLink": "/apis/certificates.k8s.io/v1beta1/certificatesigningrequests/csr-c4dbd", "uid": "e1ad1550-5263-11e8-a773-42010af00002"}, "spec": {"groups": ["system:masters", "system:cluster-admins", "system:authenticated"], "request": <---snipped---> TASK [Report approval errors] ************************************************** Monday 07 May 2018 22:04:57 -0400 (0:00:00.753) 0:13:16.684 ************ fatal: [qe-wmengcrio4-master-etcd-1.0507-o68.qe.rhcloud.com]: FAILED! => {"changed": false, "failed": true, "msg": "Node approval failed"} All csrs approved. [root@qe-wmengcrio4-master-etcd-1 ~]# oc get csr NAME AGE REQUESTOR CONDITION csr-4ncp2 17m system:node:qe-wmengcrio4-node-registry-router-2 Approved,Issued csr-c4dbd 18m system:admin Approved,Issued csr-f8jzn 17m system:node:qe-wmengcrio4-master-etcd-1 Approved,Issued csr-fdcts 17m system:node:qe-wmengcrio4-glusterfs-node-3 Approved,Issued csr-svgzt 18m system:admin Approved,Issued csr-w5n89 16m system:node:qe-wmengcrio4-glusterfs-node-1 Approved,Issued node-csr-RFT_-eZU1T2IgMJ6Gz75MFp8Mg0ewwPnq-eeDEIimG8 17m system:serviceaccount:openshift-infra:node-bootstrapper Approved,Issued node-csr-eKmKiTQKE7SG2AG4yZXzHPWSGnlNOm8w-yfdFRMLaOw 17m system:serviceaccount:openshift-infra:node-bootstrapper Approved,Issued node-csr-gVKh4T1wNpSLX_furDmY1JxEvb3ku7T0kDVhEQud-xo 17m system:serviceaccount:openshift-infra:node-bootstrapper Approved,Issued [root@qe-wmengcrio4-master-etcd-1 ~]# Expected results: Install succeeds
This is constantly occurring for nodes scaling up. Even if updating the timeout to 600s (https://github.com/openshift/openshift-ansible/blob/master/playbooks/openshift-node/private/join.yml#L29), no luck. Adding test blocker.
Please let me know what info else is needed.
Lets re-open if this can be reproduced without the load balancer misconfiguration.
Not meet this issue recently with latest build v3.10.0-0.53.0. No need to try old versions.
Closing this as a dupe of 1628964 which is being used to track the backport of CSR approval fixes from 3.11 to 3.10.z *** This bug has been marked as a duplicate of bug 1628964 ***