Bug 1623204
| Summary: | [3.10] Installation stuck at TASK [Approve node certificates when bootstrapping] | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Scott Dodson <sdodson> |
| Component: | Installer | Assignee: | Michael Gugino <mgugino> |
| Status: | CLOSED ERRATA | QA Contact: | Johnny Liu <jialiu> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 3.10.0 | CC: | aos-bugs, jialiu, jokerman, juzhao, mark.vinkx, mgoldman, mgugino, mmccomas, scortopa, wabouham, wmeng |
| Target Milestone: | --- | Keywords: | Regression, TestBlocker |
| Target Release: | 3.10.z | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 1622945 | Environment: | |
| Last Closed: | 2018-09-04 07:10:49 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1622945, 1623248, 1625817 | ||
| Bug Blocks: | |||
|
Comment 1
Scott Dodson
2018-08-28 20:16:27 UTC
Verified this bug with openshift-ansible-3.10.41-1.git.0.fd15dd7.el7.noarch, and PASS.
3 master + 2 infra nodes + 2 compute nodes
TASK [Dump the bootstrap hostnames] ********************************************
Thursday 30 August 2018 10:41:10 +0800 (0:00:00.492) 0:18:10.996 *******
ok: [qe-jialiu310z-master-etcd-1.0830-tir.qe.rhcloud.com] => {
"msg": [
"qe-jialiu310z-master-etcd-1",
"qe-jialiu310z-master-etcd-2",
"qe-jialiu310z-master-etcd-3",
"qe-jialiu310z-node-infra-1",
"qe-jialiu310z-node-infra-2",
"qe-jialiu310z-node-1",
"qe-jialiu310z-node-2"
]
}
TASK [Approve node certificates when bootstrapping] ****************************
Thursday 30 August 2018 10:41:11 +0800 (0:00:00.051) 0:18:11.047 *******
FAILED - RETRYING: Approve node certificates when bootstrapping (30 retries left).
FAILED - RETRYING: Approve node certificates when bootstrapping (29 retries left).
changed: [qe-jialiu310z-master-etcd-1.0830-tir.qe.rhcloud.com] => {"attempts": 3, "changed": true, "client_approve_results": ["certificatesigningrequest.certificates.k8s.io \"node-csr-AqJ8l3NzCYjVlx7Cr389kIx-DHdMvOmUO0tySt2Hy4k\" approved\n", "certificatesigningrequest.certificates.k8s.io \"node-csr-WxMB-aiwjgta9QJFSLzf4xuWnGuoTYtizzSe99yK4uw\" approved\n", "certificatesigningrequest.certificates.k8s.io \"node-csr-wx62rG28vKVtf-PcteavQlW8hVXr6zMEW70vuNRfUzA\" approved\n", "certificatesigningrequest.certificates.k8s.io \"node-csr-kZL50Fxsyh2eeXSKyn14LqidM9krUXsIhRZncUcETLM\" approved\n"], "failed": false, "rc": 0, "server_approve_results": ["certificatesigningrequest.certificates.k8s.io \"csr-tw9nx\" approved\n", "certificatesigningrequest.certificates.k8s.io \"csr-5txzl\" approved\n", "certificatesigningrequest.certificates.k8s.io \"csr-nr74w\" approved\n", "certificatesigningrequest.certificates.k8s.io \"csr-qbjv9\" approved\n", "certificatesigningrequest.certificates.k8s.io \"csr-nt66h\" approved\n", "certificatesigningrequest.certificates.k8s.io \"csr-pvmtv\" approved\n", "certificatesigningrequest.certificates.k8s.io \"csr-d6s9x\" approved\n", "certificatesigningrequest.certificates.k8s.io \"csr-g4d85\" approved\n", "certificatesigningrequest.certificates.k8s.io \"csr-4lstl\" approved\n", "certificatesigningrequest.certificates.k8s.io \"csr-4l4hg\" approved\n", "certificatesigningrequest.certificates.k8s.io \"csr-nn4mj\" approved\n", "certificatesigningrequest.certificates.k8s.io \"csr-l8ngc\" approved\n", "certificatesigningrequest.certificates.k8s.io \"csr-5xwds\" approved\n"]}
[root@qe-jialiu310z-master-etcd-1 ~]# oc get node
NAME STATUS ROLES AGE VERSION
qe-jialiu310z-master-etcd-1 Ready master 12m v1.10.0+b81c8f8
qe-jialiu310z-master-etcd-2 Ready master 12m v1.10.0+b81c8f8
qe-jialiu310z-master-etcd-3 Ready master 12m v1.10.0+b81c8f8
qe-jialiu310z-node-1 Ready compute 8m v1.10.0+b81c8f8
qe-jialiu310z-node-2 Ready compute 8m v1.10.0+b81c8f8
qe-jialiu310z-node-infra-1 Ready infra 8m v1.10.0+b81c8f8
qe-jialiu310z-node-infra-2 Ready infra 8m v1.10.0+b81c8f8
[root@qe-jialiu310z-master-etcd-1 ~]# oc get csr
NAME AGE REQUESTOR CONDITION
csr-4l4hg 12m system:admin Approved,Issued
csr-4lstl 12m system:admin Approved,Issued
csr-5txzl 8m system:node:qe-jialiu310z-node-1 Approved,Issued
csr-5xwds 8m system:node:qe-jialiu310z-node-infra-1 Approved,Issued
csr-b6pcz 12m system:admin Approved,Issued
csr-d6s9x 9m system:node:qe-jialiu310z-master-etcd-2 Approved,Issued
csr-g4d85 8m system:node:qe-jialiu310z-master-etcd-3 Approved,Issued
csr-l8ngc 8m system:node:qe-jialiu310z-node-infra-2 Approved,Issued
csr-nn4mj 9m system:node:qe-jialiu310z-master-etcd-3 Approved,Issued
csr-nr74w 8m system:node:qe-jialiu310z-master-etcd-1 Approved,Issued
csr-nt66h 9m system:node:qe-jialiu310z-master-etcd-1 Approved,Issued
csr-pvmtv 12m system:admin Approved,Issued
csr-qbjv9 8m system:node:qe-jialiu310z-master-etcd-2 Approved,Issued
csr-sfxk8 12m system:admin Approved,Issued
csr-tw9nx 8m system:node:qe-jialiu310z-node-2 Approved,Issued
csr-vpkjz 12m system:admin Approved,Issued
node-csr-AqJ8l3NzCYjVlx7Cr389kIx-DHdMvOmUO0tySt2Hy4k 8m system:serviceaccount:openshift-infra:node-bootstrapper Approved,Issued
node-csr-WxMB-aiwjgta9QJFSLzf4xuWnGuoTYtizzSe99yK4uw 8m system:serviceaccount:openshift-infra:node-bootstrapper Approved,Issued
node-csr-kZL50Fxsyh2eeXSKyn14LqidM9krUXsIhRZncUcETLM 8m system:serviceaccount:openshift-infra:node-bootstrapper Approved,Issued
node-csr-wx62rG28vKVtf-PcteavQlW8hVXr6zMEW70vuNRfUzA 8m system:serviceaccount:openshift-infra:node-bootstrapper Approved,Issued
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2578 Moving my comment https://bugzilla.redhat.com/show_bug.cgi?id=1622945#c10 here. I'm seeing the same issue even with the updated package: openshift-ansible-3.10.41-1.git.0.fd15dd7.el7.noarch. I think it can be related to the inventory I use, where I have one master (openshift_node_group_name=node-config-master-infra) and two nodes (openshift_node_group_name=node-config-compute). It fails all the time with updated package. When I downgrade to openshift-ansible-3.10.34-1.git.0.48df172None.noarch, then everything works. Since this bug has already transitioned to CLOSED by errata tool can you please open a new bug. With the new bug can you include your complete inventory, as well as the output of `oc get nodes` and `oc get csr -o yaml` The last will likely contain private data for signed certificates so please mark it as a private attachment unless it's just a test environment you don't care about. My suspicion is that the name on the CSR is different than what we're expecting it to be. @Scott Dodson, as suggested I opened https://bugzilla.redhat.com/show_bug.cgi?id=1625911 I also feel there could be something wrong in csr name generation. |