Bug 1623204 - [3.10] Installation stuck at TASK [Approve node certificates when bootstrapping]
Summary: [3.10] Installation stuck at TASK [Approve node certificates when bootstrapping]
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.10.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 3.10.z
Assignee: Michael Gugino
QA Contact: Johnny Liu
URL:
Whiteboard:
: 1623248 (view as bug list)
Depends On: 1622945 1623248 1625817
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-08-28 17:50 UTC by Scott Dodson
Modified: 2018-09-18 08:38 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1622945
Environment:
Last Closed: 2018-09-04 07:10:49 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:2578 0 None None None 2018-09-04 07:11:05 UTC

Comment 1 Scott Dodson 2018-08-28 20:16:27 UTC
*** Bug 1623248 has been marked as a duplicate of this bug. ***

Comment 4 Johnny Liu 2018-08-30 02:50:33 UTC
Verified this bug with openshift-ansible-3.10.41-1.git.0.fd15dd7.el7.noarch, and PASS.

3 master + 2 infra nodes + 2 compute nodes

TASK [Dump the bootstrap hostnames] ********************************************
Thursday 30 August 2018  10:41:10 +0800 (0:00:00.492)       0:18:10.996 ******* 
ok: [qe-jialiu310z-master-etcd-1.0830-tir.qe.rhcloud.com] => {
    "msg": [
        "qe-jialiu310z-master-etcd-1", 
        "qe-jialiu310z-master-etcd-2", 
        "qe-jialiu310z-master-etcd-3", 
        "qe-jialiu310z-node-infra-1", 
        "qe-jialiu310z-node-infra-2", 
        "qe-jialiu310z-node-1", 
        "qe-jialiu310z-node-2"
    ]
}

TASK [Approve node certificates when bootstrapping] ****************************
Thursday 30 August 2018  10:41:11 +0800 (0:00:00.051)       0:18:11.047 ******* 
FAILED - RETRYING: Approve node certificates when bootstrapping (30 retries left).

FAILED - RETRYING: Approve node certificates when bootstrapping (29 retries left).

changed: [qe-jialiu310z-master-etcd-1.0830-tir.qe.rhcloud.com] => {"attempts": 3, "changed": true, "client_approve_results": ["certificatesigningrequest.certificates.k8s.io \"node-csr-AqJ8l3NzCYjVlx7Cr389kIx-DHdMvOmUO0tySt2Hy4k\" approved\n", "certificatesigningrequest.certificates.k8s.io \"node-csr-WxMB-aiwjgta9QJFSLzf4xuWnGuoTYtizzSe99yK4uw\" approved\n", "certificatesigningrequest.certificates.k8s.io \"node-csr-wx62rG28vKVtf-PcteavQlW8hVXr6zMEW70vuNRfUzA\" approved\n", "certificatesigningrequest.certificates.k8s.io \"node-csr-kZL50Fxsyh2eeXSKyn14LqidM9krUXsIhRZncUcETLM\" approved\n"], "failed": false, "rc": 0, "server_approve_results": ["certificatesigningrequest.certificates.k8s.io \"csr-tw9nx\" approved\n", "certificatesigningrequest.certificates.k8s.io \"csr-5txzl\" approved\n", "certificatesigningrequest.certificates.k8s.io \"csr-nr74w\" approved\n", "certificatesigningrequest.certificates.k8s.io \"csr-qbjv9\" approved\n", "certificatesigningrequest.certificates.k8s.io \"csr-nt66h\" approved\n", "certificatesigningrequest.certificates.k8s.io \"csr-pvmtv\" approved\n", "certificatesigningrequest.certificates.k8s.io \"csr-d6s9x\" approved\n", "certificatesigningrequest.certificates.k8s.io \"csr-g4d85\" approved\n", "certificatesigningrequest.certificates.k8s.io \"csr-4lstl\" approved\n", "certificatesigningrequest.certificates.k8s.io \"csr-4l4hg\" approved\n", "certificatesigningrequest.certificates.k8s.io \"csr-nn4mj\" approved\n", "certificatesigningrequest.certificates.k8s.io \"csr-l8ngc\" approved\n", "certificatesigningrequest.certificates.k8s.io \"csr-5xwds\" approved\n"]}


[root@qe-jialiu310z-master-etcd-1 ~]# oc get node
NAME                          STATUS    ROLES     AGE       VERSION
qe-jialiu310z-master-etcd-1   Ready     master    12m       v1.10.0+b81c8f8
qe-jialiu310z-master-etcd-2   Ready     master    12m       v1.10.0+b81c8f8
qe-jialiu310z-master-etcd-3   Ready     master    12m       v1.10.0+b81c8f8
qe-jialiu310z-node-1          Ready     compute   8m        v1.10.0+b81c8f8
qe-jialiu310z-node-2          Ready     compute   8m        v1.10.0+b81c8f8
qe-jialiu310z-node-infra-1    Ready     infra     8m        v1.10.0+b81c8f8
qe-jialiu310z-node-infra-2    Ready     infra     8m        v1.10.0+b81c8f8


[root@qe-jialiu310z-master-etcd-1 ~]# oc get csr
NAME                                                   AGE       REQUESTOR                                                 CONDITION
csr-4l4hg                                              12m       system:admin                                              Approved,Issued
csr-4lstl                                              12m       system:admin                                              Approved,Issued
csr-5txzl                                              8m        system:node:qe-jialiu310z-node-1                          Approved,Issued
csr-5xwds                                              8m        system:node:qe-jialiu310z-node-infra-1                    Approved,Issued
csr-b6pcz                                              12m       system:admin                                              Approved,Issued
csr-d6s9x                                              9m        system:node:qe-jialiu310z-master-etcd-2                   Approved,Issued
csr-g4d85                                              8m        system:node:qe-jialiu310z-master-etcd-3                   Approved,Issued
csr-l8ngc                                              8m        system:node:qe-jialiu310z-node-infra-2                    Approved,Issued
csr-nn4mj                                              9m        system:node:qe-jialiu310z-master-etcd-3                   Approved,Issued
csr-nr74w                                              8m        system:node:qe-jialiu310z-master-etcd-1                   Approved,Issued
csr-nt66h                                              9m        system:node:qe-jialiu310z-master-etcd-1                   Approved,Issued
csr-pvmtv                                              12m       system:admin                                              Approved,Issued
csr-qbjv9                                              8m        system:node:qe-jialiu310z-master-etcd-2                   Approved,Issued
csr-sfxk8                                              12m       system:admin                                              Approved,Issued
csr-tw9nx                                              8m        system:node:qe-jialiu310z-node-2                          Approved,Issued
csr-vpkjz                                              12m       system:admin                                              Approved,Issued
node-csr-AqJ8l3NzCYjVlx7Cr389kIx-DHdMvOmUO0tySt2Hy4k   8m        system:serviceaccount:openshift-infra:node-bootstrapper   Approved,Issued
node-csr-WxMB-aiwjgta9QJFSLzf4xuWnGuoTYtizzSe99yK4uw   8m        system:serviceaccount:openshift-infra:node-bootstrapper   Approved,Issued
node-csr-kZL50Fxsyh2eeXSKyn14LqidM9krUXsIhRZncUcETLM   8m        system:serviceaccount:openshift-infra:node-bootstrapper   Approved,Issued
node-csr-wx62rG28vKVtf-PcteavQlW8hVXr6zMEW70vuNRfUzA   8m        system:serviceaccount:openshift-infra:node-bootstrapper   Approved,Issued

Comment 6 errata-xmlrpc 2018-09-04 07:10:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2578

Comment 7 Marek Goldmann 2018-09-05 07:20:12 UTC
Moving my comment https://bugzilla.redhat.com/show_bug.cgi?id=1622945#c10 here.

I'm seeing the same issue even with the updated package: openshift-ansible-3.10.41-1.git.0.fd15dd7.el7.noarch.

I think it can be related to the inventory I use, where I have one master (openshift_node_group_name=node-config-master-infra) and two nodes (openshift_node_group_name=node-config-compute). It fails all the time with updated package. When I downgrade to openshift-ansible-3.10.34-1.git.0.48df172None.noarch, then everything works.

Comment 8 Scott Dodson 2018-09-05 17:20:07 UTC
Since this bug has already transitioned to CLOSED by errata tool can you please open a new bug. With the new bug can you include your complete inventory, as well as the output of `oc get nodes` and `oc get csr -o yaml`

The last will likely contain private data for signed certificates so please mark it as a private attachment unless it's just a test environment you don't care about.

My suspicion is that the name on the CSR is different than what we're expecting it to be.

Comment 9 Serena Cortopassi 2018-09-06 08:36:41 UTC
@Scott Dodson, as suggested I opened https://bugzilla.redhat.com/show_bug.cgi?id=1625911

I also feel there could be something wrong in csr name generation.


Note You need to log in before you can comment on or make changes to this bug.