Bug 1625817
Summary: | [3.10] Installation stuck at TASK [Approve node certificates when bootstrapping] | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Wei Sun <wsun> |
Component: | Installer | Assignee: | Michael Gugino <mgugino> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Weihua Meng <wmeng> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 3.10.0 | CC: | aos-bugs, fabio.martinelli, jialiu, jokerman, juzhao, kumarmn, mgoldman, mgugino, mmccomas, nils.ketelsen, roxenham, scortopa, wabouham, wmeng, wsun |
Target Milestone: | --- | Keywords: | Regression |
Target Release: | 3.10.z | Flags: | sdodson:
needinfo-
|
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
The node CSR approval process has been refactored to address several process deficiencies. This process now approves certificates for relevant nodes and waits for the certificate to be verifiable via the API.
In the event that this new process fails, the logs will include relevant debugging information required by support to diagnose any remaining issues. Please make sure you capture these logs and provide them to support in the event of a failure.
|
Story Points: | --- |
Clone Of: | 1622945 | Environment: | |
Last Closed: | 2019-01-03 17:34:48 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1622945 | ||
Bug Blocks: | 1479956, 1565405, 1623204, 1623248 |
Comment 4
Michael Gugino
2018-09-18 17:29:26 UTC
I can hit this every time on a Power 8 bare-metal node with OCP 3.10 [root@rhel-ocpapp2 openshift-ansible]# oc project openshift-sdn Now using project "openshift-sdn" on server "https://rhel-ocpapp2:8443". [root@rhel-ocpapp2 openshift-ansible]# oc get all NAME READY STATUS RESTARTS AGE pod/ovs-j25wz 1/1 Running 0 5m pod/sdn-h9c8k 0/1 CrashLoopBackOff 6 5m NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/ovs 1 1 1 1 1 <none> 5m daemonset.apps/sdn 1 1 0 1 0 <none> 5m NAME DOCKER REPO TAGS UPDATED imagestream.image.openshift.io/node docker-registry.default.svc:5000/openshift-sdn/node v3.10 5 minutes ago [root@rhel-ocpapp2 openshift-ansible]# oc logs -f pod/sdn-h9c8k Error from server: Get https://rhel-ocpapp2:10250/containerLogs/openshift-sdn/sdn-h9c8k/sdn?follow=true: remote error: tls: internal error A number of CSR approval changes have been backported from 3.11 to 3.10 and may have addressed this. Can we please test with the latest 3.10 code. Willing to test it out on Power, if you can drop me the changes. I tried on different metrics, not hit this issue. openshift-ansible-3.10.51-1.git.0.44a646c.el7.noarch x86 EC2, GCP, OpenStack docker, cri-o HA, none-HA with/without proxy with/without system-container |