Bug 1871814
| Summary: | OCP installation times out but after few minutes the cluster is up | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | rlobillo |
| Component: | Installer | Assignee: | MichaĆ Dulko <mdulko> |
| Installer sub component: | OpenShift on OpenStack | QA Contact: | GenadiC <gcheresh> |
| Status: | CLOSED DUPLICATE | Docs Contact: | |
| Severity: | medium | ||
| Priority: | medium | CC: | bbennett, itbrown, ltomasbo |
| Version: | 4.5 | Keywords: | AutomationBlocker |
| Target Milestone: | --- | ||
| Target Release: | 4.7.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-09-23 09:51:18 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Set the target to 4.7 because I don't think this will block 4.6. However, please feel free to work on it, and if you have a PR that is ready to merge, please update the target to 4.6. Also happened using OpenshiftSDN with OCP 4.5.3 and 4.5.8 It also worth to mention that we sometimes see that kube-controller-manager cannot reach the api-int: E0907 04:49:57.854644 1 leaderelection.go:321] error retrieving resource lock kube-system/kube-controller-manager: Get "https://api-int.ostest.shiftstack.com:6443/api/v1/namespaces/kube-system/configmaps/kube-controller-manager?timeout=10s": dial tcp 10.196.0.5:6443: connect: connection refused |
Description of problem: Approximatetely 1 of 2 times, OCP installer expires but the installation is successfully completed few minutes later. Checking the kuryr-controller logs, there are a big amount of below errors: ERROR kuryr_kubernetes.handlers.logging requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer')) So k8s internal API is aborting the connections for some time leading to the delay on the installation. Version-Release number of selected component (if applicable): openshift_puddle: 4.5.0-0.nightly-2020-08-21-084032 How reproducible: Steps to Reproduce: 1. Install OSP16.1 + OVN + Ceph + TLS-everywhere 2. Install OCP4.5. Actual results: Unstable results on installation. Expected results: Stable successful installation. Additional info: Logs for two different executions: - https://rhos-ci-staging-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/osasinfra/view/shiftstack_on_vms/job/DFG-osasinfra-shiftstack_on_vms-ocp_verification-osp16.1/53/artifact/.sh/ir-openshift-install.log - https://rhos-ci-staging-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/osasinfra/view/shiftstack_on_vms/job/DFG-osasinfra-shiftstack_on_vms-ocp_verification-osp16.1/54/artifact/.sh/ir-openshift-install.log