Bug 1687292
| Summary: | [OSP] Sometimes masters fail to get ignition from load balancer vm and got error "dial tcp <LB ip>:22623: i/o timeout" | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | weiwei jiang <wjiang> |
| Component: | Installer | Assignee: | Eric Duen <eduen> |
| Installer sub component: | openshift-installer | QA Contact: | Tomas Sedovic <tsedovic> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | high | ||
| Priority: | high | CC: | aos-bugs, bleanhar, jokerman, mifiedle, mmccomas, xtian |
| Version: | 4.1.0 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.2.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | No Doc Update | |
| Doc Text: |
CLOSED / CURRENTRELEASE
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-10-16 06:27:41 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
weiwei jiang
2019-03-11 08:40:15 UTC
Are you still seeing this? I haven't seen issues related to masters not getting the ignition config in a while (In reply to Flavio Percoco from comment #1) > Are you still seeing this? I haven't seen issues related to masters not > getting the ignition config in a while Checked with [openshift@dhcp-140-70 installer]$ bin/openshift-install version bin/openshift-install unreleased-master-560-g974d9b0848866f03d4dd8c577d8b7ef28756a1d5-dirty built from commit 974d9b0848866f03d4dd8c577d8b7ef28756a1d5 But unfortunately got this https://bugzilla.redhat.com/show_bug.cgi?id=1687241#c2 Checked again and met https://bugzilla.redhat.com/show_bug.cgi?id=1687241#c3 [openshift@dhcp-140-70 installer]$ bin/openshift-install version bin/openshift-install unreleased-master-560-g974d9b0848866f03d4dd8c577d8b7ef28756a1d5 built from commit 974d9b0848866f03d4dd8c577d8b7ef28756a1d5 (openstack) server list --name wjiang +--------------------------------------+----------------------------+--------+-------------------------------------------------------+-------+----------------+ | ID | Name | Status | Networks | Image | Flavor | +--------------------------------------+----------------------------+--------+-------------------------------------------------------+-------+----------------+ | 78cdfb63-cd5e-4fc2-8f0c-e14be3e6d91f | wjiang-ocp-fvkd5-master-1 | ACTIVE | wjiang-ocp-fvkd5-openshift=192.168.0.11 | rhcos | ci.m1.medlarge | | 984783ba-5303-4d35-a26a-fe7e9b784e3d | wjiang-ocp-fvkd5-master-2 | ACTIVE | wjiang-ocp-fvkd5-openshift=192.168.0.5 | rhcos | ci.m1.medlarge | | 0f4812e9-9f18-4336-b1c8-5a356a90a8e1 | wjiang-ocp-fvkd5-master-0 | ACTIVE | wjiang-ocp-fvkd5-openshift=192.168.0.9 | rhcos | ci.m1.medlarge | | e4347058-6889-4dc7-a5ad-d98115e468f4 | wjiang-ocp-fvkd5-api | ACTIVE | wjiang-ocp-fvkd5-openshift=192.168.128.13, 10.0.77.71 | rhcos | ci.m1.medlarge | | bfb430a2-a6b7-4899-9117-6bb3bca7a181 | wjiang-ocp-fvkd5-bootstrap | ACTIVE | wjiang-ocp-fvkd5-openshift=192.168.0.10 | rhcos | ci.m1.medlarge | +--------------------------------------+----------------------------+--------+-------------------------------------------------------+-------+----------------+ After I disable the creation of trunk for masters for upshift openstack, all work well. DEBUG OpenShift Installer unreleased-master-601-g1c1b2bb6f64b25c3eccacd07f031a3ec5b2ab29d-dirty DEBUG Built from commit 1c1b2bb6f64b25c3eccacd07f031a3ec5b2ab29d INFO Waiting up to 30m0s for the Kubernetes API at https://api.wjiang-ocp.shiftstack.com:6443... DEBUG Still waiting for the Kubernetes API: Get https://api.wjiang-ocp.shiftstack.com:6443/version?timeout=32s: dial tcp 10.0.76.214:6443: connect: connection refused DEBUG Still waiting for the Kubernetes API: Get https://api.wjiang-ocp.shiftstack.com:6443/version?timeout=32s: EOF DEBUG Still waiting for the Kubernetes API: Get https://api.wjiang-ocp.shiftstack.com:6443/version?timeout=32s: EOF DEBUG Still waiting for the Kubernetes API: Get https://api.wjiang-ocp.shiftstack.com:6443/version?timeout=32s: EOF DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource DEBUG Still waiting for the Kubernetes API: Get https://api.wjiang-ocp.shiftstack.com:6443/version?timeout=32s: EOF INFO API v1.12.4+8156b0c up INFO Waiting up to 30m0s for the bootstrap-complete event... DEBUG added kube-controller-manager.158f2bc94596490e: wjiang-ocp-ctshd-bootstrap.wjiang-ocp.shiftstack.com_f9e39ec3-4ee5-11e9-ad40-fa163ef33bb6 became leader DEBUG added kube-scheduler.158f2bc97583d42a: wjiang-ocp-ctshd-bootstrap.wjiang-ocp.shiftstack.com_f5af79b9-4ee5-11e9-bb52-fa163ef33bb6 became leader DEBUG modified kube-controller-manager.158f2bc94596490e: wjiang-ocp-ctshd-bootstrap.wjiang-ocp.shiftstack.com_f9e39ec3-4ee5-11e9-ad40-fa163ef33bb6 became leader DEBUG modified kube-scheduler.158f2bc97583d42a: wjiang-ocp-ctshd-bootstrap.wjiang-ocp.shiftstack.com_f5af79b9-4ee5-11e9-bb52-fa163ef33bb6 became leader DEBUG added kube-controller-manager.158f2c018fddb679: wjiang-ocp-ctshd-bootstrap.wjiang-ocp.shiftstack.com_55e4225b-4ee6-11e9-8463-fa163ef33bb6 became leader DEBUG added kube-scheduler.158f2c01dbc01f68: wjiang-ocp-ctshd-bootstrap.wjiang-ocp.shiftstack.com_54a353be-4ee6-11e9-bcbf-fa163ef33bb6 became leader DEBUG added openshift-master-controllers.158f2c027a6b2309: controller-manager-rbxq9 became leader DEBUG added bootstrap-success: Required control plane pods have been created DEBUG added openshift-master-controllers.158f2c11437d5e36: controller-manager-5lcpm became leader DEBUG added bootstrap-complete: cluster bootstrapping has completed INFO Destroying the bootstrap resources... One work around here is to use service_port_ip even lb_ip is defined, to make the communication within cluster go through same network.
diff --git a/data/data/openstack/service/main.tf b/data/data/openstack/service/main.tf
index 534762e18..41a494ee1 100644
--- a/data/data/openstack/service/main.tf
+++ b/data/data/openstack/service/main.tf
@@ -200,7 +200,7 @@ $ORIGIN ${var.cluster_domain}.
3600 ; minimum (1 hour)
)
-${length(var.lb_floating_ip) == 0 ? "api IN A ${var.service_port_ip}" : "api IN A ${var.lb_floating_ip}"}
+api IN A ${var.service_port_ip}
${length(var.lb_floating_ip) == 0 ? "*.apps IN A ${var.service_port_ip}" : "*.apps IN A ${var.lb_floating_ip}"}
bootstrap.${var.cluster_domain} IN A ${var.bootstrap_ip}
This also block all the routes. All the routes target to the external ip of load balancer, this make web console not work well, since it require authentication routes. [openshift@dhcp-140-70 installer]$ oc -n openshift-console logs console-d9d875c95-tww2b 2019/04/2 10:59:58 cmd/main: cookies are secure! 2019/04/2 11:00:03 auth: error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://openshift-authentication-openshift-authentication.apps.wjiang-ocp.shiftstack.com/oauth/token failed: Head https://openshift-authentication-openshift-authentication.apps.wjiang-ocp.shiftstack.com: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) 2019/04/2 11:00:18 auth: error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://openshift-authentication-openshift-authentication.apps.wjiang-ocp.shiftstack.com/oauth/token failed: Head https://openshift-authentication-openshift-authentication.apps.wjiang-ocp.shiftstack.com: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) 2019/04/2 11:00:33 auth: error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://openshift-authentication-openshift-authentication.apps.wjiang-ocp.shiftstack.com/oauth/token failed: Head https://openshift-authentication-openshift-authentication.apps.wjiang-ocp.shiftstack.com: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) 2019/04/2 11:00:48 auth: error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://openshift-authentication-openshift-authentication.apps.wjiang-ocp.shiftstack.com/oauth/token failed: Head https://openshift-authentication-openshift-authentication.apps.wjiang-ocp.shiftstack.com: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) [openshift@dhcp-140-70 installer]$ oc get pods -n openshift-console -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE console-d9d875c95-chq7f 0/1 Running 22 105m 10.129.0.28 wjiang-ocp-5hrhk-master-1.wjiang-ocp.shiftstack.com <none> console-d9d875c95-tww2b 0/1 Running 22 105m 10.128.0.22 wjiang-ocp-5hrhk-master-0.wjiang-ocp.shiftstack.com <none> downloads-77f7688f6c-pjrkp 1/1 Running 0 105m 10.128.0.21 wjiang-ocp-5hrhk-master-0.wjiang-ocp.shiftstack.com <none> downloads-77f7688f6c-txp92 1/1 Running 0 105m 10.130.0.20 wjiang-ocp-5hrhk-master-2.wjiang-ocp.shiftstack.com <none> Checked with 4.2.0-0.nightly-2019-07-31-162901, and this would not be an issue anymore. eturning to QE to close out since BZ has been validated to work on a nightly. Verified on 4.2.0-0.nightly-2019-08-05-223032, looks like it's just an upshift OSP issue. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922 |