Description of problem: Failed test: https://prow.k8s.io/view/gcs/origin-ci-test/logs/canary-openshift-ocp-installer-e2e-gcp-serial-4.2/24 Failed error: fail [k8s.io/kubernetes/test/e2e/e2e.go:104]: Unexpected error: <*url.Error | 0xc003334300>: { Op: "Get", URL: "https://api.ci-op-x3fpxir9-03113.origin-ci-int-gce.dev.openshift.com:6443/api/v1/nodes?fieldSelector=spec.unschedulable%3Dfalse&resourceVersion=0", Err: { Op: "dial", Net: "tcp", Source: nil, Addr: nil, Err: { Err: "no such host", Name: "api.ci-op-x3fpxir9-03113.origin-ci-int-gce.dev.openshift.com", Server: "10.142.15.249:53", IsTimeout: false, IsTemporary: false, }, }, } Get https://api.ci-op-x3fpxir9-03113.origin-ci-int-gce.dev.openshift.com:6443/api/v1/nodes?fieldSelector=spec.unschedulable%3Dfalse&resourceVersion=0: dial tcp: lookup api.ci-op-x3fpxir9-03113.origin-ci-int-gce.dev.openshift.com on 10.142.15.249:53: no such host occurred Aug 20 10:32:03.010 E kube-apiserver Kube API is not responding to GET requests Aug 20 10:32:03.010 E openshift-apiserver OpenShift API is not responding to GET requests Version-Release number of selected component (if applicable): redhat-openshift-release-informing#redhat-canary-openshift-ocp-installer-e2e-gcp-serial-4.2 How reproducible: always
Looks like something related to the DNS record for the API server, which is part of the installer. The DNS component is for cluster DNS bugs (e.g. CoreDNS). Routing would be appropriate for DNS issues related to routes. Hope that helps clarify. I reassigned this to the Installer component. Let me know if that was a mistake!
``` E0820 10:28:31.227539 244 reflector.go:126] github.com/openshift/origin/pkg/monitor/operator.go:126: Failed to list *v1.ClusterOperator: Get https://api.ci-op-x3fpxir9-03113.origin-ci-int-gce.dev.openshift.com:6443/apis/config.openshift.io/v1/clusteroperators?limit=500&resourceVersion=0: dial tcp: lookup api.ci-op-x3fpxir9-03113.origin-ci-int-gce.dev.openshift.com on 10.142.15.249:53: no such host ``` The IP `10.142.15.249:53` that is being requested for DNS > https://github.com/openshift/installer/blob/63bb767efaafde1b0daf9638b7f0889af97cff8f/pkg/types/defaults/installconfig.go#L17-L19 the cluster network (pod cidr) is 10.128.0.0/14 (First IP 10.128.0.0 Last IP 10.131.255.255) the machine network (machine cidr) is 10.0.0./16 (First IP 10.0.0.0 Last IP 10.0.255.255) So this IP doesn't belong to the virtual network or the pod network of the cluster. That means that request was made from a the `test` pod of CI run.. Now the either the DNS failed in the ci-cluster or the GCP had a hiccup.. this doesn't seem like installer's problem. on another run: see DNS working but failing to connect to api https://prow.k8s.io/view/gcs/origin-ci-test/logs/canary-openshift-ocp-installer-e2e-gcp-serial-4.2/26#0:build-log.txt%3A71035 and then the DNS not resolving at all few seconds later https://prow.k8s.io/view/gcs/origin-ci-test/logs/canary-openshift-ocp-installer-e2e-gcp-serial-4.2/26#0:build-log.txt%3A71042
e2e-gcp-serial is running tests that are failing, since the serial suite is run one at a time, this causes the test to timeout and therefore the `no such host` errors happen towrds the end of the run as the CI cluster is being torn down.. a class of failures is tracked here https://bugzilla.redhat.com/show_bug.cgi?id=1745720
*** Bug 1748760 has been marked as a duplicate of this bug. ***
all the jobs are failed on 4.3 branch, So I have to wait.
It is fixed. I check it with https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-gcp-serial-4.3/137
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0062