Bug 1793592

Summary: Control plane unreachable during e2e run on GCP
Product: OpenShift Container Platform Reporter: Clayton Coleman <ccoleman>
Component: Machine Config OperatorAssignee: Antonio Murdaca <amurdaca>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Michael Nguyen <mnguyen>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.4CC: aos-bugs, mfojtik, mifiedle, skumari, smilner
Target Milestone: ---   
Target Release: 4.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-04-24 22:17:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Clayton Coleman 2020-01-21 15:58:37 UTC
https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-gcp-serial-4.4/920

Looks like a node died, but apiservers both dropped out:

Jan 21 08:59:24.478 - 132s  I test="[sig-apps] Daemon set [Serial] should rollback without unnecessary restarts [Conformance] [Suite:openshift/conformance/serial/minimal] [Suite:k8s]" running
Jan 21 08:59:33.347 W clusteroperator/marketplace changed Available to False: OperatorExited: The operator has exited
Jan 21 08:59:33.657 I ns/openshift-kube-apiserver pod/kube-apiserver-ci-op-vnt4n-m-2.c.openshift-gce-devel-ci.internal Received signal to terminate, becoming unready, but keeping serving
Jan 21 08:59:33.899 I ns/openshift-kube-apiserver pod/kube-apiserver-ci-op-vnt4n-m-2.c.openshift-gce-devel-ci.internal All pre-shutdown hooks have been finished
Jan 21 08:59:34.506 I ns/openshift-kube-apiserver pod/kube-apiserver-ci-op-vnt4n-m-0.c.openshift-gce-devel-ci.internal Received signal to terminate, becoming unready, but keeping serving
Jan 21 08:59:36.711 E kube-apiserver failed contacting the API: Get https://api.ci-op-j5grvql8-9bdfc.origin-ci-int-gce.dev.openshift.com:6443/apis/config.openshift.io/v1/clusterversions?allowWatchBookmarks=true&fieldSelector=metadata.name%3Dversion&resourceVersion=19327&timeout=8m21s&timeoutSeconds=501&watch=true: unexpected EOF
Jan 21 08:59:36.711 E kube-apiserver Kube API started failing: Get https://api.ci-op-j5grvql8-9bdfc.origin-ci-int-gce.dev.openshift.com:6443/api/v1/namespaces/kube-system?timeout=3s: unexpected EOF
Jan 21 08:59:36.711 I openshift-apiserver OpenShift API started failing: Get https://api.ci-op-j5grvql8-9bdfc.origin-ci-int-gce.dev.openshift.com:6443/apis/image.openshift.io/v1/namespaces/openshift-apiserver/imagestreams/missing?timeout=3s: unexpected EOF
Jan 21 08:59:36.714 E kube-apiserver failed contacting the API: Get https://api.ci-op-j5grvql8-9bdfc.origin-ci-int-gce.dev.openshift.com:6443/api/v1/pods?allowWatchBookmarks=true&resourceVersion=27400&timeout=9m17s&timeoutSeconds=557&watch=true: dial tcp 35.237.14.166:6443: connect: connection refused
Jan 21 08:59:36.714 E kube-apiserver failed contacting the API: Get https://api.ci-op-j5grvql8-9bdfc.origin-ci-int-gce.dev.openshift.com:6443/apis/config.openshift.io/v1/clusteroperators?allowWatchBookmarks=true&resourceVersion=28191&timeout=7m12s&timeoutSeconds=432&watch=true: dial tcp 35.237.14.166:6443: connect: connection refused
Jan 21 08:59:38.868 - 479s  E kube-apiserver Kube API is not responding to GET requests
Jan 21 08:59:38.868 - 479s  E openshift-apiserver OpenShift API is not responding to GET requests

Needs to be routed, urgent because we shouldn't ever die.

Comment 3 Michal Fojtik 2020-02-13 14:28:01 UTC
Termination events:

Jan 21 08:59:33.657 I ns/openshift-kube-apiserver pod/kube-apiserver-ci-op-vnt4n-m-2.c.openshift-gce-devel-ci.internal Received signal to terminate, becoming unready, but keeping serving
Jan 21 08:59:33.899 I ns/openshift-kube-apiserver pod/kube-apiserver-ci-op-vnt4n-m-2.c.openshift-gce-devel-ci.internal All pre-shutdown hooks have been finished
Jan 21 08:59:34.506 I ns/openshift-kube-apiserver pod/kube-apiserver-ci-op-vnt4n-m-0.c.openshift-gce-devel-ci.internal Received signal to terminate, becoming unready, but keeping serving
Jan 21 08:59:33.657 I ns/openshift-kube-apiserver pod/kube-apiserver-ci-op-vnt4n-m-2.c.openshift-gce-devel-ci.internal Received signal to terminate, becoming unready, but keeping serving
Jan 21 08:59:33.899 I ns/openshift-kube-apiserver pod/kube-apiserver-ci-op-vnt4n-m-2.c.openshift-gce-devel-ci.internal All pre-shutdown hooks have been finished
Jan 21 08:59:34.506 I ns/openshift-kube-apiserver pod/kube-apiserver-ci-op-vnt4n-m-0.c.openshift-gce-devel-ci.internal Received signal to terminate, becoming unready, but keeping serving

Comment 11 Red Hat Bugzilla 2023-09-14 05:50:26 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days