Bug 1886055
Summary: | After OCP upgrade, two of three openshift-apiserver pods in a CrashLoopBackOff mode | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | baiesi |
Component: | openshift-apiserver | Assignee: | Stefan Schimanski <sttts> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Xingxing Xia <xxia> |
Severity: | high | Docs Contact: | |
Priority: | low | ||
Version: | 4.3.z | CC: | annair, aos-bugs, baiesi, eparis, jokerman, mfojtik, milei, nstielau, wking, wlewis |
Target Milestone: | --- | Flags: | mfojtik:
needinfo?
|
Target Release: | 4.7.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | LifecycleReset | ||
Fixed In Version: | Doc Type: | No Doc Update | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-01-21 09:07:20 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1869362 |
Description
baiesi
2020-10-07 14:49:14 UTC
apiserver-99xl5-openshift-apiserver.log: Copying system trust bundle I1007 11:50:30.551927 1 audit.go:368] Using audit backend: ignoreErrors<log> I1007 11:50:30.557010 1 plugins.go:84] Registered admission plugin "NamespaceLifecycle" I1007 11:50:30.557041 1 plugins.go:84] Registered admission plugin "ValidatingAdmissionWebhook" I1007 11:50:30.557051 1 plugins.go:84] Registered admission plugin "MutatingAdmissionWebhook" I1007 11:50:30.559354 1 admission.go:48] Admission plugin "project.openshift.io/ProjectRequestLimit" is not configured so it will be disabled. I1007 11:50:30.560653 1 plugins.go:158] Loaded 5 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,build.openshift.io/BuildConfigSecretInjector,image.openshift.io/ImageLimitRange,image.openshift.io/ImagePolicy,MutatingAdmissionWebhook. I1007 11:50:30.560687 1 plugins.go:161] Loaded 8 validating admission controller(s) successfully in the following order: OwnerReferencesPermissionEnforcement,build.openshift.io/BuildConfigSecretInjector,build.openshift.io/BuildByStrategy,image.openshift.io/ImageLimitRange,image.openshift.io/ImagePolicy,quota.openshift.io/ClusterResourceQuota,ValidatingAdmissionWebhook,ResourceQuota. I1007 11:50:30.572544 1 client.go:361] parsed scheme: "endpoint" I1007 11:50:30.572628 1 endpoint.go:68] ccResolverWrapper: sending new addresses to cc: [{https://etcd.openshift-etcd.svc:2379 0 <nil>}] I1007 11:50:35.591672 1 client.go:361] parsed scheme: "endpoint" I1007 11:50:35.591730 1 endpoint.go:68] ccResolverWrapper: sending new addresses to cc: [{https://etcd.openshift-etcd.svc:2379 0 <nil>}] F1007 11:50:55.591913 1 openshift_apiserver.go:420] context deadline exceeded W1007 11:50:55.592702 1 clientconn.go:1120] grpc: addrConn.createTransport failed to connect to {https://etcd.openshift-etcd.svc:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp: operation was canceled". Reconnecting... I1007 11:50:55.591981 1 controlbuf.go:508] transport: loopyWriter.run returning. connection error: desc = "transport is closing" Updates team is responsible for the updates framework and cluster-version operator manifest application. If a component pod is crashlooping, that's the responsibility of that component's team. In this case, maybe the pod needs to be more robust in the face of etcd connection, or maybe etcd is down, or maybe there is some networking issue between the pod and etcd. But updates folks are not maintaining any of those components, so it's hard for us to know where the issue is. Other topics had higher prio. Adding UpcomingSprint. This bug hasn't had any activity in the last 30 days. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're marking this bug as "LifecycleStale" and decreasing the severity/priority. If you have further information on the current state of the bug, please update it, otherwise this bug can be closed in about 7 days. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. Additionally, you can add LifecycleFrozen into Keywords if you think this bug should never be marked as stale. Please consult with bug assignee before you do that. This doesn't seem like a 4.7 blocker, but does seem high severity. The LifecycleStale keyword was removed because the bug got commented on recently. The bug assignee was notified. 4.3 is EOL. Closing. Please reopen if you see this with a more current version. |