Bug 1764670 - cannot access cluster after some period of time "error: EOF" [NEEDINFO]
Summary: cannot access cluster after some period of time "error: EOF"
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: openshift-apiserver
Version: 4.2.0
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: 4.3.0
Assignee: David Eads
QA Contact: Xingxing Xia
: 1769247 (view as bug list)
Depends On:
TreeView+ depends on / blocked
Reported: 2019-10-23 14:53 UTC by Dan Mace
Modified: 2020-06-12 07:55 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2019-12-03 08:19:07 UTC
Target Upstream Version:
deads: needinfo? (dmace)
sttts: needinfo? (dmace)

Attachments (Terms of Use)

Description Dan Mace 2019-10-23 14:53:02 UTC
Manual clone of https://bugzilla.redhat.com/show_bug.cgi?id=1759523 (see https://bugzilla.redhat.com/show_bug.cgi?id=1759523#c25 for rationale).

Description of problem:

After successful install of ocp 4.2 and waiting some amount of time we can nolong login to the cluster, from browswer or oc command line.  Would like to know if there is a way to recover.  At the moment I need to destroy my cluster and rebuild it.

Version-Release number of selected component (if applicable):
4.2 RC1

How reproducible:

Steps to Reproduce:
1.Install openshift
2.wait 24 hours or less
3.then try to login

Actual results:
oc login -u kubeadmin -p <redacted> https://api.simple-sunfish.purple-chesterfield.com:6443
error: EOF

Expected results:
Sucessful login

Additional info:
This is intermittent, 1 out of 3 clusters has hit this issue.

Comment 2 Dan Mace 2019-11-06 14:42:51 UTC
*** Bug 1769247 has been marked as a duplicate of this bug. ***

Comment 3 David Eads 2019-11-12 21:00:31 UTC
Use the certificate based admin.kubeconfig to run must-gather so we have a point to start debugging from. Also, did these cluster-admins ever manually modify any configmap in `openshift-config-managed`?  For a while there were some bad directions in a KCS article that resulted in corrupting some trust bundles.

Note You need to log in before you can comment on or make changes to this bug.