Bug 1695048

Summary: Conformance tests failing with "Unauthorized" or "You must be logged in to the server"
Product: OpenShift Container Platform Reporter: Devan Goodwin <dgoodwin>
Component: apiserver-authAssignee: Mo <mkhan>
Status: CLOSED DUPLICATE QA Contact: Chuan Yu <chuyu>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 4.1.0CC: aos-bugs, ccoleman, evb, gmontero, jokerman, mmccomas, nagrawal, slaznick
Target Milestone: ---   
Target Release: 4.1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-04-05 13:22:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Devan Goodwin 2019-04-02 11:41:23 UTC
Description of problem:

Investigating failures as build cop today, saw some conformance tests failing with "unauthorized" or "you must be logged into the server" errors. 

It appears to be related to OpenShift types, image streams, deployment configs, etc.

e2e failures: https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/openshift_machine-config-operator/596/pull-ci-openshift-machine-config-operator-master-e2e-aws/2875

In the openshift-apiserver logs I see a lot of:

E0402 09:46:06.727061       1 webhook.go:192] Failed to make webhook authorizer request: .authorization.k8s.io "" is invalid: spec.user: Invalid value: "": at least one of user or group must be specified

Not sure if this is master or auth, please redirect if I'm wrong.

Version-Release number of selected component (if applicable):

See build link above.


How reproducible:

Looks like maybe 14 hits in the last week: https://search.svc.ci.openshift.org/?search=error%3A+You+must+be+logged+in+to+the+server&maxAge=168h&context=2&type=all

Comment 1 Devan Goodwin 2019-04-02 11:50:39 UTC
Moving to auth.

Michal Fojtik   [3 minutes ago]
i think that flakes are caused by something creating SAR request without user/group

Michal Fojtik   [2 minutes ago]
@auth-team should check the audit logs, grep create calls for subject access reviews and find the one returning 403 (likely)

Michal Fojtik   [2 minutes ago]
then identify service account and track down what is making that invalid request

Comment 2 Clayton Coleman 2019-04-03 14:09:04 UTC
Moving to urgent, this is a top flake and causes failures/flakes in almost every single run.

Comment 3 Gabe Montero 2019-04-03 14:27:38 UTC
Looks at least related to https://bugzilla.redhat.com/show_bug.cgi?id=1694878 if not a duplicate

I am also seeing the same TLS errors in the authentication server

For example:  I0402 09:19:48.937291       1 log.go:172] http: TLS handshake error from 10.131.0.6:44300: EOF

from the  run Devan mentioned in his description.

Adding Standa to the cc: ... you agree?

Comment 4 Standa Laznicka 2019-04-03 14:38:00 UTC
Indeed, the TLS handshake errors seem to be the same in both the cases.

Note that the logged error 'Failed to make webhook authorizer request: .authorization.k8s.io "" is invalid: spec.user: Invalid value: "": at least one of user or group must be specified' does not seem to have actual influence on the test result as explained in https://bugzilla.redhat.com/show_bug.cgi?id=1694878#c1.

I still need to figure out what causes the TLS errors.

Comment 5 Erica von Buelow 2019-04-05 12:39:55 UTC
Assigning to Mo who has been helping to debug the issue

Comment 6 Erica von Buelow 2019-04-05 13:22:21 UTC

*** This bug has been marked as a duplicate of bug 1694878 ***