Bug 1709575

Summary: Unable to oauth authenticate with github/keycloak to openshift jenkins instance
Product: OpenShift Container Platform Reporter: Justin Pierce <jupierce>
Component: ImageStreamsAssignee: Gabe Montero <gmontero>
Status: CLOSED ERRATA QA Contact: XiuJuan Wang <xiuwang>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 4.1.0CC: adam.kaplan, aos-bugs, gmontero, jokerman, mhepburn, mkhan, mmccomas, pweil, wzheng
Target Milestone: ---   
Target Release: 4.2.0   
Hardware: Unspecified   
OS: Unspecified   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: The changes to OpenShift OAuth support in 4.x can now allow for differing certificate configuration between the Jenkins service account cert and the cert used by the router for the OAuth server, and the openshift jenkins login plugin needed to be updated to account for that. Consequence: You could not log into the Jenkins console in such scenarios Fix: The openshift jenkins login plugin was updated to attempt TLS connections with the default certs available to the JVM in addition to the certs mounted into the its pod. Result: You can log into the jenkins console in such scenarios.
Story Points: ---
Clone Of:
: 1712240 (view as bug list) Environment:
Last Closed: 2019-10-16 06:28:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1712240    
Description Flags
Querying auth endpoints within Jenkins pod
full exception
jenkins pod logs
logs from jenkins image: docker.io/gmontero/jenkins-login-plugin-justin-test:latest none

Description Justin Pierce 2019-05-13 20:52:27 UTC
Created attachment 1568092 [details]
Querying auth endpoints within Jenkins pod

Description of problem:
Receive the following when I try to authenticate through github or keycloak on Jenkins 4.1 image.

com.google.api.client.auth.oauth2.TokenResponseException: 403 Forbidden
  "kind" : "Status",
  "apiVersion" : "v1",
  "metadata" : { },
  "status" : "Failure",
  "message" : "forbidden: User \"system:anonymous\" cannot post path \"/oauth/token\"",
  "reason" : "Forbidden",
  "details" : { },
  "code" : 403
	at com.google.api.client.auth.oauth2.TokenResponseException.from(TokenResponseException.java:105)
	at com.google.api.client.auth.oauth2.TokenRequest.executeUnparsed(TokenRequest.java:287)
	at com.google.api.client.auth.openidconnect.IdTokenResponse.execute(IdTokenResponse.java:120)
	at org.openshift.jenkins.plugins.openshiftlogin.OpenShiftOAuth2SecurityRealm$9.onSuccess(OpenShiftOAuth2SecurityRealm.java:890)
	at org.openshift.jenkins.plugins.openshiftlogin.OAuthSession.doFinishLogin(OAuthSession.java:129)
	at org.openshift.jenkins.plugins.openshiftlogin.OpenShiftOAuth2SecurityRealm.doFinishLogin(OpenShiftOAuth2SecurityRealm.java:1142)
	at java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:627)
	at org.kohsuke.stapler.Function$MethodFunction.invoke(Function.java:396)
	at org.kohsuke.stapler.Function$InstanceFunction.invoke(Function.java:408)
	at org.kohsuke.stapler.Function.bindAndInvoke(Function.java:212)
	at org.kohsuke.stapler.Function.bindAndInvokeAndServeResponse(Function.java:145)
	at org.kohsuke.stapler.MetaClass$11.doDispatch(MetaClass.java:537)
...  (see attachments for full log)

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. Configure github or keycloak authentication
2. Provision a jenkins master with image: quay.io/openshift/origin-jenkins@sha256:285df29bc3565de215f4bca0b1724b91d43def15b0438da3be7599798b83752e   (4.1 build, built today)
3. Login through keycloak or github and authorize the application

Comment 1 Justin Pierce 2019-05-13 20:52:55 UTC
Created attachment 1568094 [details]
full exception

Comment 2 Justin Pierce 2019-05-13 20:53:32 UTC
Created attachment 1568095 [details]
jenkins pod logs

Comment 3 Gabe Montero 2019-05-13 21:09:19 UTC
Justin- please pull docker.io/gmontero/jenkins-login-plugin-justin-test:latest and tag that into the jenkins:2 and jenkins:latest imagestreamtags in the openshift namespace in your cluster.

Let's see if the switch from GET to HEAD per Mo's suggestion has any bearing.

If not, grab the jenkins pod logs and we'll see what the logs with "GGM" in the string say.

Comment 4 Justin Pierce 2019-05-14 01:12:13 UTC
Created attachment 1568215 [details]
logs from jenkins image: docker.io/gmontero/jenkins-login-plugin-justin-test:latest

Comment 5 Wenjing Zheng 2019-05-14 06:35:45 UTC
Actually, I think you are using an image which doesn't include fix for bug https://bugzilla.redhat.com/show_bug.cgi?id=1671633 .
QE has tested github/keycloak, both of them works well.

Comment 9 Adam Kaplan 2019-05-15 18:49:28 UTC
Updating components:

1. Affects v4.1.0 (appears in release candidate)
2. Target is v4.2.0.

@gmontero - if we decide to backport we will clone this BZ.

Comment 10 Gabe Montero 2019-05-15 19:55:30 UTC
So debug today has revealed that the discrepancy in results with Justin's attempts vs. what was previously tried/tested is varying certs.

i.e. the SA cert mounted by default in the jenkins pod vs. the (router) cert the oauth server uses, along with cert specific configurations variances.

We are debating varying approaches, configuration restrictions, retries, workarounds in slack.  The first of such is available for Justin to try (not expected to 
work, but should have debug when verifies if one of the workarounds will work if universally applied).

Comment 12 Gabe Montero 2019-05-16 00:01:37 UTC
OK we've got a prototype that works in Justin's non default cert env.

We are trying one more permutation.  Based on its results, we'll decide between Mo, Justin, and I how we move forward.

Comment 14 Gabe Montero 2019-05-16 16:54:46 UTC
PR https://github.com/openshift/jenkins-openshift-login-plugin/pull/71 is up for the code change to the login plugin to handler the scenario Justin has here

A couple of reminders:
- this scenario is not something you'll see with an out of the box install on AWS via try.openshift.com
- after this PR merges, before putting on QA, I need to
  a) cut a new version of the login plugin at the update center
  b) craft a openshift/jenkins PR to bump the version of the login plugin in the image
  c) submit a new jenkins pipeline run for to update the plugin RPM in distgit/brew
  d) we then wait for a brew openshift/jenkins image with the new version
    NOTE: for Justin's immediate need on starter, we'll point him to the pre 4.2 GA brew image with the new login plugin version

Comment 15 Gabe Montero 2019-05-16 16:55:33 UTC
I'll clone the bug / backport to 4.1.z once we've vetted the version of from this bug

Comment 16 Gabe Montero 2019-05-17 14:03:20 UTC
Turns out the 4.2 release plumbing on the osbs/brew side is not quite there yet (the impending migration off of buildvm may get in the way at some point as well).
I'm engaged with the ART on tracking when it will be available.

https://github.com/openshift/jenkins/pull/854 is up which updates the master branch (meaning 4.2) openshift/jenkins image with v1.0.17 of the login plugin, which
has the fix Justin, Mo, and I have worked out.

But based on what is ready first, 4.2 osbs/brew, or 4.1.z opens up, we'll see about merging openshift/jenkins PRs and associated brew RPM updates to get Justin an image for starter.

Comment 18 Gabe Montero 2019-05-22 14:00:56 UTC
Per my comment in the clone of this bug XiuJuan, yep, this problem does not occur with a "default" cluster install.

Reach out to either Justin or Mo (both on cc: here) via needinfo to get a specific recipe for the steps needed to 
alter the cert configurations.

Comment 19 XiuJuan Wang 2019-05-23 02:38:35 UTC
Could you help to provide the steps to reproduce this bug in ocp 4.1 ?
My steps in comment #17 #18 .

Comment 20 Justin Pierce 2019-05-23 13:39:03 UTC
1. Install a cluster
2. Configure the ingresscontroller's default certificate with a certificate signed by a trusted root authority (e.g. let's encrypt)
3. Run Jenkins on the cluster
4. Attempt to log into Jenkins using oauth

After installing a cluster, you need to configure a trusted (i.e. signed by a root authority), default certificate in the ingresscontroller (until you do this, ingress will be using the a certificate signed by the cluster crt). You can use "let's encrypt" to create a trusted certificate for your cluster and configure ingress. When you do this, the certificate for incoming routes (including oauth) will be different from the internal cert used by the API.  It is this configuration that prevents Jenkins from performing the oauth flow. Prior to Gabe's change, Jenkins would fail when trying to communicate with the oauth route, because, although its certificate was signed by a root authority, it was not signed by the internal API certificate.

Comment 21 XiuJuan Wang 2019-05-27 12:02:26 UTC
Thanks Justin,
I could reproduce this bug on ocp now. But failed to install a 4.2 cluster, will try to verify on 4.2 tomorrow

Comment 22 XiuJuan Wang 2019-05-28 08:50:43 UTC
Verified this issue with jenkins registry.svc.ci.openshift.org/ocp/4.2-2019-05-27-211046@sha256:f2d60cfc24ce881c4def3bc3bc4b1c8ab5fd031508c12b3a6bd85f1dbb52081a (get from 4.2.0-0.ci-2019-05-27-211046).
The jenkins-login plugin is 1.0.17

Could login jenkins web successfully when configured a custom certificate signed by a trusted root authority in cluster.

Comment 23 errata-xmlrpc 2019-10-16 06:28:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.