Created attachment 1568092 [details] Querying auth endpoints within Jenkins pod Description of problem: Receive the following when I try to authenticate through github or keycloak on Jenkins 4.1 image. com.google.api.client.auth.oauth2.TokenResponseException: 403 Forbidden { "kind" : "Status", "apiVersion" : "v1", "metadata" : { }, "status" : "Failure", "message" : "forbidden: User \"system:anonymous\" cannot post path \"/oauth/token\"", "reason" : "Forbidden", "details" : { }, "code" : 403 } at com.google.api.client.auth.oauth2.TokenResponseException.from(TokenResponseException.java:105) at com.google.api.client.auth.oauth2.TokenRequest.executeUnparsed(TokenRequest.java:287) at com.google.api.client.auth.openidconnect.IdTokenResponse.execute(IdTokenResponse.java:120) at org.openshift.jenkins.plugins.openshiftlogin.OpenShiftOAuth2SecurityRealm$9.onSuccess(OpenShiftOAuth2SecurityRealm.java:890) at org.openshift.jenkins.plugins.openshiftlogin.OAuthSession.doFinishLogin(OAuthSession.java:129) at org.openshift.jenkins.plugins.openshiftlogin.OpenShiftOAuth2SecurityRealm.doFinishLogin(OpenShiftOAuth2SecurityRealm.java:1142) at java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:627) at org.kohsuke.stapler.Function$MethodFunction.invoke(Function.java:396) at org.kohsuke.stapler.Function$InstanceFunction.invoke(Function.java:408) at org.kohsuke.stapler.Function.bindAndInvoke(Function.java:212) at org.kohsuke.stapler.Function.bindAndInvokeAndServeResponse(Function.java:145) at org.kohsuke.stapler.MetaClass$11.doDispatch(MetaClass.java:537) ... (see attachments for full log) Version-Release number of selected component (if applicable): 4.1.0-rc.0 How reproducible: 100% Steps to Reproduce: 1. Configure github or keycloak authentication 2. Provision a jenkins master with image: quay.io/openshift/origin-jenkins@sha256:285df29bc3565de215f4bca0b1724b91d43def15b0438da3be7599798b83752e (4.1 build, built today) 3. Login through keycloak or github and authorize the application
Created attachment 1568094 [details] full exception
Created attachment 1568095 [details] jenkins pod logs
Justin- please pull docker.io/gmontero/jenkins-login-plugin-justin-test:latest and tag that into the jenkins:2 and jenkins:latest imagestreamtags in the openshift namespace in your cluster. Let's see if the switch from GET to HEAD per Mo's suggestion has any bearing. If not, grab the jenkins pod logs and we'll see what the logs with "GGM" in the string say.
Created attachment 1568215 [details] logs from jenkins image: docker.io/gmontero/jenkins-login-plugin-justin-test:latest
Actually, I think you are using an image which doesn't include fix for bug https://bugzilla.redhat.com/show_bug.cgi?id=1671633 . QE has tested github/keycloak, both of them works well.
Updating components: 1. Affects v4.1.0 (appears in release candidate) 2. Target is v4.2.0. @gmontero - if we decide to backport we will clone this BZ.
So debug today has revealed that the discrepancy in results with Justin's attempts vs. what was previously tried/tested is varying certs. i.e. the SA cert mounted by default in the jenkins pod vs. the (router) cert the oauth server uses, along with cert specific configurations variances. We are debating varying approaches, configuration restrictions, retries, workarounds in slack. The first of such is available for Justin to try (not expected to work, but should have debug when verifies if one of the workarounds will work if universally applied).
OK we've got a prototype that works in Justin's non default cert env. We are trying one more permutation. Based on its results, we'll decide between Mo, Justin, and I how we move forward.
PR https://github.com/openshift/jenkins-openshift-login-plugin/pull/71 is up for the code change to the login plugin to handler the scenario Justin has here A couple of reminders: - this scenario is not something you'll see with an out of the box install on AWS via try.openshift.com - after this PR merges, before putting on QA, I need to a) cut a new version of the login plugin at the update center b) craft a openshift/jenkins PR to bump the version of the login plugin in the image c) submit a new jenkins pipeline run for to update the plugin RPM in distgit/brew d) we then wait for a brew openshift/jenkins image with the new version NOTE: for Justin's immediate need on starter, we'll point him to the pre 4.2 GA brew image with the new login plugin version
I'll clone the bug / backport to 4.1.z once we've vetted the version of from this bug
Turns out the 4.2 release plumbing on the osbs/brew side is not quite there yet (the impending migration off of buildvm may get in the way at some point as well). I'm engaged with the ART on tracking when it will be available. https://github.com/openshift/jenkins/pull/854 is up which updates the master branch (meaning 4.2) openshift/jenkins image with v1.0.17 of the login plugin, which has the fix Justin, Mo, and I have worked out. But based on what is ready first, 4.2 osbs/brew, or 4.1.z opens up, we'll see about merging openshift/jenkins PRs and associated brew RPM updates to get Justin an image for starter.
Per my comment in the clone of this bug XiuJuan, yep, this problem does not occur with a "default" cluster install. Reach out to either Justin or Mo (both on cc: here) via needinfo to get a specific recipe for the steps needed to alter the cert configurations.
Justin, Could you help to provide the steps to reproduce this bug in ocp 4.1 ? My steps in comment #17 #18 . Thanks~
1. Install a cluster 2. Configure the ingresscontroller's default certificate with a certificate signed by a trusted root authority (e.g. let's encrypt) 3. Run Jenkins on the cluster 4. Attempt to log into Jenkins using oauth Detail: After installing a cluster, you need to configure a trusted (i.e. signed by a root authority), default certificate in the ingresscontroller (until you do this, ingress will be using the a certificate signed by the cluster crt). You can use "let's encrypt" to create a trusted certificate for your cluster and configure ingress. When you do this, the certificate for incoming routes (including oauth) will be different from the internal cert used by the API. It is this configuration that prevents Jenkins from performing the oauth flow. Prior to Gabe's change, Jenkins would fail when trying to communicate with the oauth route, because, although its certificate was signed by a root authority, it was not signed by the internal API certificate.
Thanks Justin, I could reproduce this bug on ocp now. But failed to install a 4.2 cluster, will try to verify on 4.2 tomorrow
Verified this issue with jenkins registry.svc.ci.openshift.org/ocp/4.2-2019-05-27-211046@sha256:f2d60cfc24ce881c4def3bc3bc4b1c8ab5fd031508c12b3a6bd85f1dbb52081a (get from 4.2.0-0.ci-2019-05-27-211046). The jenkins-login plugin is 1.0.17 Could login jenkins web successfully when configured a custom certificate signed by a trusted root authority in cluster.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922