Bug 1764704

Summary: console pods should not use ca.crt from ServiceAccount to validate ingress certificate
Product: OpenShift Container Platform Reporter: Seth Jennings <sjenning>
Component: Management ConsoleAssignee: Jakub Hadvig <jhadvig>
Status: CLOSED DUPLICATE QA Contact: Yadan Pei <yapei>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.2.0CC: aos-bugs, cewong, ChetRHosey, deads, decarr, jhadvig, jokerman, spadgett, wking, yapei
Target Milestone: ---Flags: yapei: needinfo? (jhadvig)
Target Release: 4.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-11-19 14:35:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Seth Jennings 2019-10-23 15:10:40 UTC
Description of problem:

Currently the console pods assume the router-ca that signs the ingress wildcard certificate is in the ca.crt bundle that is distributed in the ServiceAccount secret. The console pods should not make this assumption.

The console operator should copy the router-ca configmap out of the openshift-config-managed namespace (and watch it for changes) and mount that into the console pods and use that CA for ingress certificate verification when connecting to the oauth Route.

Version-Release number of selected component (if applicable):
4.2.0

How reproducible:
Always

Steps to Reproduce:
1. Run the KCM without the router-ca in the CA bundle specified in the --root-ca flag
2. Console pods never become ready
3.

Actual results:
2019/10/23 14:56:06 auth: error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.hosted-apps.lab.variantweb.net/oauth/token failed: Head https://oauth-openshift.hosted-apps.lab.variantweb.net: x509: certificate signed by unknown authority

Expected results:
No error and console pods become ready.


Additional info:

Comment 1 Samuel Padgett 2019-10-24 13:01:29 UTC
This looks like it's the same as Bug 1712525.

Comment 4 Yadan Pei 2019-11-15 08:22:57 UTC
Hi Jakub, I used following steps to verify the bug

1. Generate a CA and certificate (for testing):

    BASE_DOMAIN="$(oc get dns.config/cluster -o 'jsonpath={.spec.baseDomain}')"
    INGRESS_DOMAIN="$(oc get ingress.config/cluster -o 'jsonpath={.spec.domain}')"
    openssl genrsa -out example-ca.key 2048
    openssl req -x509 -new -key example-ca.key -out example-ca.crt -days 1 -subj "/C=US/ST=NC/L=Chocowinity/O=OS3/OU=Eng/CN=$BASE_DOMAIN"
    openssl genrsa -out example.key 2048
    openssl req -new -key example.key -out example.csr -subj "/C=US/ST=NC/L=Chocowinity/O=OS3/OU=Eng/CN=*.$INGRESS_DOMAIN"
    openssl x509 -req -in example.csr -CA example-ca.crt -CAkey example-ca.key -CAcreateserial -out example.crt -days 1

2. Configure the certificate as the ingresscontroller's default certificate:
[yapei@dhcp-141-192 test-files]$ oc -n openshift-ingress create secret tls custom-default-cert --cert=example.crt --key=example.key
secret/custom-default-cert created
[yapei@dhcp-141-192 test-files]$ oc -n openshift-ingress-operator patch ingresscontrollers/default --type=merge --patch='{"spec":{"defaultCertificate":{"name":"custom-default-cert"}}}'
ingresscontroller.operator.openshift.io/default patched

3. ingress pods are re-created
[yapei@dhcp-141-192 test-files]$ oc get pods -n openshift-ingress
NAME                              READY   STATUS    RESTARTS   AGE
router-default-798959bfd4-kzddm   1/1     Running   0          26s
router-default-798959bfd4-l6snk   1/1     Running   0          38s

4. Check console accessibility, opening console still have issues. 

But it looks like this is blocked by a new issue which is opened at bug 1772759

Comment 5 Yadan Pei 2019-11-15 08:25:28 UTC
Also after we configure ingress operator to use custom certificate, no cm/router-ca in openshift-config-managed namespace, this is tracked in another bug 1772775

Comment 6 Yadan Pei 2019-11-19 06:13:09 UTC
I tried again follow the steps in comment 4

This time bug 1772759 was not reproduced, but accessing console still met the same issue(got error page and looping, see attached)


$ oc logs -f console-operator-68c4b88777-kgfql -n openshift-console-operator
I1119 06:00:29.143631       1 event.go:255] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-console-operator", Name:"console-operator", UID:"b005894c-c166-4282-9d28-a12793b596e6", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'TargetConfigDeleted' Deleted target configmap openshift-console/router-ca because source config does not exist

$ oc logs -f console-657df767dc-mqvpl -n openshift-console
2019/11/19 06:05:12 server: authentication failed: unauthenticated
2019/11/19 06:05:12 auth: failed to get latest auth source data: request to OAuth issuer endpoint https://oauth-openshift.apps.qe-yapei43.qe.devcluster.openshift.com/oauth/token failed: Head https://oauth-openshift.apps.qe-yapei43.qe.devcluster.openshift.com: x509: certificate signed by unknown authority
2019/11/19 06:05:12 server: authentication failed: unauthenticated
2019/11/19 06:05:12 server: authentication failed: unauthenticated
2019/11/19 06:05:12 auth: failed to get latest auth source data: request to OAuth issuer endpoint https://oauth-openshift.apps.qe-yapei43.qe.devcluster.openshift.com/oauth/token failed: Head https://oauth-openshift.apps.qe-yapei43.qe.devcluster.openshift.com: x509: certificate signed by unknown authority
2019/11/19 06:05:12 auth: failed to get latest auth source data: request to OAuth issuer endpoint https://oauth-openshift.apps.qe-yapei43.qe.devcluster.openshift.com/oauth/token failed: Head https://oauth-openshift.apps.qe-yapei43.qe.devcluster.openshift.com: x509: certificate signed by unknown authority
2019/11/19 06:05:12 auth: unable to verify auth code with issuer: Post https://oauth-openshift.apps.qe-yapei43.qe.devcluster.openshift.com/oauth/token: x509: certificate signed by unknown authority
2019/11/19 06:05:13 auth: failed to get latest auth source data: request to OAuth issuer endpoint https://oauth-openshift.apps.qe-yapei43.qe.devcluster.openshift.com/oauth/token failed: Head https://oauth-openshift.apps.qe-yapei43.qe.devcluster.openshift.com: x509: certificate signed by unknown authority
2019/11/19 06:05:13 server: authentication failed: unauthenticated
2019/11/19 06:05:13 server: authentication failed: unauthenticated
2019/11/19 06:05:13 server: authentication failed: unauthenticated
2019/11/19 06:05:13 server: authentication failed: unauthenticated



Confirmed fix PR already landed in 4.3.0-0.nightly-2019-11-18-175710
$ oc get pods -n openshift-console-operator -o yaml | grep -i image
      - name: IMAGE
      image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:6d83281dcc25dddd4ac7f0c58e488cdacfec5d47568ed278143f8fe86e5ecc98
      imagePullPolicy: IfNotPresent
    imagePullSecrets:
      image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:6d83281dcc25dddd4ac7f0c58e488cdacfec5d47568ed278143f8fe86e5ecc98
      imageID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:6d83281dcc25dddd4ac7f0c58e488cdacfec5d47568ed278143f8fe86e5ecc98
$ oc image info quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:6d83281dcc25dddd4ac7f0c58e488cdacfec5d47568ed278143f8fe86e5ecc98 | grep -i commit
Environment: SOURCE_GIT_COMMIT=365e1e43ea2c523c040324e4f8e674bf819732e4
             io.openshift.build.commit.id=365e1e43ea2c523c040324e4f8e674bf819732e4
             io.openshift.build.commit.url=https://github.com/openshift/console-operator/commit/365e1e43ea2c523c040324e4f8e674bf819732e4
[yapei@dhcp-141-192 console-operator]$ git log 365e1e43ea2c523c040324e4f8e674bf819732e4 | grep '#328'
    Merge pull request #328 from jhadvig/bz1764704


Let me know if the steps are wrong

Comment 7 Samuel Padgett 2019-11-19 14:35:28 UTC
Based on https://bugzilla.redhat.com/show_bug.cgi?id=1772775#c1, it looks like the proposed fix won't work. I'm going to duplicate this to bug 1712525 since they're really the same issue.

*** This bug has been marked as a duplicate of bug 1712525 ***