Bug 1824934 - Console operator inverts logic for picking up the default-ingress-cert [NEEDINFO]
Summary: Console operator inverts logic for picking up the default-ingress-cert
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Management Console
Version: 4.4
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.5.0
Assignee: Jakub Hadvig
QA Contact: Yadan Pei
URL:
Whiteboard:
Depends On:
Blocks: 1824935
TreeView+ depends on / blocked
 
Reported: 2020-04-16 17:08 UTC by bpeterse
Modified: 2020-07-13 17:28 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Fix falsely stated that the configmap would be absent if the administrator provided a default certificate for every ingress controller.Since 4.5 we are mounting default-ingress-cert into the console pod. Consequence: Default CA was usedinside the console pod. Fix: Configure the console to use the default-ingress-cert configmap if the configmap exists, or else to use the default CA if the configmap is absent. Result: Use and mount default-ingress-cert configmap into the console pod.
Clone Of:
: 1824935 (view as bug list)
Environment:
Last Closed: 2020-07-13 17:27:59 UTC
Target Upstream Version:
jhadvig: needinfo? (bpeterse)


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:2409 None None None 2020-07-13 17:28:23 UTC

Description bpeterse 2020-04-16 17:08:24 UTC
Description of problem:

See this PR: https://github.com/openshift/console-operator/pull/403
Merged.

Comment 3 shahan 2020-04-17 08:46:53 UTC
1. delete the cm default-ingress-cert from openshift-console
$oc -n openshift-cluster-version scale deployments/cluster-version-operator --replicas=0
$oc -n openshift-ingress-operator scale deploy/ingress-operator --replicas=0
$oc -n openshift-config-managed delete configmaps/default-ingress-cert
$ oc get cm default-ingress-cert -n openshift-console
Error from server (NotFound): configmaps "default-ingress-cert" not found

2. after a while, check the deployment console
      containers:
      - command:
        - /opt/bridge/bin/bridge
        - --public-dir=/opt/bridge/static
        - --config=/var/console-config/console-config.yaml
        - --service-ca-file=/var/service-ca/service-ca.crt
        image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:3032707cb3bd2f6089e0ec01460bf7837bea020c445cd01b20ebb21ba8fe6983
        imagePullPolicy: IfNotPresent
        lifecycle:
          preStop:
            exec:
              command:
              - sleep
              - "25"
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /health
            port: 8443
            scheme: HTTPS
          initialDelaySeconds: 150
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        name: console
        ports:
        - containerPort: 443
          name: https
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /health
            port: 8443
            scheme: HTTPS
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        resources:
          requests:
            cpu: 10m
            memory: 100Mi
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: FallbackToLogsOnError
        volumeMounts:
        - mountPath: /var/serving-cert
          name: console-serving-cert
          readOnly: true
        - mountPath: /var/oauth-config
          name: console-oauth-config
          readOnly: true
        - mountPath: /var/console-config
          name: console-config
          readOnly: true
        - mountPath: /var/service-ca
          name: service-ca
          readOnly: true
        - mountPath: /var/default-ingress-cert
          name: default-ingress-cert
          readOnly: true
        - mountPath: /etc/pki/ca-trust/extracted/pem
          name: trusted-ca-bundle
          readOnly: true
      dnsPolicy: ClusterFirst
      nodeSelector:
        node-role.kubernetes.io/master: ""
      priorityClassName: system-cluster-critical
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: console
      serviceAccountName: console
      terminationGracePeriodSeconds: 40
      tolerations:
      - effect: NoSchedule
        key: node-role.kubernetes.io/master
        operator: Exists
      - effect: NoExecute
        key: node.kubernetes.io/unreachable
        operator: Exists
        tolerationSeconds: 120
      - effect: NoExecute
        key: node.kubernetes.io/not-reachable
        operator: Exists
        tolerationSeconds: 120
      volumes:
      - name: console-serving-cert
        secret:
          defaultMode: 420
          secretName: console-serving-cert
      - name: console-oauth-config
        secret:
          defaultMode: 420
          secretName: console-oauth-config
      - configMap:
          defaultMode: 420
          name: console-config
        name: console-config
      - configMap:
          defaultMode: 420
          name: service-ca
        name: service-ca
      - configMap:
          defaultMode: 420
          name: default-ingress-cert
        name: default-ingress-cert
      - configMap:
          defaultMode: 420
          items:
          - key: ca-bundle.crt
            path: tls-ca-bundle.pem
          name: trusted-ca-bundle
        name: trusted-ca-bundle

There is no oauthEndpointCAFile refer to the service account CA (/var/run/secrets/kubernetes.io/serviceaccount/ca.crt)
$oc logs console-operator-7fb9b776dc-4hpd4 -n openshift-console-operator
417 07:55:03.663020       1 controller.go:129] {Console Console} failed with: default-ingress-cert configmap not found
I0417 07:55:03.675821       1 event.go:278] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-console-operator", Name:"console-operator", UID:"41f103e4-274b-41c4-961a-86cf04d30b54", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for clusteroperator/console changed: Degraded message changed from "RouteHealthDegraded: failed to read CA to check route health: configmaps \"default-ingress-cert\" not found" to "RouteHealthDegraded: failed to read CA to check route health: configmaps \"default-ingress-cert\" not found\nDefaultIngressCertValidationDegraded: default-ingress-cert configmap not found"
E0417 07:55:04.855146       1 status.go:74] RouteHealthDegraded FailedLoadCA failed to read CA to check route health: configmaps "default-ingress-cert" not found
E0417 07:55:06.453848       1 status.go:74] DefaultIngressCertValidationDegraded FailedGet default-ingress-cert configmap not found
E0417 07:55:06.453998       1 controller.go:129] {Console Console} failed with: default-ingress-cert configmap not found
E0417 07:55:08.453589       1 status.go:74] DefaultIngressCertValidationDegraded FailedGet default-ingress-cert configmap not found
E0417 07:55:08.453713       1 controller.go:129] {Console Console} failed with: default-ingress-cert configmap not found
E0417 07:55:10.453553       1 status.go:74] DefaultIngressCertValidationDegraded FailedGet default-ingress-cert configmap not found
E0417 07:55:10.453666       1 controller.go:129] {Console Console} failed with: default-ingress-cert configmap not found
E0417 07:55:12.453723       1 status.go:74] DefaultIngressCertValidationDegraded FailedGet default-ingress-cert configmap not found

4.5.0-0.ci-2020-04-17-071134

Comment 4 bpeterse 2020-04-17 18:25:56 UTC
Operator behavior should be as follows:
- default-ingress-cert should be used if it exists, otherwise
- router-ca should be used

Comment 5 Standa Laznicka 2020-04-22 09:42:36 UTC
router-ca no longer exists in 4.5 and https://bugzilla.redhat.com/show_bug.cgi?id=1824934#c3 is unsupported. Shanan, please explain which supported scenario you are trying to mock and why you're mocking it instead of actually performing the scenario.

Comment 6 Standa Laznicka 2020-04-22 09:44:26 UTC
Nvm, discard my previous comment, I did not read the PR...

Comment 7 Standa Laznicka 2020-04-22 11:01:46 UTC
Couple of observations:
- shana reveals that the operator health check demands existence of the default-ingress-cert but is willing to deploy the pods even when that does not exist, that makes little sense
- according to the enhancement https://github.com/openshift/enhancements/blob/master/enhancements/network/default-ingress-cert-configmap.md#implementation-history, default-ingress-cert always exists, so it does not make sense to test removing it
- /var/run/secrets/kubernetes.io/serviceaccount/ca.crt should contain the very same ingress cert from the CM that you're using (among other CAs), so you may not need this fallback unless you needed to deploy console pods before the certificate becomes available

I would remove the fallback as a whole unless you have a good reason to keep it. Otherwise you should fix your health checks :)

Comment 8 Jakub Hadvig 2020-04-27 08:45:06 UTC
In that case I guess we should be OK to close this issue. Maybe create a story to remove the fallback?

Comment 9 Jakub Hadvig 2020-05-08 14:00:28 UTC
we did not have time to address this during this sprint

Comment 11 Jakub Hadvig 2020-05-18 12:46:17 UTC
So the issue here is that the bug failed the QA, but the tested scenario is unsupported,
meaning the default-ingress-cert CM always exists, so it does not make sense to test removing it
and also scale down to zero the ingress-operator and CVO. On that we agreed with @Standa.

For that case we should either test that the changes delivered in the PR are reflected, or
close it down.

Comment #8 is there just as a reminder that we now can remove the fallback of the missing 
default-ingress-cert CM, for which we can create a story or open an BZ (which would be more 
appropriate IMO).

Comment 12 Jakub Hadvig 2020-05-18 12:46:43 UTC
Also no additional backports are needed.

Comment 13 Jakub Hadvig 2020-05-19 10:11:00 UTC
After chat with Ben putting on ON_QA since the scenario QA used in the FailedQA case was invalid,
please check https://bugzilla.redhat.com/show_bug.cgi?id=1824934#c7

We created a story for removing the fallback.

Comment 14 shahan 2020-05-21 10:44:57 UTC
        - mountPath: /var/default-ingress-cert
          name: default-ingress-cert
console use the /var/default-ingress-cert defauly as expected.
Verify this bug
4.5.0-0.nightly-2020-05-20-183547

Comment 15 errata-xmlrpc 2020-07-13 17:27:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409


Note You need to log in before you can comment on or make changes to this bug.