Bug 1824934
Summary: | Console operator inverts logic for picking up the default-ingress-cert | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | bpeterse | |
Component: | Management Console | Assignee: | Jakub Hadvig <jhadvig> | |
Status: | CLOSED ERRATA | QA Contact: | Yadan Pei <yapei> | |
Severity: | medium | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 4.4 | CC: | aos-bugs, hasha, jhadvig, jokerman, pweil, slaznick | |
Target Milestone: | --- | |||
Target Release: | 4.5.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: |
Cause: Fix falsely stated that the configmap would be absent if the administrator provided a default certificate for every ingress controller.Since 4.5 we are mounting default-ingress-cert into the console pod.
Consequence: Default CA was usedinside the console pod.
Fix: Configure the console to use the default-ingress-cert configmap if the configmap exists, or else to use the default CA if the configmap is absent.
Result: Use and mount default-ingress-cert configmap into the console pod.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1824935 (view as bug list) | Environment: | ||
Last Closed: | 2020-07-13 17:27:59 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1824935 |
Description
bpeterse
2020-04-16 17:08:24 UTC
1. delete the cm default-ingress-cert from openshift-console $oc -n openshift-cluster-version scale deployments/cluster-version-operator --replicas=0 $oc -n openshift-ingress-operator scale deploy/ingress-operator --replicas=0 $oc -n openshift-config-managed delete configmaps/default-ingress-cert $ oc get cm default-ingress-cert -n openshift-console Error from server (NotFound): configmaps "default-ingress-cert" not found 2. after a while, check the deployment console containers: - command: - /opt/bridge/bin/bridge - --public-dir=/opt/bridge/static - --config=/var/console-config/console-config.yaml - --service-ca-file=/var/service-ca/service-ca.crt image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:3032707cb3bd2f6089e0ec01460bf7837bea020c445cd01b20ebb21ba8fe6983 imagePullPolicy: IfNotPresent lifecycle: preStop: exec: command: - sleep - "25" livenessProbe: failureThreshold: 3 httpGet: path: /health port: 8443 scheme: HTTPS initialDelaySeconds: 150 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 name: console ports: - containerPort: 443 name: https protocol: TCP readinessProbe: failureThreshold: 3 httpGet: path: /health port: 8443 scheme: HTTPS periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 resources: requests: cpu: 10m memory: 100Mi terminationMessagePath: /dev/termination-log terminationMessagePolicy: FallbackToLogsOnError volumeMounts: - mountPath: /var/serving-cert name: console-serving-cert readOnly: true - mountPath: /var/oauth-config name: console-oauth-config readOnly: true - mountPath: /var/console-config name: console-config readOnly: true - mountPath: /var/service-ca name: service-ca readOnly: true - mountPath: /var/default-ingress-cert name: default-ingress-cert readOnly: true - mountPath: /etc/pki/ca-trust/extracted/pem name: trusted-ca-bundle readOnly: true dnsPolicy: ClusterFirst nodeSelector: node-role.kubernetes.io/master: "" priorityClassName: system-cluster-critical restartPolicy: Always schedulerName: default-scheduler securityContext: {} serviceAccount: console serviceAccountName: console terminationGracePeriodSeconds: 40 tolerations: - effect: NoSchedule key: node-role.kubernetes.io/master operator: Exists - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists tolerationSeconds: 120 - effect: NoExecute key: node.kubernetes.io/not-reachable operator: Exists tolerationSeconds: 120 volumes: - name: console-serving-cert secret: defaultMode: 420 secretName: console-serving-cert - name: console-oauth-config secret: defaultMode: 420 secretName: console-oauth-config - configMap: defaultMode: 420 name: console-config name: console-config - configMap: defaultMode: 420 name: service-ca name: service-ca - configMap: defaultMode: 420 name: default-ingress-cert name: default-ingress-cert - configMap: defaultMode: 420 items: - key: ca-bundle.crt path: tls-ca-bundle.pem name: trusted-ca-bundle name: trusted-ca-bundle There is no oauthEndpointCAFile refer to the service account CA (/var/run/secrets/kubernetes.io/serviceaccount/ca.crt) $oc logs console-operator-7fb9b776dc-4hpd4 -n openshift-console-operator 417 07:55:03.663020 1 controller.go:129] {Console Console} failed with: default-ingress-cert configmap not found I0417 07:55:03.675821 1 event.go:278] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-console-operator", Name:"console-operator", UID:"41f103e4-274b-41c4-961a-86cf04d30b54", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for clusteroperator/console changed: Degraded message changed from "RouteHealthDegraded: failed to read CA to check route health: configmaps \"default-ingress-cert\" not found" to "RouteHealthDegraded: failed to read CA to check route health: configmaps \"default-ingress-cert\" not found\nDefaultIngressCertValidationDegraded: default-ingress-cert configmap not found" E0417 07:55:04.855146 1 status.go:74] RouteHealthDegraded FailedLoadCA failed to read CA to check route health: configmaps "default-ingress-cert" not found E0417 07:55:06.453848 1 status.go:74] DefaultIngressCertValidationDegraded FailedGet default-ingress-cert configmap not found E0417 07:55:06.453998 1 controller.go:129] {Console Console} failed with: default-ingress-cert configmap not found E0417 07:55:08.453589 1 status.go:74] DefaultIngressCertValidationDegraded FailedGet default-ingress-cert configmap not found E0417 07:55:08.453713 1 controller.go:129] {Console Console} failed with: default-ingress-cert configmap not found E0417 07:55:10.453553 1 status.go:74] DefaultIngressCertValidationDegraded FailedGet default-ingress-cert configmap not found E0417 07:55:10.453666 1 controller.go:129] {Console Console} failed with: default-ingress-cert configmap not found E0417 07:55:12.453723 1 status.go:74] DefaultIngressCertValidationDegraded FailedGet default-ingress-cert configmap not found 4.5.0-0.ci-2020-04-17-071134 Operator behavior should be as follows: - default-ingress-cert should be used if it exists, otherwise - router-ca should be used router-ca no longer exists in 4.5 and https://bugzilla.redhat.com/show_bug.cgi?id=1824934#c3 is unsupported. Shanan, please explain which supported scenario you are trying to mock and why you're mocking it instead of actually performing the scenario. Nvm, discard my previous comment, I did not read the PR... Couple of observations: - shana reveals that the operator health check demands existence of the default-ingress-cert but is willing to deploy the pods even when that does not exist, that makes little sense - according to the enhancement https://github.com/openshift/enhancements/blob/master/enhancements/network/default-ingress-cert-configmap.md#implementation-history, default-ingress-cert always exists, so it does not make sense to test removing it - /var/run/secrets/kubernetes.io/serviceaccount/ca.crt should contain the very same ingress cert from the CM that you're using (among other CAs), so you may not need this fallback unless you needed to deploy console pods before the certificate becomes available I would remove the fallback as a whole unless you have a good reason to keep it. Otherwise you should fix your health checks :) In that case I guess we should be OK to close this issue. Maybe create a story to remove the fallback? we did not have time to address this during this sprint So the issue here is that the bug failed the QA, but the tested scenario is unsupported, meaning the default-ingress-cert CM always exists, so it does not make sense to test removing it and also scale down to zero the ingress-operator and CVO. On that we agreed with @Standa. For that case we should either test that the changes delivered in the PR are reflected, or close it down. Comment #8 is there just as a reminder that we now can remove the fallback of the missing default-ingress-cert CM, for which we can create a story or open an BZ (which would be more appropriate IMO). Also no additional backports are needed. After chat with Ben putting on ON_QA since the scenario QA used in the FailedQA case was invalid, please check https://bugzilla.redhat.com/show_bug.cgi?id=1824934#c7 We created a story for removing the fallback. - mountPath: /var/default-ingress-cert name: default-ingress-cert console use the /var/default-ingress-cert defauly as expected. Verify this bug 4.5.0-0.nightly-2020-05-20-183547 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |