Bug 1824934
| Summary: | Console operator inverts logic for picking up the default-ingress-cert | |||
|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | bpeterse | |
| Component: | Management Console | Assignee: | Jakub Hadvig <jhadvig> | |
| Status: | CLOSED ERRATA | QA Contact: | Yadan Pei <yapei> | |
| Severity: | medium | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 4.4 | CC: | aos-bugs, hasha, jhadvig, jokerman, pweil, slaznick | |
| Target Milestone: | --- | |||
| Target Release: | 4.5.0 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | Bug Fix | ||
| Doc Text: |
Cause: Fix falsely stated that the configmap would be absent if the administrator provided a default certificate for every ingress controller.Since 4.5 we are mounting default-ingress-cert into the console pod.
Consequence: Default CA was usedinside the console pod.
Fix: Configure the console to use the default-ingress-cert configmap if the configmap exists, or else to use the default CA if the configmap is absent.
Result: Use and mount default-ingress-cert configmap into the console pod.
|
Story Points: | --- | |
| Clone Of: | ||||
| : | 1824935 (view as bug list) | Environment: | ||
| Last Closed: | 2020-07-13 17:27:59 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1824935 | |||
|
Description
bpeterse
2020-04-16 17:08:24 UTC
1. delete the cm default-ingress-cert from openshift-console
$oc -n openshift-cluster-version scale deployments/cluster-version-operator --replicas=0
$oc -n openshift-ingress-operator scale deploy/ingress-operator --replicas=0
$oc -n openshift-config-managed delete configmaps/default-ingress-cert
$ oc get cm default-ingress-cert -n openshift-console
Error from server (NotFound): configmaps "default-ingress-cert" not found
2. after a while, check the deployment console
containers:
- command:
- /opt/bridge/bin/bridge
- --public-dir=/opt/bridge/static
- --config=/var/console-config/console-config.yaml
- --service-ca-file=/var/service-ca/service-ca.crt
image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:3032707cb3bd2f6089e0ec01460bf7837bea020c445cd01b20ebb21ba8fe6983
imagePullPolicy: IfNotPresent
lifecycle:
preStop:
exec:
command:
- sleep
- "25"
livenessProbe:
failureThreshold: 3
httpGet:
path: /health
port: 8443
scheme: HTTPS
initialDelaySeconds: 150
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
name: console
ports:
- containerPort: 443
name: https
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /health
port: 8443
scheme: HTTPS
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources:
requests:
cpu: 10m
memory: 100Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FallbackToLogsOnError
volumeMounts:
- mountPath: /var/serving-cert
name: console-serving-cert
readOnly: true
- mountPath: /var/oauth-config
name: console-oauth-config
readOnly: true
- mountPath: /var/console-config
name: console-config
readOnly: true
- mountPath: /var/service-ca
name: service-ca
readOnly: true
- mountPath: /var/default-ingress-cert
name: default-ingress-cert
readOnly: true
- mountPath: /etc/pki/ca-trust/extracted/pem
name: trusted-ca-bundle
readOnly: true
dnsPolicy: ClusterFirst
nodeSelector:
node-role.kubernetes.io/master: ""
priorityClassName: system-cluster-critical
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: console
serviceAccountName: console
terminationGracePeriodSeconds: 40
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/master
operator: Exists
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 120
- effect: NoExecute
key: node.kubernetes.io/not-reachable
operator: Exists
tolerationSeconds: 120
volumes:
- name: console-serving-cert
secret:
defaultMode: 420
secretName: console-serving-cert
- name: console-oauth-config
secret:
defaultMode: 420
secretName: console-oauth-config
- configMap:
defaultMode: 420
name: console-config
name: console-config
- configMap:
defaultMode: 420
name: service-ca
name: service-ca
- configMap:
defaultMode: 420
name: default-ingress-cert
name: default-ingress-cert
- configMap:
defaultMode: 420
items:
- key: ca-bundle.crt
path: tls-ca-bundle.pem
name: trusted-ca-bundle
name: trusted-ca-bundle
There is no oauthEndpointCAFile refer to the service account CA (/var/run/secrets/kubernetes.io/serviceaccount/ca.crt)
$oc logs console-operator-7fb9b776dc-4hpd4 -n openshift-console-operator
417 07:55:03.663020 1 controller.go:129] {Console Console} failed with: default-ingress-cert configmap not found
I0417 07:55:03.675821 1 event.go:278] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-console-operator", Name:"console-operator", UID:"41f103e4-274b-41c4-961a-86cf04d30b54", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for clusteroperator/console changed: Degraded message changed from "RouteHealthDegraded: failed to read CA to check route health: configmaps \"default-ingress-cert\" not found" to "RouteHealthDegraded: failed to read CA to check route health: configmaps \"default-ingress-cert\" not found\nDefaultIngressCertValidationDegraded: default-ingress-cert configmap not found"
E0417 07:55:04.855146 1 status.go:74] RouteHealthDegraded FailedLoadCA failed to read CA to check route health: configmaps "default-ingress-cert" not found
E0417 07:55:06.453848 1 status.go:74] DefaultIngressCertValidationDegraded FailedGet default-ingress-cert configmap not found
E0417 07:55:06.453998 1 controller.go:129] {Console Console} failed with: default-ingress-cert configmap not found
E0417 07:55:08.453589 1 status.go:74] DefaultIngressCertValidationDegraded FailedGet default-ingress-cert configmap not found
E0417 07:55:08.453713 1 controller.go:129] {Console Console} failed with: default-ingress-cert configmap not found
E0417 07:55:10.453553 1 status.go:74] DefaultIngressCertValidationDegraded FailedGet default-ingress-cert configmap not found
E0417 07:55:10.453666 1 controller.go:129] {Console Console} failed with: default-ingress-cert configmap not found
E0417 07:55:12.453723 1 status.go:74] DefaultIngressCertValidationDegraded FailedGet default-ingress-cert configmap not found
4.5.0-0.ci-2020-04-17-071134
Operator behavior should be as follows: - default-ingress-cert should be used if it exists, otherwise - router-ca should be used router-ca no longer exists in 4.5 and https://bugzilla.redhat.com/show_bug.cgi?id=1824934#c3 is unsupported. Shanan, please explain which supported scenario you are trying to mock and why you're mocking it instead of actually performing the scenario. Nvm, discard my previous comment, I did not read the PR... Couple of observations: - shana reveals that the operator health check demands existence of the default-ingress-cert but is willing to deploy the pods even when that does not exist, that makes little sense - according to the enhancement https://github.com/openshift/enhancements/blob/master/enhancements/network/default-ingress-cert-configmap.md#implementation-history, default-ingress-cert always exists, so it does not make sense to test removing it - /var/run/secrets/kubernetes.io/serviceaccount/ca.crt should contain the very same ingress cert from the CM that you're using (among other CAs), so you may not need this fallback unless you needed to deploy console pods before the certificate becomes available I would remove the fallback as a whole unless you have a good reason to keep it. Otherwise you should fix your health checks :) In that case I guess we should be OK to close this issue. Maybe create a story to remove the fallback? we did not have time to address this during this sprint So the issue here is that the bug failed the QA, but the tested scenario is unsupported, meaning the default-ingress-cert CM always exists, so it does not make sense to test removing it and also scale down to zero the ingress-operator and CVO. On that we agreed with @Standa. For that case we should either test that the changes delivered in the PR are reflected, or close it down. Comment #8 is there just as a reminder that we now can remove the fallback of the missing default-ingress-cert CM, for which we can create a story or open an BZ (which would be more appropriate IMO). Also no additional backports are needed. After chat with Ben putting on ON_QA since the scenario QA used in the FailedQA case was invalid, please check https://bugzilla.redhat.com/show_bug.cgi?id=1824934#c7 We created a story for removing the fallback. - mountPath: /var/default-ingress-cert
name: default-ingress-cert
console use the /var/default-ingress-cert defauly as expected.
Verify this bug
4.5.0-0.nightly-2020-05-20-183547
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |