Bug 1777593
| Summary: | Metrics endpoints of catalog-operator and olm-operator are potentially broken by service CA rotation | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Evan Cordell <ecordell> |
| Component: | OLM | Assignee: | Jeff Peeler <jpeeler> |
| OLM sub component: | OLM | QA Contact: | Jian Zhang <jiazha> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | urgent | ||
| Priority: | unspecified | CC: | ecordell, jiazha, jpeeler, mnewby, nhale |
| Version: | 4.3.0 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.3.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 1771811 | Environment: | |
| Last Closed: | 2020-01-23 11:14:45 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1771811 | ||
| Bug Blocks: | 1775250 | ||
|
Description
Evan Cordell
2019-11-27 22:02:45 UTC
Hi, Jeff
I test it in a cluster without this fixed PR. But, I couldn't reproduce this issue. Details as follows:
Cluster version is 4.3.0-0.nightly-2019-12-03-032607
The OLM version without that fixed PR.
mac:~ jianzhang$ oc exec catalog-operator-6cfdcd86fd-xwpsh -- olm --version
OLM version: 0.13.0
git commit: ba10413e72cfe23724edc588ff25f36dfdbeb37e
1, Delete olm-operator-serving-cert and catalog-operator-serving-cert.
mac:~ jianzhang$ oc get secret
NAME TYPE DATA AGE
builder-dockercfg-lv7jr kubernetes.io/dockercfg 1 23h
builder-token-2k476 kubernetes.io/service-account-token 4 23h
builder-token-zj659 kubernetes.io/service-account-token 4 23h
catalog-operator-serving-cert kubernetes.io/tls 2 6m19s
default-dockercfg-kzn65 kubernetes.io/dockercfg 1 23h
default-token-lbgz5 kubernetes.io/service-account-token 4 23h
default-token-x554m kubernetes.io/service-account-token 4 23h
deployer-dockercfg-pdrmt kubernetes.io/dockercfg 1 23h
deployer-token-mclc7 kubernetes.io/service-account-token 4 23h
deployer-token-q9jtd kubernetes.io/service-account-token 4 23h
olm-operator-serviceaccount-dockercfg-zqfnc kubernetes.io/dockercfg 1 23h
olm-operator-serviceaccount-token-4vtnf kubernetes.io/service-account-token 4 23h
olm-operator-serviceaccount-token-vgfxq kubernetes.io/service-account-token 4 23h
olm-operator-serving-cert kubernetes.io/tls 2 6m19s
v1.packages.operators.coreos.com-cert kubernetes.io/tls 2 23h
2, Forward the port to my localhost.
mac:~ jianzhang$ oc port-forward catalog-operator-6cfdcd86fd-xwpsh 8081:8081
Forwarding from 127.0.0.1:8081 -> 8081
Forwarding from [::1]:8081 -> 8081
Handling connection for 8081
3, In another terminal, run `openssl s_client -connect`, it works well.
mac:~ jianzhang$ openssl s_client -connect localhost:8081
CONNECTED(00000005)
depth=1 CN = openshift-service-serving-signer@1575455354
verify error:num=19:self signed certificate in certificate chain
verify return:0
---
Certificate chain
0 s:/CN=catalog-operator-metrics.openshift-operator-lifecycle-manager.svc
i:/CN=openshift-service-serving-signer@1575455354
1 s:/CN=openshift-service-serving-signer@1575455354
i:/CN=openshift-service-serving-signer@1575455354
---
Server certificate
-----BEGIN CERTIFICATE-----
...
Start Time: 1575538998
Timeout : 7200 (sec)
Verify return code: 19 (self signed certificate in certificate chain)
4, Check the metrics on the Promuttheus, it works well. See a screenshot: https://user-images.githubusercontent.com/15416633/70224969-1cd68280-1789-11ea-8aa9-669a4c9c9f0d.png
So, what're the steps to reproduce this issue?
My reference to using openssl s_client was a pointer to get started, not the entire test itself. Without the PR, the original certificate will stay in use until the container is restarted. I don't see much value in testing anything without the PR, but if you really wanted to you can verify that the certificate is still the same after you delete the certs in the OLM namespace. With the PR, do something like this after you've set up the port forwarding you had before: $ echo | openssl s_client -connect localhost:8081 2>&1 | sed --quiet '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' > olm.crt $ openssl x509 -in olm.crt -purpose -noout -text Do the above before and after deleting the certificate in the OLM namespace. The result should be that the certificate is different and I assume the validity (not before / not after) will be slightly different too. Hi Jeff, Many thanks for your information! I test it in a cluster within this fixed PR, details as follows: Cluster version is 4.3.0-0.nightly-2019-12-05-213858 mac:~ jianzhang$ oc exec catalog-operator-8fcc9bc76-bjzz6 -- olm --version OLM version: 0.13.0 git commit: 7dfd4517e5368fa19c48dab9b9e126798f3c3f40 mac:~ jianzhang$ oc port-forward catalog-operator-8fcc9bc76-kvctw 8081:8081 Forwarding from 127.0.0.1:8081 -> 8081 Forwarding from [::1]:8081 -> 8081 Handling connection for 8081 ... mac:~ jianzhang$ echo | openssl s_client -connect localhost:8081 2>&1 | gsed --quiet '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' > olm4.crt Only delete secret: catalog-operator-serving-cert, olm-operator-serving-cert mac:~ jianzhang$ oc get secret NAME TYPE DATA AGE builder-dockercfg-mpcgh kubernetes.io/dockercfg 1 42m builder-token-4dpzn kubernetes.io/service-account-token 4 43m builder-token-p8vm9 kubernetes.io/service-account-token 4 43m catalog-operator-serving-cert kubernetes.io/tls 2 64s default-dockercfg-4zkbl kubernetes.io/dockercfg 1 42m default-token-dhdst kubernetes.io/service-account-token 4 51m default-token-j2vdg kubernetes.io/service-account-token 4 43m deployer-dockercfg-v979w kubernetes.io/dockercfg 1 42m deployer-token-54pmq kubernetes.io/service-account-token 4 43m deployer-token-tr248 kubernetes.io/service-account-token 4 43m olm-operator-serviceaccount-dockercfg-ldx5g kubernetes.io/dockercfg 1 43m olm-operator-serviceaccount-token-kbshw kubernetes.io/service-account-token 4 43m olm-operator-serviceaccount-token-knwvr kubernetes.io/service-account-token 4 51m olm-operator-serving-cert kubernetes.io/tls 2 64s v1.packages.operators.coreos.com-cert kubernetes.io/tls 2 47m mac:~ jianzhang$ echo | openssl s_client -connect localhost:8081 2>&1 | gsed --quiet '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' > olm5.crt Check if the olm4.crt and olm5. crt are the same. mac:~ jianzhang$ diff olm4.crt olm5.crt 2c2 < MIIEVjCCAz6gAwIBAgIIKAf5qP8BYvcwDQYJKoZIhvcNAQELBQAwNjE0MDIGA1UE --- > MIIEVjCCAz6gAwIBAgIIBZzzc3WJ7kYwDQYJKoZIhvcNAQELBQAwNjE0MDIGA1UE 4c4 < Fw0xOTEyMDYwNTM0NDRaFw0yMTEyMDUwNTM0NDVaMEwxSjBIBgNVBAMTQWNhdGFs --- > Fw0xOTEyMDYwNTQzMDlaFw0yMTEyMDUwNTQzMTBaMEwxSjBIBgNVBAMTQWNhdGFs 6,14c6,14 < LW1hbmFnZXIuc3ZjMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAxxpz < w7/wDb5aJGsu9cVGzF08wVpXMWYW6VfFa0oiipLO/RttLOgm8UUsjqgH+w/bwaCl < X1zxdVBbpqvHX3NDxvb72GM24qhTKoWXuQX0Vt6pzn8vhzzvnzFcy4sjXx7fOmC2 < tc4b4dGiwmYh9hqy/Jtv19QTU7LI+/Prk+2oYe/fRK5PDH1UEFLWx3nfzmjstZGE < 9aRnh5wTba2iCnmP8i/BYa9yVdt58Mb7touBA+/Nj3iTL0KgNBkJQLEoiIcmuE7C < jgUQRMxRfRVVdXR7XMHrQerr96tajZwnSjbcM4SYEcigoRJVa+o/g019mRfktajH < o8d+6fuf6AHt8uNA0QIDAQABo4IBUDCCAUwwDgYDVR0PAQH/BAQDAgWgMBMGA1Ud < JQQMMAoGCCsGAQUFBwMBMAwGA1UdEwEB/wQCMAAwHQYDVR0OBBYEFO5v1gBY0qlr < f59f05f4V6IEMAP/MB8GA1UdIwQYMBaAFNxpLrspblVMh04UbIaYlAYneWO2MIGf --- > LW1hbmFnZXIuc3ZjMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAsagD > npKrLfRay3Re7RMBpRl4MtCoZrqR9I5Aps575G8k0uBGwXf2F4YURHjpXvD0zfly > mbTy3U/oeStX+HDQ54mfLjDhGqkizpFmYHASwtqXdDxsrRbeGRKzWCYsYWaBZTAq > KrniFtPCiAOCEAbJBvUmcv2ahR6CVajXNiUSz9j+ptPoGCyfpQ4CO1kSF6X0Y5Gy > R8kTExhXua6bs30jpdhE9vcENpc8YjGrh/81HtMZRohwWyZNeAz3dwbIxuX1YfVB > dz1AT9O5ebciy3cs4EaU5wr5bj6/63I4DF5rQa7NZJPLlCurBFLYpR5F4Mk0a1TD > LQ3c4DQRM+6wgLSqmwIDAQABo4IBUDCCAUwwDgYDVR0PAQH/BAQDAgWgMBMGA1Ud > JQQMMAoGCCsGAQUFBwMBMAwGA1UdEwEB/wQCMAAwHQYDVR0OBBYEFHAevST4r1DZ > WJAsxGXaL/OxucL/MB8GA1UdIwQYMBaAFNxpLrspblVMh04UbIaYlAYneWO2MIGf 19,25c19,25 < NDllOS05ZDQ5LWEyMzI4ODc5NDM1YTANBgkqhkiG9w0BAQsFAAOCAQEAW3QGBOxR < 7dzGifds6qnei4JjFx85Jgq6eLUKZSvz3RLfToKtWs96LCQIp0cxPdnJtFAzfzEO < 3vk04ZXfgG2FnomlQ0h7SOZQH03+khwErVjIwfoHyHvVIzLXEI9p6yyHCWArkS3L < YrIqbCMN+hP6BNi9+iFXRuF80H0POMwXIz96Sk6hOxZOqg6lb8NBiJusf2Av6Np0 < DduWZJC/Xef9paiDkLKzXJkginNNQ0MZCWnTgl5+weXJJYeQauk8zUyGunDu4Os6 < hSYi16xPKHryIlsWEPMnMdKlye8pn3UDT4E+5xKjBf26ML5kiPSYbCav/pt7olkF < DbxbG5OYu9KKWQ== --- > NDllOS05ZDQ5LWEyMzI4ODc5NDM1YTANBgkqhkiG9w0BAQsFAAOCAQEADUdpNgTW > HjwfQorMzRKVMYdvSGC/Ku/SaSBJd65mbQFexNeYiloX+UcogM5IawFqDw6haK6m > DJlG5hR+uBgdSIgSYlRvUPkLU/iRgtUXnMydb8OTOs3cxTFTEloaaA4BzJNz7qn8 > M0TggdR5jKDHa29h1IyO30jvQnz52mMpLfXt+QrRoWQ+Gs+Pv1mLjomMUkPcgxOS > s5JKJ0AVcrEQmQbZPuLTmispVtZ3v1YD4mvI4Fc5HsMRXSQwIYVOioimC9ownK0n > 6ldi9gDEPE/JjaDOj53McVP2TSnaEaGdDksVPei5Y45Y+MmrHqWlTIcKfnax53R+ > Ec7NcsKMB0QCPg== They are different. LGTM, verify it. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0062 |