Bug 1829492
| Summary: | Upgrade playbook fails if certificates are going to expire in less than 183 days and the openshift_certificate_expiry_warning_days has been set in the inventory file | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Joel Rosental R. <jrosenta> |
| Component: | Installer | Assignee: | Russell Teague <rteague> |
| Installer sub component: | openshift-ansible | QA Contact: | Gaoyun Pei <gpei> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | medium | ||
| Priority: | medium | CC: | apjagtap, bleanhar |
| Version: | 3.11.0 | ||
| Target Milestone: | --- | ||
| Target Release: | 3.11.z | ||
| Hardware: | Unspecified | ||
| OS: | All | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
Cause: The variable openshift_certificate_expiry_warning_days was hard-coded for one part of the code calling the openshift_certificate_expiry role during upgrades.
Consequence: This prevented overriding the variable in the inventory.
Fix: Replaced the hard-coded value with a task to set a value of six months if the variable has not been defined by the user.
Result: Override possible in inventory and upgrades will default to six months.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-05-28 05:44:13 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
*** Bug 1829232 has been marked as a duplicate of this bug. *** Verify this bug with openshift-ansible-3.11.218-1.git.0.6f55149.el7.noarch.
When certificates are going to expire in less than 183 days:
1) Upgrade playbook will fail for certificates are near expired the by default
TASK [openshift_certificate_expiry : Check cert expirys on host] ****************************************************
ok: [ci-vm-10-0-149-234.hosted.upshift.rdu2.redhat.com] =>
..."days_remaining": 18, "expiry": "2022-05-19 03:38:23", "health": "warning", "issuer": "CN=openshift-signer@1589859503 ", "path": "/etc/origin/master/master.kubelet-client.crt", "serial": 3, "serial_hex": "0x3"}], "registry": [], "router": []}, "msg": "Checked 16 total certificates. Expired/Warning/OK: 0/7/9. Warning window: 183 days", "rc": 0, "summary": {"etcd_certificates": 3, "expired": 0, "kubeconfig_certificates": 4, "ok": 9, "registry_certs": 0, "router_certs": 0, "system_certificates": 9, "total": 16, "warning": 7}, "warn_certs": true}
...
TASK [openshift_certificate_expiry : Fail when certs are near or already expired] ***********************************
fatal: [ci-vm-10-0-149-234.hosted.upshift.rdu2.redhat.com]: FAILED! => {"changed": false, "msg": "Cluster certificates found to be expired or within 183 days of expiring. You may view the report at /root/cert-expiry-report.20220501T000040.html or /root/cert-expiry-report.20220501T000040.json.\n"}
2) With setting the openshift_certificate_expiry_warning_days to a smaller number, playbook could continue.
openshift_certificate_expiry_warning_days=7
TASK [openshift_certificate_expiry : Check cert expirys on host] ****************************************************
ok: [ci-vm-10-0-149-234.hosted.upshift.rdu2.redhat.com] => {"changed": false, "check_results": {"etcd": [], "kubeconfigs": [], "meta": {"checked_at_time": "2022-05-01 00:02:50.357203", "show_all": "False", "warn_before_date": "2022-05-08 00:02:50.357203", "warning_days": 7}, "ocp_certs": [], "registry": [], "router": []}, "msg": "Checked 16 total certificates. Expired/Warning/OK: 0/0/16. Warning window: 7 days", "rc": 0, "summary": {"etcd_certificates": 3, "expired": 0, "kubeconfig_certificates": 4, "ok": 16, "registry_certs": 0, "router_certs": 0, "system_certificates": 9, "total": 16, "warning": 0}, "warn_certs": false}
...
TASK [openshift_certificate_expiry : Fail when certs are near or already expired] ***********************************
skipping: [ci-vm-10-0-149-234.hosted.upshift.rdu2.redhat.com] => {"changed": false, "skip_reason": "Conditional result was False"}
3) Bypass this check failure with setting openshift_certificate_expiry_fail_on_warn=false
TASK [openshift_certificate_expiry : Check cert expirys on host] ****************************************************
ok: [ci-vm-10-0-149-234.hosted.upshift.rdu2.redhat.com] =>
... "days_remaining": 18, "expiry": "2022-05-19 03:38:23", "health": "warning", "issuer": "CN=openshift-signer@1589859503 ", "path": "/etc/origin/master/master.kubelet-client.crt", "serial": 3, "serial_hex": "0x3"}], "registry": [], "router": []}, "msg": "Checked 16 total certificates. Expired/Warning/OK: 0/7/9. Warning window: 183 days", "rc": 0, "summary": {"etcd_certificates": 3, "expired": 0, "kubeconfig_certificates": 4, "ok": 9, "registry_certs": 0, "router_certs": 0, "system_certificates": 9, "total": 16, "warning": 7}, "warn_certs": true}
...
TASK [openshift_certificate_expiry : Fail when certs are near or already expired] ***********************************
skipping: [ci-vm-10-0-149-234.hosted.upshift.rdu2.redhat.com] => {"changed": false, "skip_reason": "Conditional result was False"}
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2215 |
Description of problem: While running upgrade_control_plane.yml playbook it fails if with the following error regardless the "openshift_certificate_expiry_warning_days" has been previously set in the inventory file to a lower value (e.g: 90) if any of the cluster certificates expire in less than 183 days: "1. Hosts: master01.myexample.com Play: Inspect cluster certificates Task: Fail when certs are near or already expired Message: Cluster certificates found to be expired or within 183 days of expiring. You may view the report at /root/cert-expiry-report.20200416T193315.html or /root/cert-expiry-report.20200416T193315.json." The reason seems to be due to this value that is hard-coded in as a variable that is passed to this task [0] and overrides any other value that may be set in the inventory because of having a higher precedence when ansible evaluates them. This was not present on openshift-ansible-3.11.170-2.git.5.8802564.el7.noarch. [0]: https://github.com/openshift/openshift-ansible/blob/release-3.11/playbooks/common/openshift-cluster/upgrades/init.yml#L20 Version-Release number of the following components: openshift-ansible-3.11.200-1.git.0.3f37acb.el7.noarch ansible-2.6.20-1.el7ae.noarch How reproducible: Always if conditions are met, i.e: if any cluster certificate is expiring in less than 183 days, and the openshift_certificate_expiry_warning_days variable has been set through the inventory file. Steps to Reproduce: 1. Set "openshift_certificate_expiry_warning_days" in the inventory to any value 2. Run /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/upgrades/v3_11/upgrade_control_plane.yml playbook Actual results: 2020-04-16 19:59:26,132 p=27757 u=root | TASK [openshift_certificate_expiry : Fail when certs are near or already expired] ****************************************************************** ********************************* 2020-04-16 19:59:26,132 p=27757 u=root | Thursday 16 April 2020 19:59:26 +0200 (0:00:10.114) 0:26:48.588 ******** 2020-04-16 19:59:26,514 p=27757 u=root | fatal: [master01.myexample.com]: FAILED! => {"changed": false, "msg": "Cluster certificates found to be expired or within 183 days of expiring. You m ay view the report at /root/cert-expiry-report.20200416T193315.html or /root/cert-expiry-report.20200416T193315.json.\n"} 2020-04-16 19:59:27,711 p=27757 u=root | fatal: [master02.myexample.com]: FAILED! => {"changed": false, "msg": "Cluster certificates found to be expired or within 183 days of expiring. You m ay view the report at /root/cert-expiry-report.20200416T193242.html or /root/cert-expiry-report.20200416T193242.json.\n"} 2020-04-16 19:59:27,830 p=27757 u=root | fatal: [master03.myexample.com]: FAILED! => {"changed": false, "msg": "Cluster certificates found to be expired or within 183 days of expiring. You m ay view the report at /root/cert-expiry-report.20200416T193242.html or /root/cert-expiry-report.20200416T193242.json.\n"} 2020-04-16 19:59:27,832 p=27757 u=root | NO MORE HOSTS LEFT ********************************************************************************************************************************* ********************************* Expected results: This variable should not be overriden. Additional info: