Bug 1520971 - [RFE] Add expiry information for certs master.kubelet-client.crt and master.proxy-client.crt in easy-mode.yaml playbook report
Summary: [RFE] Add expiry information for certs master.kubelet-client.crt and master.p...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.5.0
Hardware: Unspecified
OS: Unspecified
low
high
Target Milestone: ---
: 3.10.0
Assignee: Vadim Rutkovsky
QA Contact: Gaoyun Pei
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-12-05 14:43 UTC by Joel Rosental R.
Modified: 2018-08-28 17:21 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-08-28 17:21:39 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Joel Rosental R. 2017-12-05 14:43:35 UTC
Description of problem:
The report created by running certificate expiration playbook (easy-mode.yaml) doesn't include expiration information for files like:

/etc/origin/master/master.kubelet-client.crt and /etc/origin/master/master.proxy-client.crt

Version-Release number of selected component (if applicable):
Customer is running: 

openshift v3.5.5.31.36

How reproducible:
Always

Steps to Reproduce:
1. Execute easy-mode.yaml playbook to check certificate expiration information
2.
3.

Actual results:
These certs are not currently checked.

Expected results:
These files should be checked as well.

Additional info:
The fact of having these certs expired caused some issues for this customer like not being able to perform:

$ oc rsh <pod>

because of getting an error like:

error: unable to upgrade connection: Unauthorized

The error want away as soon as these two certs were regenerated again.

P.S: They were not using `oc proxy`.

Comment 1 Vadim Rutkovsky 2018-04-11 10:00:51 UTC
Created PR for 3.10 - https://github.com/openshift/openshift-ansible/pull/7904

Comment 2 Vadim Rutkovsky 2018-04-13 08:29:41 UTC
Fix for 3.10 is in openshift-ansible-3.10.0-0.21.0

Created PRs for 
3.6: https://github.com/openshift/openshift-ansible/pull/7943
3.7: https://github.com/openshift/openshift-ansible/pull/7942
3.9: https://github.com/openshift/openshift-ansible/pull/7941

Comment 3 Gaoyun Pei 2018-04-18 08:45:05 UTC
Tried with openshift-ansible-3.10.0-0.22.0.git.0.b6ec617.el7.noarch.

Certificate expiry check playbook failed as below:

[root@gpei-preserved ~]# ansible-playbook -i host/host /usr/share/ansible/openshift-ansible/playbooks/openshift-checks/certificate_expiry/easy-mode.yaml -v
Using /etc/ansible/ansible.cfg as config file

PLAY [Check cert expirys] ***************************************************************************************************************************************************

TASK [openshift_certificate_expiry : Check cert expirys on host] ************************************************************************************************************
fatal: [qe-gpei-310test2node-registry-router-1.0418-0gu.qe.rhcloud.com]: FAILED! => {"changed": false, "failed": true, "module_stderr": "Shared connection to qe-gpei-310test2node-registry-router-1.0418-0gu.qe.rhcloud.com closed.\r\n", "module_stdout": "Traceback (most recent call last):\r\n  File \"/tmp/ansible_796x07/ansible_module_openshift_cert_expiry.py\", line 805, in <module>\r\n    main()\r\n  File \"/tmp/ansible_796x07/ansible_module_openshift_cert_expiry.py\", line 507, in main\r\n    cert_meta['certFile'] = os.path.join(cfg_path, cfg['servingInfo']['certFile'])\r\nKeyError: 'certFile'\r\n", "msg": "MODULE FAILURE", "rc": 0}
fatal: [qe-gpei-310test2master-etcd-1.0418-0gu.qe.rhcloud.com]: FAILED! => {"changed": false, "failed": true, "module_stderr": "Shared connection to qe-gpei-310test2master-etcd-1.0418-0gu.qe.rhcloud.com closed.\r\n", "module_stdout": "Traceback (most recent call last):\r\n  File \"/tmp/ansible_jJD8pE/ansible_module_openshift_cert_expiry.py\", line 805, in <module>\r\n    main()\r\n  File \"/tmp/ansible_jJD8pE/ansible_module_openshift_cert_expiry.py\", line 507, in main\r\n    cert_meta['certFile'] = os.path.join(cfg_path, cfg['servingInfo']['certFile'])\r\nKeyError: 'certFile'\r\n", "msg": "MODULE FAILURE", "rc": 0}
	to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/openshift-checks/certificate_expiry/easy-mode.retry

PLAY RECAP ******************************************************************************************************************************************************************
qe-gpei-310test2master-etcd-1.0418-0gu.qe.rhcloud.com : ok=0    changed=0    unreachable=0    failed=1   
qe-gpei-310test2node-registry-router-1.0418-0gu.qe.rhcloud.com : ok=0    changed=0    unreachable=0    failed=1

Comment 4 Vadim Rutkovsky 2018-04-18 10:39:23 UTC
(In reply to Gaoyun Pei from comment #3)
> File
> \"/tmp/ansible_jJD8pE/ansible_module_openshift_cert_expiry.py\", line 507,
> in main\r\n    cert_meta['certFile'] = os.path.join(cfg_path,
> cfg['servingInfo']['certFile'])\r\nKeyError: 'certFile'\r\n",

Right, that happens when we're checking node certificate and there no such field there.

Created https://github.com/openshift/openshift-ansible/pull/8017 to fix it

Comment 5 Vadim Rutkovsky 2018-04-20 08:54:55 UTC
Fix is available in openshift-ansible-3.10.0-0.25.0

Comment 6 Gaoyun Pei 2018-04-24 03:38:55 UTC
Test with openshift-ansible-3.10.0-0.27.0.git.0.abed3b7.el7.noarch, cert-expiry check playbook fails on node.

[root@gpei-preserved ~]# ansible-playbook -i host/host /usr/share/ansible/openshift-ansible/playbooks/openshift-checks/certificate_expiry/easy-mode.yaml -v
Using /etc/ansible/ansible.cfg as config file

PLAY [Check cert expirys] ***************************************************************************************************************************************************

TASK [openshift_certificate_expiry : Check cert expirys on host] ************************************************************************************************************
fatal: [qe-gpei-3102node-registry-router-1.0423-2l7.qe.rhcloud.com]: FAILED! => {"changed": false, "failed": true, "module_stderr": "Shared connection to qe-gpei-3102node-registry-router-1.0423-2l7.qe.rhcloud.com closed.\r\n", "module_stdout": "Traceback (most recent call last):\r\n  File \"/tmp/ansible_b4DFlC/ansible_module_openshift_cert_expiry.py\", line 826, in <module>\r\n    main()\r\n  File \"/tmp/ansible_b4DFlC/ansible_module_openshift_cert_expiry.py\", line 590, in main\r\n    c = cfg['users'][0]['user']['client-certificate-data']\r\nKeyError: 'client-certificate-data'\r\n", "msg": "MODULE FAILURE", "rc": 0}

ok: [qe-gpei-3102master-etcd-1.0423-2l7.qe.rhcloud.com] => {"changed": false, "check_results": {"etcd": [{"cert_cn": "CN:etcd-signer@1524539393", "days_remaining": 1825, "expiry": "2023-04-23 03:10:03", "health": "ok", "path": "/etc/etcd/ca.crt", "serial": 15685371651196948480, "serial_hex"...

Comment 7 Vadim Rutkovsky 2018-04-25 11:16:36 UTC
Good catch, it does fail on dedicated nodes.

Created https://github.com/openshift/openshift-ansible/pull/8132 to fix this

Comment 8 Vadim Rutkovsky 2018-04-27 08:33:53 UTC
Fix is available in openshift-ansible-3.10.0-0.30.0

Comment 9 Gaoyun Pei 2018-04-28 07:26:28 UTC
Verify this bug with openshift-ansible-3.10.0-0.30.0.git.0.4f02952.el7.noarch.

Run easy-mode.yaml playbook, it would generate the detailed cert report as /tmp/cert-expiry-report.html and /tmp/cert-expiry-report.json by default.

List the certs checked by the playbook:
[root@gpei-preserved host]# grep path /tmp/cert-expiry-report.json
          "path": "/etc/etcd/ca.crt", 
          "path": "/etc/etcd/server.crt", 
          "path": "/etc/etcd/peer.crt", 
          "path": "/etc/origin/node/node.kubeconfig", 
          "path": "/etc/origin/node/node.kubeconfig", 
          "path": "/etc/origin/master/admin.kubeconfig", 
          "path": "/etc/origin/master/openshift-master.kubeconfig", 
          "path": "/etc/origin/master/master.server.crt", 
          "path": "/etc/origin/master/master.proxy-client.crt", 
          "path": "/etc/origin/master/master.kubelet-client.crt", 
          "path": "/etc/origin/master/service-signer.crt", 
          "path": "/etc/origin/master/master.etcd-client.crt", 
          "path": "/etc/origin/master/master.etcd-ca.crt", 
          "path": "/etc/origin/master/ca.crt", 
          "path": "/etc/origin/node/client-ca.crt", 
          "path": "/etc/origin/node/client-ca.crt", 
          "path": "/api/v1/namespaces/default/secrets/registry-certificates", 
          "path": "/api/v1/namespaces/default/secrets/router-certs", 
          "path": "/etc/origin/node/client-ca.crt", 
          "path": "/etc/origin/node/client-ca.crt", 

master.kubelet-client.crt and master.proxy-client.crt were checked.


Note You need to log in before you can comment on or make changes to this bug.