Description of problem: I am upgrading from 3.9 cluster to 3.10 cluster, upgrade playbook failing with following error. The full traceback is: File "/tmp/ansible_fVxihL/ansible_module_oc_adm_csr.py", line 47, in <module> import ruamel.yaml as yaml fatal: [ec2-34-214-193-12.us-west-2.compute.amazonaws.com -> ec2-34-216-221-226.us-west-2.compute.amazonaws.com]: FAILED! => { "changed": true, "failed": true, "finished": false, "invocation": { "module_args": { "approve_all": false, "debug": false, "fail_on_timeout": true, "kubeconfig": "/etc/origin/master/admin.kubeconfig", "nodes": [ "ip-172-31-8-53.us-west-2.compute.internal" ], "service_account": "node-bootstrapper", "state": "approve", "timeout": 60 } }, "msg": "Timed out accepting certificate signing requests. Failing as requested.", "nodes": [ { "client_accepted": false, "csrs": { "csr-zcc5w": { "apiVersion": "certificates.k8s.io/v1beta1", "kind": "CertificateSigningRequest", "metadata": { "creationTimestamp": "2018-06-29T15:39:03Z", "generateName": "csr-", "name": "csr-zcc5w", "namespace": "", "resourceVersion": "100820", "selfLink": "/apis/certificates.k8s.io/v1beta1/certificatesigningrequests/csr-zcc5w", "uid": "8d2bbfc0-7bb2-11e8-b178-02cabb34edd4" }, "spec": { "groups": [ "system:nodes", "system:authenticated" ], "request": "LS0tLS1CRUdJTiBDRVJUSUZJQ0FURSBSRVFVRVNULS0tLS0KTUlJQ1JUQ0NBZXNDQVFBd1Z6RVZNQk1HQTFVRUNoTU1jM2x6ZEdWdE9tNXZaR1Z6TVQ0d1BBWURWUVFERXpWegplWE4wWlcwNmJtOWtaVHBwY0MweE56SXRNekV0T0MwMU15NTFjeTEzWlhOMExUSXVZMj l0Y0hWMFpTNXBiblJsCmNtNWhiREJaTUJNR0J5cUdTTTQ5QWdFR0NDcUdTTTQ5QXdFSEEwSUFCSEczUGp3QytjTk0valcwY0VGQXBwVHkKR2Exam5zbEV2V3dNY2pISjBqMCtSQU5tQ3hmazk5SFQ3alNVM05PQ3hHdUloWHNIYUNRR1lDSWlsUzR4MmZtZwpnZ0V3TUlJQkxBWUpLb1pJaHZjTkFRa09NWUlCSFRDQ0FS a3dnZ0VWQmdOVkhSRUVnZ0VNTUlJQkNJSXBhWEF0Ck1UY3lMVE14TFRndE5UTXVkWE10ZDJWemRDMHlMbU52YlhCMWRHVXVhVzUwWlhKdVlXeUNLV2x3TFRFM01pMHoKTVMwNExUVXpMblZ6TFhkbGMzUXRNaTVqYjIxd2RYUmxMbWx1ZEdWeWJtRnNnaWxwY0MweE56SXRNekV0T0MwMQpNeTUxY3kxM1pYTjBMVEl1WT I5dGNIVjBaUzVwYm5SbGNtNWhiSUl4WldNeUxUTTBMVEl4TkMweE9UTXRNVEl1CmRYTXRkMlZ6ZEMweUxtTnZiWEIxZEdVdVlXMWhlbTl1WVhkekxtTnZiWWNFckI4SU5ZY1Evb0FBQUFBQUFBQUEKeWpuLy9rTkRPb2NFckJFQUFZY1Evb0FBQUFBQUFBQ0U2Mi8vL3BLRWZJY0VyQlFBQVljUS9vQUFBQUFBQUFBQQpV VGovL3FFcXpJY0VyQjhJTlljRUl0YkJEREFLQmdncWhrak9QUVFEQWdOSUFEQkZBaUIvbDJVbzljVGlEY0RhCnMxdzNNZXJaNVVoNTBGeUFvTUx5WEp5LytkR3R1d0loQU9kZ1BlQjV0L1BaZEJZSU5jRXJ4T21RYzJFVVBPM00KOXF2Ymh4VFRYVkpzCi0tLS0tRU5EIENFUlRJRklDQVRFIFJFUVVFU1QtLS0tLQo=", "usages": [ "digital signature", "key encipherment", "server auth" ], "username": "system:node:ip-172-31-8-53.us-west-2.compute.internal" }, "status": { "certificate": "LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURzVENDQXBtZ0F3SUJBZ0lVTkdlbXk3ZHBwenI3NXFyUDAxRnNQRUR4dzJVd0RRWUpLb1pJaHZjTkFRRUwKQlFBd0pqRWtNQ0lHQTFVRUF3d2JiM0JsYm5Ob2FXWjBMWE5wWjI1bGNrQXhOVE13TWpjNE1qZ$ RNQjRYRFRFNApNRFl5T1RFMU16UXdNRm9YRFRFNU1EWXlPVEUxTXpRd01Gb3dWekVWTUJNR0ExVUVDaE1NYzNsemRHVnRPbTV2ClpHVnpNVDR3UEFZRFZRUURFelZ6ZVhOMFpXMDZibTlrWlRwcGNDMHhOekl0TXpFdE9DMDFNeTUxY3kxM1pYTjAKTFRJdVkyOXRjSFYwWlM1cGJuUmxjbTVoYkRCWk1CTUdCeXFHU00$ OUFnRUdDQ3FHU000OUF3RUhBMElBQkhHMwpQandDK2NOTS9qVzBjRUZBcHBUeUdhMWpuc2xFdld3TWNqSEowajArUkFObUN4Zms5OUhUN2pTVTNOT0N4R3VJCmhYc0hhQ1FHWUNJaWxTNHgyZm1qZ2dGdk1JSUJhekFPQmdOVkhROEJBZjhFQkFNQ0JhQXdFd1lEVlIwbEJBd3cKQ2dZSUt3WUJCUVVIQXdFd0RBWURWU$ BUQVFIL0JBSXdBREFkQmdOVkhRNEVGZ1FVQXVIcDlIWTd4ZXhSN3FLQwpxOUtDY1dyMU11NHdnZ0VWQmdOVkhSRUVnZ0VNTUlJQkNJSXBhWEF0TVRjeUxUTXhMVGd0TlRNdWRYTXRkMlZ6CmRDMHlMbU52YlhCMWRHVXVhVzUwWlhKdVlXeUNLV2x3TFRFM01pMHpNUzA0TFRVekxuVnpMWGRsYzNRdE1pNWoKYjIxd2R$ UmxMbWx1ZEdWeWJtRnNnaWxwY0MweE56SXRNekV0T0MwMU15NTFjeTEzWlhOMExUSXVZMjl0Y0hWMApaUzVwYm5SbGNtNWhiSUl4WldNeUxUTTBMVEl4TkMweE9UTXRNVEl1ZFhNdGQyVnpkQzB5TG1OdmJYQjFkR1V1CllXMWhlbTl1WVhkekxtTnZiWWNFckI4SU5ZY1Evb0FBQUFBQUFBQUF5am4vL2tORE9vY0VyQ$ VBQVljUS9vQUEKQUFBQUFBQ0U2Mi8vL3BLRWZJY0VyQlFBQVljUS9vQUFBQUFBQUFBQVVUai8vcUVxekljRXJCOElOWWNFSXRiQgpEREFOQmdrcWhraUc5dzBCQVFzRkFBT0NBUUVBT2xYQXRDYjZrK1loOVBheGh2TllJcThXcDY2emVkU3c0MXI1Cnd6ckRsbXJpaU9na2Z4ZlNocUhxek5XYVBkVVZDVXZ4N2NBc2l$ b3dNMm9TTlMxVmJMa09XM0NoUEUrQW5QY1IKbm4rakg2QWNRcnZVQkZ6MjBHTndDTmxRUS92dklRM2kraUJaK2NVQnUvWFJhR3Mxb2VqTGU0MUllK25aajFQZgoyS2N0aWVXTVBuZERNY1RCT0lBWVQyK20xNmpkcVBQMFFDdmFiclkrV3FyZ0N3b3pWMFZGWkVFNmhVOWd2dDdRCmh6STlyUDYwcmdtY1RhV1RWSkNjO$ 9xQ1RDMjRiazNGdHFkVnd1OVpFQ0hQN0FidGJGNzNNYUVnOWxBMjJmcC8KNjR5Mzd3eW5BTDc3TERJYmFGY0tXUTdNbW9tbGVIcC9PUHp5Y1NwM0M3WnhBd1NCV0E9PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==", "conditions": [ { "lastUpdateTime": "2018-06-29T15:39:05Z", "message": "This CSR was approved by kubectl certificate approve.", "reason": "KubectlApprove", "type": "Approved" } ] } }, "node-csr-IweiOoeJ_LwhsNJ9-C3kusDKB4eI5EzPwznMHqhLd7g": { "apiVersion": "certificates.k8s.io/v1beta1", "kind": "CertificateSigningRequest", "metadata": { "creationTimestamp": "2018-06-29T15:39:02Z", "name": "node-csr-IweiOoeJ_LwhsNJ9-C3kusDKB4eI5EzPwznMHqhLd7g", "namespace": "", "resourceVersion": "100800", "selfLink": "/apis/certificates.k8s.io/v1beta1/certificatesigningrequests/node-csr-IweiOoeJ_LwhsNJ9-C3kusDKB4eI5EzPwznMHqhLd7g", "uid": "8c417ba3-7bb2-11e8-b178-02cabb34edd4" }, "spec": { "groups": [ "system:nodes", "system:authenticated" ], "request": "LS0tLS1CRUdJTiBDRVJUSUZJQ0FURSBSRVFVRVNULS0tLS0KTUlJQkV6Q0J1UUlCQURCWE1SVXdFd1lEVlFRS0V3eHplWE4wWlcwNmJtOWtaWE14UGpBOEJnTlZCQU1UTlhONQpjM1JsYlRwdWIyUmxPbWx3TFRFM01pMHpNUzA0TFRVekxuVnpMWGRsYzNRdE1pNWpiM$ F3ZFhSbExtbHVkR1Z5CmJtRnNNRmt3RXdZSEtvWkl6ajBDQVFZSUtvWkl6ajBEQVFjRFFnQUVBL2swU2NQblRvYlViSTI4REh3NEZPT0IKZ2ZadFF0czRteWkrTGMxbXNLSUJLcDliVHpQQXdtc244NDBGU3VLc2lNTVJxTE1CS1UvUkg2R2gwZnFPcXFBQQpNQW9HQ0NxR1NNNDlCQU1DQTBrQU1FWUNJUURKSDJES0M$ c3ZVaHBZcTVCbmFjSHVxNTU4WTU2R01BSUdWQWRZCm4yRmdsUUloQUxiRWJrZWowdjgzV0lGNlBmK3lENDBqUjdQOE9UZkZSTmNYbmtCTm5scXgKLS0tLS1FTkQgQ0VSVElGSUNBVEUgUkVRVUVTVC0tLS0tCg==", "usages": [ "digital signature", "key encipherment", "client auth" ], "username": "system:node:ip-172-31-8-53.us-west-2.compute.internal" }, "status": { "certificate": "LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUNsRENDQVh5Z0F3SUJBZ0lVUGxBanVmbWtDblFMUVh3NWwzNDlXbzRoRGNJd0RRWUpLb1pJaHZjTkFRRUwKQlFBd0pqRWtNQ0lHQTFVRUF3d2JiM0JsYm5Ob2FXWjBMWE5wWjI1bGNrQXhOVE13TWpjNE1qZz RNQjRYRFRFNApNRFl5T1RFMU16UXdNRm9YRFRFNU1EWXlPVEUxTXpRd01Gb3dWekVWTUJNR0ExVUVDaE1NYzNsemRHVnRPbTV2ClpHVnpNVDR3UEFZRFZRUURFelZ6ZVhOMFpXMDZibTlrWlRwcGNDMHhOekl0TXpFdE9DMDFNeTUxY3kxM1pYTjAKTFRJdVkyOXRjSFYwWlM1cGJuUmxjbTVoYkRCWk1CTUdCeXFHU000 OUFnRUdDQ3FHU000OUF3RUhBMElBQkFQNQpORW5ENTA2RzFHeU52QXg4T0JUamdZSDJiVUxiT0pzb3ZpM05ackNpQVNxZlcwOHp3TUpySi9PTkJVcmlySWpECkVhaXpBU2xQMFIraG9kSDZqcXFqVkRCU01BNEdBMVVkRHdFQi93UUVBd0lGb0RBVEJnTlZIU1VFRERBS0JnZ3IKQmdFRkJRY0RBakFNQmdOVkhSTUJBZj hFQWpBQU1CMEdBMVVkRGdRV0JCUnBEdmxIb29OajNVTU1BanV2cUxJagpIQzVqUGpBTkJna3Foa2lHOXcwQkFRc0ZBQU9DQVFFQVN5QnJlUzZLYVAvVGZIRUtVVHhMQ3BKdzY2a1ljKzh2CmgxQW9McnlyYitzUyt3K3c4Z00vUDFhNWxMRnkwR0NwY2Z5d0FNQU1HVVBWS1FJVzduVzVhTnNNc3lzVGRwSlUKQnpWOWxL OWdiUDlCeVB2TVB6OHBpTVZNTWl5WWxpaHlNbFh5NElsMmtJeGttY1lkZTJqS1Bzc29oMEhiaXlpTwpmM3JxL3dkZjdRczlZbElNV0V2VERlb0tPb3ZGSUpSUWFSeXl3K0prc1A4TDZNbmFUTVIwQ1V5SzUwMU5HU1JNCnAvSXZiNFF1NHJkL2FaODdYZFFiL2laKzF3L0RGd0VvUGg0aXZoaGc2b1JobWF0VTUyL2pXSD d1eXhtS250bC8KUVJlTUZQcDBoMjk5ODgvQjAxeGNLbEpMdHdaMDFJamM4NFBQamh6Y2N4K2M1ZGwwNTN2ZE9RPT0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=", "conditions": [ { "lastUpdateTime": "2018-06-29T15:39:03Z", "message": "This CSR was approved by kubectl certificate approve.", "reason": "KubectlApprove", "type": "Approved" } ] } } }, "denied": false, "name": "ip-172-31-8-53.us-west-2.compute.internal", "server_accepted": true } ], "results": [ { "cmd": "/usr/bin/oc adm certificate approve node-csr-IweiOoeJ_LwhsNJ9-C3kusDKB4eI5EzPwznMHqhLd7g", "results": {}, "returncode": 0 }, { "cmd": "/usr/bin/oc adm certificate approve csr-zcc5w", "results": {}, "returncode": 0 } ], "state": "approve", "timeout": true } Version-Release number of the following components: rpm -q openshift-ansible openshift-ansible-3.10.10-1.git.248.0bb6b58.el7.noarch rpm -q ansible ansible-2.4.5.0-1.el7ae.noarch ansible --version ansible 2.4.5.0 config file = /etc/ansible/ansible.cfg configured module search path = [u'/root/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules'] ansible python module location = /usr/lib/python2.7/site-packages/ansible executable location = /usr/bin/ansible python version = 2.7.5 (default, Jun 12 2018, 10:42:23) [GCC 4.8.5 20150623 (Red Hat 4.8.5-34)] Steps to Reproduce: 1. create 3.9 cluster, 3 masters, 3 etcd (separate), 2 infra, 3 nodes, 1 lb 2. start upgrade using playbook Actual results: Please include the entire output from the last TASK line through the end of output if an error is generated Expected results: playbook finish successfully Additional info: Please attach logs from ansible-playbook with the -vvv flag
Created attachment 1455548 [details] ansible log with -vvv
In this environment the username field on the client side certificate field has a username of `system:node:ip-172-31-8-53.us-west-2.compute.internal` where as in all other environments we see that the the username field is `system:admin`. We're trying to figure out why this is happening.
*** Bug 1597219 has been marked as a duplicate of this bug. ***
Seth / Clayton, The comment here says that the client cert should have a username of the service account, however it seems during upgrades it's got "username": "system:node:ip-172-31-8-53.us-west-2.compute.internal" https://github.com/openshift/openshift-ansible/commit/19cd3550718ad466e5e0a648d611efe45ac5cafa#diff-f9afe10ac5cdf6294e144f2853bf9d05R1609 Is this expected?
If I delete everything in /etc/origin/node/certificates and restart the kubelet $ kubectl get csr NAME AGE REQUESTOR CONDITION node-csr-SOBked3RehDQ0okkusaCqi0ul-1qt-hbh8XT0jvc3x4 43s system:serviceaccount:openshift-infra:node-bootstrapper Pending [centos@master ~]$ oc adm certificate approve node-csr-SOBked3RehDQ0okkusaCqi0ul-1qt-hbh8XT0jvc3x4 certificatesigningrequest.certificates.k8s.io/node-csr-SOBked3RehDQ0okkusaCqi0ul-1qt-hbh8XT0jvc3x4 approved [centos@master ~]$ kubectl get csr NAME AGE REQUESTOR CONDITION csr-s9btc 2s system:node:node.lab.variantweb.net Pending node-csr-SOBked3RehDQ0okkusaCqi0ul-1qt-hbh8XT0jvc3x4 1m system:serviceaccount:openshift-infra:node-bootstrapper Approved,Issued [centos@master ~]$ oc adm certificate approve csr-s9btc certificatesigningrequest.certificates.k8s.io/csr-s9btc approved [centos@master ~]$ kubectl get csr NAME AGE REQUESTOR CONDITION csr-s9btc 22s system:node:node.lab.variantweb.net Approved,Issued node-csr-SOBked3RehDQ0okkusaCqi0ul-1qt-hbh8XT0jvc3x4 1m system:serviceaccount:openshift-infra:node-bootstrapper Approved,Issued The CSR submitted by the node-bootstrapper is for the kubelet client cert. Once it is approved, the kubelet uses its own SA, identified by its newly signed client cert, to create a CSR for its server cert, which it uses to accept TLS connections from the API server. My understanding is the the kubelet uses its own client cert (i.e. its own serviceaccount) to rotate the client cert. It only uses the node-bootstrapper SA to obtain the initial client cert. If you want to force the kubelet to start the bootstrapping process from the beginning, you can remove everything in /etc/origin/node/certificates.
Still not fix on latest v3.10.12. This bug blocked all upgrade tests.
https://github.com/openshift/openshift-ansible/pull/9079
The PR https://github.com/openshift/openshift-ansible/pull/9098 has been merged to openshift-ansible-3.10.15-1
We did not hit the issue on v3.10.15.
Move to verified per commet 19