Created attachment 1459081 [details] ansible-log Description of problem: After running openshift-ansible/playbooks/redeploy-certificates.yml to redeploy certificates, the console can not be accessed. See ansible log in attachment. Version-Release number of selected component (if applicable): docker.io/openshift/origin-console latest fb4295d1e2f7 2 days ago 259 MB openshift v3.10.18 How reproducible: Always Steps to Reproduce: 1.Access console url in browser 2.Redeploy certificates by running playbook: ansible-playbook -i /path/to/inventory openshift-ansible/playbooks/redeploy-certificates.yml 3.Access console url in browser again. Actual results: 1. Could login successfully. 2. Secret and pod in project openshift-console are recreated. # oc get pod -n openshift-console NAME READY STATUS RESTARTS AGE console-7db7fdfcdb-4p7gw 1/1 Running 0 1h # oc logs console-7db7fdfcdb-4p7gw -n openshift-console 2018/07/16 07:14:08 cmd/main: cookies are secure! 2018/07/16 07:14:08 cmd/main: Binding to 0.0.0.0:8443... 2018/07/16 07:14:08 cmd/main: using TLS 2018/07/16 07:23:27 http: TLS handshake error from 10.128.0.1:35114: tls: first record does not look like a TLS handshake 2018/07/16 07:23:41 http: TLS handshake error from 10.128.0.1:35164: remote error: tls: unknown certificate authority # oc get secret -n openshift-console NAME TYPE DATA AGE builder-dockercfg-hjr7x kubernetes.io/dockercfg 1 2h builder-token-8vs67 kubernetes.io/service-account-token 4 2h builder-token-lp5lh kubernetes.io/service-account-token 4 2h console-dockercfg-fptw2 kubernetes.io/dockercfg 1 2h console-oauth-config Opaque 2 2h console-serving-cert kubernetes.io/tls 2 1h console-token-6pp6b kubernetes.io/service-account-token 4 2h console-token-p8qfl kubernetes.io/service-account-token 4 2h default-dockercfg-v9nwh kubernetes.io/dockercfg 1 2h default-token-4txz9 kubernetes.io/service-account-token 4 2h default-token-m6c4j kubernetes.io/service-account-token 4 2h deployer-dockercfg-whmh6 kubernetes.io/dockercfg 1 2h deployer-token-bxjmh kubernetes.io/service-account-token 4 2h deployer-token-dcq29 kubernetes.io/service-account-token 4 2h 3. Cound not access console url, return 503 (Service Unavailable) error. Expected results: 3. Should access console successfully. Additional info:
Created attachment 1459098 [details] console-secret
There seems to be a problem with the router after redeploying certificates best I can tell. The console itself gets updated correctly in my testing.
vrutkovs - any chance you could help look at this?
PR https://github.com/openshift/openshift-ansible/pull/8891 Fixed in openshift-ansible-3.11.0-0.5.0
openshift-ansible-3.11.0-0.7.0.git.0.6e3e78eNone.noarch After I try redeploy certificates with playbook from above package, the router pod could not run normally. # oc get pod NAME READY STATUS RESTARTS AGE docker-registry-1-q6zjw 1/1 Running 2 5h docker-registry-2-deploy 0/1 Error 0 40m docker-registry-3-deploy 0/1 Error 0 7m registry-console-1-96cb8 1/1 Running 2 5h router-1-2ggqb 0/1 CrashLoopBackOff 7 5h # oc logs router-1-2ggqb I0723 09:21:06.424116 1 template.go:244] Starting template router (v3.10.23) I0723 09:21:06.449899 1 metrics.go:147] Router health and metrics port listening at 0.0.0.0:1936 on HTTP and HTTPS E0723 09:21:06.512063 1 haproxy.go:392] can't scrape HAProxy: dial unix /var/lib/haproxy/run/haproxy.sock: connect: no such file or directory I0723 09:21:07.382251 1 router.go:252] Router is including routes in all namespaces E0723 09:21:07.616668 1 haproxy.go:392] can't scrape HAProxy: dial unix /var/lib/haproxy/run/haproxy.sock: connect: no such file or directory E0723 09:21:07.634792 1 limiter.go:137] error reloading router: exit status 1 [ALERT] 203/092107 (28) : parsing [/var/lib/haproxy/conf/haproxy.config:112] : 'bind 127.0.0.1:10444' : unable to load SSL private key from PEM file '/etc/pki/tls/private/tls.crt'. [ALERT] 203/092107 (28) : parsing [/var/lib/haproxy/conf/haproxy.config:147] : 'bind 127.0.0.1:10443' : unable to load SSL private key from PEM file '/etc/pki/tls/private/tls.crt'. [ALERT] 203/092107 (28) : Error(s) found in configuration file : /var/lib/haproxy/conf/haproxy.config [ALERT] 203/092107 (28) : Fatal errors found in configuration. I will have another try to confirm if the failure is caused by redeploying certificates.
(In reply to Yanping Zhang from comment #6) > openshift-ansible-3.11.0-0.7.0.git.0.6e3e78eNone.noarch > After I try redeploy certificates with playbook from above package, the > router pod could not run normally. Correct, filed bz 1607391 to track this
Will verify the bug after https://bugzilla.redhat.com/show_bug.cgi?id=1607391 is fixed.
openshift-ansible-3.11.0-0.13.0.git.0.16dc599None.noarch openshift-ansible-3.11.0-0.14.0.git.0.7bd4429None.noarch After tried to redeploy certificates with playbook from above package, admin console pod and web console are both gone, master-controllers pod is also crashed. (Though router pod is running normally). Ansible log is in attachment. ========================= Resources before redeploy: # oc get secret -n openshift-console NAME TYPE DATA AGE builder-dockercfg-dppjv kubernetes.io/dockercfg 1 7h builder-token-qm2r5 kubernetes.io/service-account-token 4 7h builder-token-znkck kubernetes.io/service-account-token 4 7h console-dockercfg-gfkcl kubernetes.io/dockercfg 1 7h console-oauth-config Opaque 1 7h console-serving-cert kubernetes.io/tls 2 7h console-token-5m9d7 kubernetes.io/service-account-token 4 7h console-token-ntdb7 kubernetes.io/service-account-token 4 7h default-dockercfg-bmz7z kubernetes.io/dockercfg 1 7h default-token-gfmdl kubernetes.io/service-account-token 4 7h default-token-v98lw kubernetes.io/service-account-token 4 7h deployer-dockercfg-dtxjp kubernetes.io/dockercfg 1 7h deployer-token-9kjxj kubernetes.io/service-account-token 4 7h deployer-token-lmgmf kubernetes.io/service-account-token 4 7h [root@preserve-ui-yapei-311-master-etcd-1 ~]# oc get secret -n openshift-web-console NAME TYPE DATA AGE builder-dockercfg-jrj8c kubernetes.io/dockercfg 1 7h builder-token-n9v82 kubernetes.io/service-account-token 4 7h builder-token-tvn97 kubernetes.io/service-account-token 4 7h default-dockercfg-nmmms kubernetes.io/dockercfg 1 7h default-token-82f2t kubernetes.io/service-account-token 4 7h default-token-87q7s kubernetes.io/service-account-token 4 7h deployer-dockercfg-pz747 kubernetes.io/dockercfg 1 7h deployer-token-sxq4j kubernetes.io/service-account-token 4 7h deployer-token-wp95r kubernetes.io/service-account-token 4 7h webconsole-dockercfg-gzwsl kubernetes.io/dockercfg 1 7h webconsole-serving-cert kubernetes.io/tls 2 7h webconsole-token-4djkh kubernetes.io/service-account-token 4 7h webconsole-token-wrx7w kubernetes.io/service-account-token 4 7h # oc get pod --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE default docker-registry-ntq2h 1/1 Running 0 7h default dockergc-2k847 1/1 Running 0 7h default dockergc-fc6ts 1/1 Running 0 7h default registry-console-1-deploy 0/1 Error 0 7h default registry-console-2-n6cbp 1/1 Running 0 12m default router-1-ntxb2 1/1 Running 0 7h dof-u database-1-hs8mc 1/1 Running 0 6h dof-u frontend-1-btwp8 1/1 Running 0 6h dof-u frontend-1-nq8dh 1/1 Running 0 6h dof-u python-sample-build-1-build 0/1 Completed 0 6h hasha-pro1 mongodb-1-mhdsk 1/1 Running 0 7h hasha-pro1 nodejs-mongo-persistent-1-build 0/1 Completed 0 7h hasha-pro1 nodejs-mongo-persistent-1-hbv85 1/1 Running 0 7h install-test mongodb-1-8w99m 1/1 Running 0 7h install-test nodejs-mongodb-example-1-7kpd4 1/1 Running 0 7h install-test nodejs-mongodb-example-1-build 0/1 Completed 0 7h kube-service-catalog apiserver-6qj8x 1/1 Running 0 7h kube-service-catalog controller-manager-qpxmc 1/1 Running 0 7h kube-system master-api-preserve-ui-yapei-311-master-etcd-1 1/1 Running 1 7h kube-system master-controllers-preserve-ui-yapei-311-master-etcd-1 1/1 Running 0 7h kube-system master-etcd-preserve-ui-yapei-311-master-etcd-1 1/1 Running 0 7h openshift-ansible-service-broker asb-1-deploy 0/1 Error 0 7h openshift-console console-788c8fc9dd-tt5bd 1/1 Running 0 7h openshift-node sync-7qcfp 1/1 Running 0 7h openshift-node sync-wrz55 1/1 Running 0 7h openshift-node sync-xwfkm 1/1 Running 0 7h openshift-sdn ovs-jpgrg 1/1 Running 0 7h openshift-sdn ovs-lvphc 1/1 Running 0 7h openshift-sdn ovs-txvmw 1/1 Running 0 7h openshift-sdn sdn-9nk6r 1/1 Running 0 7h openshift-sdn sdn-mwm5k 1/1 Running 0 7h openshift-sdn sdn-r4zj8 1/1 Running 0 7h openshift-template-service-broker apiserver-nx59l 1/1 Running 0 7h openshift-web-console webconsole-6b8bdf69cf-9cms6 1/1 Running 0 7h xiaocwan-t dapi-env-test-pod 0/1 Completed 0 6h xiaocwan-t ruby-hello-world-1-5jcvf 1/1 Running 0 2h xiaocwan-t ruby-hello-world-1-build 0/1 Completed 0 2h [root@preserve-ui-yapei-311-master-etcd-1 ~]# oc get secrets -n default NAME TYPE DATA AGE builder-dockercfg-5kj5c kubernetes.io/dockercfg 1 7h builder-token-gs4hp kubernetes.io/service-account-token 4 7h builder-token-z6tbp kubernetes.io/service-account-token 4 7h default-dockercfg-fb5qn kubernetes.io/dockercfg 1 7h default-token-49vzd kubernetes.io/service-account-token 4 7h default-token-qv6q8 kubernetes.io/service-account-token 4 7h deployer-dockercfg-qfhb2 kubernetes.io/dockercfg 1 7h deployer-token-bvshg kubernetes.io/service-account-token 4 7h deployer-token-d2kcm kubernetes.io/service-account-token 4 7h dockergc-dockercfg-2z4bj kubernetes.io/dockercfg 1 7h dockergc-token-lkfb8 kubernetes.io/service-account-token 4 7h dockergc-token-n7pc4 kubernetes.io/service-account-token 4 7h registry-certificates Opaque 2 7h registry-config Opaque 1 7h registry-dockercfg-lzllq kubernetes.io/dockercfg 1 7h registry-token-hqglz kubernetes.io/service-account-token 4 7h registry-token-m8t2t kubernetes.io/service-account-token 4 7h router-certs kubernetes.io/tls 2 7h router-dockercfg-sjnmp kubernetes.io/dockercfg 1 7h router-metrics-tls kubernetes.io/tls 2 7h router-token-rjcc7 kubernetes.io/service-account-token 4 7h router-token-zc967 kubernetes.io/service-account-token 4 7h =============================== Resources after redeploy: [root@preserve-ui-yapei-311-master-etcd-1 ~]# oc get secrets -n default NAME TYPE DATA AGE builder-dockercfg-5kj5c kubernetes.io/dockercfg 1 9h builder-token-gs4hp kubernetes.io/service-account-token 4 9h builder-token-z6tbp kubernetes.io/service-account-token 4 9h default-dockercfg-fb5qn kubernetes.io/dockercfg 1 9h default-token-49vzd kubernetes.io/service-account-token 4 9h default-token-qv6q8 kubernetes.io/service-account-token 4 9h deployer-dockercfg-qfhb2 kubernetes.io/dockercfg 1 9h deployer-token-bvshg kubernetes.io/service-account-token 4 9h deployer-token-d2kcm kubernetes.io/service-account-token 4 9h dockergc-dockercfg-2z4bj kubernetes.io/dockercfg 1 9h dockergc-token-lkfb8 kubernetes.io/service-account-token 4 9h dockergc-token-n7pc4 kubernetes.io/service-account-token 4 9h registry-certificates Opaque 2 9h registry-config Opaque 1 9h registry-dockercfg-lzllq kubernetes.io/dockercfg 1 9h registry-token-hqglz kubernetes.io/service-account-token 4 9h registry-token-m8t2t kubernetes.io/service-account-token 4 9h router-certs kubernetes.io/tls 2 17m router-dockercfg-sjnmp kubernetes.io/dockercfg 1 9h router-metrics-tls kubernetes.io/tls 2 9h router-token-rjcc7 kubernetes.io/service-account-token 4 9h router-token-zc967 kubernetes.io/service-account-token 4 9h [root@preserve-ui-yapei-311-master-etcd-1 ~]# oc get secrets -n openshift-console NAME TYPE DATA AGE builder-dockercfg-dppjv kubernetes.io/dockercfg 1 9h builder-token-qm2r5 kubernetes.io/service-account-token 4 9h builder-token-znkck kubernetes.io/service-account-token 4 9h console-dockercfg-gfkcl kubernetes.io/dockercfg 1 9h console-oauth-config Opaque 1 9h console-token-5m9d7 kubernetes.io/service-account-token 4 9h console-token-ntdb7 kubernetes.io/service-account-token 4 9h default-dockercfg-bmz7z kubernetes.io/dockercfg 1 9h default-token-gfmdl kubernetes.io/service-account-token 4 9h default-token-v98lw kubernetes.io/service-account-token 4 9h deployer-dockercfg-dtxjp kubernetes.io/dockercfg 1 9h deployer-token-9kjxj kubernetes.io/service-account-token 4 9h deployer-token-lmgmf kubernetes.io/service-account-token 4 9h [root@preserve-ui-yapei-311-master-etcd-1 ~]# oc get secrets -n openshift-web-console NAME TYPE DATA AGE builder-dockercfg-jrj8c kubernetes.io/dockercfg 1 9h builder-token-n9v82 kubernetes.io/service-account-token 4 9h builder-token-tvn97 kubernetes.io/service-account-token 4 9h default-dockercfg-nmmms kubernetes.io/dockercfg 1 9h default-token-82f2t kubernetes.io/service-account-token 4 9h default-token-87q7s kubernetes.io/service-account-token 4 9h deployer-dockercfg-pz747 kubernetes.io/dockercfg 1 9h deployer-token-sxq4j kubernetes.io/service-account-token 4 9h deployer-token-wp95r kubernetes.io/service-account-token 4 9h webconsole-dockercfg-gzwsl kubernetes.io/dockercfg 1 9h webconsole-token-4djkh kubernetes.io/service-account-token 4 9h webconsole-token-wrx7w kubernetes.io/service-account-token 4 9h [root@preserve-ui-yapei-311-master-etcd-1 ~]# oc get event -n openshift-console LAST SEEN FIRST SEEN COUNT NAME KIND SUBOBJECT TYPE REASON SOURCE MESSAGE 23m 23m 1 console-788c8fc9dd-tt5bd.154abae4e4c56d69 Pod spec.containers{console} Normal Killing kubelet, preserve-ui-yapei-311-master-etcd-1 Killing container with id cri-o://console:Need to kill Pod [root@preserve-ui-yapei-311-master-etcd-1 ~]# oc get pod -n openshift-console No resources found. [root@preserve-ui-yapei-311-master-etcd-1 ~]# oc get pod -n openshift-web-console No resources found. [root@preserve-ui-yapei-311-master-etcd-1 ~]# oc get pod --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE default docker-registry-ntq2h 1/1 Running 0 9h default dockergc-2k847 1/1 Running 0 9h default dockergc-fc6ts 1/1 Running 0 9h default registry-console-1-deploy 0/1 Error 0 9h default registry-console-2-n6cbp 1/1 Running 0 1h default router-1-ntxb2 1/1 Running 0 9h dof-u database-1-hs8mc 1/1 Running 0 7h dof-u frontend-1-btwp8 1/1 Running 0 7h dof-u frontend-1-nq8dh 1/1 Running 0 7h dof-u python-sample-build-1-build 0/1 Completed 0 7h hasha-pro1 mongodb-1-mhdsk 1/1 Running 0 9h hasha-pro1 nodejs-mongo-persistent-1-build 0/1 Completed 0 9h hasha-pro1 nodejs-mongo-persistent-1-hbv85 1/1 Running 0 9h install-test mongodb-1-8w99m 1/1 Running 0 9h install-test nodejs-mongodb-example-1-7kpd4 1/1 Running 0 9h install-test nodejs-mongodb-example-1-build 0/1 Completed 0 9h kube-service-catalog apiserver-6qj8x 1/1 Running 0 9h kube-service-catalog controller-manager-qpxmc 0/1 CrashLoopBackOff 10 9h kube-system master-api-preserve-ui-yapei-311-master-etcd-1 1/1 Running 3 9h kube-system master-controllers-preserve-ui-yapei-311-master-etcd-1 0/1 CrashLoopBackOff 10 9h kube-system master-etcd-preserve-ui-yapei-311-master-etcd-1 1/1 Running 1 9h openshift-ansible-service-broker asb-1-deploy 0/1 Error 0 9h openshift-node sync-7qcfp 1/1 Running 0 9h openshift-node sync-wrz55 1/1 Running 0 9h openshift-node sync-xwfkm 1/1 Running 0 9h openshift-sdn ovs-jpgrg 1/1 Running 0 9h openshift-sdn ovs-lvphc 1/1 Running 0 9h openshift-sdn ovs-txvmw 1/1 Running 0 9h openshift-sdn sdn-9nk6r 1/1 Running 0 9h openshift-sdn sdn-mwm5k 1/1 Running 0 9h openshift-sdn sdn-r4zj8 1/1 Running 0 9h openshift-template-service-broker apiserver-nx59l 1/1 Running 0 9h xiaocwan-t dapi-env-test-pod 0/1 Completed 0 7h xiaocwan-t ruby-hello-world-1-5jcvf 1/1 Running 0 3h xiaocwan-t ruby-hello-world-1-build 0/1 Completed 0 3h
Created attachment 1475828 [details] playbooklog
Since the only controller has crashed new console pods are not being created. Does 'oc get rc -n openshift-web-console' show any replication controllers? What's in the logs of 'master-controllers-preserve-ui-yapei-311-master-etcd-1' pod in kube-system? Is this reproducible? I haven't seen this happening with docker, gonna try crio
Works fine with cri-o here, so it must be a failing controller
Cannot reproduce failing controller, is this still happening on latest build?
clear NEEDINFO since comment 13 answered the question
openshift-ansible-3.11.0-0.25.0.git.0.7497e69.el7.noarch Run the playbook again: /usr/share/ansible/openshift-ansible/playbooks/redeploy-certificates.yml After redepoly certs, admin console still could not be accessed. Console pod and master api/controller pods are all running. Old web console and application can be accessed. Here is some resource info: http://pastebin.test.redhat.com/639223
openshift v3.11.0-0.32.0 kubernetes v1.11.0+d4cacc0 openshift-ansible-3.11.0-0.32.0.git.0.b27b349.el7.noarch Run the playbook: /usr/share/ansible/openshift-ansible/playbooks/redeploy-certificates.yml After redeploy certificates, admin console could be accessed now. And docker-registry deploy pod has same issue mentioned in Comment 19, # oc logs docker-registry-2-deploy --> Scaling up docker-registry-2 from 0 to 1, scaling down docker-registry-1 from 1 to 0 (keep 1 pods available, don't exceed 2 pods) Scaling docker-registry-2 up to 1 --> Error listing events for replication controller docker-registry-2: Get https://172.30.0.1:443/api/v1/namespaces/default/events?fieldSelector=involvedObject.uid%3D3957f9bb-b4a8-11e8-a28f-0eee935ab702%2CinvolvedObject.name%3Ddocker-registry-2%2CinvolvedObject.namespace%3Ddefault%2CinvolvedObject.kind%3DReplicationController: dial tcp 172.30.0.1:443: connect: connection refused --> Error listing events for replication controller docker-registry-1: Get https://172.30.0.1:443/api/v1/namespaces/default/events?fieldSelector=involvedObject.name%3Ddocker-registry-1%2CinvolvedObject.namespace%3Ddefault%2CinvolvedObject.kind%3DReplicationController%2CinvolvedObject.uid%3D0cfbf2b5-b49c-11e8-9a9a-0eee935ab702: dial tcp 172.30.0.1:443: connect: connection refused The connection to the server 172.30.0.1:443 was refused - did you specify the right host or port? The bug can be verified. And for docker-registry deploy pod issue, we could track it in a new bug.
Verified this bug according to Comment 20
Closing bugs that were verified and targeted for GA but for some reason were not picked up by errata. This bug fix should be present in current 3.11 release content.