Bug 1596557
Summary: | After running redeploy-certificates.yml playbook in OCP 3.9 ansible service broker stop working | ||||||
---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Yadan Pei <yapei> | ||||
Component: | Installer | Assignee: | Vadim Rutkovsky <vrutkovs> | ||||
Status: | CLOSED ERRATA | QA Contact: | Yadan Pei <yapei> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 3.9.0 | CC: | aos-bugs, aship, chezhang, dmoessne, farandac, gsapienz, jiazha, jokerman, jrosenta, mifiedle, mmccomas, nate.childers, oarribas, openshift-bugs-escalate, vrutkovs, yanpzhan, yapei, zitang | ||||
Target Milestone: | --- | Keywords: | NeedsTestCase | ||||
Target Release: | 3.9.z | ||||||
Hardware: | Unspecified | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | 1592303 | ||||||
: | 1623987 (view as bug list) | Environment: | |||||
Last Closed: | 2018-09-22 04:53:09 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 1592303, 1596233, 1667981 | ||||||
Bug Blocks: | 1623987 | ||||||
Attachments: |
|
Comment 1
Yadan Pei
2018-06-29 08:52:30 UTC
Above output is got after running openshift-ansible/playbooks/redeploy-certificates.yml beside ansible service broker, seems template service broker also need fix Created https://github.com/openshift/openshift-ansible/pull/9585 It also seems to fix TSB After running /usr/share/ansible/openshift-ansible/playbooks/redeploy-certificates.yml with openshift-ansible-3.9.43-1.git.0.d0bc600.el7.noarch asb-* pods are running but apiserver-* pods was not started correctly # oc get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-service-catalog apiserver-4kt52 0/1 CrashLoopBackOff 9 29m kube-service-catalog controller-manager-qzp64 0/1 CrashLoopBackOff 5 29m openshift-ansible-service-broker asb-1-ph77j 1/1 Running 0 15m openshift-ansible-service-broker asb-etcd-1-4dq8t 1/1 Running 0 15m openshift-template-service-broker apiserver-4rchm 1/1 Running 2 27m openshift-template-service-broker apiserver-mx4pc 1/1 Running 1 27m openshift-web-console webconsole-7d7cbcf74c-7w64w 1/1 Running 0 13m # oc logs -f apiserver-4kt52 -n kube-service-catalog I0911 02:40:14.459147 1 feature_gate.go:184] feature gates: map[OriginatingIdentity:true] I0911 02:40:14.459291 1 hyperkube.go:188] Service Catalog version v3.9.43 (built 2018-09-08T02:18:49Z) W0911 02:40:14.751120 1 authentication.go:229] Unable to get configmap/extension-apiserver-authentication in kube-system. Usually fixed by 'kubectl create rolebinding -n kube-system ROLE_NAME --role=extension-apiserver-authentication-reader --serviceaccount=YOUR_NS:YOUR_SA' Error: Get https://172.30.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication: dial tcp 172.30.0.1:443: connect: network is unreachable I don't see TSB pods are re-created, Vadim, can you help confirm? Created attachment 1482254 [details]
ansiblelogs
(In reply to Yadan Pei from comment #9) > I don't see TSB pods are re-created, Vadim, can you help confirm? These tasks have run: >TASK [ansible_service_broker : Remove ASB pods] ******************************** >changed: [host-8-244-4.host.centralci.eng.rdu2.redhat.com] => (item=asb) >changed: [host-8-244-4.host.centralci.eng.rdu2.redhat.com] => (item=asb-etcd) Please attach the output of `ansible-playbooks -vvv` for more information >dial tcp 172.30.0.1:443: connect: network is unreachable Some network problem? Is it reproducible? Can new APBs be provisioned? Above network error is reproducible, we will debug and open separate bug if that's an issue. Despite the network errors, what I can confirm is that some secrets for ASB are re-created and pods are recreated also. # oc get secret -n openshift-ansible-service-broker //these secrets are re-created NAME TYPE DATA AGE asb-client kubernetes.io/service-account-token 4 16m asb-tls kubernetes.io/tls 2 16m broker-etcd-auth-secret Opaque 2 16m etcd-auth-secret Opaque 1 16m etcd-tls kubernetes.io/tls 2 16m # oc get pods -n openshift-ansible-service-broker // All ASB pods are running NAME READY STATUS RESTARTS AGE asb-1-mbhpg 1/1 Running 0 16m asb-etcd-1-smn26 1/1 Running 0 16m Another point I need confirm is I don't see TSB secret/pods are re-created, do we need recreate them also? # oc get pods -n openshift-template-service-broker NAME READY STATUS RESTARTS AGE apiserver-k5xl5 0/1 CrashLoopBackOff 9 2h apiserver-t54c7 1/1 Running 1 2h Please attach the following info: 1) versions 2) inventory 3) apiserver container logs The network issue is not reproduced on EC2, so it's not a issue any more. The only remaining concern is whether we need create re-create TSB secrets/pods openshift-ansible-3.9.43-1.git.0.d0bc600.el7.noarch api container logs are still required to find out why is it broken One of the pods failed to connect to kube API server: "dial tcp 172.30.0.1:443: connect: network is unreachable" The other one works fine, so the fix worked, but networks issues won't let the first pod start correctly. Moving to VERIFIED per comment 12 and comment 20 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2658 |