openshift-ansible needs to update an existing router service when its annotation changes. PR: https://github.com/openshift/openshift-ansible/pull/11122 +++ This bug was initially created as a clone of Bug #1635613 +++ Description of problem: When a customer openshift_hosted_routers and deploys the routers using: ansible-playbook -i <ansible inventory> \ /usr/share/ansible/openshift-ansible/playbooks/openshift-hosted/deploy_router.yml The routers won't work because the secret router-metrics-tls secret is missing. If openshift_hosted_routers is not defined oc_adm_router works as expected creating the secret. I have tried reproducing using the exact same configuration for openshift_hosted_routers Version-Release number of selected component (if applicable): openshift-ansible-3.10.47-1.git.0.95bc2d2.el7_5.noarch How reproducible: Customer always faces this issue, I'm unable to reproduce with the same values Steps to Reproduce: 1.ansible-playbook -i <ansible inventory> /usr/share/ansible/openshift-ansible/playbooks/openshift-hosted/deploy_router.yml Actual results: router-metrics-tls is absent and therefore fails to start the router Expected results: OpenShift router works as expected Additional info: Workaround: 1. Deploy the routers without custom openshift_hosted_routers 2. oc delete dc router -n default 3. Deploy the routers with custom openshift_hosted_routers --- Additional comment from Scott Dodson on 2018-10-18 13:49:23 UTC --- Juan, Can you confirm whether in your scenario this is happening during a clean install or was this happening in an upgraded environment? --- Additional comment from Juan Luis de Sousa-Valadas on 2018-10-22 07:21:10 UTC --- Scott, It was in an upgraded environment, but I had deleted manually all the router components. --- Additional comment from Dan Mace on 2018-11-15 17:32:50 UTC --- The `router-metrics-tls` secret is provided by the serving cert signer component, which generates certificate secrets based on annotated services. To help us narrow down the issue, please reproduce the problem and then provide the output of the following command: $ oc get -n default services -o yaml What we expect is a service with the following annotation: service.alpha.openshift.io/serving-cert-secret-name: router-metrics-tls Normally, the annotated service is created by the `oc adm router` command. The presence of that annotated service is what causes the service cert signer component to generate the `router-metrics-tls` secret for use by the router deployment. We can continue diagnosing once we have the output of the `oc` command I listed. Thanks! --- Additional comment from on 2018-12-20 09:28:57 UTC --- Hi, can confirm this issue on an openshift origin version updated from v3.9 to v3.11. The mentioned secret is not show. Possibly this service should have been added by the ansible-playbook upgrade.yaml, but was forgotten? > oc get -n default services -o yaml apiVersion: v1 items: - apiVersion: v1 kind: Service metadata: creationTimestamp: 2018-07-11T12:19:48Z labels: docker-registry: default name: docker-registry namespace: default resourceVersion: "30194106" selfLink: /api/v1/namespaces/default/services/docker-registry uid: b47a8907-8504-11e8-b082-5cf3fce5f1c8 spec: clusterIP: 172.30.76.0 ports: - name: 5000-tcp port: 5000 protocol: TCP targetPort: 5000 selector: docker-registry: default sessionAffinity: ClientIP sessionAffinityConfig: clientIP: timeoutSeconds: 10800 type: ClusterIP status: loadBalancer: {} - apiVersion: v1 kind: Service metadata: creationTimestamp: 2018-07-11T12:10:31Z labels: component: apiserver provider: kubernetes name: kubernetes namespace: default resourceVersion: "40893" selfLink: /api/v1/namespaces/default/services/kubernetes uid: 684f7838-8503-11e8-aa0c-5cf3fce5f1c8 spec: clusterIP: 172.30.0.1 ports: - name: https port: 443 protocol: TCP targetPort: 8443 - name: dns port: 53 protocol: UDP targetPort: 8053 - name: dns-tcp port: 53 protocol: TCP targetPort: 8053 sessionAffinity: ClientIP sessionAffinityConfig: clientIP: timeoutSeconds: 10800 type: ClusterIP status: loadBalancer: {} - apiVersion: v1 kind: Service metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"createdBy":"registry-console-template","name":"registry-console"},"name":"registry-console","namespace":"default"},"spec":{"ports":[{"name":"registry-console","port":9000,"protocol":"TCP","targetPort":9090}],"selector":{"name":"registry-console"},"type":"ClusterIP"}} openshift.io/generated-by: OpenShiftNewApp creationTimestamp: 2018-07-11T12:52:31Z labels: app: registry-console createdBy: registry-console-template name: registry-console name: registry-console namespace: default resourceVersion: "30054148" selfLink: /api/v1/namespaces/default/services/registry-console uid: 464c8afa-8509-11e8-b082-5cf3fce5f1c8 spec: clusterIP: 172.30.165.61 ports: - name: registry-console port: 9000 protocol: TCP targetPort: 9090 selector: name: registry-console sessionAffinity: None type: ClusterIP status: loadBalancer: {} - apiVersion: v1 kind: Service metadata: annotations: prometheus.io/port: "1936" prometheus.io/scrape: "true" prometheus.openshift.io/password: <deleted> prometheus.openshift.io/username: <deleted> creationTimestamp: 2018-07-11T12:19:39Z labels: router: router name: router namespace: default resourceVersion: "42089" selfLink: /api/v1/namespaces/default/services/router uid: aea585f0-8504-11e8-b082-5cf3fce5f1c8 spec: clusterIP: 172.30.77.211 ports: - name: 80-tcp port: 80 protocol: TCP targetPort: 80 - name: 443-tcp port: 443 protocol: TCP targetPort: 443 - name: 1936-tcp port: 1936 protocol: TCP targetPort: 1936 selector: router: router sessionAffinity: None type: ClusterIP status: loadBalancer: {} kind: List metadata: resourceVersion: "" selfLink: "" ```
verified with openshift-ansible-3.10.110-1.git.0.1e03ab3.el7 and the issue has been fixed.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0328