Bug 1672454 - [3.10] oc_adm_router doesn't create router-metrics-tls secret
Summary: [3.10] oc_adm_router doesn't create router-metrics-tls secret
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 3.10.0
Hardware: x86_64
OS: Linux
urgent
high
Target Milestone: ---
: 3.10.z
Assignee: Miciah Dashiel Butler Masters
QA Contact: Hongan Li
URL:
Whiteboard:
Depends On: 1635613
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-02-05 00:55 UTC by Miciah Dashiel Butler Masters
Modified: 2022-08-04 22:20 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: If a router had previously been deployed with an older version of openshift-ansible, its service could be missing the service.alpha.openshift.io/serving-cert-secret-name annotation. openshift-ansible did not add the missing annotation. Consequence: The service serving cert controller was not creating the router-metrics-tls secret, and as a result, the newly deployed router would fail to start. Fix: openshift-ansible was changed to update any existing router service to have the needed annotation so that the service serving cert controller will create the router-metrics-tls secret. Result: openshift-ansible can now deploy a functioning router even if an old router service that is missing the annotation exists.
Clone Of: 1635613
Environment:
Last Closed: 2019-02-20 10:11:10 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift openshift-ansible pull 11122 0 None closed [release-3.10] Update router service when its annotation changes 2020-10-11 08:11:16 UTC
Red Hat Product Errata RHBA-2019:0328 0 None None None 2019-02-20 10:11:13 UTC

Description Miciah Dashiel Butler Masters 2019-02-05 00:55:23 UTC
openshift-ansible needs to update an existing router service when its annotation changes.

PR: https://github.com/openshift/openshift-ansible/pull/11122

+++ This bug was initially created as a clone of Bug #1635613 +++

Description of problem:

When a customer openshift_hosted_routers and deploys the routers using:
ansible-playbook -i <ansible inventory> \
/usr/share/ansible/openshift-ansible/playbooks/openshift-hosted/deploy_router.yml

The routers won't work because the secret router-metrics-tls secret is missing. If openshift_hosted_routers is not defined oc_adm_router works as expected creating the secret.

I have tried reproducing using the exact same configuration for openshift_hosted_routers

Version-Release number of selected component (if applicable):
openshift-ansible-3.10.47-1.git.0.95bc2d2.el7_5.noarch

How reproducible:
Customer always faces this issue, I'm unable to reproduce with the same values 

Steps to Reproduce:
1.ansible-playbook -i <ansible inventory> /usr/share/ansible/openshift-ansible/playbooks/openshift-hosted/deploy_router.yml

Actual results:
router-metrics-tls is absent and therefore fails to start the router

Expected results:
OpenShift router works as expected

Additional info:
Workaround: 
1. Deploy the routers without custom openshift_hosted_routers
2. oc delete dc router -n default
3. Deploy the routers with custom openshift_hosted_routers

--- Additional comment from Scott Dodson on 2018-10-18 13:49:23 UTC ---

Juan,

Can you confirm whether in your scenario this is happening during a clean install or was this happening in an upgraded environment?

--- Additional comment from Juan Luis de Sousa-Valadas on 2018-10-22 07:21:10 UTC ---

Scott,
It was in an upgraded environment, but I had deleted manually all the router components.

--- Additional comment from Dan Mace on 2018-11-15 17:32:50 UTC ---

The `router-metrics-tls` secret is provided by the serving cert signer component, which generates certificate secrets based on annotated services. To help us narrow down the issue, please reproduce the problem and then provide the output of the following command:

  $ oc get -n default services -o yaml

What we expect is a service with the following annotation:

  service.alpha.openshift.io/serving-cert-secret-name: router-metrics-tls

Normally, the annotated service is created by the `oc adm router` command. The presence of that annotated service is what causes the service cert signer component to generate the `router-metrics-tls` secret for use by the router deployment.

We can continue diagnosing once we have the output of the `oc` command I listed.

Thanks!

--- Additional comment from  on 2018-12-20 09:28:57 UTC ---

Hi,
can confirm this issue on an openshift origin version updated from v3.9 to v3.11.
The mentioned secret is not show. Possibly this service should have been added by the ansible-playbook upgrade.yaml, but was forgotten?


> oc get -n default services -o yaml
apiVersion: v1
items:
- apiVersion: v1
  kind: Service
  metadata:
    creationTimestamp: 2018-07-11T12:19:48Z
    labels:
      docker-registry: default
    name: docker-registry
    namespace: default
    resourceVersion: "30194106"
    selfLink: /api/v1/namespaces/default/services/docker-registry
    uid: b47a8907-8504-11e8-b082-5cf3fce5f1c8
  spec:
    clusterIP: 172.30.76.0
    ports:
    - name: 5000-tcp
      port: 5000
      protocol: TCP
      targetPort: 5000
    selector:
      docker-registry: default
    sessionAffinity: ClientIP
    sessionAffinityConfig:
      clientIP:
        timeoutSeconds: 10800
    type: ClusterIP
  status:
    loadBalancer: {}
- apiVersion: v1
  kind: Service
  metadata:
    creationTimestamp: 2018-07-11T12:10:31Z
    labels:
      component: apiserver
      provider: kubernetes
    name: kubernetes
    namespace: default
    resourceVersion: "40893"
    selfLink: /api/v1/namespaces/default/services/kubernetes
    uid: 684f7838-8503-11e8-aa0c-5cf3fce5f1c8
  spec:
    clusterIP: 172.30.0.1
    ports:
    - name: https
      port: 443
      protocol: TCP
      targetPort: 8443
    - name: dns
      port: 53
      protocol: UDP
      targetPort: 8053
    - name: dns-tcp
      port: 53
      protocol: TCP
      targetPort: 8053
    sessionAffinity: ClientIP
    sessionAffinityConfig:
      clientIP:
        timeoutSeconds: 10800
    type: ClusterIP
  status:
    loadBalancer: {}
- apiVersion: v1
  kind: Service
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"createdBy":"registry-console-template","name":"registry-console"},"name":"registry-console","namespace":"default"},"spec":{"ports":[{"name":"registry-console","port":9000,"protocol":"TCP","targetPort":9090}],"selector":{"name":"registry-console"},"type":"ClusterIP"}}
      openshift.io/generated-by: OpenShiftNewApp
    creationTimestamp: 2018-07-11T12:52:31Z
    labels:
      app: registry-console
      createdBy: registry-console-template
      name: registry-console
    name: registry-console
    namespace: default
    resourceVersion: "30054148"
    selfLink: /api/v1/namespaces/default/services/registry-console
    uid: 464c8afa-8509-11e8-b082-5cf3fce5f1c8
  spec:
    clusterIP: 172.30.165.61
    ports:
    - name: registry-console
      port: 9000
      protocol: TCP
      targetPort: 9090
    selector:
      name: registry-console
    sessionAffinity: None
    type: ClusterIP
  status:
    loadBalancer: {}
- apiVersion: v1
  kind: Service
  metadata:
    annotations:
      prometheus.io/port: "1936"
      prometheus.io/scrape: "true"
      prometheus.openshift.io/password: <deleted>
      prometheus.openshift.io/username: <deleted>
    creationTimestamp: 2018-07-11T12:19:39Z
    labels:
      router: router
    name: router
    namespace: default
    resourceVersion: "42089"
    selfLink: /api/v1/namespaces/default/services/router
    uid: aea585f0-8504-11e8-b082-5cf3fce5f1c8
  spec:
    clusterIP: 172.30.77.211
    ports:
    - name: 80-tcp
      port: 80
      protocol: TCP
      targetPort: 80
    - name: 443-tcp
      port: 443
      protocol: TCP
      targetPort: 443
    - name: 1936-tcp
      port: 1936
      protocol: TCP
      targetPort: 1936
    selector:
      router: router
    sessionAffinity: None
    type: ClusterIP
  status:
    loadBalancer: {}
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

```

Comment 2 Hongan Li 2019-02-12 10:00:35 UTC
verified with openshift-ansible-3.10.110-1.git.0.1e03ab3.el7 and the issue has been fixed.

Comment 4 errata-xmlrpc 2019-02-20 10:11:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0328


Note You need to log in before you can comment on or make changes to this bug.