Bug 1635613
Summary: | oc_adm_router doesn't create router-metrics-tls secret | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Juan Luis de Sousa-Valadas <jdesousa> | |
Component: | Networking | Assignee: | Miciah Dashiel Butler Masters <mmasters> | |
Networking sub component: | router | QA Contact: | Hongan Li <hongli> | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | high | |||
Priority: | urgent | CC: | adeshpan, aos-bugs, bmchugh, hongli, jdesousa, jjerezro, jkaur, jokerman, jrosenta, mmasters, mmccomas, rbost, rsevilla, sdodson, sople, steffen.seckler, tkimura, vhernand | |
Version: | 3.10.0 | |||
Target Milestone: | --- | |||
Target Release: | 3.11.z | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: |
Cause: If a router had previously been deployed with an older version of openshift-ansible, its service could be missing the service.alpha.openshift.io/serving-cert-secret-name annotation. openshift-ansible did not add the missing annotation.
Consequence: The service serving cert controller was not creating the router-metrics-tls secret, and as a result, the newly deployed router would fail to start.
Fix: openshift-ansible was changed to update any existing router service to have the needed annotation so that the service serving cert controller will create the router-metrics-tls secret.
Result: openshift-ansible can now deploy a functioning router even if an old router service that is missing the annotation exists.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1672454 (view as bug list) | Environment: | ||
Last Closed: | 2019-02-20 14:11:01 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1672454 |
Description
Juan Luis de Sousa-Valadas
2018-10-03 11:42:30 UTC
Juan, Can you confirm whether in your scenario this is happening during a clean install or was this happening in an upgraded environment? Scott, It was in an upgraded environment, but I had deleted manually all the router components. I had this exact issue when deleting my routers and redeploying from the 3.10.45 playbook. Editing the router deployment config, using oc, to not mount "router-metrics-tls" and then removing both environment variables concerning metrics TLS will "fix" this issue. I hope this gets some attention soon because it's concerning that Red Hat ships playbooks that literally do not work in their enterprise products. The `router-metrics-tls` secret is provided by the serving cert signer component, which generates certificate secrets based on annotated services. To help us narrow down the issue, please reproduce the problem and then provide the output of the following command: $ oc get -n default services -o yaml What we expect is a service with the following annotation: service.alpha.openshift.io/serving-cert-secret-name: router-metrics-tls Normally, the annotated service is created by the `oc adm router` command. The presence of that annotated service is what causes the service cert signer component to generate the `router-metrics-tls` secret for use by the router deployment. We can continue diagnosing once we have the output of the `oc` command I listed. Thanks! Hi,
can confirm this issue on an openshift origin version updated from v3.9 to v3.11.
The mentioned secret is not show. Possibly this service should have been added by the ansible-playbook upgrade.yaml, but was forgotten?
> oc get -n default services -o yaml
apiVersion: v1
items:
- apiVersion: v1
kind: Service
metadata:
creationTimestamp: 2018-07-11T12:19:48Z
labels:
docker-registry: default
name: docker-registry
namespace: default
resourceVersion: "30194106"
selfLink: /api/v1/namespaces/default/services/docker-registry
uid: b47a8907-8504-11e8-b082-5cf3fce5f1c8
spec:
clusterIP: 172.30.76.0
ports:
- name: 5000-tcp
port: 5000
protocol: TCP
targetPort: 5000
selector:
docker-registry: default
sessionAffinity: ClientIP
sessionAffinityConfig:
clientIP:
timeoutSeconds: 10800
type: ClusterIP
status:
loadBalancer: {}
- apiVersion: v1
kind: Service
metadata:
creationTimestamp: 2018-07-11T12:10:31Z
labels:
component: apiserver
provider: kubernetes
name: kubernetes
namespace: default
resourceVersion: "40893"
selfLink: /api/v1/namespaces/default/services/kubernetes
uid: 684f7838-8503-11e8-aa0c-5cf3fce5f1c8
spec:
clusterIP: 172.30.0.1
ports:
- name: https
port: 443
protocol: TCP
targetPort: 8443
- name: dns
port: 53
protocol: UDP
targetPort: 8053
- name: dns-tcp
port: 53
protocol: TCP
targetPort: 8053
sessionAffinity: ClientIP
sessionAffinityConfig:
clientIP:
timeoutSeconds: 10800
type: ClusterIP
status:
loadBalancer: {}
- apiVersion: v1
kind: Service
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"createdBy":"registry-console-template","name":"registry-console"},"name":"registry-console","namespace":"default"},"spec":{"ports":[{"name":"registry-console","port":9000,"protocol":"TCP","targetPort":9090}],"selector":{"name":"registry-console"},"type":"ClusterIP"}}
openshift.io/generated-by: OpenShiftNewApp
creationTimestamp: 2018-07-11T12:52:31Z
labels:
app: registry-console
createdBy: registry-console-template
name: registry-console
name: registry-console
namespace: default
resourceVersion: "30054148"
selfLink: /api/v1/namespaces/default/services/registry-console
uid: 464c8afa-8509-11e8-b082-5cf3fce5f1c8
spec:
clusterIP: 172.30.165.61
ports:
- name: registry-console
port: 9000
protocol: TCP
targetPort: 9090
selector:
name: registry-console
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
- apiVersion: v1
kind: Service
metadata:
annotations:
prometheus.io/port: "1936"
prometheus.io/scrape: "true"
prometheus.openshift.io/password: <deleted>
prometheus.openshift.io/username: <deleted>
creationTimestamp: 2018-07-11T12:19:39Z
labels:
router: router
name: router
namespace: default
resourceVersion: "42089"
selfLink: /api/v1/namespaces/default/services/router
uid: aea585f0-8504-11e8-b082-5cf3fce5f1c8
spec:
clusterIP: 172.30.77.211
ports:
- name: 80-tcp
port: 80
protocol: TCP
targetPort: 80
- name: 443-tcp
port: 443
protocol: TCP
targetPort: 443
- name: 1936-tcp
port: 1936
protocol: TCP
targetPort: 1936
selector:
router: router
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
kind: List
metadata:
resourceVersion: ""
selfLink: ""
```
Does manually adding the annotation `service.alpha.openshift.io/serving-cert-secret-name: router-metrics-tls` to the `router` service in the `default` namespace work around the issue? *** Bug 1671626 has been marked as a duplicate of this bug. *** 3.10 backport: https://bugzilla.redhat.com/show_bug.cgi?id=1672454 verified with openshift-ansible-3.11.82-1.git.0.f29227a.el7 and issue has been fixed, the router can be deployed by ansible playbook. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0326 |