Bug 1418032
| Summary: | [3.2] Update router and registry certificates in the redeploy-certificates.yml | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Francesco Marchioni <fmarchio> |
| Component: | Installer | Assignee: | Andrew Butcher <abutcher> |
| Status: | CLOSED ERRATA | QA Contact: | Gaoyun Pei <gpei> |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 3.2.0 | CC: | aos-bugs, clichybi, gpei, jialiu, jokerman, lmeyer, mmccomas |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-05-17 17:38:33 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Francesco Marchioni
2017-01-31 16:27:03 UTC
Test this bug with openshift-ansible-3.2.55-1.git.0.5feab7c.el7.noarch
Now the cert redeploy playbook /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/redeploy-certificates.yml also includes redeploying registry cert and router cert playbooks.
1. For registry certificates redeployment playbook, it works well against an ocp-3.2 cluster when it have "registry-certificates" secret for docker-registry.
After cert redeployment, a new set of registry.crt/key generated under /etc/origin/master, "registry-certificates" secret updated with the new cert files. docker-registry could be redeployed successfully and sti-build test passed.
2. For router certificates redeployment playbook, it didn't regenerate "router-certs" secret, then router pod was always in ContainerCreating due to secrets "router-certs" not found.
The ansible log shows no problem when running router cert redeployment:
TASK [Update router environment variables] *************************************
skipping: [ec2-54-146-165-55.compute-1.amazonaws.com] => {"changed": false, "skip_reason": "Conditional check failed", "skipped": true}
TASK [Delete existing router certificate secret] *******************************
changed: [ec2-54-146-165-55.compute-1.amazonaws.com] => {"changed": true, "cmd": ["oc", "delete", "secret/router-certs", "--config=/tmp/openshift-ansible-dqqYTg/admin.kubeconfig", "-n", "default"], "delta": "0:00:00.503799", "end": "2017-04-14 01:49:38.548672", "rc": 0, "start": "2017-04-14 01:49:38.044873", "stderr": "", "stdout": "secret \"router-certs\" deleted", "stdout_lines": ["secret \"router-certs\" deleted"], "warnings": []}
TASK [Remove router service annotations] ***************************************
changed: [ec2-54-146-165-55.compute-1.amazonaws.com] => {"changed": true, "cmd": ["oc", "annotate", "service/router", "service.alpha.openshift.io/serving-cert-secret-name-", "service.alpha.openshift.io/serving-cert-signed-by-", "--config=/tmp/openshift-ansible-dqqYTg/admin.kubeconfig", "-n", "default"], "delta": "0:00:00.497830", "end": "2017-04-14 01:49:40.743203", "rc": 0, "start": "2017-04-14 01:49:40.245373", "stderr": "", "stdout": "service \"router\" annotated", "stdout_lines": ["service \"router\" annotated"], "warnings": []}
TASK [Add serving-cert-secret annotation to router service] ********************
changed: [ec2-54-146-165-55.compute-1.amazonaws.com] => {"changed": true, "cmd": ["oc", "annotate", "service/router", "service.alpha.openshift.io/serving-cert-secret-name=router-certs", "--config=/tmp/openshift-ansible-dqqYTg/admin.kubeconfig", "-n", "default"], "delta": "0:00:00.517662", "end": "2017-04-14 01:49:42.896729", "rc": 0, "start": "2017-04-14 01:49:42.379067", "stderr": "", "stdout": "service \"router\" annotated", "stdout_lines": ["service \"router\" annotated"], "warnings": []}
TASK [Redeploy router] *********************************************************
changed: [ec2-54-146-165-55.compute-1.amazonaws.com] => {"changed": true, "cmd": ["oc", "deploy", "dc/router", "--latest", "--config=/tmp/openshift-ansible-dqqYTg/admin.kubeconfig", "-n", "default"], "delta": "0:00:00.512418", "end": "2017-04-14 01:49:45.097579", "rc": 0, "start": "2017-04-14 01:49:44.585161", "stderr": "", "stdout": "Started deployment #3", "stdout_lines": ["Started deployment #3"], "warnings": []}
...
But actually during step "Add serving-cert-secret annotation to router service", it didn't regenerate "router-certs" secret, here's a manual try:
1). After installation, check router pod status and router-certs secret
[root@ip-172-18-9-176 ~]# oc get pod |grep router
router-1-0hknf 1/1 Running 0 1m
[root@ip-172-18-9-176 ~]# oc get secret|grep router-certs
router-certs kubernetes.io/tls 2 1m
2). Delete existing router certificate secret
[root@ip-172-18-9-176 ~]# oc delete secret router-certs
secret "router-certs" deleted
[root@ip-172-18-9-176 ~]# oc get secret|grep router-certs
[root@ip-172-18-9-176 ~]#
3). Remove router service annotations
[root@ip-172-18-9-176 ~]# oc annotate service router \
> service.alpha.openshift.io/serving-cert-secret-name- \
> service.alpha.openshift.io/serving-cert-signed-by-
service "router" annotated
4). Add serving-cert-secret annotation to router service
[root@ip-172-18-9-176 ~]# oc annotate service router \
> service.alpha.openshift.io/serving-cert-secret-name=router-certs
service "router" annotated
[root@ip-172-18-9-176 ~]# oc get secret|grep router-certs
[root@ip-172-18-9-176 ~]#
5). Redeploy router
[root@ip-172-18-9-176 ~]# oc deploy dc/router --latest
Started deployment #2
[root@ip-172-18-9-176 ~]# oc get pod
NAME READY STATUS RESTARTS AGE
router-2-qbopc 0/1 ContainerCreating 0 1m
[root@ip-172-18-9-176 ~]# oc describe pod router-2-qbopc
Name: router-2-qbopc
Namespace: default
Node: ip-172-18-3-88.ec2.internal/172.18.3.88
Start Time: Fri, 14 Apr 2017 02:27:30 -0400
Labels: deployment=router-2,deploymentconfig=router,router=router
Status: Pending
IP: 172.18.3.88
Controllers: ReplicationController/router-2
Containers:
router:
Container ID:
Image: x.com/openshift3/ose-haproxy-router:v3.2.1.31
Image ID:
Ports: 80/TCP, 443/TCP, 1936/TCP
QoS Tier:
memory: BestEffort
cpu: BestEffort
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Liveness: http-get http://localhost:1936/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://localhost:1936/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
Environment Variables:
DEFAULT_CERTIFICATE_PATH: /etc/pki/tls/private/tls.crt
ROUTER_EXTERNAL_HOST_HOSTNAME:
ROUTER_EXTERNAL_HOST_HTTPS_VSERVER:
ROUTER_EXTERNAL_HOST_HTTP_VSERVER:
ROUTER_EXTERNAL_HOST_INSECURE: false
ROUTER_EXTERNAL_HOST_PARTITION_PATH:
ROUTER_EXTERNAL_HOST_PASSWORD:
ROUTER_EXTERNAL_HOST_PRIVKEY: /etc/secret-volume/router.pem
ROUTER_EXTERNAL_HOST_USERNAME:
ROUTER_SERVICE_HTTPS_PORT: 443
ROUTER_SERVICE_HTTP_PORT: 80
ROUTER_SERVICE_NAME: router
ROUTER_SERVICE_NAMESPACE: default
ROUTER_SUBDOMAIN:
STATS_PASSWORD: paYRXO8NPM
STATS_PORT: 1936
STATS_USERNAME: admin
Conditions:
Type Status
Ready False
Volumes:
server-certificate:
Type: Secret (a volume populated by a Secret)
SecretName: router-certs
router-token-bjbv6:
Type: Secret (a volume populated by a Secret)
SecretName: router-token-bjbv6
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
1m 1m 1 {default-scheduler } Normal Scheduled Successfully assigned router-2-qbopc to ip-172-18-3-88.ec2.internal
1m 5s 7 {kubelet ip-172-18-3-88.ec2.internal} Warning FailedMount Unable to mount volumes for pod "router-2-qbopc_default(6f960196-20db-11e7-b29a-0e2a308162cc)": secrets "router-certs" not found
1m 5s 7 {kubelet ip-172-18-3-88.ec2.internal} Warning FailedSync Error syncing pod, skipping: secrets "router-certs" not found
[root@ip-172-18-9-176 ~]# oc get secret|grep router-certs
[root@ip-172-18-9-176 ~]#
How was this cluster prepared? I don't see a router-certs secret specified in the router deploymentConfig when installing a 3.2 cluster with openshift-ansible.
# oc get dc/router -o jsonpath='{.spec.template.spec.volumes}'
[]
Steps which remove and add the service serving certificate secret annotation will only run when the secret is specified in the router deploymentConfig. If there are no secrets or environment variables then the router will just be redeployed.
TASK [Update router environment variables] *************************************
skipping: [master1.abutcher.com]
TASK [Delete existing router certificate secret] *******************************
skipping: [master1.abutcher.com]
TASK [Remove router service annotations] ***************************************
skipping: [master1.abutcher.com]
TASK [Add serving-cert-secret annotation to router service] ********************
skipping: [master1.abutcher.com]
TASK [Redeploy router] *********************************************************
changed: [master1.abutcher.com]
@Andrew, I checked the previous installation log, I should have openshift_hosted_router_certificate specified in ansible inventory.
openshift_hosted_router_certificate={"certfile": "/files/router_1.crt", "keyfile": "/files/router_1.key","cafile": "/files/router_1_rootca.crt"}
@Gaoyun, the redeploy playbooks were not taking custom router certificates into account and this problem exists in all versions of the installer. I've created https://bugzilla.redhat.com/show_bug.cgi?id=1446737 for 3.5 and cloned for other versions. 3.4 https://bugzilla.redhat.com/show_bug.cgi?id=1446745 3.3 https://bugzilla.redhat.com/show_bug.cgi?id=1446745 Proposed fix for 3.2: https://github.com/openshift/openshift-ansible/pull/4043 Verify this bug with openshift-ansible-3.2.56-1.git.0.b844ab7.el7.noarch When custom router certificate provided during install via openshift_hosted_router_certificate, run redeploy cert playbook against the cluster, custom router cert would be retained and router pod was running well. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:1244 |