Bug 1715322

Summary: redeploy of certificates don't recreate the ASB pod secrets
Product: OpenShift Container Platform Reporter: Roberto <rdiazgav>
Component: InstallerAssignee: Joseph Callen <jcallen>
Installer sub component: openshift-installer QA Contact: Jian Zhang <jiazha>
Status: CLOSED ERRATA Docs Contact:
Severity: low    
Priority: low CC: jcallen, jiazha, mrobson
Version: 3.11.0   
Target Milestone: ---   
Target Release: 3.11.z   
Hardware: All   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: There was no playbook to implement updating the certificate for the ASB, TSB or SC Consequence: The certificate was not being updated and causing the SC and related service to fail Fix: Create additional playbook/task to support updating the certificates Result: The certificates are changed for the ASB, TSB and SC.
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-07-23 19:56:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Roberto 2019-05-30 06:53:47 UTC
Description of problem:

Ansible Service Broker secrets are not recreated when a redeployment of certificates is performed(redeploy-certificates.yml) so ASB fails.

How reproduce:

// get info from the current status of the pod and secrets

# oc get pod
NAME          READY     STATUS    RESTARTS   AGE
asb-1-585qk   1/1       Running   6          15h

# oc get secrets | grep asb
asb-client                             kubernetes.io/service-account-token   4         59d
asb-client-dockercfg-pz8gh             kubernetes.io/dockercfg               1         59d
asb-client-token-p7f6q                 kubernetes.io/service-account-token   4         59d
asb-client-token-tl5kd                 kubernetes.io/service-account-token   4         59d
asb-dockercfg-hrjvz                    kubernetes.io/dockercfg               1         59d
asb-registry-auth                      Opaque                                2         59d
asb-tls                                kubernetes.io/tls                     2         15h
asb-token-nf6j2                        kubernetes.io/service-account-token   4         59d
asb-token-p7hll                        kubernetes.io/service-account-token   4         15h

# curl -vvv --cacert /etc/origin/master/service-signer.crt https://asb.openshift-ansible-service-broker.svc:1338
* About to connect() to asb.openshift-ansible-service-broker.svc port 1338 (#0)
*   Trying 172.30.194.125...
* Connected to asb.openshift-ansible-service-broker.svc (172.30.194.125) port 1338 (#0)
* Initializing NSS with certpath: sql:/etc/pki/nssdb
*   CAfile: /etc/origin/master/service-signer.crt
  CApath: none
* SSL connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
* Server certificate:
* 	subject: CN=asb.openshift-ansible-service-broker.svc
* 	start date: may 09 14:23:42 2019 GMT
* 	expire date: may 08 14:23:43 2021 GMT
* 	common name: asb.openshift-ansible-service-broker.svc
* 	issuer: CN=openshift-service-serving-signer@1557410941
> GET / HTTP/1.1
> User-Agent: curl/7.29.0
> Host: asb.openshift-ansible-service-broker.svc:1338
> Accept: */*
> 
< HTTP/1.1 200 OK
< Content-Type: application/json
< Date: Fri, 10 May 2019 05:46:31 GMT
< Content-Length: 162
< 
{
  "paths": [
    "/apis",
    "/healthz",
    "/healthz/ping",
    "/healthz/poststarthook/generic-apiserver-start-informers",
    "/metrics",
    "/osb/"
  ]
* Connection #0 to host asb.openshift-ansible-service-broker.svc left intact

//so far everything is working as expected, so I've redeployed the certificates:

# ansible-playbook -vvv -i ~/hosts redeploy-certificates.yml
[...]

PLAY RECAP *************************************************************************************************************************************************************
infra1.ocplab.com          : ok=20   changed=2    unreachable=0    failed=0   
localhost                  : ok=15   changed=0    unreachable=0    failed=0   
master1.ocplab.com         : ok=216  changed=56   unreachable=0    failed=0   
node1.ocplab.com           : ok=20   changed=2    unreachable=0    failed=0   


INSTALLER STATUS *******************************************************************************************************************************************************
Initialization  : Complete (0:00:58)

// check svc again and it fails

# curl -vvv --cacert /etc/origin/master/service-signer.crt https://asb.openshift-ansible-service-broker.svc:1338
* About to connect() to asb.openshift-ansible-service-broker.svc port 1338 (#0)
*   Trying 172.30.194.125...
* Connected to asb.openshift-ansible-service-broker.svc (172.30.194.125) port 1338 (#0)
* Initializing NSS with certpath: sql:/etc/pki/nssdb
*   CAfile: /etc/origin/master/service-signer.crt
  CApath: none
* Server certificate:
* 	subject: CN=asb.openshift-ansible-service-broker.svc
* 	start date: may 09 14:23:42 2019 GMT
* 	expire date: may 08 14:23:43 2021 GMT
* 	common name: asb.openshift-ansible-service-broker.svc
* 	issuer: CN=openshift-service-serving-signer@1557410941
* NSS error -8172 (SEC_ERROR_UNTRUSTED_ISSUER)
* Peer's certificate issuer has been marked as not trusted by the user.
* Closing connection 0
curl: (60) Peer's certificate issuer has been marked as not trusted by the user.
More details here: http://curl.haxx.se/docs/sslcerts.html

curl performs SSL certificate verification by default, using a "bundle"
 of Certificate Authority (CA) public keys (CA certs). If the default
 bundle file isn't adequate, you can specify an alternate file
 using the --cacert option.
If this HTTPS server uses a certificate signed by a CA represented in
 the bundle, the certificate verification probably failed due to a
 problem with the certificate (it might be expired, or the name might
 not match the domain name in the URL).
If you'd like to turn off curl's verification of the certificate, use
 the -k (or --insecure) option.

//let's take look again to the status of the pod and secrets

# oc get pod
NAME          READY     STATUS    RESTARTS   AGE
asb-1-585qk   1/1       Running   6          15h

# oc get secrets | grep asb
asb-client                             kubernetes.io/service-account-token   4         59d
asb-client-dockercfg-pz8gh             kubernetes.io/dockercfg               1         59d
asb-client-token-p7f6q                 kubernetes.io/service-account-token   4         59d
asb-client-token-tl5kd                 kubernetes.io/service-account-token   4         59d
asb-dockercfg-hrjvz                    kubernetes.io/dockercfg               1         59d
asb-registry-auth                      Opaque                                2         59d
asb-tls                                kubernetes.io/tls                     2         15h
asb-token-nf6j2                        kubernetes.io/service-account-token   4         59d
asb-token-p7hll                        kubernetes.io/service-account-token   4         15h

// no changes, that is, the AGE field is the same value

//let's recreate the secrets after redeploying certificates

# oc get pod asb-1-585qk -o yaml
[...]
  - configMap:
      defaultMode: 420
      items:
      - key: broker-config
        path: config.yaml
      name: broker-config
    name: config-volume
  - name: asb-tls     <<<<<<<<
    secret:
      defaultMode: 420
      secretName: asb-tls
  - name: asb-token-p7hll  <<<<<<<
    secret:
      defaultMode: 420
      secretName: asb-token-p7hll


# oc delete secret asb-tls asb-token-p7hll
secret "asb-tls" deleted
secret "asb-token-p7hll" deleted

// the secrets are automatically recreated

# oc get secrets | grep asb
asb-client                             kubernetes.io/service-account-token   4         59d
asb-client-dockercfg-pz8gh             kubernetes.io/dockercfg               1         59d
asb-client-token-p7f6q                 kubernetes.io/service-account-token   4         59d
asb-client-token-tl5kd                 kubernetes.io/service-account-token   4         59d
asb-dockercfg-hrjvz                    kubernetes.io/dockercfg               1         59d
asb-registry-auth                      Opaque                                2         59d
asb-tls                                kubernetes.io/tls                     2         1m
asb-token-nf6j2                        kubernetes.io/service-account-token   4         59d
asb-token-rrjpt                        kubernetes.io/service-account-token   4         1m

// now with the new secrets I delete the pod 

# oc delete pod asb-1-585qk
pod "asb-1-585qk" deleted

# oc get pod
NAME          READY     STATUS    RESTARTS   AGE
asb-1-96fwp   1/1       Running   0          2m

//check the svc again, and it works as expected now
# curl -vvv --cacert /etc/origin/master/service-signer.crt https://asb.openshift-ansible-service-broker.svc:1338
* About to connect() to asb.openshift-ansible-service-broker.svc port 1338 (#0)
*   Trying 172.30.194.125...
* Connected to asb.openshift-ansible-service-broker.svc (172.30.194.125) port 1338 (#0)
* Initializing NSS with certpath: sql:/etc/pki/nssdb
*   CAfile: /etc/origin/master/service-signer.crt
  CApath: none
* SSL connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
* Server certificate:
* 	subject: CN=asb.openshift-ansible-service-broker.svc
* 	start date: may 10 08:07:02 2019 GMT
* 	expire date: may 09 08:07:03 2021 GMT
* 	common name: asb.openshift-ansible-service-broker.svc
* 	issuer: CN=openshift-service-serving-signer@1557467649
> GET / HTTP/1.1
> User-Agent: curl/7.29.0
> Host: asb.openshift-ansible-service-broker.svc:1338
> Accept: */*
> 
< HTTP/1.1 200 OK
< Content-Type: application/json
< Date: Fri, 10 May 2019 08:12:00 GMT
< Content-Length: 162
< 
{
  "paths": [
    "/apis",
    "/healthz",
    "/healthz/ping",
    "/healthz/poststarthook/generic-apiserver-start-informers",
    "/metrics",
    "/osb/"
  ]
* Connection #0 to host asb.openshift-ansible-service-broker.svc left intact

I guess the secrets should be recreated when the certificates are redeployed.

Comment 5 Jian Zhang 2019-06-20 02:43:14 UTC
After running "ansible-playbook -i qe-inventory-host-file playbooks/redeploy-certificates.yml -vvv", these secrets(asb-client, asb-tls, apiserver-serving-cert, templateservicebroker-client) have been updated as expected. As below:
[root@qe-phunt-preserve-merrn-1 ~]# oc get secret -n openshift-ansible-service-broker
NAME                         TYPE                                  DATA      AGE
asb-client                   kubernetes.io/service-account-token   4         5m
asb-client-dockercfg-rrplx   kubernetes.io/dockercfg               1         6h
asb-client-token-8t55n       kubernetes.io/service-account-token   4         6h
asb-client-token-hdp2p       kubernetes.io/service-account-token   4         6h
asb-dockercfg-lzknv          kubernetes.io/dockercfg               1         6h
asb-registry-auth            Opaque                                2         6h
asb-tls                      kubernetes.io/tls                     2         5m
asb-token-gg4g7              kubernetes.io/service-account-token   4         6h
...

[root@qe-phunt-preserve-merrn-1 ~]# oc get secret -n openshift-template-service-broker
NAME                                           TYPE                                  DATA      AGE
apiserver-dockercfg-qrprr                      kubernetes.io/dockercfg               1         6h
apiserver-serving-cert                         kubernetes.io/tls                     2         5m
apiserver-token-67drq                          kubernetes.io/service-account-token   4         6h
apiserver-token-dpdmr                          kubernetes.io/service-account-token   4         6h
builder-dockercfg-5288k                        kubernetes.io/dockercfg               1         6h
builder-token-jxlbd                            kubernetes.io/service-account-token   4         6h
builder-token-mfw2z                            kubernetes.io/service-account-token   4         6h
default-dockercfg-gfzpm                        kubernetes.io/dockercfg               1         6h
default-token-7pz2t                            kubernetes.io/service-account-token   4         6h
default-token-bvjv8                            kubernetes.io/service-account-token   4         6h
deployer-dockercfg-7v548                       kubernetes.io/dockercfg               1         6h
deployer-token-bmghk                           kubernetes.io/service-account-token   4         6h
deployer-token-dcdfk                           kubernetes.io/service-account-token   4         6h
templateservicebroker-client                   kubernetes.io/service-account-token   4         5m
...

And, check svc again, ASB works well!
[root@qe-phunt-preserve-merrn-1 ~]# curl -vvv --cacert /etc/origin/master/service-signer.crt https://asb.openshift-ansible-service-broker.svc:1338
* About to connect() to asb.openshift-ansible-service-broker.svc port 1338 (#0)
*   Trying 172.30.7.188...
* Connected to asb.openshift-ansible-service-broker.svc (172.30.7.188) port 1338 (#0)
* Initializing NSS with certpath: sql:/etc/pki/nssdb
*   CAfile: /etc/origin/master/service-signer.crt
  CApath: none
* SSL connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
* Server certificate:
* 	subject: CN=asb.openshift-ansible-service-broker.svc
* 	start date: Jun 20 02:11:19 2019 GMT
* 	expire date: Jun 19 02:11:20 2021 GMT
* 	common name: asb.openshift-ansible-service-broker.svc
* 	issuer: CN=openshift-service-serving-signer@1560996300
> GET / HTTP/1.1
> User-Agent: curl/7.29.0
> Host: asb.openshift-ansible-service-broker.svc:1338
> Accept: */*
> 
< HTTP/1.1 200 OK
< Content-Type: application/json
< Date: Thu, 20 Jun 2019 02:17:57 GMT
< Content-Length: 162
< 
{
  "paths": [
    "/apis",
    "/healthz",
    "/healthz/ping",
    "/healthz/poststarthook/generic-apiserver-start-informers",
    "/metrics",
    "/osb/"
  ]
* Connection #0 to host asb.openshift-ansible-service-broker.svc left intact


TSB works well too.
[root@qe-phunt-preserve-merrn-1 ~]# curl -vvv --cacert /etc/origin/master/service-signer.crt https://apiserver.openshift-template-service-broker.svc:443
* About to connect() to apiserver.openshift-template-service-broker.svc port 443 (#0)
*   Trying 172.30.241.218...
* Connected to apiserver.openshift-template-service-broker.svc (172.30.241.218) port 443 (#0)
* Initializing NSS with certpath: sql:/etc/pki/nssdb
*   CAfile: /etc/origin/master/service-signer.crt
  CApath: none
* NSS: client certificate not found (nickname not specified)
* SSL connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
* Server certificate:
* 	subject: CN=apiserver.openshift-template-service-broker.svc
* 	start date: Jun 20 02:12:05 2019 GMT
* 	expire date: Jun 19 02:12:06 2021 GMT
* 	common name: apiserver.openshift-template-service-broker.svc
* 	issuer: CN=openshift-service-serving-signer@1560996300
> GET / HTTP/1.1
> User-Agent: curl/7.29.0
> Host: apiserver.openshift-template-service-broker.svc
> Accept: */*
> 
< HTTP/1.1 200 OK
< Content-Type: application/json
< Date: Thu, 20 Jun 2019 02:31:57 GMT
< Content-Length: 214
< 
{
  "paths": [
    "/apis",
    "/brokers/template.openshift.io/v2",
    "/healthz",
    "/healthz/log",
    "/healthz/ping",
    "/healthz/poststarthook/template-service-broker-synctemplates",
    "/metrics"
  ]
* Connection #0 to host apiserver.openshift-template-service-broker.svc left intact

Verify it, thanks!

mac:openshift-ansible jianzhang$ ansible-playbook -i qe-inventory-host-file playbooks/redeploy-certificates.yml -vvv
...
...
INSTALLER STATUS ********************************************************************************************************************************************
Initialization  : Complete (0:01:05)
Thursday 20 June 2019  10:13:04 +0800 (0:00:35.639)       0:12:29.888 ********* 
=============================================================================== 
Gathering Facts ------------------------------------------------------------------------------------------------------------------------------------- 37.53s
/Users/jianzhang/project/openshift-ansible/playbooks/init/basic_facts.yml:2 --------------------------------------------------------------------------------
openshift_service_catalog : Verify that the apiserver is running ------------------------------------------------------------------------------------ 37.04s
/Users/jianzhang/project/openshift-ansible/roles/openshift_service_catalog/tasks/restart_pods.yml:19 -------------------------------------------------------
openshift_service_catalog : Verify that the controller-manager is running --------------------------------------------------------------------------- 35.98s
/Users/jianzhang/project/openshift-ansible/roles/openshift_service_catalog/tasks/restart_pods.yml:40 -------------------------------------------------------
template_service_broker : Verify that the apiserver is running -------------------------------------------------------------------------------------- 35.64s
/Users/jianzhang/project/openshift-ansible/roles/template_service_broker/tasks/restart_pods.yml:9 ----------------------------------------------------------
Remove generated certificates ----------------------------------------------------------------------------------------------------------------------- 29.03s
/Users/jianzhang/project/openshift-ansible/playbooks/openshift-master/private/certificates-backup.yml:28 ---------------------------------------------------
ansible_service_broker : Verify that the ASB is running --------------------------------------------------------------------------------------------- 24.32s
/Users/jianzhang/project/openshift-ansible/roles/ansible_service_broker/tasks/restart_pods.yml:20 ----------------------------------------------------------
openshift_control_plane : verify API server --------------------------------------------------------------------------------------------------------- 18.66s
/Users/jianzhang/project/openshift-ansible/roles/openshift_control_plane/handlers/main.yml:13 --------------------------------------------------------------
openshift_control_plane : verify API server --------------------------------------------------------------------------------------------------------- 17.14s
/Users/jianzhang/project/openshift-ansible/roles/openshift_control_plane/handlers/main.yml:13 --------------------------------------------------------------
openshift_console : Copy console templates to temp directory ---------------------------------------------------------------------------------------- 15.64s
/Users/jianzhang/project/openshift-ansible/roles/openshift_console/tasks/install.yml:19 --------------------------------------------------------------------
openshift_console : Waiting for console rollout to complete ----------------------------------------------------------------------------------------- 14.76s
/Users/jianzhang/project/openshift-ansible/roles/openshift_console/tasks/start.yml:2 -----------------------------------------------------------------------
template_service_broker : Remove apiserver pods ----------------------------------------------------------------------------------------------------- 14.32s
/Users/jianzhang/project/openshift-ansible/roles/template_service_broker/tasks/restart_pods.yml:2 ----------------------------------------------------------
etcd : restart etcd --------------------------------------------------------------------------------------------------------------------------------- 12.01s
/Users/jianzhang/project/openshift-ansible/roles/etcd/tasks/restart.yml:2 ----------------------------------------------------------------------------------
openshift_service_catalog : Generating API Server keys ---------------------------------------------------------------------------------------------- 11.61s
/Users/jianzhang/project/openshift-ansible/roles/openshift_service_catalog/tasks/generate_certs.yml:29 -----------------------------------------------------
Wait for master API to come back online ------------------------------------------------------------------------------------------------------------- 11.51s
/Users/jianzhang/project/openshift-ansible/playbooks/openshift-node/private/restart.yml:54 -----------------------------------------------------------------
Remove web console pods ----------------------------------------------------------------------------------------------------------------------------- 10.87s
/Users/jianzhang/project/openshift-ansible/playbooks/openshift-web-console/private/redeploy-certificates.yml:16 --------------------------------------------
etcd : Retrieve etcd ca cert tarball ---------------------------------------------------------------------------------------------------------------- 10.34s
/Users/jianzhang/project/openshift-ansible/roles/etcd/tasks/certificates/fetch_server_certificates_from_ca.yml:165 -----------------------------------------
etcd : template -------------------------------------------------------------------------------------------------------------------------------------- 9.45s
/Users/jianzhang/project/openshift-ansible/roles/etcd/tasks/certificates/deploy_ca.yml:32 ------------------------------------------------------------------
etcd : Unarchive cert tarball ------------------------------------------------------------------------------------------------------------------------ 9.27s
/Users/jianzhang/project/openshift-ansible/roles/etcd/tasks/certificates/fetch_server_certificates_from_ca.yml:149 -----------------------------------------
etcd : copy ------------------------------------------------------------------------------------------------------------------------------------------ 8.95s
/Users/jianzhang/project/openshift-ansible/roles/etcd/tasks/certificates/deploy_ca.yml:63 ------------------------------------------------------------------
openshift_hosted : Create OpenShift router ----------------------------------------------------------------------------------------------------------- 8.46s
/Users/jianzhang/project/openshift-ansible/roles/openshift_hosted/tasks/router.yml:85 ----------------------------------------------------------------------

Comment 8 errata-xmlrpc 2019-07-23 19:56:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:1753

Comment 9 Red Hat Bugzilla 2023-09-14 05:29:28 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days