Gaoyun Pei, I didn't see the attached ansible output. Could you please run this command before and after the upgrade2? === $ oc volume dc/router -ndefault deploymentconfigs/router secret/router-certs as server-certificate mounted at /etc/pki/tls/private === My guess as to what is happening is that the router certificates secret(router-certs) has a different name than what is being updated and therefore is not being referenced when the update occurs. Please post the results of the previous command and the ansible -vvv log of the redeploy-router-certificates.yml. This would greatly help assist in determining what the issue is. Include: - Output from command "oc volume dc/router -ndefault" - Output from "redeploy-router-certificates.yml" with -vvv - Output of oc get secret router-certs -o yaml - Inventory file Thanks.
Suggested fix: https://github.com/openshift/openshift-ansible/pull/7306 This was caused by a refactor to the openshift_hosted role. This should be fixed once merged.
Commit pushed to master at https://github.com/openshift/openshift-ansible https://github.com/openshift/openshift-ansible/commit/d1ea03e2709ab0d3717b27aaf6c3948561249268 Merge pull request #7306 from kwoodson/fix_redeploy_router Automatic merge from submit-queue. [bug 1543256] Fix redeploy router from openshift_hosted refactor. It appears that after a recent refactor of `openshift_hosted` role the `redeploy-router-certificates.yml` was not calling the `router.yml` required to re-roll the certificates. https://bugzilla.redhat.com/attachment.cgi?id=1401169
Verify this bug with openshift-ansible-3.9.2-1.git.0.1a855b3.el7.noarch. Router cert redeployment playbook failed at step "Redeploy router" as the following error: TASK [openshift_hosted : Create OpenShift router] *************************************************************************************************************************** changed: [ec2-35-173-252-112.compute-1.amazonaws.com] => (item={u'name': u'router', u'certificate': {u'keyfile': u'/etc/origin/master/openshift-router.key', u'certfile': u'/etc/origin/master/openshift-router.crt', u'cafile': u'/etc/origin/master/ca.crt'}, u'replicas': u'1', u'serviceaccount': u'router', u'namespace': u'default', u'stats_port': 1936, u'edits': [{u'action': u'put', u'key': u'spec.strategy.rollingParams.intervalSeconds', u'value': 1}, {u'action': u'put', u'key': u'spec.strategy.rollingParams.updatePeriodSeconds', u'value': 1}, {u'action': u'put', u'key': u'spec.strategy.activeDeadlineSeconds', u'value': 21600}], u'images': u'registry.reg-aws.openshift.com:443/openshift3/ose-${component}:${version}', u'selector': u'role=node,router=enabled', u'ports': [u'80:80', u'443:443']}) => {"changed": true, "failed": false, "item": {"certificate": {"cafile": "/etc/origin/master/ca.crt", "certfile": "/etc/origin/master/openshift-router.crt", "keyfile": "/etc/origin/master/openshift-router.key"}, "edits": [{"action": "put", "key": "spec.strategy.rollingParams.intervalSeconds", "value": 1}, {"action": "put", "key": "spec.strategy.rollingParams.updatePeriodSeconds", "value": 1}, {"action": "put", "key": "spec.strategy.activeDeadlineSeconds", "value": 21600}], "images": "registry.reg-aws.openshift.com:443/openshift3/ose-${component}:${version}", "name": "router", "namespace": "default", "ports": ["80:80", "443:443"], "replicas": "1", "selector": "role=node,router=enabled", "serviceaccount": "router", "stats_port": 1936}, "results": {"results": [{"cmd": "/usr/bin/oc replace -f /tmp/SecretFfx3S0 -n default", "results": {}, "returncode": 0}, {"cmd": "/usr/bin/oc replace -f /tmp/DeploymentConfigj5GNDr -n default", "results": {}, "returncode": 0}], "returncode": 0}, "state": "present"} TASK [Redeploy router] ****************************************************************************************************************************************************** fatal: [ec2-35-173-252-112.compute-1.amazonaws.com]: FAILED! => {"changed": true, "cmd": ["oc", "rollout", "latest", "dc/router", "--config=/tmp/openshift-ansible-9GE7Xp/admin.kubeconfig", "-n", "default"], "delta": "0:00:00.641174", "end": "2018-03-04 22:02:44.850810", "failed": true, "msg": "non-zero return code", "rc": 1, "start": "2018-03-04 22:02:44.209636", "stderr": "error: #2 is already in progress (Pending).", "stderr_lines": ["error: #2 is already in progress (Pending)."], "stdout": "", "stdout_lines": []} to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/openshift-hosted/redeploy-router-certificates.retry
Proposed fix: https://github.com/openshift/openshift-ansible/pull/7386
Followup https://github.com/openshift/openshift-ansible/pull/7390
PR already merged into openshift-ansible-3.9.3-1.git.0.e166207.el7.noarch, test again with this package. Router cert redeployment playbook works well. The cert is updated and router pod are running after the redeployment.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0489