Bug 1419255
| Summary: | Fail to redeploy certificates due to restart node's delay | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | liujia <jiajliu> | ||||||
| Component: | Installer | Assignee: | Andrew Butcher <abutcher> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | Gaoyun Pei <gpei> | ||||||
| Severity: | medium | Docs Contact: | |||||||
| Priority: | medium | ||||||||
| Version: | 3.5.0 | CC: | aos-bugs, gpei, jokerman, mmccomas, tdawson | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | No Doc Update | |||||||
| Doc Text: |
undefined
|
Story Points: | --- | ||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2017-04-11 21:24:30 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
Created attachment 1247674 [details]
redeploy log
Created attachment 1247675 [details]
node service
(In reply to Andrew Butcher from comment #4) > The new certificate redeploy playbooks have merged from > https://github.com/openshift/openshift-ansible/pull/2671 so moving to ON_QA > to try with the new changes. > > @Gaoyun, have you encountered a similar while verifying the new playbooks? Didn't encounter such issue during the testing. We should have three cert redeploy playbook may trigger nodes restart: playbooks/byo/openshift-cluster/redeploy-openshift-ca.yml playbooks/byo/openshift-cluster/redeploy-certificates.yml playbooks/byo/openshift-cluster/redeploy-node-certificates.yml I tried all of them against various ocp-3.5 env, including containerized/rpm env, multi-master, 7 nodes cluster, all the node restart are successful. Move this bug to verified with openshift-ansible-3.5.6-1.git.0.5e6099d.el7.noarch. |
Description of problem: Run redeploy-certificates playbook against ocp3.5(master/node/etcd), playbook will fail and exit on task [restart node] of play [Restart nodes]. But the node service will come to "running" after playbook halt. TASK [restart node] ************************************************************ fatal: [x.x.x.x]: FAILED! => { "changed": false, "failed": true } MSG: Unable to restart service atomic-openshift-node: Job for atomic-openshift-node.service failed because the control process exited with error code. See "systemctl status atomic-openshift-node.service" and "journalctl -xe" for details. to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/redeploy-certificates.retry Version-Release number of selected component (if applicable): openshift-ansible-3.5.3-1.git.0.80c2436.el7.noarch ansible-2.2.0.0-1.el7.noarch How reproducible: always Steps to Reproduce: 1.Container install ocp3.5 on atomic host. 2.Run redeploy certificates playbook # ansible-playbook -i /tmp/hosts /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/redeploy-certificates.yml -v | tee /tmp/redeploy.log 3. Actual results: Playbook exit at task [restart node]. Expected results: It should redeploy certificates successfully. Additional info: redeploy log in attachment. atomic-openshift-node log in attachment.