Bug 1696413

Summary:	Task failure restart docker while running redeploy-certificates
Product:	OpenShift Container Platform	Reporter:	Joseph Callen <jcallen>
Component:	Installer	Assignee:	Joseph Callen <jcallen>
Installer sub component:	openshift-ansible	QA Contact:	Gaoyun Pei <gpei>
Status:	CLOSED ERRATA	Docs Contact:
Severity:	medium
Priority:	medium	CC:	gpei, jialiu, shiywang
Version:	3.10.0
Target Milestone:	---
Target Release:	3.10.z
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:	1695856	Environment:
Last Closed:	2019-06-11 09:30:48 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1695856
Bug Blocks:

Description Joseph Callen 2019-04-04 19:43:34 UTC

+++ This bug was initially created as a clone of Bug #1695856 +++

Description of problem:

When using cri-o on infra and compute nodes running `playbooks/redeploy-certificates.yml` fails while trying to restart docker when it is not installed on the node.

Version-Release number of the following components:



$ ansible --version
ansible 2.7.9

openshift-ansible - dc63ae8a3b1c018568720d7fe66324ecce2a7b91

How reproducible:

Steps to Reproduce:
1.
2.
3.

Actual results:
TASK [Restart docker] *************************************************************************************************************************************
task path: /var/home/jlcallen/Development/oa-testing/aws-c2/openshift-ansible/playbooks/openshift-node/private/restart.yml:11
Using module file /usr/lib/python3.7/site-packages/ansible/modules/system/systemd.py
<ec2-54-80-147-83.compute-1.amazonaws.com> ESTABLISH SSH CONNECTION FOR USER: ec2-user
<ec2-54-80-147-83.compute-1.amazonaws.com> SSH: EXEC ssh -o ControlMaster=auto -o ControlPersist=600s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=ec2-user -o ConnectTimeout=30 -o
ControlPath=/var/home/jlcallen/.ansible/cp/%h-%r ec2-54-80-147-83.compute-1.amazonaws.com '/bin/sh -c '"'"'sudo -H -S -n -u root /bin/sh -c '"'"'"'"'"'"'"'"'echo BECOME-SUCCESS-ajwqxzmkclwksuuyupbtydpezevsatqt; /usr/bin/python'"'"'"'"'"'"'"'"' && sleep 0'"'"''
Escalation succeeded
<ec2-54-80-147-83.compute-1.amazonaws.com> (1, b'\n{"msg": "Could not find the requested service docker: host", "failed": true, "invocation": {"module_args": {"no_block": false, "force": null, "name": "docker", "enabled": null, "daemon_reload": false, "state": "restarted", "masked": null, "scope": null, "user":
 null}}}\n', b'')
<ec2-54-80-147-83.compute-1.amazonaws.com> Failed to connect to the host via ssh:
fatal: [ec2-54-80-147-83.compute-1.amazonaws.com]: FAILED! => {
    "attempts": 3,
    "changed": false,
    "invocation": {
        "module_args": {
            "daemon_reload": false,
            "enabled": null,
            "force": null,
            "masked": null,
            "name": "docker",
            "no_block": false,
            "scope": null,
            "state": "restarted",
            "user": null
        }
    },
    "msg": "Could not find the requested service docker: host"
}



Expected results:
The docker service is not restarted on nodes with it not installed.


Additional info:
Please attach logs from ansible-playbook with the -vvv flag

--- Additional comment from Joseph Callen on 2019-04-03 20:20:55 UTC ---

PR: https://github.com/openshift/openshift-ansible/pull/11456

Comment 1 Joseph Callen 2019-04-04 19:45:24 UTC

PR: https://github.com/openshift/openshift-ansible/pull/11463

Comment 6 Gaoyun Pei 2019-04-18 09:36:58 UTC

Verify this bug with openshift-ansible-3.10.139-1.git.0.02bc5db.el7.noarch.rpm
When nodes installed with openshift_use_crio_only=true, redeploy-certificates.yml playbook won't try to restart docker service.

PLAY [Restart nodes] **********************************************************************************************************************************************************************************************

TASK [Gathering Facts] ********************************************************************************************************************************************************************************************
ok: [ec2-34-207-196-240.compute-1.amazonaws.com]

TASK [Restart docker] *********************************************************************************************************************************************************************************************
skipping: [ec2-34-207-196-240.compute-1.amazonaws.com] => {"changed": false, "skip_reason": "Conditional result was False"}

Comment 8 errata-xmlrpc 2019-06-11 09:30:48 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0786