Bug 1451693

Summary: Reinstall failed at "openshift_master : Start and enable master" due to system daemon didn't get reloaded
Product: OpenShift Container Platform Reporter: Gan Huang <ghuang>
Component: InstallerAssignee: Tim Bielawa <tbielawa>
Status: CLOSED ERRATA QA Contact: Gan Huang <ghuang>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.6.0CC: aos-bugs, jokerman, mmccomas
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: Older logic was missing a condition in which the systemd unit files should be reloaded Consequence: Updated or changed service unit files were not identified Fix: Ansible installer master/node roles were updated to ensure the 'reload system units' action is trigger Result: Updated service unit files are correctly detected. Users will not receive a 'Could not find the requested service' error anymore
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-10 05:24:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Gan Huang 2017-05-17 10:09:18 UTC
Description of problem:
Reinstall (uninstall first, then retrigger the install) failed at task "openshift_master : Start and enable master" due to system daemon didn't get reloaded.

Version-Release number of selected component (if applicable):
openshift-ansible-3.6.68-1.git.0.9cbe2b7.el7.noarch

How reproducible:
always

Steps to Reproduce:
1. Trigger containerized install, then uninstall the cluster
2. Run BYO playbook to reinstall the cluster above

Actual results:

<--snip-->

TASK [openshift_master : Create the systemd unit files] ************************
changed: [openshift-143.lab.sjc.redhat.com]

TASK [openshift_master : debug] ************************************************
ok: [openshift-143.lab.sjc.redhat.com] => {
    "create_master_unit_file": {
        "changed": true, 
        "checksum": "c9cc1f43779bce480e5aeb981abc23cfdf831eb0", 
        "dest": "/etc/systemd/system/atomic-openshift-master.service", 
        "gid": 0, 
        "group": "root", 
        "md5sum": "2fd439b6d5c712e4128c2941f9a4ea31", 
        "mode": "0644", 
        "owner": "root", 
        "secontext": "system_u:object_r:systemd_unit_file_t:s0", 
        "size": 794, 
        "src": "/root/.ansible/tmp/ansible-tmp-1495014933.6-155179993919746/source", 
        "state": "file", 
        "uid": 0
    }
}

TASK [openshift_master : Install Master service file] **************************
skipping: [openshift-143.lab.sjc.redhat.com]

TASK [openshift_master : debug] ************************************************
ok: [openshift-143.lab.sjc.redhat.com] => {
    "create_master_unit_file": {
        "changed": false, 
        "skip_reason": "Conditional check failed", 
        "skipped": true
    }
}

TASK [openshift_master : command] **********************************************
skipping: [openshift-143.lab.sjc.redhat.com]

<--snip-->

TASK [openshift_master : Start and enable master] ******************************
FAILED - RETRYING: TASK: openshift_master : Start and enable master (1 retries left).
fatal: [openshift-143.lab.sjc.redhat.com]: FAILED! => {
    "attempts": 1, 
    "changed": false, 
    "failed": true
}

MSG:

Could not find the requested service atomic-openshift-master: cannot enable


NO MORE HOSTS LEFT *************************************************************

PLAY RECAP *********************************************************************


Expected results:
No errors

Additional info:

Login to the master host, manually run `systemctl daemon-reload` and restart/enable atomic-openshift-master service successfully.

Task for master `systemctl daemon-reload` wasn't executed due to `create_master_unit_file` was overridden with `changed": false` by skipped task `Install Master service file`

Comment 3 Gan Huang 2017-06-13 05:02:45 UTC
Verified with openshift-ansible-3.6.98-1.git.0.e651d65.el7.noarch.rpm

Comment 5 errata-xmlrpc 2017-08-10 05:24:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1716