Bug 1312203

Summary: openshift-ansibel get stuck when running on the host to be deployed
Product: OpenShift Container Platform Reporter: Gan Huang <ghuang>
Component: InstallerAssignee: Jason DeTiberus <jdetiber>
Status: CLOSED NOTABUG QA Contact: Ma xiaoqiang <xiama>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.1.0CC: aos-bugs, jokerman, mmccomas, xtian
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-02-26 16:19:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Gan Huang 2016-02-26 05:39:43 UTC
Description of problem:
When openshift-ansible is running on one host to be deployed, it will get stuck when doing "Start and enable iptables service" task. There is a bug (https://bugzilla.redhat.com/show_bug.cgi?id=1274201) about atomic-openshift-installer, bug not fixed in openshift-ansible.

Version-Release number of selected component (if applicable):
https://github.com/openshift/openshift-ansible.git -b master
(Feb 26, 2016 )

How reproducible:
Always

Steps to Reproduce:
1.Run openshift-ansible on the host (master in QE env) to be deployed
2.
3.

Actual results:
TASK: [os_firewall | Start and enable iptables service] *********************** 

Then no response in a long time. It would throw error message after serveral hours.

fatal: [openshift-138.lab.eng.nay.redhat.com] => SSH Error: Shared connection to openshift-138.lab.eng.nay.redhat.com closed.
It is sometimes useful to re-run the command using -vvvv, which prints SSH debug output to help diagnose the issue.

FATAL: all hosts have already failed -- aborting

PLAY RECAP ******************************************************************** 
           to retry, use: --limit @/root/config.retry

localhost                  : ok=10   changed=0    unreachable=0    failed=0   
openshift-138.lab.eng.nay.redhat.com : ok=26   changed=5    unreachable=1    failed=0  


Expected results:
Install successfully.

Additional info:
Press "Ctrl+C" with it and re-run openshift-ansible, it would execut the tasks smoothly.

Comment 1 Jason DeTiberus 2016-02-26 16:19:57 UTC
While not ideal, this is because the connection is being made over SSH to itself. When creating the ansible inventory that includes the local system, you will need to override ansible_connection on that system.

In this case I suspect you will want the following:
[masters]
openshift-138.lab.eng.nay.redhat.com ansible_connection=local <other vars>...

[nodes]
openshift-138.lab.eng.nay.redhat.com ansible_connection=local <other vars>...