Bug 1272226

Summary:	ansible step unreliable
Product:	OpenShift Container Platform	Reporter:	Aleksandar Kostadinov <akostadi>
Component:	Installer	Assignee:	Jason DeTiberus <jdetiber>
Status:	CLOSED NOTABUG	QA Contact:	Ma xiaoqiang <xiama>
Severity:	low	Docs Contact:
Priority:	low
Version:	3.0.0	CC:	aos-bugs, bleanhar, jokerman, mmccomas, xtian
Target Milestone:	---	Keywords:	Reopened
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2016-01-12 16:13:59 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Comment 2 Brenton Leanhardt 2015-10-16 14:45:26 UTC

I don't think we have enough information to reproduce this.  I feel like we can require network connectivity for ansible to work.  Please reopen this if you see it again and know a way to reproduce it.

Comment 3 Aleksandar Kostadinov 2015-10-16 19:26:18 UTC

At the moment I'm total ansible newbie and don't have time to debug it. But I see it capable of failing early on some errors and retry operations. Hoped somebody that knows our setup will be able to make more reliable the setup steps. Setup of 3 instances takes ~ 50 minutes normally, sometimes longer. It's normal that during this period network glitches can happen. It saves a lot of time when environemt setup is reliable.

Anyway, leaving like that for the time being unless somebody can chime in.

Comment 4 Brenton Leanhardt 2015-10-16 19:38:33 UTC

It looks like it could be a sshd configuration problem.  I wouldn't expected it to failed randomly though.  Maybe try the suggestion here:

http://stackoverflow.com/questions/23899028/ansible-failed-to-transfer-file-to

In any case, I don't think this will have anything to do with our playbook.  Could you report your ansible version?

I'm going to reopen this.  I don't think it's a release blocker though.  If anything, we'll work with Ansible to get this fixed. :)

Comment 5 Aleksandar Kostadinov 2015-10-16 21:24:32 UTC

I don't think it is configuration issue as repeating same thing again just worked (running against newly launched openstack instances so they are always in clean state before ansible).
I think it was a network glitch or who knows what.
The only fact is that file transfer failed and ansible didn't immediately stop, it didn't retry either (as far as I can tell).

I *thought* it can be made more reliably by tuning the steps we do. As I said, can't help right now unfortunately.

So feel free to handle as you see fit. I'm just starting with frequent env creation so I guess after a few weeks I'd have some statistics. Definitely not a release blocker!