Bug 1272226

Summary: ansible step unreliable
Product: OpenShift Container Platform Reporter: Aleksandar Kostadinov <akostadi>
Component: InstallerAssignee: Jason DeTiberus <jdetiber>
Status: CLOSED NOTABUG QA Contact: Ma xiaoqiang <xiama>
Severity: low Docs Contact:
Priority: low    
Version: 3.0.0CC: aos-bugs, bleanhar, jokerman, mmccomas, xtian
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-01-12 16:13:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 2 Brenton Leanhardt 2015-10-16 14:45:26 UTC
I don't think we have enough information to reproduce this.  I feel like we can require network connectivity for ansible to work.  Please reopen this if you see it again and know a way to reproduce it.

Comment 3 Aleksandar Kostadinov 2015-10-16 19:26:18 UTC
At the moment I'm total ansible newbie and don't have time to debug it. But I see it capable of failing early on some errors and retry operations. Hoped somebody that knows our setup will be able to make more reliable the setup steps. Setup of 3 instances takes ~ 50 minutes normally, sometimes longer. It's normal that during this period network glitches can happen. It saves a lot of time when environemt setup is reliable.

Anyway, leaving like that for the time being unless somebody can chime in.

Comment 4 Brenton Leanhardt 2015-10-16 19:38:33 UTC
It looks like it could be a sshd configuration problem.  I wouldn't expected it to failed randomly though.  Maybe try the suggestion here:

http://stackoverflow.com/questions/23899028/ansible-failed-to-transfer-file-to

In any case, I don't think this will have anything to do with our playbook.  Could you report your ansible version?

I'm going to reopen this.  I don't think it's a release blocker though.  If anything, we'll work with Ansible to get this fixed. :)

Comment 5 Aleksandar Kostadinov 2015-10-16 21:24:32 UTC
I don't think it is configuration issue as repeating same thing again just worked (running against newly launched openstack instances so they are always in clean state before ansible).
I think it was a network glitch or who knows what.
The only fact is that file transfer failed and ansible didn't immediately stop, it didn't retry either (as far as I can tell).

I *thought* it can be made more reliably by tuning the steps we do. As I said, can't help right now unfortunately.

So feel free to handle as you see fit. I'm just starting with frequent env creation so I guess after a few weeks I'd have some statistics. Definitely not a release blocker!