Bug 1645725 - docker_creds.py module has a short timeout
Summary: docker_creds.py module has a short timeout
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.11.0
Hardware: x86_64
OS: Linux
Target Milestone: ---
: 3.11.z
Assignee: Patrick Dillon
QA Contact: Gaoyun Pei
Depends On:
TreeView+ depends on / blocked
Reported: 2018-11-03 02:32 UTC by Gerald
Modified: 2019-10-18 01:34 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2019-10-18 01:34:36 UTC
Target Upstream Version:

Attachments (Terms of Use)
host used to deploy the lab env (10.41 KB, application/octet-stream)
2018-11-03 02:32 UTC, Gerald
no flags Details
Ansible debug (8.94 KB, text/plain)
2018-11-03 02:33 UTC, Gerald
no flags Details

System ID Priority Status Summary Last Updated
Github openshift openshift-ansible pull 11922 'None' closed Bug 1645725: Increase timeout for Docker images and creds 2020-03-23 23:29:40 UTC
Red Hat Product Errata RHBA-2019:3139 None None None 2019-10-18 01:34:58 UTC

Description Gerald 2018-11-03 02:32:56 UTC
Created attachment 1500752 [details]
host used to deploy the lab env

Description of problem:

Installing openshift on a lab server with network latency made the prerequisite.yaml playbook to fail. I run the failed command[1] manually and it took 10.5 seconds to responds, and the defined timeout is 10 secs.

Version-Release number of selected component (if applicable):

Steps to Reproduce:
1. install openshift-ansible 
2. run the prerequisite.yml playbook
3. expect the "Create credentials for oreg_url" task to fail
Actual results:

Expected results:

Additional info:

Comment 1 Gerald 2018-11-03 02:33:41 UTC
Created attachment 1500765 [details]
Ansible debug

Comment 3 Chris Murphy 2019-09-12 17:17:59 UTC
We are currently giving our users these instructions to work around this issue:

Once install finishes, ssh to master-0 and run the following commands:
sudo sed 's/default=20/default=60/' -i openshift-ansible/roles/lib_utils/library/docker_creds.py
sudo sed 's/timeout 30/timeout 60/' -i openshift-ansible/roles/openshift_health_checker/openshift_checks/docker_image_availability.py
sudo sed 's/timeout 30/timeout 60/' -i openshift-ansible/roles/openshift_health_checker/test/docker_image_availability_test.py

This shows the other two tests with hardcoded timeouts that will fail under the same conditions.


Comment 4 Patrick Dillon 2019-09-23 15:34:57 UTC
Gerald, based on the 10 second timeout and commit version, it looks like your code does not include a bump to the timeout in October: https://github.com/openshift/openshift-ansible/commit/32636fde0e07af35df53f90a03a89a30c1ef7e52 Any chance you can update your version? 

I opened a PR to bump the timeouts based on Chris's suggestions: 

Comment 5 Gerald 2019-09-25 16:36:23 UTC
Thanks Patrick, 

60 seconds worked perfectly!. As Chris wrote, I used the same workaround at the time this bz was reported, that's why I can confirm 60 seconds it's more than enough to discard slow networks during packages download.


Comment 7 Gaoyun Pei 2019-10-10 07:24:15 UTC
Verified this bug with openshift-ansible-3.11.152-1.git.0.3e13655.el7.noarch.rpm.

The `skopeo` command only took 2.5s in QE's environment, so the timeout is quite enough for running the checks.

# time skopeo inspect '--creds=xxx' docker://registry.redhat.io/openshift3/ose
    "Name": "registry.redhat.io/openshift3/ose",
    "Digest": "sha256:f4064c56127c75efb83a79e91c3de44f48df930f5a9b9b829bbcfc81ceeffd19",
    "RepoTags": [

real	0m2.593s
user	0m0.080s
sys	0m0.033s

After PR applied, no regression issue found.

Comment 9 errata-xmlrpc 2019-10-18 01:34:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.