Bug 1645725
| Summary: | docker_creds.py module has a short timeout | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Gerald <gschmidt> | ||||||
| Component: | Installer | Assignee: | Patrick Dillon <padillon> | ||||||
| Installer sub component: | openshift-ansible | QA Contact: | Gaoyun Pei <gpei> | ||||||
| Status: | CLOSED ERRATA | Docs Contact: | |||||||
| Severity: | high | ||||||||
| Priority: | unspecified | CC: | aos-bugs, chmurphy, gpei, gschmidt, jliberma | ||||||
| Version: | 3.11.0 | ||||||||
| Target Milestone: | --- | ||||||||
| Target Release: | 3.11.z | ||||||||
| Hardware: | x86_64 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2019-10-18 01:34:36 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
Created attachment 1500765 [details]
Ansible debug
We are currently giving our users these instructions to work around this issue: Once install finishes, ssh to master-0 and run the following commands: sudo sed 's/default=20/default=60/' -i openshift-ansible/roles/lib_utils/library/docker_creds.py sudo sed 's/timeout 30/timeout 60/' -i openshift-ansible/roles/openshift_health_checker/openshift_checks/docker_image_availability.py sudo sed 's/timeout 30/timeout 60/' -i openshift-ansible/roles/openshift_health_checker/test/docker_image_availability_test.py This shows the other two tests with hardcoded timeouts that will fail under the same conditions. Chris Gerald, based on the 10 second timeout and commit version, it looks like your code does not include a bump to the timeout in October: https://github.com/openshift/openshift-ansible/commit/32636fde0e07af35df53f90a03a89a30c1ef7e52 Any chance you can update your version? I opened a PR to bump the timeouts based on Chris's suggestions: https://github.com/openshift/openshift-ansible/pull/11922 Thanks Patrick, 60 seconds worked perfectly!. As Chris wrote, I used the same workaround at the time this bz was reported, that's why I can confirm 60 seconds it's more than enough to discard slow networks during packages download. Gerald Verified this bug with openshift-ansible-3.11.152-1.git.0.3e13655.el7.noarch.rpm.
The `skopeo` command only took 2.5s in QE's environment, so the timeout is quite enough for running the checks.
# time skopeo inspect '--creds=xxx' docker://registry.redhat.io/openshift3/ose
{
"Name": "registry.redhat.io/openshift3/ose",
"Digest": "sha256:f4064c56127c75efb83a79e91c3de44f48df930f5a9b9b829bbcfc81ceeffd19",
"RepoTags": [
"v3.5.5.5",
...
"sha256:2f87e3b75838689a5d28de304b2b012888cf2afd00256d89094d706d5dda0cf6",
"sha256:7bf92aa152acaa30396ee2f099dbf92d906ea5574148629f93d05581e5d3cf3f"
]
}
real 0m2.593s
user 0m0.080s
sys 0m0.033s
After PR applied, no regression issue found.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:3139 |
Created attachment 1500752 [details] host used to deploy the lab env Description of problem: Installing openshift on a lab server with network latency made the prerequisite.yaml playbook to fail. I run the failed command[1] manually and it took 10.5 seconds to responds, and the defined timeout is 10 secs. Version-Release number of selected component (if applicable): openshift-ansible-3.11.16-1.git.0.4ac6f81.el7.noarch Steps to Reproduce: 1. install openshift-ansible 2. run the prerequisite.yml playbook 3. expect the "Create credentials for oreg_url" task to fail Actual results: Expected results: Additional info: