Bug 1669194
| Summary: | Sanity Check in upgrade and prerequisite playbook is slow and removed vars check does not work | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Matthew Robson <mrobson> |
| Component: | Installer | Assignee: | Michael Gugino <mgugino> |
| Installer sub component: | openshift-ansible | QA Contact: | Weihua Meng <wmeng> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | high | ||
| Priority: | unspecified | CC: | gpei, hongkliu, mgugino, mifiedle, wmeng |
| Version: | 3.11.0 | ||
| Target Milestone: | --- | ||
| Target Release: | 3.11.z | ||
| Hardware: | All | ||
| OS: | All | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-02-20 14:11:02 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
PR with a fix: https://github.com/openshift/openshift-ansible/pull/11061 Quick test without and with the fix shows over 2x speed improvement.
Without fix - 7m 37s
2019-01-23 12:46:39,467 p=39217 u=root | TASK [Run variable sanity checks] **********************************************
2019-01-23 12:54:16,036 p=39217 u=root | ok: [nodename] => {
"changed": false,
"msg": "New Sanity Checks passed"
}
With Fix - 3m 17s
2019-01-23 13:14:57,100 p=71065 u=root | TASK [Run variable sanity checks] **********************************************
2019-01-23 13:18:14,905 p=71065 u=root | ok: [nodename] => {
"changed": false,
"msg": "New Sanity Checks passed"
}
Hi, Mike I tested with cluster of 6 glusterfs nodes(3 for docker registry), for upgrade time, there is no difference between openshift-ansible-3.11.59-1.git.0.ba8e948.el7.noarch openshift-ansible-3.11.82-1.git.0.f29227a.el7.noarch Could you help? Thanks. How many devices / volumes / pvc do you have? Where we see this issue, there are around 700 volumes in use. move to verified according to comment 10 Thanks for help, Matthew and Mike. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0326 |
Description of problem: Sanity check is taking well over 60 minutes to run 10:52:07,767 p=127636 u=root | TASK [Run variable sanity checks] ******************************************************************************************* 2019-01-21 10:52:07,767 p=127636 u=root | task path: /usr/share/ansible/openshift-ansible/playbooks/init/sanity_checks.yml:14 2019-01-21 12:08:16,698 p=127636 u=root | ok: [nodename] => { "changed": false, "msg": "Sanity Checks passed"} Doing some additional debugging, the OCS nodes take the majority of the time inside check_for_removed_vars Version-Release number of the following components: 3.11.59 How reproducible: Always Steps to Reproduce: 1. Run upgrade or check, especially with large OCS nodes. 2. 3. Actual results: Very slow versus 3.9 Expected results: Quick execution