Bug 1749406
Summary: | FFWD ansible is slow due to gather facts | |||
---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Lukas Bezdicka <lbezdick> | |
Component: | openstack-tripleo-common | Assignee: | mathieu bultel <mbultel> | |
Status: | CLOSED ERRATA | QA Contact: | Ronnie Rasouli <rrasouli> | |
Severity: | urgent | Docs Contact: | ||
Priority: | urgent | |||
Version: | 13.0 (Queens) | CC: | augol, ccamacho, fsoppels, jfrancoa, johfulto, jpretori, jschluet, ltamagno, mbollo, mbracho, mbultel, mburns, morazi, nchandek, owalsh, sclewis, shdunne, slinaber | |
Target Milestone: | z9 | Keywords: | ABIAssurance, Reopened, Triaged, ZStream | |
Target Release: | 13.0 (Queens) | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | Triaged | |||
Fixed In Version: | openstack-tripleo-common-8.7.1-2.el7ost | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1761395 (view as bug list) | Environment: | ||
Last Closed: | 2019-12-05 10:08:11 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1761395 |
Description
Lukas Bezdicka
2019-09-05 14:36:12 UTC
*** Bug 1728215 has been marked as a duplicate of this bug. *** FYI, implementing fact gathering in my testing has improved the performance of the update/upgrade/ffwd-upgrade playbooks. As such I've proposed https://review.opendev.org/682855 upstream. If it's accepted and merged, it may be a good candidate for backporting. I'm working out some other OSP10->OSP13 ffwd-upgrade issues in my environment - once they're resolved, I'll post some timings. In order to validate this, I implemented the following on the undercloud host which has the same effect and may be useful as a workaround if the patch is not suitable as a backport: (undercloud) [stack@undercloud-0 ~]$ sudo mv /etc/ansible/ansible.cfg /etc/ansible/ansible.org.cfg (undercloud) [stack@undercloud-0 ~]$ sudo tee /etc/ansible/ansible.cfg <<EOF [defaults] roles_path = /etc/ansible/roles:/usr/share/ansible/roles # improve fact gathering performance gathering = smart fact_caching = jsonfile fact_caching_connection = /var/tmp/ansible_fact_cache # two hours timeout fact_caching_timeout = 7200 [inventory] [privilege_escalation] [paramiko_connection] [ssh_connection] [persistent_connection] [accelerate] [selinux] [colors] [diff] EOF For workaround documentation: Starting with running openstack overcloud ffwd-upgrade prepare .... : 1) When the prepare finishes (I still suggest running prepare after you restore Undercloud) run the config download which will save the config to tripleo-config-XXXX directory: openstack overcloud config download 2) To speed up running each playbook, extract the inventory and store it in a temporary location. It is important that the environment does not change between this extraction and the completion of the process. ie Do not add new nodes, or remove nodes until the upgrade is complete. ansible-inventory -i /usr/bin/tripleo-ansible-inventory --list --yaml > /tmp/ansible_inventory.yaml 3) Now apply configuration to the ansible by changing ansible.cfg [defaults] # increased forks for better performance forks = 100 # implement fact caching and a smaller subset of facts gathered for improved performance gathering = smart gather_subset = !hardware,!facter,!ohai fact_caching_connection = /tmp/ansible_fact_cache fact_caching = jsonfile # expire the fact cache after 2 hours fact_caching_timeout = 7200 # work around ensuring the right modules are found library = /usr/share/ansible-modules # set the inventory to the extracted inventory location inventory = /tmp/ansible_inventory.yaml [ssh_connection] ssh_extra_args = -o Compression=no -o TCPKeepAlive=yes -o VerifyHostKeyDNS=no -o ForwardX11=no -o ForwardAgent=yes -T 4) Inform ansible where the ansible.cfg file to use is: export ANSIBLE_CONFIG=<path to ansible.cfg from config-download> 5) Run the fast_forward_upgrade_playbook.yaml playbook present inside of the generated config: ansible-playbook -b fast_forward_upgrade_playbook.yaml At this steps your databases will get updated and openstack services are stopped and disabled. 6) Run the upgrade for the Controllers ansible-playbook --skip-tags=validation -b upgrade_steps_playbook.yaml deploy_steps_playbook.yaml post_upgrade_steps_playbook.yaml --limit Controller 7) Run the upgrade for one compute. Note that Compute[0] will take first node from the inventory which might not be your compute-0. If you want to update specific node either find out its index number or specify the hostname. ansible-playbook --skip-tags=validation -b upgrade_steps_playbook.yaml deploy_steps_playbook.yaml post_upgrade_steps_playbook.yaml --limit Compute[0] Consider this as last point from which you can return, if this took too long I suggest reverting back, otherwise continue with the rest of the computes. 8) Upgrade Computes in batches. The ansible host pattern [1:10] will take second to eleventh node in the group from the inventory. You can verify the hosts which will be targeted by running: ansible -m ping --list-hosts Compute[1:10] ansible-playbook --skip-tags=validation -b upgrade_steps_playbook.yaml deploy_steps_playbook.yaml post_upgrade_steps_playbook.yaml --limit Compute[1:10] 9) Revert ansible.cfg back to its original state, unset the ANSIBLE_CONFIG environment variable and continue with converge The package is ready. If this bug requires doc text for errata release, please set the 'Doc Type' and provide draft text according to the template in the 'Doc Text' field. The documentation team will review, edit, and approve the text. If this bug does not require doc text, please set the 'requires_doc_text' flag to -. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:3794 *** Bug 1761395 has been marked as a duplicate of this bug. *** *** Bug 1775869 has been marked as a duplicate of this bug. *** |