Bug 1857451 - Ansible forks value should have an upper limit and Current Calculation needs to change
Summary: Ansible forks value should have an upper limit and Current Calculation needs ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-tripleoclient
Version: 16.2 (Train)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: OSP Team
QA Contact: David Rosenfeld
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-07-15 21:11 UTC by Sai Sindhur Malleni
Modified: 2022-08-10 16:00 UTC (History)
8 users (show)

Fixed In Version: python-tripleoclient-12.4.1-2.20210214005004.106048a.el8ost.1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-09-15 07:08:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 741471 0 None ABANDONED Adjust the usage of Ansible forks 2021-02-08 13:32:37 UTC
OpenStack gerrit 748718 0 None MERGED Expose --ansible-forks 2021-02-08 13:32:37 UTC
OpenStack gerrit 751097 0 None MERGED Adjust Ansible forks caculations 2021-02-08 13:32:38 UTC
OpenStack gerrit 765855 0 None MERGED Expose --ansible-forks 2021-02-08 13:32:37 UTC
Red Hat Issue Tracker OSP-6787 0 None None None 2022-08-10 16:00:09 UTC
Red Hat Product Errata RHEA-2021:3483 0 None None None 2021-09-15 07:09:09 UTC

Description Sai Sindhur Malleni 2020-07-15 21:11:03 UTC
Description of problem:
Currently by default 10*CPU_COUNT forks are configured in the ansbile.cfg. This leads to cases where on a 64 core undercloud we have forks set to 640 and when the user doesn't use --limit option in ansible and the playbook ends up running on all the existing nodes (let's say we really have a large number of nodes, 600+), we see ansible consuming 230G+ of RSS memory. 

Link to ansible memory usage with so many forks: https://snapshot.raintank.io/dashboard/snapshot/zKg6pZnP1m6zHqHYDQdpXwRRS01zF4fc?orgId=2

The peak is when ansible run against 630 overcloud nodes

We need to,
1. Change the default calculation we currently have to reduce the number of forks by default
2. Place an upper limit on the number of forks, irrespective of the number of cores on the undercloud


Version-Release number of selected component (if applicable):
16.1

How reproducible:
100% at large scale

Steps to Reproduce:
1. Have enough overcloud nodes and an undercloud node with a lot of CPUs
2. Run the config-download ansible playbooks with default ansible.cfg
3.

Actual results:
Ansible consumes almost all the memory on the undercloud

Expected results:
Ansible shouldn't consume so many resources

Additional info:

Comment 13 errata-xmlrpc 2021-09-15 07:08:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform (RHOSP) 16.2 enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2021:3483


Note You need to log in before you can comment on or make changes to this bug.