Bug 2224177 - [RFE]overcloud node provision doesn't execute network configuration in parallel [NEEDINFO]
Summary: [RFE]overcloud node provision doesn't execute network configuration in parallel
Keywords:
Status: NEW
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-tripleoclient
Version: 17.1 (Wallaby)
Hardware: x86_64
OS: Linux
low
low
Target Milestone: ---
: ---
Assignee: OSP Team
QA Contact: David Rosenfeld
URL:
Whiteboard:
Depends On:
Blocks: 2222869
TreeView+ depends on / blocked
 
Reported: 2023-07-20 06:57 UTC by Keigo Noha
Modified: 2023-08-16 08:58 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:
hjensas: needinfo? (knoha)


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-26759 0 None None None 2023-07-20 07:01:33 UTC

Description Keigo Noha 2023-07-20 06:57:55 UTC
Description of problem:
Current python-tripleoclient executes OS deployment in parallel according to concurrency option.
~~~
class ProvisionNode(command.Command):
:
    def take_action(self, parsed_args):
:
        extra_vars = {
            "stack_name": parsed_args.stack,
            "baremetal_deployment": roles,
            "baremetal_deployed_path": output_path,
            "ssh_public_keys": ssh_key,
            "ssh_private_key_file": key,
            "ssh_user_name": parsed_args.overcloud_ssh_user,
            "node_timeout": parsed_args.timeout,
            "concurrency": parsed_args.concurrency,
            "manage_network_ports": True,
            "configure_networking": parsed_args.network_config,
            "configure_networking": parsed_args.network_config,
            "working_dir": working_dir,
            "templates": parsed_args.templates,
            "overwrite": overwrite,
        }

        with oooutils.TempDirs() as tmp:
            oooutils.run_ansible_playbook(
                playbook='cli-overcloud-node-provision.yaml',
                inventory='localhost,',
                workdir=tmp,
                playbook_dir=constants.ANSIBLE_TRIPLEO_PLAYBOOKS,
                verbosity=oooutils.playbook_verbosity(self=self),
                extra_vars=extra_vars,
            )
~~~

However, the latter code, configuring network things, is run by per role.
~~~
        oooutils.run_role_playbooks(self, working_dir, roles_file_dir,
                                    roles, parsed_args.network_config)
~~~

Spine-leaf environment will have many custom roles for leafs.
This implementation will increase the execution cycle directly based on the number of roles.

To reduce the time of node provisioning, can we run this process in parallel and limit the number of parallel execution at a time to prevent resource starvation?

Comment 2 Harald Jensås 2023-08-15 09:18:21 UTC
I think this bug is invalid?
The 'cli-overcloud-node-network-config.yaml' playbook run's outside the role loop without any limit on what roles it executes on, so ansible will run it in paralell on all nodes in all roles.

However, the growvols play and any extra playbooks added by the operator does run per-role with per-role "extra_vars".


https://opendev.org/openstack/python-tripleoclient/src/branch/stable/wallaby/tripleoclient/utils.py#L2837-L2872

2837 def run_role_playbooks(self, working_dir, roles_file_dir, roles,
2838                        network_config=True):
2839     inventory_file = os.path.join(working_dir,
2840                                   'tripleo-ansible-inventory.yaml')
2841     with open(inventory_file, 'r') as f:
2842         inventory = yaml.safe_load(f.read())
2843         
2844     growvols_play = 'cli-overcloud-node-growvols.yaml'
2845     growvols_path = rel_or_abs_path_role_playbook(
2846         constants.ANSIBLE_TRIPLEO_PLAYBOOKS, growvols_play)
2847         
2848     # Pre-Network Config
2849     for role in roles:
2850         if role.get('count', 1) == 0:
2851             continue
2852         
2853         role_playbooks = []
2854                     
2855         for x in role.get('ansible_playbooks', []):
2856             role_playbooks.append(x['playbook'])
2857         
2858             run_role_playbook(self, inventory, roles_file_dir, x['playbook'],
2859                               limit_hosts=role['name'],
2860                               extra_vars=x.get('extra_vars', {}))
2861                 
2862         if growvols_path not in role_playbooks:
2863             # growvols was not run with custom extra_vars, run it with defaults
2864             run_role_playbook(self, inventory,
2865                               constants.ANSIBLE_TRIPLEO_PLAYBOOKS,
2866                               growvols_play,
2867                               limit_hosts=role['name'])
2868 
2869     if network_config:
2870         # Network Config
2871         run_role_playbook(self, inventory, constants.ANSIBLE_TRIPLEO_PLAYBOOKS,
2872                           'cli-overcloud-node-network-config.yaml')


Note You need to log in before you can comment on or make changes to this bug.