Bug 2251191 - Heat stack fails at while scaling from 850-950 nodes with TimeoutError: resources[192].resources.NovaCompute: QueuePool limit of size 5 overflow 64 reached, connection timed out
Summary: Heat stack fails at while scaling from 850-950 nodes with TimeoutError: reso...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-heat
Version: 17.1 (Wallaby)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Rabi Mishra
QA Contact: David Rosenfeld
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-11-23 11:28 UTC by Asma Syed Hameed
Modified: 2024-12-16 19:20 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-12-16 19:20:36 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-30543 0 None None None 2023-11-23 11:31:57 UTC

Description Asma Syed Hameed 2023-11-23 11:28:31 UTC
Description of problem:

During scaling from 850-950 virtual computes with baremetal undercloud, 3 baremetal controllers and 7 baremetal ceph nodes the stack creation fails

83619:2023-11-23 09:23:34.833 23 INFO heat.engine.stack [req-03d92752-90b6-46c8-954f-69258a45ca36 admin admin - - -] Stack UPDATE FAILED (overcloud-ComputeR4-lcuju6fhmd2p): Resource CREATE failed: TimeoutError: resources[192].resources.NovaCompute: QueuePool limit of size 5 overflow 64 reached, connection timed out, timeout 30 (Background on this error at: http: //sqlalche.me/e/13/3o7r)                                                                                                    
83868:2023-11-23 09:23:36.945 9 ERROR heat.engine.resource Traceback (most recent call last):
83869:2023-11-23 09:23:36.945 9 ERROR heat.engine.resource   File "/usr/lib/python3.9/site-packages/heat/engine/resource.py", line 916, in _action_recorder                                                                                         
83870:2023-11-23 09:23:36.945 9 ERROR heat.engine.resource     yield
83871:2023-11-23 09:23:36.945 9 ERROR heat.engine.resource   File "/usr/lib/python3.9/site-packages/heat/engine/resource.py", line 1028, in _do_action                                                                                              
83872:2023-11-23 09:23:36.945 9 ERROR heat.engine.resource     yield from self.action_handler_task(action, args=handler_args)                                                                                                                       
83873:2023-11-23 09:23:36.945 9 ERROR heat.engine.resource   File "/usr/lib/python3.9/site-packages/heat/engine/resource.py", line 978, in action_handler_task                                                                                      
83874:2023-11-23 09:23:36.945 9 ERROR heat.engine.resource     done = check(handler_data)
83875:2023-11-23 09:23:36.945 9 ERROR heat.engine.resource   File "/usr/lib/python3.9/site-packages/heat/engine/resources/openstack/heat/resource_group.py", line 433, in check_create_complete                                                     
83876:2023-11-23 09:23:36.945 9 ERROR heat.engine.resource     if not checker.step():
83877:2023-11-23 09:23:36.945 9 ERROR heat.engine.resource   File "/usr/lib/python3.9/site-packages/heat/engine/scheduler.py", line 210, in step                                                                                                    
83878:2023-11-23 09:23:36.945 9 ERROR heat.engine.resource     poll_period = next(self._runner)
83879:2023-11-23 09:23:36.945 9 ERROR heat.engine.resource   File "/usr/lib/python3.9/site-packages/heat/engine/resources/openstack/heat/resource_group.py", line 441, in _run_to_completion                                                        
83880:2023-11-23 09:23:36.945 9 ERROR heat.engine.resource     while not super(ResourceGroup,
83881:2023-11-23 09:23:36.945 9 ERROR heat.engine.resource   File "/usr/lib/python3.9/site-packages/heat/engine/resources/stack_resource.py", line 545, in check_update_complete                                                                    
83882:2023-11-23 09:23:36.945 9 ERROR heat.engine.resource     return self._check_status_complete(target_action,
83883:2023-11-23 09:23:36.945 9 ERROR heat.engine.resource   File "/usr/lib/python3.9/site-packages/heat/engine/resources/stack_resource.py", line 450, in _check_status_complete                                                                   
83884:2023-11-23 09:23:36.945 9 ERROR heat.engine.resource     raise exception.ResourceFailure(status_reason, self,
83885:2023-11-23 09:23:36.945 9 ERROR heat.engine.resource heat.common.exception.ResourceFailure: resources.ComputeR4: Resource CREATE failed: TimeoutError: resources[192].resources.NovaCompute: QueuePool limit of size 5 overflow 64 reached, connection timed out, timeout 30 (Background on this error at: http: //sqlalche.me/e/13/3o7r)
83886:2023-11-23 09:23:36.945 9 ERROR heat.engine.resource
84072:2023-11-23 09:23:38.723 9 INFO heat.engine.stack [req-03d92752-90b6-46c8-954f-69258a45ca36 admin admin - - -] Stack CREATE FAILED (overcloud): Resource CREATE failed: resources.ComputeR4: Resource CREATE failed: TimeoutError: resources[192].resources.NovaCompute: QueuePool limit of size 5 overflow 64 reached, connection timed out, timeout 30 (Background on this error at: http: //sqlalche.me/e/13/3o7r)                                                                               


Version-Release number of selected component (if applicable):
RHOS-17.1-RHEL-9-20230802.n.1

How reproducible:
100%

Steps to Reproduce:
1. Deploy 800+ nodes
2. Scale out to 950 nodes


Actual results:
Stack creation failed

Expected results:
Stack creation successful


Note You need to log in before you can comment on or make changes to this bug.