Bug 2209391
| Summary: | After updating to 16.2.5, heat stack-show on the undercloud takes close to ~7 minutes and break everything | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | David Hill <dhill> |
| Component: | openstack-heat | Assignee: | OSP Team <rhos-maint> |
| Status: | CLOSED NOTABUG | QA Contact: | David Rosenfeld <drosenfe> |
| Severity: | urgent | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 16.2 (Train) | ||
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-05-23 20:45:54 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
After updating to 16.2.5, heat stack-show/environment-show on the undercloud takes close to ~7 minutes and break everything starting with ansible inventory generation in the tripleoclient calling heatclient and timing out after 30s . Command executed: stack 642186 0.2 0.0 6753600 111472 pts/2 S+ 17:43 0:06 | \_ /usr/bin/python3 /usr/bin/openstack --debug --verbose overcloud external-update run --stack overcloud --tags container_image_prepare timesout after 30 seconds ... I had to modify heat client to timeout after 600s : ~~~ def get(self, stack_id, resolve_outputs=True): """Get the metadata for a specific stack. :param stack_id: Stack ID or name to lookup :param resolve_outputs: If True, then outputs for this stack will be resolved """ kwargs = {} if not resolve_outputs: kwargs['params'] = {"resolve_outputs": False} resp = self.client.get('/stacks/%s' % stack_id, **kwargs, timeout=600) body = utils.get_response_body(resp) return Stack(self, body.get('stack'), loaded=True) ~~~ heat show takes much time to return the heat stack ... that is not normal either. [dhostname] [05:18:13 PM] ✘-2 ~/overcloud [master ↓·32|●2✚ 4…1] 17:18 $ time /usr/bin/python3 -s /usr/bin/tripleo-ansible-inventory --debug --os-cloud undercloud --stack overcloud --undercloud-key-file /var/lib/mistral/.ssh/tripleo-admin-rsa --ansible_ssh_user tripleo-admin --undercloud-connection ss h --static-yaml-inventory /home/stack/tripleo-ansible-inventory.yaml real 6m20.432s user 0m1.688s sys 0m0.143s config_download took ~40 minutes now node_update is being executed: (undercloud) [stack@director:~]$ mistral task-list | grep -v SUCCESS +--------------------------------------+--------------------------+---------------------------------------------+--------------------+--------------------------------------+---------+------------------------------+---------------------+---------------------+ | ID | Name | Workflow name | Workflow namespace | Workflow Execution ID | State | State info | Created at | Updated at | +--------------------------------------+--------------------------+---------------------------------------------+--------------------+--------------------------------------+---------+------------------------------+---------------------+---------------------+ | da761abd-04e4-46cc-b75e-959228000e6a | get_deployment_status | tripleo.deployment.v1.get_deployment_status | | d442c32d-a743-4429-a40b-992bb15a41df | ERROR | Failed to handle action c... | 2023-05-23 08:43:19 | 2023-05-23 09:48:38 | | 3e4f1e48-047d-48a4-92ac-b2ee30f24afc | node_update | tripleo.package_update.v1.update_nodes | | cbd0fd17-b496-4c6f-a38d-8470b0142ec2 | RUNNING | None | 2023-05-23 18:29:24 | 2023-05-23 18:29:24 | +--------------------------------------+--------------------------+---------------------------------------------+--------------------+--------------------------------------+---------+------------------------------+---------------------+---------------------+ this one completed in 30 seconds. Even in debug, heat engine isn't generating much output of what it's doing/executing . The mysql database is 9GB (which isn't the end of the world either and environment have ~220 computes. Is this slowness expected ?