Bug 2244772 - OSP 16.2 z6 | Minor update from 16.1 to 16.2.6 occasionally fails on paunch not being able to start container
Summary: OSP 16.2 z6 | Minor update from 16.1 to 16.2.6 occasionally fails on paunch n...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: documentation
Version: 16.2 (Train)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: z7
: 16.2 (Train on RHEL 8.4)
Assignee: mgeary
QA Contact: RHOS Documentation Team
URL:
Whiteboard: DFG: Upgrade
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-10-18 07:12 UTC by Leonid Natapov
Modified: 2024-04-08 11:42 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-04-08 11:42:08 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-29827 0 None None None 2023-10-18 07:14:08 UTC

Description Leonid Natapov 2023-10-18 07:12:41 UTC
Description of problem:

OSP 16.2 z6 | Minor update from 16.1 to 16.2.6 occasionally fails on paunch.
Happens in stage Overcloud Update while updating controller nodes.


2023-10-17 18:29:52 | 2023-10-17 18:29:41.003725 | 52540010-1760-51e9-ef94-000000000179 |     TIMING | Start containers for step {{ step }} using paunch | ctrl-2-16-1 | 2:08:03.810767 | 0.92s
2023-10-17 18:29:52 | 2023-10-17 18:29:41.030836 | 52540010-1760-51e9-ef94-00000000017a |       TASK | Wait for containers to start for step 1 using paunch
2023-10-17 18:29:52 | 2023-10-17 18:29:41.599594 | 52540010-1760-51e9-ef94-00000000017a |    WAITING | Wait for containers to start for step 1 using paunch | ctrl-2-16-1 | 360 retries left
2023-10-17 18:29:52 | 2023-10-17 18:29:51.956164 | 52540010-1760-51e9-ef94-00000000017a |    WAITING | Wait for containers to start for step 1 using paunch | ctrl-2-16-1 | 359 retries left
2023-10-17 18:30:03 | 
2023-10-17 18:30:03 | 2023-10-17 18:30:02.876441 | 52540010-1760-51e9-ef94-00000000017a |      FATAL | Wait for containers to start for step 1 using paunch | ctrl-2-16-1 | error={"ansible_job_id": "55440942512.555016", "attempts": 3, "changed": false, "cmd": "/home/tripleo-admin/.ansible/tmp/ansible-tmp-1697567380.2127461-8950-110259694849210/AnsiballZ_paunch.py", "data": "", "finished": 1, "msg": "Traceback (most recent call last):\n  File \"/tmp/ansible_async_wrapper_payload_kjf528ot/ansible_async_wrapper_payload.zip/ansible/modules/utilities/logic/async_wrapper.py\", line 166, in _run_module\n  File 


remove_container", "    systemd.service_delete(container=container, log=self.log)", "  File \"/usr/lib/python3.6/site-packages/paunch/utils/systemd.py\", line 153, in service_delete", "    systemctl.stop(sysd_f)", "  File \"/usr/lib/python3.6/site-packages/paunch/utils/systemctl.py\", line 42, in stop", "    systemctl(['stop', service], log)", "  File \"/usr/lib/python3.6/site-packages/paunch/utils/systemctl.py\", line 34, in systemctl", "    raise SystemctlException(str(err))", "paunch.utils.systemctl.SystemctlException: Command '['systemctl', 'stop', 'tripleo_metrics_qdr.service']' returned non-zero exit status 1."]}
2023-10-17 18:30:06 | 
2023-10-17 18:30:06 | 2023-10-17 18:30:02.898738 | 52540010-1760-51e9-ef94-00000000017a |     TIMING | Wait for containers to start for step {{ step }} using paunch | ctrl-2-16-1 | 2:08:25.705744 | 21.87s
2023-10-17 18:30:06 | 
2023-10-17 18:30:06 | NO MORE HOSTS LEFT 



Version-Release number of selected component (if applicable):


How reproducible:

Sometimes.

Steps to Reproduce:
1. Update OSP16.1 to OSP16.2.6
2.
3.

Actual results:

Overcloud update fails

Expected results:

Overcloud update succeeds 

Additional info:

Comment 2 Leonid Natapov 2023-10-18 12:08:37 UTC
The issue related to collectd-sensubility out of memory.

Comment 14 mgeary 2024-04-08 11:41:46 UTC
Content available at 
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html-single/keeping_red_hat_openstack_platform_updated/index#assembly_preparing-for-a-minor-update_keeping-updated
in section: "Known issues that might block an update": Minor update from 16.1 to 16.2.6 occasionally fails on paunch not being able to start container.


Note You need to log in before you can comment on or make changes to this bug.