Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 2244772

Summary: OSP 16.2 z6 | Minor update from 16.1 to 16.2.6 occasionally fails on paunch not being able to start container
Product: Red Hat OpenStack Reporter: Leonid Natapov <lnatapov>
Component: documentationAssignee: mgeary <mgeary>
Status: CLOSED CURRENTRELEASE QA Contact: RHOS Documentation Team <rhos-docs>
Severity: high Docs Contact:
Priority: high    
Version: 16.2 (Train)CC: jveiraca, lmadsen, mariel, mciecier, mgeary, mrunge
Target Milestone: z7Keywords: Triaged, ZStream
Target Release: 16.2 (Train on RHEL 8.4)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: DFG: Upgrade
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-04-08 11:42:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Leonid Natapov 2023-10-18 07:12:41 UTC
Description of problem:

OSP 16.2 z6 | Minor update from 16.1 to 16.2.6 occasionally fails on paunch.
Happens in stage Overcloud Update while updating controller nodes.


2023-10-17 18:29:52 | 2023-10-17 18:29:41.003725 | 52540010-1760-51e9-ef94-000000000179 |     TIMING | Start containers for step {{ step }} using paunch | ctrl-2-16-1 | 2:08:03.810767 | 0.92s
2023-10-17 18:29:52 | 2023-10-17 18:29:41.030836 | 52540010-1760-51e9-ef94-00000000017a |       TASK | Wait for containers to start for step 1 using paunch
2023-10-17 18:29:52 | 2023-10-17 18:29:41.599594 | 52540010-1760-51e9-ef94-00000000017a |    WAITING | Wait for containers to start for step 1 using paunch | ctrl-2-16-1 | 360 retries left
2023-10-17 18:29:52 | 2023-10-17 18:29:51.956164 | 52540010-1760-51e9-ef94-00000000017a |    WAITING | Wait for containers to start for step 1 using paunch | ctrl-2-16-1 | 359 retries left
2023-10-17 18:30:03 | 
2023-10-17 18:30:03 | 2023-10-17 18:30:02.876441 | 52540010-1760-51e9-ef94-00000000017a |      FATAL | Wait for containers to start for step 1 using paunch | ctrl-2-16-1 | error={"ansible_job_id": "55440942512.555016", "attempts": 3, "changed": false, "cmd": "/home/tripleo-admin/.ansible/tmp/ansible-tmp-1697567380.2127461-8950-110259694849210/AnsiballZ_paunch.py", "data": "", "finished": 1, "msg": "Traceback (most recent call last):\n  File \"/tmp/ansible_async_wrapper_payload_kjf528ot/ansible_async_wrapper_payload.zip/ansible/modules/utilities/logic/async_wrapper.py\", line 166, in _run_module\n  File 


remove_container", "    systemd.service_delete(container=container, log=self.log)", "  File \"/usr/lib/python3.6/site-packages/paunch/utils/systemd.py\", line 153, in service_delete", "    systemctl.stop(sysd_f)", "  File \"/usr/lib/python3.6/site-packages/paunch/utils/systemctl.py\", line 42, in stop", "    systemctl(['stop', service], log)", "  File \"/usr/lib/python3.6/site-packages/paunch/utils/systemctl.py\", line 34, in systemctl", "    raise SystemctlException(str(err))", "paunch.utils.systemctl.SystemctlException: Command '['systemctl', 'stop', 'tripleo_metrics_qdr.service']' returned non-zero exit status 1."]}
2023-10-17 18:30:06 | 
2023-10-17 18:30:06 | 2023-10-17 18:30:02.898738 | 52540010-1760-51e9-ef94-00000000017a |     TIMING | Wait for containers to start for step {{ step }} using paunch | ctrl-2-16-1 | 2:08:25.705744 | 21.87s
2023-10-17 18:30:06 | 
2023-10-17 18:30:06 | NO MORE HOSTS LEFT 



Version-Release number of selected component (if applicable):


How reproducible:

Sometimes.

Steps to Reproduce:
1. Update OSP16.1 to OSP16.2.6
2.
3.

Actual results:

Overcloud update fails

Expected results:

Overcloud update succeeds 

Additional info:

Comment 2 Leonid Natapov 2023-10-18 12:08:37 UTC
The issue related to collectd-sensubility out of memory.

Comment 14 mgeary 2024-04-08 11:41:46 UTC
Content available at 
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html-single/keeping_red_hat_openstack_platform_updated/index#assembly_preparing-for-a-minor-update_keeping-updated
in section: "Known issues that might block an update": Minor update from 16.1 to 16.2.6 occasionally fails on paunch not being able to start container.