Bug 2032139

Summary: paunch doesn't handle containers left in an incomplete status because of timeout
Product: Red Hat OpenStack Reporter: Takashi Kajinami <tkajinam>
Component: python-paunchAssignee: Takashi Kajinami <tkajinam>
Status: CLOSED ERRATA QA Contact: nlevinki <nlevinki>
Severity: medium Docs Contact:
Priority: medium    
Version: 16.1 (Train)CC: astupnik, bdobreli, cjeanner, drosenfe, knoha
Target Milestone: z9Keywords: Triaged
Target Release: 16.1 (Train on RHEL 8.2)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: python-paunch-5.3.3-1.20220715123744.ed2c015.el8ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2092726 (view as bug list) Environment:
Last Closed: 2022-12-07 20:25:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2092726    
Bug Blocks:    

Description Takashi Kajinami 2021-12-14 06:51:47 UTC
Description of problem:

Currently paunch looks for the existing container and compare image id and CONFIG_HASH (and some additional environments if needed)
to determine whether it should recreate the container.

When deployment times out while starting containers, paunch can leave containers in an incomplete status as described below.
 - The container was recreated, registered to systemd, but was not enabled/started
 - The container was recreated, registered to systemd, enabled, but was started
 - The container was recreated, but was not registered to systemd.

In such situation paunch can't detect these incomplete containers and next deployment successfully runs
without any errors, leaving these containers without any fix.
We should ensure paunch registers systemd service and starts these containers properly.


Version-Release number of selected component (if applicable):
python3-paunch-5.3.3-1.20210412123423
How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Keigo Noha 2022-06-02 06:12:00 UTC
Hi Kajinami-san,

It looks that upstream merged the change into ussuri and train branches.
Is it possible to open a downstream gerrit to proceed the backport into RHSOP16.x?

Best Regards,
Keigo Noha

Comment 3 Takashi Kajinami 2022-06-02 06:27:02 UTC
(In reply to Keigo Noha from comment #1)
> Hi Kajinami-san,
> 
> It looks that upstream merged the change into ussuri and train branches.
> Is it possible to open a downstream gerrit to proceed the backport into
> RHSOP16.x?
> 
> Best Regards,
> Keigo Noha

Hi

The fix was merged to upstream stable/train but has not yet been imported to RHOSP16.2 yet.
Once it is imported into RHOSP16.2, we can consider backporting the fix to RHOSP16.1 based
on the requirement.

I'll use this bug to track the fix in RHOSP16.2, which is targeted to z4 atm.

Comment 5 Takashi Kajinami 2022-06-02 06:45:15 UTC
> I'll use this bug to track the fix in RHOSP16.2, which is targeted to z4 atm.

Instead of that, I've kept this bug for RHOSP16.1 and cloned this bug for RHOSP16.2.

Comment 15 errata-xmlrpc 2022-12-07 20:25:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1.9 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:8795