Bug 2231120

Summary: [OSP 16]Cannot force pacemaker containers to specific version when newer subversion exists in registry.
Product: Red Hat OpenStack Reporter: Matt Flusche <mflusche>
Component: tripleo-ansibleAssignee: OSP Team <rhos-maint>
Status: CLOSED NOTABUG QA Contact: Joe H. Rahme <jhakimra>
Severity: high Docs Contact:
Priority: medium    
Version: 16.2 (Train)CC: bshephar, imahmed, lmiccini
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-08-11 14:07:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Matt Flusche 2023-08-10 16:16:54 UTC
Description of problem:

Describing this issue with just a two containers but seems to impact all pacemaker containers.  All non-pacemaker containers are correctly forced to the version defined in images.yaml.

Registry contains the following containers:

openstack-ovn-northd:16.2.4-17
openstack-ovn-northd:16.2.4-17.1679573635
openstack-ovn-controller:16.2.4-17
openstack-ovn-controller:16.2.4-17.1679573638

However, the goal is to force to version: 16.2.4-17 with the following images.yaml in deployment.

parameter_defaults:
  [...]
  ContainerOvnDbsImage: registry.local:443/rhosp-rhel8/openstack-ovn-northd:16.2.4-17
  ContainerOvnControllerImage: registry.local:443/rhosp-rhel8/openstack-ovn-controller:16.2.4-17

With this config ovn-controller (not managed by pacemaker) will be deployed with the desired 16.2.4-17 version; however, openstack-ovn-northd (tagged as: cluster.common.tag/openstack-ovn-northd:pcmklatest) will be updated to the more current version: 16.2.4-17.1679573635

This seem to be related to how the container is tagged and setup for pacemaker; perhaps the tripleo-container-tag ansible playbook used for this.


Version-Release number of selected component (if applicable):
rhosp 16.2.4


How reproducible:
See above

Comment 1 Brendan Shephard 2023-08-11 01:40:27 UTC
Pretty sure this is intentional to ensure stability. Those containers will only be changed during a minor update:
https://github.com/openstack/tripleo-heat-templates/blob/1393d39be367db3acb02508e0e858395a4e4fefa/deployment/ovn/ovn-dbs-pacemaker-puppet.yaml#L17-L24

They are tagged during update_tasks here:
https://github.com/openstack/tripleo-heat-templates/blob/1393d39be367db3acb02508e0e858395a4e4fefa/deployment/ovn/ovn-dbs-pacemaker-puppet.yaml#L315-L320


This wont happen outside of a update, so the only way you could achieve this without updating would be to manually pull the images, tag them and restart the pacemaker services. Whether pidone would support such a thing is another question. Adding pidone to review.

Comment 2 Luca Miccini 2023-08-11 06:12:37 UTC
Hey Matt, from our discussion on slack I thought we were going to use the minor update workflow? 
Brendan is correct, I don't think you can change bundle images during a stack update nowadays if you use the cluster common tag (https://github.com/openstack/tripleo-heat-templates/commit/8b8f103906843ab25e5b97a76253b6120aea1de3).
I would +1 the manual tagging of the images in case of emergency (production down, fix it quickly and run the minor update later).

Cheers,
Luca

Comment 3 Luca Miccini 2023-08-11 12:54:07 UTC
probably a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=2215283

Comment 4 Matt Flusche 2023-08-11 14:07:38 UTC
(In reply to Luca Miccini from comment #2)
> Hey Matt, from our discussion on slack I thought we were going to use the
> minor update workflow? 
> Brendan is correct, I don't think you can change bundle images during a
> stack update nowadays if you use the cluster common tag
> (https://github.com/openstack/tripleo-heat-templates/commit/
> 8b8f103906843ab25e5b97a76253b6120aea1de3).
> I would +1 the manual tagging of the images in case of emergency (production
> down, fix it quickly and run the minor update later).
> 
> Cheers,
> Luca

Thanks for the clarification, I guess I didn't understand the full situation during the slack discussion and that minor update workflow was necessary (or manually tagging the images).

Sorry for being slow to understand, I'll close this one.