Bug 1965124

Summary: 16.1 to 16.2 upgrade with TLS-e fails due to container creation conflict with redis_tls_proxy
Product: Red Hat OpenStack Reporter: James Parker <jparker>
Component: openstack-tripleo-heat-templatesAssignee: Michele Baldessari <michele>
Status: CLOSED ERRATA QA Contact: Joe H. Rahme <jhakimra>
Severity: high Docs Contact:
Priority: high    
Version: 16.2 (Train)CC: aschultz, dciabrin, dvd, jhajyahy, kchamart, lmiccini, mburns, michele, mschuppe, sathlang, shrjoshi, smooney, spower
Target Milestone: rcKeywords: Triaged, UpgradeBlocker
Target Release: 16.2 (Train on RHEL 8.4)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-11.5.1-2.20210603174816.el8ost.4 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-09-15 07:15:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1978779    
Bug Blocks: 1945760    
Attachments:
Description Flags
Controller Package Update Logs none

Description James Parker 2021-05-26 22:23:03 UTC
Created attachment 1787387 [details]
Controller Package Update Logs

Description of problem:
When upgrading from a 16.1 deployment to a 16.2 deployment with TLS-e, the upgrade fails when updating the controller.  This appears to be due to the fact that although the redis_tls_proxy container is running on the controller, the upgrade procedure still attempts to create a new container with the same name.
# Currently running on the controller host:
[heat-admin@controller-0 ~]$ sudo podman ps -a --filter label=container_name=redis_tls_proxy
CONTAINER ID  IMAGE                                                                                    COMMAND      CREATED      STATUS          PORTS  NAMES
556f965b47b0  undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-redis:16.1_20210430.1  kolla_start  6 hours ago  Up 6 hours ago         redis_tls_proxy
[heat-admin@controller-0 ~]$ sudo podman inspect --format='{{.Config.Labels.config_id}}' redis_tls_proxy 
tripleo_step2

# During the upgrade procdure
Running container: redis_tls_proxy
$ podman ps -a --filter label=container_name=redis_tls_proxy --filter label=config_id=tripleo_step1 --format {{.Names}}

Did not find container with \"['podman', 'ps', '-a', '--filter', 'label=container_name=redis_tls_proxy', '--filter', 'label=config_id=tripleo_step1', '--format', '{{.Names}}']\" - retrying without config_id
$ podman ps -a --filter label=container_name=redis_tls_proxy --format {{.Names}}
b'redis_tls_proxy
Start container redis_tls_proxy as redis_tls_proxy.
$ podman create --name redis_tls_proxy --label config_id=tripleo_step1 --label container_name=redis_tls_proxy

....

Error: error creating container storage: the container name \"redis_tls_proxy\" is already in use by \"556f965b47b044fa48ef8194f2cf0adaad46e8145f1e4afdb0af4d70977ad561\". You have to remove that container to be able to reuse that name.: that name is already in use\

I believe this could be related to the fix https://review.opendev.org/c/openstack/tripleo-heat-templates/+/777549.

Version-Release number of selected component (if applicable):
Utilized downstream puddle RHOS-16.1-RHEL-8-20210506.n.1 and target upgrade was RHOS-16.2-RHEL-8-20210514.n.0

How reproducible:
100% of the time with a TLS-e deployment

Steps to Reproduce:
1. Deploy a 16.1 TLS-e environment
2. Upgrade the environment to 16.2
3. During upgrade of controller the deployment fails

Actual results:
Deployment fails to upgrade

Expected results:
Deployment upgrades to 16.2

Additional info:
Test bed can be provided upon request and package update logs have been attached.

Detailed steps taken for upgrade procedure below, for sake of brevity did not include output of all commands.

sudo dnf install -y crudini
. stackrc
tripleo-ansible-inventory --static-yaml-inventory ansible-inventory.yaml
export RELEASE=16.2
sudo rhos-release $RELEASE -P -p passed_phase2
ansible -i ansible-inventory.yaml -m shell -ba "mv /etc/yum.repos.d/ /etc/16.1.yum.repos.d/;mkdir /etc/yum.repos.d/" Compute
ansible -i ansible-inventory.yaml -m shell -ba "mv /etc/yum.repos.d/ /etc/16.1.yum.repos.d/;mkdir /etc/yum.repos.d/" Controller
ansible -i ansible-inventory.yaml -m synchronize -ba "src=/etc/yum.repos.d/ dest=/etc/yum.repos.d/"  Controller
ansible -i ansible-inventory.yaml -m synchronize -ba "src=/etc/yum.repos.d/ dest=/etc/yum.repos.d/"  Compute
ansible -i ansible-inventory.yaml -m shell -ba "dnf module disable -y container-tools:rhel8;dnf module enable -y container-tools:2.0" all
sudo dnf update -y python3-tripleoclient* openstack-tripleo-common openstack-tripleo-heat-templates tripleo-ansible ansible
cat << EOF > /home/stack/virt/custom-undercloud-params.yaml
parameter_defaults:
    SkipRhelEnforcement: true
    DnfStreams: [{'module':'container-tools', 'stream':'2.0'}]
EOF
echo -e "    SkipRhelEnforcement: true\n    DnfStreams: [{'module':'container-tools', 'stream':'2.0'}]" >> /home/stack/virt/config_heat.yaml
crudini --set /home/stack/undercloud.conf DEFAULT custom_env_files /home/stack/virt/custom-undercloud-params.yaml
cp -p containers-prepare-parameter.yaml{,.$(date +%F_%H%M%S)}
# Update tag of original containers-prepare-parameter.yaml to 16.2_20210514.1
openstack undercloud upgrade -y
sudo reboot

# Upgrade controller/computes
source stackrc
mkdir -p ~/images/
rm -rf ~/images/*
for i in /usr/share/rhosp-director-images/overcloud-full-latest-16.2.tar /usr/share/rhosp-director-images/ironic-python-agent-latest-16.2.tar ; do tar -C ~/images/ -xvf $i; done
openstack overcloud image upload --update-existing --image-path /home/stack/images/
openstack overcloud node configure $(openstack baremetal node list -c UUID -f value)
sed 's/overcloud deploy/overcloud update prepare -y/' overcloud_deploy.sh >  overcloud_update_prepare.sh
bash ./overcloud_update_prepare.sh
openstack overcloud external-update run -y --tags container_image_prepare
openstack overcloud update run -y --stack overcloud --limit Controller --playbook all

Comment 12 errata-xmlrpc 2021-09-15 07:15:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform (RHOSP) 16.2 enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2021:3483

Comment 13 Red Hat Bugzilla 2023-09-15 01:08:41 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days