Bug 2051376 - [OSP16.2 -> OSP17.1] Undercloud upgrade failing due to remaining deprecated services
Summary: [OSP16.2 -> OSP17.1] Undercloud upgrade failing due to remaining deprecated s...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 17.1 (Wallaby)
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: beta
: 17.1
Assignee: Sergii Golovatiuk
QA Contact: Archana Singh
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-02-07 07:08 UTC by Jose Luis Franco
Modified: 2023-08-16 01:11 UTC (History)
10 users (show)

Fixed In Version: openstack-tripleo-heat-templates-11.6.1-2.20220116004912.el8ost
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-08-16 01:10:52 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1959662 0 None None None 2022-02-07 13:49:29 UTC
OpenStack gerrit 827260 0 None MERGED Remove Nova from undercloud during upgrades 2023-01-06 16:59:48 UTC
OpenStack gerrit 827647 0 None MERGED Deprecate and unwire enable_nova UC option 2023-01-06 16:59:47 UTC
OpenStack gerrit 827968 0 None MERGED Drop Nova and NovaJoin services from UC role data 2023-01-06 16:59:49 UTC
OpenStack gerrit 828212 0 None MERGED Remove old per step startup configs 2023-01-06 16:59:50 UTC
Red Hat Issue Tracker OSP-12494 0 None None None 2022-02-07 07:23:19 UTC
Red Hat Issue Tracker UPG-4973 0 None None None 2022-02-07 07:23:22 UTC
Red Hat Product Errata RHEA-2023:4577 0 None None None 2023-08-16 01:11:32 UTC

Description Jose Luis Franco 2022-02-07 07:08:01 UTC
Description of problem:

The Undercloud upgrade fails during step5 in deployment tasks with the following error:

2022-02-05 22:08:36 | 2022-02-05 22:08:36.321034 | 52540022-e1f2-fc14-5813-0000000045d2 |       TASK | Delete orphan containers managed by Podman for /var/lib/tripleo-config/container-startup-config/step_5
2022-02-05 22:08:36 | 2022-02-05 22:08:36.370107 | 52540022-e1f2-fc14-5813-0000000045d2 |     TIMING | Delete orphan containers managed by Podman for /var/lib/tripleo-config/container-startup-config/step_5 | undercloud-0 | 0:20:11.949398 | 0.04s
2022-02-05 22:08:36 | 2022-02-05 22:08:36.515514 | 52540022-e1f2-fc14-5813-00000000461f |     TIMING | tripleo_container_rm : include_tasks | undercloud-0 | 0:20:12.094821 | 0.10s
2022-02-05 22:08:36 | 2022-02-05 22:08:36.540016 | 52540022-e1f2-fc14-5813-000000004546 |       TASK | Create containers from /var/lib/tripleo-config/container-startup-config/step_5
2022-02-05 22:08:36 | 2022-02-05 22:08:36.567910 | 52540022-e1f2-fc14-5813-000000004546 |     TIMING | tripleo_container_manage : Create containers from /var/lib/tripleo-config/container-startup-config/step_5 | undercloud-0 | 0:20:12.147202 | 0.03s
2022-02-05 22:08:36 | 2022-02-05 22:08:36.580700 | a73ad766-66ef-4462-8b33-e7a14f124897 |   INCLUDED | /usr/share/ansible/roles/tripleo_container_manage/tasks/create.yml | undercloud-0
2022-02-05 22:08:36 | 2022-02-05 22:08:36.610576 | 52540022-e1f2-fc14-5813-000000004649 |       TASK | Create containers managed by Podman for /var/lib/tripleo-config/container-startup-config/step_5
2022-02-05 22:18:53 | 2022-02-05 22:18:53.287277 |                                      |    WARNING | ERROR: Can't run container nova_wait_for_compute_service
2022-02-05 22:18:53 | stderr: + command -v python3
2022-02-05 22:18:53 | + python3 /container-config-scripts/nova_wait_for_compute_service.py
2022-02-05 22:18:53 | 2022-02-05 22:18:53.290911 | 52540022-e1f2-fc14-5813-000000004649 |      FATAL | Create containers managed by Podman for /var/lib/tripleo-config/container-startup-config/step_5 | undercloud-0 | error={"changed": false, "msg": "Failed containers: nova_wait_for_compute_service"}
2022-02-05 22:18:53 | 2022-02-05 22:18:53.292697 | 52540022-e1f2-fc14-5813-000000004649 |     TIMING | tripleo_container_manage : Create containers managed by Podman for /var/lib/tripleo-config/container-startup-config/step_5 | undercloud-0 | 0:30:28.871975 | 616.68s
2022-02-05 22:18:53 | 

Log: http://rhos-ci-logs.lab.eng.tlv2.redhat.com/logs/rcj/DFG-upgrades-ffu-17.0-from-16.2-passed_phase1-3cont_2comp_3ceph_1ipa-ipv4-ovn_dvr/2/undercloud-0/home/stack/ffu_undercloud_upgrade.log.gz

When having a look at the running containers after the failure, we can still see three old OSP16.2 containers which were deprecated in OSP17:

3ea4e5b9f5db  undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-mistral-api:16.2_20220124.1                /usr/bin/bootstra...  24 minutes ago  Exited (0) 24 minutes ago          mistral_db_populate
a4e33ff8c0da  undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-nova-compute-ironic:16.2_20220124.1        kolla_start           24 minutes ago  Up 24 minutes ago                  nova_compute
924b048cf4b3  undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-nova-compute-ironic:16.2_20220124.1        /container-config...  24 minutes ago  Exited (1) 14 minutes ago          nova_wait_for_compute_service

Log: http://rhos-ci-logs.lab.eng.tlv2.redhat.com/logs/rcj/DFG-upgrades-ffu-17.0-from-16.2-passed_phase1-3cont_2comp_3ceph_1ipa-ipv4-ovn_dvr/2/undercloud-0/var/log/extra/podman/podman_allinfo.log.gz


And indeed, when we have a look at the container tripleo configs, we can see the configs belonging to those services hanging in /var/lib/tripleo-config/container-startup-config/step_5/ directory:

http://rhos-ci-logs.lab.eng.tlv2.redhat.com/logs/rcj/DFG-upgrades-ffu-17.0-from-16.2-passed_phase1-3cont_2comp_3ceph_1ipa-ipv4-ovn_dvr/2/undercloud-0/var/lib/tripleo-config/container-startup-config/step_5/
Version-Release number of selected component (if applicable):


How reproducible:
Run CI job: https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/DFG/view/upgrades/view/ffu/job/DFG-upgrades-ffu-17.0-from-16.2-passed_phase1-3cont_2comp_3ceph_1ipa-ipv4-ovn_dvr/

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Jose Luis Franco 2022-02-07 13:42:59 UTC
The issue seems to be in this line: https://github.com/openstack/tripleo-heat-templates/blob/stable/wallaby/common/common-container-setup-tasks.yaml#L76 as the Undercloud/docker_config.yaml (file attached in the BZ obtained from the failing CI job) does not contain a step_5. Therefore, all the configurations stored in that folder remain there: https://github.com/openstack/tripleo-ansible/blob/master/tripleo_ansible/ansible_plugins/modules/container_startup_config.py#L93-L102

Comment 21 errata-xmlrpc 2023-08-16 01:10:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 17.1 (Wallaby)), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2023:4577


Note You need to log in before you can comment on or make changes to this bug.