Bug 2026654
Summary: | Undercloud fails to provide correct images to overcloud when updating tags from 16.1.x to 16.1 | |||
---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Eric Nothen <enothen> | |
Component: | tripleo-ansible | Assignee: | OSP Team <rhos-maint> | |
Status: | CLOSED ERRATA | QA Contact: | Joe H. Rahme <jhakimra> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 16.1 (Train) | CC: | drosenfe, enothen, mburns, morazi, slinaber | |
Target Milestone: | --- | Keywords: | Reopened, Triaged | |
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | tripleo-ansible-0.5.1-1.20211220033343.902c3c8.el8ost | Doc Type: | No Doc Update | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 2028962 (view as bug list) | Environment: | ||
Last Closed: | 2022-03-24 11:02:18 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 2028962 |
Description
Eric Nothen
2021-11-25 12:25:27 UTC
Workaround: Before pulling the new images, remove the old ones from the undercloud registry so that when the overcloud nodes pull images, only the ones with tag 16.1 are available. $ openstack tripleo container image list -c "Image Name" -f value |awk '/16.1.4$/' | while read image ;do echo "Deleting ${image}..." ;sudo openstack tripleo container image delete -y $image ;done [root@controller1-161 ~]# podman images |grep aodh director.ctlplane.localdomain:8787/rhosp-rhel8/openstack-aodh-evaluator 16.1.4 d6480804c4d2 8 months ago 743 MB [root@controller1-161 ~]# [root@controller1-161 ~]# podman pull director.ctlplane.localdomain:8787/rhosp-rhel8/openstack-aodh-evaluator:16.1 Trying to pull director.ctlplane.localdomain:8787/rhosp-rhel8/openstack-aodh-evaluator:16.1... Getting image source signatures Copying blob 17cb0a75ad3b skipped: already exists Copying blob a2070ce838f3 skipped: already exists Copying blob 2cae32376095 skipped: already exists Copying blob c30189fd3718 skipped: already exists Copying blob 328bf4e7fd60 done Copying blob 20ac4e1e3488 done Copying config cef243fa5a done Writing manifest to image destination Storing signatures cef243fa5a06744cc03c27c7f74ea7238540dd328586b327546ea6c722c9529c [root@controller1-161 ~]# [root@controller1-161 ~]# podman images |grep aodh director.ctlplane.localdomain:8787/rhosp-rhel8/openstack-aodh-evaluator 16.1 cef243fa5a06 6 weeks ago 743 MB director.ctlplane.localdomain:8787/rhosp-rhel8/openstack-aodh-evaluator 16.1.4 d6480804c4d2 8 months ago 743 MB [root@controller1-161 ~]# Also worth mentioning that while the undercloud registry is providing the wrong image, the overcloud upgrade fails. After cleaning the registry with the workaround provided above, the overcloud update completes as expected. Please provide the container image prepare logs (run with --debug). It should be noted that podman images does not list the contents of the image-serve registry. You would need to use `openstack tripleo container image list` to view the contents. Attached the requested logs This is a bug in z4 that is fixed in z6. Please upgrade. See Bug 1941412 *** This bug has been marked as a duplicate of bug 1941412 *** (In reply to Alex Schultz from comment #9) > This is a bug in z4 that is fixed in z6. Please upgrade. See Bug 1941412 > > *** This bug has been marked as a duplicate of bug 1941412 *** I don't see how the problem described here is related to the bug for which this BZ was marked as duplicate. To start with, my test undercloud is already at 16.1.6, and I'm using latest packages available to deploy the overcloud: [root@director ~]# cat /etc/rhosp-release Red Hat OpenStack Platform release 16.1.6 GA (Train) [root@director ~]# [root@director ~]# yum check-update Updating Subscription Management repositories. /usr/lib/python3.6/site-packages/dateutil/parser/_parser.py:70: UnicodeWarning: decode() called on unicode string, see https://bugzilla.redhat.com/show_bug.cgi?id=1693751 instream = instream.decode() Advanced Virtualization for RHEL 8 x86_64 (RPMs) 23 kB/s | 2.8 kB 00:00 Red Hat Enterprise Linux 8 for x86_64 - BaseOS - Extended Update Support (RPMs) 20 kB/s | 2.4 kB 00:00 Fast Datapath for RHEL 8 x86_64 (RPMs) 21 kB/s | 2.4 kB 00:00 Red Hat Enterprise Linux 8 for x86_64 - AppStream - Extended Update Support (RPMs) 25 kB/s | 2.8 kB 00:00 Red Hat Ansible Engine 2.9 for RHEL 8 x86_64 (RPMs) 22 kB/s | 2.3 kB 00:00 Red Hat Enterprise Linux 8 for x86_64 - High Availability - Extended Update Support (RPMs) 22 kB/s | 2.4 kB 00:00 Red Hat OpenStack Platform 16.1 for RHEL 8 x86_64 (RPMs) 19 kB/s | 2.4 kB 00:00 [root@director ~]# On top of that, I don't even need to deploy the overcloud to notice that there's going to be a problem when doing that: [root@director ~]# ll /var/lib/image-serve/v2/rhosp-rhel8/openstack-aodh-listener/manifests/*.type-map -rw-r--r--. 1 root root 169 Nov 25 21:35 /var/lib/image-serve/v2/rhosp-rhel8/openstack-aodh-listener/manifests/16.1.4.type-map -rw-r--r--. 1 root root 167 Nov 25 21:58 /var/lib/image-serve/v2/rhosp-rhel8/openstack-aodh-listener/manifests/16.1.type-map [root@director ~]# [root@director ~]# curl -s http://192.168.24.1:8787/v2/rhosp-rhel8/openstack-aodh-listener/manifests/16.1 | jq .config.digest "sha256:6d6130ebb9f4c6b0ad2b1d9a29c9d2fa84df24afc9b4744e5232f75346cd4273" [root@director ~]# curl -s http://192.168.24.1:8787/v2/rhosp-rhel8/openstack-aodh-listener/manifests/16.1.4 | jq .config.digest "sha256:6d6130ebb9f4c6b0ad2b1d9a29c9d2fa84df24afc9b4744e5232f75346cd4273" [root@director ~]# [root@director ~]# [root@director ~]# curl -s http://192.168.24.1:8787/v2/rhosp-rhel8/openstack-aodh-listener/manifests/16.1.type-map | jq .config.digest "sha256:ea548ddc17674629c97f7472cb5fe62f5495208bcf2f27d5918bd9a8e7c9833f" [root@director ~]# curl -s http://192.168.24.1:8787/v2/rhosp-rhel8/openstack-aodh-listener/manifests/16.1.4.type-map | jq .config.digest "sha256:6d6130ebb9f4c6b0ad2b1d9a29c9d2fa84df24afc9b4744e5232f75346cd4273" [root@director ~]# That shows that if I were to deploy an overcloud now with tag 16.1 (not even saying an update, but a fresh install), it would still pull an image of 16.1.4 instead of 16.1.6. Ok I'll reopen and look when I get a chance. There was an issue with the tags not being properly managed because the metadata wasn't correctly fetched every time. I'll try and duplicate this. That being said, it's not really recommended to use 16.1 unless you are following the latest of 16.1 always. The referenced bz is missing references to changes around image id comparisons which were handled via https://review.opendev.org/q/topic:%22bug%252F1895974-stable%252Ftrain%22+(status:open%20OR%20status:merged) (In reply to Alex Schultz from comment #11) > it's not really recommended to use 16.1 unless you are following the latest of 16.1 always. > Yes, that is why my customer is changing their container image prepare file to use tag 16.1 instead of 16.1.x. They ran into this issue now as part of the change, but from now on they keep the 16.1 tag and just pull the latest. I am unable to reproduce this. I just setup a 16.1.4 undercloud. Then I proceeded to run openstack tripleo container image prepare with a switch to tag: '16.1' from tag: '16.1.4'. I then looked at the type-map for openstack-cron which was updated to a different container. [cloud-user@undercloud manifests]$ cat 16.1.type-map URI: 16.1 Content-Type: application/vnd.docker.distribution.manifest.v2+json URI: sha256:de32ea21c4637013c63b95e7289dcf531e0c315f20ace5e0802fcfcb00017470/index.json [cloud-user@undercloud manifests]$ cat 16.1.4.type-map URI: 16.1.4 Content-Type: application/vnd.docker.distribution.manifest.v2+json I even tried copying the 16.1.4.type-map over to 16.1.type-map and rerunning to see if the file doesn't get updated. I was updated with the same content. Actually I think I've reproduced it. I'll continue to dig deeper. So the content is being updated and technically the files on disk are correct. What appears to be happening is that the way the tag urls are being intrepreted by apache is causing the 16.1.4 metadata to be provided for the 16.1 tag. I'm looking into how we can address this mismatch. For now the workaround would be to make sure that you don't have 16.1.x tags when using 16.1. (In reply to Alex Schultz from comment #15) > So the content is being updated and technically the files on disk are > correct. What appears to be happening is that the way the tag urls are > being intrepreted by apache is causing the 16.1.4 metadata to be provided > for the 16.1 tag. I'm looking into how we can address this mismatch. For > now the workaround would be to make sure that you don't have 16.1.x tags > when using 16.1. I agree. That's why I said the issue is with the Apache Multiview handler. Applying the workaround on comment #1 _before_ starting the overcloud update/deploy is enough for the job to complete successfully afterwards. Is it to early to ask on which z stream is this fix going to be included? Procedure used was: deploy undercloud with puddle that has fix openstack tripleo container image prepare with a tag of 16.1.7 podman pull undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-aodh-evaluator openstack tripleo container image prepare with a tag of 16.1 podman pull undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-aodh-evaluator Saw that 16.1 and 16.1.7 ids were different: (undercloud) [stack@undercloud-0 ~]$ sudo podman images | grep aodh undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-aodh-evaluator 16.1 969a82fce921 5 hours ago 743 MB undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-aodh-evaluator 16.1.7 b21f7139c745 2 months ago 743 MB Is this bug also affecting 16.2? I don't have a one to verify myself right now. Yes it impacts 16.2. Bug 2028962 is for 16.2 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenStack Platform 16.1.8 bug fix and enhancement advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:0986 |