Bug 1853417 - container image prepare assumes default tag of 16.0 always exists even when it doesn't in a filtered content view
Summary: container image prepare assumes default tag of 16.0 always exists even when i...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-common
Version: 16.0 (Train)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: z2
: 16.1 (Train on RHEL 8.2)
Assignee: James Slagle
QA Contact: David Rosenfeld
URL:
Whiteboard:
Depends On:
Blocks: 1853419
TreeView+ depends on / blocked
 
Reported: 2020-07-02 15:37 UTC by James Slagle
Modified: 2023-12-15 18:22 UTC (History)
10 users (show)

Fixed In Version: openstack-tripleo-common-11.4.1-0.20200727073831.96886e8.el8ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-28 15:38:12 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 741622 0 None MERGED Don't assume default tag exists in container repo 2021-01-26 13:55:27 UTC
OpenStack gerrit 750109 0 None MERGED Fix handling of default_tag 2021-01-26 13:55:27 UTC
Red Hat Product Errata RHEA-2020:4284 0 None None None 2020-10-28 15:38:33 UTC

Comment 1 Brendan Shephard 2020-07-23 23:55:47 UTC
Hey James,

Trying to test this out myself. So I have applied the patch from: https://review.opendev.org/#/c/741622/

Changed my container-images-prepare.yaml to include the default_tag: false

  ContainerImagePrepare:
  - push_destination: true
    set:
      ceph_alertmanager_image: ose-prometheus-alertmanager
      ceph_alertmanager_namespace: registry.redhat.io/openshift4
      ceph_alertmanager_tag: 4.1
      ceph_grafana_image: ose-grafana
      ceph_grafana_namespace: registry.redhat.io/openshift4
      ceph_grafana_tag: 4.1
      ceph_image: rhceph-4-rhel8
      ceph_namespace: registry.redhat.io/rhceph
      ceph_node_exporter_image: ose-prometheus-node-exporter
      ceph_node_exporter_namespace: registry.redhat.io/openshift4
      ceph_node_exporter_tag: v4.1
      ceph_prometheus_image: ose-prometheus
      ceph_prometheus_namespace: registry.redhat.io/openshift4
      ceph_prometheus_tag: 4.1
      ceph_tag: latest
      name_prefix: osp-apac-library-osp16-cv-osp16_containers
      name_suffix: ''
      namespace: satellite.pek2.apac.lab:5000
      neutron_driver: ovn
      default_tag: false
      rhel_containers: false
    tag_from_label: '{version}-{release}'


It seems as though it's still trying to get the manifest for 16.0 here though:
https://satellite.pek2.apac.lab:5000 "GET /v2/osp-apac-library-osp16-cv-osp16_containers-cinder-api/manifests/16.0 HTTP/1.1" 302 445  

And then I ultimately end up with:
Exception occured while running the command
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/tripleo_common/image/image_uploader.py", line 725, in _inspect
    timeout=30
  File "/usr/lib/python3.6/site-packages/tripleo_common/image/image_uploader.py", line 281, in get
    **kwargs)
  File "/usr/lib/python3.6/site-packages/tripleo_common/image/image_uploader.py", line 260, in _action
    request=req)
  File "/usr/lib/python3.6/site-packages/tripleo_common/image/image_uploader.py", line 245, in check_status
    request.raise_for_status()
  File "/usr/lib/python3.6/site-packages/requests/models.py", line 940, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://satellite.pek2.apac.lab/pulp/docker/v2/3-osp16-cv-library-04aa59f9-3e14-4ca2-b673-4793b4e6a033/manifests/1/16.0


And:
[...]
  File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 319, in iter
    return fut.result()
  File "/usr/lib64/python3.6/concurrent/futures/_base.py", line 425, in result
    return self.__get_result()
  File "/usr/lib64/python3.6/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 361, in call
    result = fn(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/tripleo_common/image/image_uploader.py", line 730, in _inspect
    image_url.geturl())
tripleo_common.image.exception.ImageNotFoundException: Not found image: docker://satellite.pek2.apac.lab:5000/osp-apac-library-osp16-cv-osp16_containers-cinder-api:16.0


Any idea if I'm missing something here?

Comment 2 Brendan Shephard 2020-07-24 03:29:27 UTC
I think I might have misunderstood the intention of default_tag there. So I unset it in my container-images-prepare.yaml, and let it default to true as set in container-images/container_image_prepare_defaults.yaml by that patch.

Gave it another shot and still getting the same thing unfortunately. I probably jumped the gun a bit here anyway since it's still ON_DEV.

Comment 3 James Slagle 2020-07-28 18:09:30 UTC
(In reply to Brendan Shephard from comment #2)
> I think I might have misunderstood the intention of default_tag there. So I
> unset it in my container-images-prepare.yaml, and let it default to true as
> set in container-images/container_image_prepare_defaults.yaml by that patch.
> 
> Gave it another shot and still getting the same thing unfortunately. I
> probably jumped the gun a bit here anyway since it's still ON_DEV.

as you realized, default_tag is not something that you explicitly set in the values for ContainerImagePrepare.

When you apply the patch, it should cause the latest tag available in the CV to be used, as long as you have not also specified an explicit "tag" key in ContainerImagePrepare.

You may still run into issues if the CV were to say have a filter for 16.0, which points to a version that is also not available in the CV. That's just how satellite works based on my understanding.

I can help you verify that the patch does the right thing, but would need to know a few things:

How was it applied?
How are you running openstack tripleo container image prepare? Need to see the command, and all environment files.
How is the CV setup in satellite, what are the filters? And what is the CV container repo url?

Comment 4 James Slagle 2020-07-28 18:17:12 UTC
one thing to note is that if the container image prepare process is just running as part of the normal openstack overcloud deploy, then you will need to make sure the patch is applied to tripleo-common in the mistral_executor container, since mistral is executing ansible in a container with it's own filesystem and tripleo-common (not the same as what is on the host).

based on the line numbers in the traceback, I'm not sure the patch is actually being used.

Comment 5 Brendan Shephard 2020-07-28 22:24:55 UTC
Hey man,

If you're happy that it works, then I'm happy. I initially tried patching the various files from your patch here:
https://review.opendev.org/changes/741622/revisions/589ac8ac0d5a14ce4aa831892d3b0a4a31e29674/patch?zip

But for whatever reason I didn't have the required $ patch  foo to use the patch command. I ended up just copying the raw files from your change into their various directories. Then to test it, I was just using:
sudo openstack tripleo container image prepare -e ~/containers-prepare-parameter.yaml --debug

From the Satellite site, I was able to use podman to pull the container from the Content View. So I'm fairly confident that should have been working. The path is:
name_prefix: osp-apac-library-osp16-cv-osp16_containers

Where osp-apac is the org, I didn't have any environments setup (Production, Development, etc), so Library is the environment, osp16-cv is the Content View and osp16_containers is the product.


I didn't restart any of the services or modify any of the containers after I added in your changed scripts, but I did try adding random print statements in to make sure I was using the right image_uploader.py file. And in the ContainerImagePrepare, I removed the tag: 16.0, and left tag_from_label: although, I did have that default_tag: argument which I realized I wasn't able to set there. But I guess it shouldn't have made a difference in that case?


Regardless, as I said, if you're happy with it. Then I'm confident that it is fine and was just something I missed while I was testing it.

Comment 6 James Slagle 2020-09-03 19:12:29 UTC
Testplan:
Setup a satellite server with a Content View with container repositories.
Add a tag filter on the Content View so that only certain tags are included. 
You can try with different tag filters, e.g., the latest, not the latest, the latest symlinked version (16.1).

Setup your ContainerImagePrepare parameter per the director docs,
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.1/html-single/director_installation_and_usage/index#preparing-a-satellite-server-for-container-images

Do *not* specify an explicit tag that you used for the Content View, or the default tag for the container images in ContainerImagePrepare. With this fix, that should now be discovered automatically during the container image prepare step.

To try a negative test, you could specify an explicit tag in ContainerImagePrepare that you know does not exist in the Content View filter, but perhaps does exist in the upstream repo. It should be filtered out in the Content View, and then container image prepare should fail with a 404 or not found.


I have tested this behavior on Satellite 6.7 with curl.

curl the fully sync'd product repository for openstack-aodh-api from the satellite:

(undercloud) [cloud-user@osp16 ~]$ curl -L https://tripleo-03.tripleo.lab.eng.bos.redhat.com:5000/v2/org1-openstack_16-rhosp-rhel8_openstack-aodh-api/tags/list | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   429  100   429    0     0   4125      0 --:--:-- --:--:-- --:--:--  4125
100   205  100   205    0     0   1045      0 --:--:-- --:--:-- --:--:--  3253
{
  "name": "org1-openstack_16-rhosp-rhel8_openstack-aodh-api",
  "tags": [
    "16.0-97",
    "16.0-93",
    "16.0-95",
    "16.1-50-source",
    "16.1",
    "16.0",
    "16.0-105",
    "16.0-104",
    "16.1-50",
    "16.0-82",
    "16.1-45",
    "16.0-79"
  ]
}

Notice that all tags are returned.

Now, curl the content view repository where I've added an include filter on only the 16.0-93 tag. THe environment is called "library", and the content view is called "osp16":

(undercloud) [cloud-user@osp16 ~]$ curl -L https://tripleo-03.tripleo.lab.eng.bos.redhat.com:5000/v2/org1-library-osp16-openstack_16-rhosp-rhel8_openstack-aodh-api/tags/list | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   461  100   461    0     0   4564      0 --:--:-- --:--:-- --:--:--  4564
100    95  100    95    0     0    482      0 --:--:-- --:--:-- --:--:--   482
{
  "name": "org1-library-osp16-openstack_16-rhosp-rhel8_openstack-aodh-api",
  "tags": [
    "16.0-93"
  ]
}

Only the 16.0-93 tag is returned as expected.

With the patch from this bz, tripleo-common should now discover that only the 16.0-93 tag exists in the repository and use that, as long as you didn't explicitly ask for a different version.

Comment 10 Jad Haj Yahya 2020-09-17 06:49:16 UTC
Run satellite deployment without tag parameter and default_tag set to false in container image prepare yaml

Comment 11 Jad Haj Yahya 2020-09-17 13:56:44 UTC
Mistake on verification
also compose not ready yet

Comment 13 Jad Haj Yahya 2020-10-02 05:33:39 UTC
Deployed undercloud and overcloud using satellite filtered content view

Comment 18 errata-xmlrpc 2020-10-28 15:38:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:4284


Note You need to log in before you can comment on or make changes to this bug.