Bug 2235621 - [FFU 16.2 to 17.1] openstack overcloud upgrade fails when pulling images from registry registry.redhat.io
Summary: [FFU 16.2 to 17.1] openstack overcloud upgrade fails when pulling images from...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 17.1 (Wallaby)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: z2
: 17.1
Assignee: Sergii Golovatiuk
QA Contact: Joe H. Rahme
URL:
Whiteboard:
Depends On:
Blocks: 2263226
TreeView+ depends on / blocked
 
Reported: 2023-08-29 10:30 UTC by Pedro Navarro
Modified: 2024-02-07 17:30 UTC (History)
9 users (show)

Fixed In Version: openstack-tripleo-heat-templates-14.3.1-17.1.20230921160833.d36821f.el9ost
Doc Type: Bug Fix
Doc Text:
Before this update, the RHOSP upgrade from 16.2 to 17.1 failed when pulling images from `registry.redhat.io` because the upgrade playbook did not include the Podman registry login task. This issue is resolved in RHOSP 17.1.2.
Clone Of:
: 2263226 (view as bug list)
Environment:
Last Closed: 2024-01-16 14:30:46 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 893430 0 None MERGED Run podman install block in upgrade tasks 2023-12-14 12:42:00 UTC
Red Hat Issue Tracker OSP-27925 0 None None None 2023-08-29 11:26:30 UTC
Red Hat Product Errata RHBA-2024:0209 0 None None None 2024-01-16 14:30:48 UTC

Description Pedro Navarro 2023-08-29 10:30:26 UTC
Description of problem:

I'm getting authentication errors when executing:
openstack overcloud upgrade run --yes --stack overcloud --debug --limit allovercloud,undercloud --playbook all

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.FFU from 16.2 to 17.1
2.Configure containers-prepare-parameter.yaml to use registry.redhat.io, more info in additional info section
3. Run openstack overcloud upgrade run --yes --stack overcloud --debug --limit allovercloud,undercloud --playbook all

Actual results:

2023-08-29 04:09:55.422345 | 2cc26024-0e20-2113-4ab1-000000001018 |     TIMING | check if libvirt is installed | overcloud-novacompute-0 | 0:02:24.892518 | 0.03s
2023-08-29 04:09:55.423416 | 2cc26024-0e20-2113-4ab1-00000000116d |      FATAL | Pull registry.redhat.io/rhosp-rhel9/openstack-cinder-volume:17.1 image | overcloud-controller-2 | error={"changed": true, "cmd": "podman pull registry.redhat.io/rhosp-rhel9/openstack-cinder-volume:17.1", "delta": "0:00:00.228850", "end": "2023-08-29 08:09:55.401755", "msg": "non-zero return code", "rc": 125, "start": "2023-08-29 08:09:55.172905", "stderr": "Trying to pull registry.redhat.io/rhosp-rhel9/openstack-cinder-volume:17.1...\nError: Error initializing source docker://registry.redhat.io/rhosp-rhel9/openstack-cinder-volume:17.1: unable to retrieve auth token: invalid username/password: unauthorized: Please login to the Red Hat Registry using your Customer Portal credentials. Further instructions can be found here: https://access.redhat.com/RegistryAuthentication", "stderr_lines": ["Trying to pull registry.redhat.io/rhosp-rhel9/openstack-cinder-volume:17.1...", "Error: Error initializing source docker://registry.redhat.io/rhosp-rhel9/openstack-cinder-volume:17.1: unable to retrieve auth token: invalid username/password: unauthorized: Please login to the Red Hat Registry using your Customer Portal credentials. Further instructions can be found here: https://access.redhat.com/RegistryAuthentication"], "stdout": "", "stdout_lines": []}

Expected results:

Images are pulled in correctly

Additional info:

cat containers-prepare-parameter.yaml
# Generated with the following on 2023-08-28T12:47:32.463507
#
parameter_defaults:
  ContainerImagePrepare:
  - tag_from_label: '{version}-{release}'
    set:
      namespace: registry.redhat.io/rhosp-rhel9
      name_prefix: openstack-
      name_suffix: ''
      tag: '17.1'
      rhel_containers: false
      neutron_driver: ovn
      ceph_namespace: registry.redhat.io/rhceph
      ceph_image: rhceph-6-rhel9
      ceph_tag: latest
      ceph_prometheus_namespace: registry.redhat.io/openshift4
      ceph_prometheus_image: ose-prometheus
      ceph_prometheus_tag: v4.6
      ceph_alertmanager_namespace: registry.redhat.io/openshift4
      ceph_alertmanager_image: ose-prometheus-alertmanager
      ceph_alertmanager_tag: v4.6
      ceph_node_exporter_namespace: registry.redhat.io/openshift4
      ceph_node_exporter_image: ose-prometheus-node-exporter
      ceph_node_exporter_tag: v4.6
      ceph_grafana_namespace: registry.redhat.io/rhceph
      ceph_grafana_image: rhceph-6-dashboard-rhel9
      ceph_grafana_tag: latest
  MultiRhelRoleContainerImagePrepare: &id001
  - tag_from_label: '{version}-{release}'
    set:
      namespace: registry.redhat.io/rhosp-rhel9
      name_prefix: openstack-
      name_suffix: ''
      tag: '17.1'
      rhel_containers: false
      neutron_driver: ovn
      ceph_namespace: registry.redhat.io/rhceph
      ceph_image: rhceph-6-rhel9
      ceph_tag: latest
      ceph_prometheus_namespace: registry.redhat.io/openshift4
      ceph_prometheus_image: ose-prometheus
      ceph_prometheus_tag: v4.6
      ceph_alertmanager_namespace: registry.redhat.io/openshift4
      ceph_alertmanager_image: ose-prometheus-alertmanager
      ceph_alertmanager_tag: v4.6
      ceph_node_exporter_namespace: registry.redhat.io/openshift4
      ceph_node_exporter_image: ose-prometheus-node-exporter
      ceph_node_exporter_tag: v4.6
      ceph_grafana_namespace: registry.redhat.io/rhceph
      ceph_grafana_image: rhceph-6-dashboard-rhel9
      ceph_grafana_tag: latest
    excludes:
    - collectd
    - nova-libvirt
  - tag_from_label: '{version}-{release}'
    set:
      namespace: registry.redhat.io/rhosp-rhel9
      name_prefix: openstack-
      name_suffix: ''
      tag: '17.1'
      rhel_containers: false
      neutron_driver: ovn
      ceph_namespace: registry.redhat.io/rhceph
      ceph_image: rhceph-6-rhel9
      ceph_tag: latest
      ceph_prometheus_namespace: registry.redhat.io/openshift4
      ceph_prometheus_image: ose-prometheus
      ceph_prometheus_tag: v4.6
      ceph_alertmanager_namespace: registry.redhat.io/openshift4
      ceph_alertmanager_image: ose-prometheus-alertmanager
      ceph_alertmanager_tag: v4.6
      ceph_node_exporter_namespace: registry.redhat.io/openshift4
      ceph_node_exporter_image: ose-prometheus-node-exporter
      ceph_node_exporter_tag: v4.6
      ceph_grafana_namespace: registry.redhat.io/rhceph
      ceph_grafana_image: rhceph-6-dashboard-rhel9
      ceph_grafana_tag: latest
    includes:
    - collectd
    - nova-libvirt
  ComputeContainerImagePrepare: *id001
  ControllerContainerImagePrepare: *id001

Comment 1 Takashi Kajinami 2023-08-29 11:57:28 UTC
The content shared in the additional info section does not contain the ContainerImageRegistryCredentials parameter
which defines credentials for registry authentication.

https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/17.1/html-single/installing_and_managing_red_hat_openstack_platform_with_director/index#ref_obtaining-container-images-from-private-registries_preparing-for-director-installation

Can you retry the command with that parameter set ?

Comment 2 Takashi Kajinami 2023-08-29 12:06:20 UTC
Because you don't set push_destination: true, we expect all overcloud nodes pull container images
directly from the source registry which means all of your overcloud nodes should login to our public registry.

So you may also need to set
ContainerImageRegistryLogin: true
as well.

If the problem is reproduced with these set, we may also have to review the full output of `overcloud upgrade run`
to check whether the step to execute podman login is executed properly during the task.

Comment 4 Pedro Navarro 2023-08-29 13:35:35 UTC
Indeed, I forgot to add 2 missing lines cat containers-prepare-parameter.yaml
# Generated with the following on 2023-08-28T12:47:32.463507
#
parameter_defaults:
  ContainerImagePrepare:
  - tag_from_label: '{version}-{release}'
    set:
      namespace: registry.redhat.io/rhosp-rhel9
      name_prefix: openstack-
      name_suffix: ''
      tag: '17.1'
      rhel_containers: false
      neutron_driver: ovn
      ceph_namespace: registry.redhat.io/rhceph
      ceph_image: rhceph-6-rhel9
      ceph_tag: latest
      ceph_prometheus_namespace: registry.redhat.io/openshift4
      ceph_prometheus_image: ose-prometheus
      ceph_prometheus_tag: v4.6
      ceph_alertmanager_namespace: registry.redhat.io/openshift4
      ceph_alertmanager_image: ose-prometheus-alertmanager
      ceph_alertmanager_tag: v4.6
      ceph_node_exporter_namespace: registry.redhat.io/openshift4
      ceph_node_exporter_image: ose-prometheus-node-exporter
      ceph_node_exporter_tag: v4.6
      ceph_grafana_namespace: registry.redhat.io/rhceph
      ceph_grafana_image: rhceph-6-dashboard-rhel9
      ceph_grafana_tag: latest
  MultiRhelRoleContainerImagePrepare: &id001
  - tag_from_label: '{version}-{release}'
    set:
      namespace: registry.redhat.io/rhosp-rhel9
      name_prefix: openstack-
      name_suffix: ''
      tag: '17.1'
      rhel_containers: false
      neutron_driver: ovn
      ceph_namespace: registry.redhat.io/rhceph
      ceph_image: rhceph-6-rhel9
      ceph_tag: latest
      ceph_prometheus_namespace: registry.redhat.io/openshift4
      ceph_prometheus_image: ose-prometheus
      ceph_prometheus_tag: v4.6
      ceph_alertmanager_namespace: registry.redhat.io/openshift4
      ceph_alertmanager_image: ose-prometheus-alertmanager
      ceph_alertmanager_tag: v4.6
      ceph_node_exporter_namespace: registry.redhat.io/openshift4
      ceph_node_exporter_image: ose-prometheus-node-exporter
      ceph_node_exporter_tag: v4.6
      ceph_grafana_namespace: registry.redhat.io/rhceph
      ceph_grafana_image: rhceph-6-dashboard-rhel9
      ceph_grafana_tag: latest
    excludes:
    - collectd
    - nova-libvirt
  - tag_from_label: '{version}-{release}'
    set:
      namespace: registry.redhat.io/rhosp-rhel9
      name_prefix: openstack-
      name_suffix: ''
      tag: '17.1'
      rhel_containers: false
      neutron_driver: ovn
      ceph_namespace: registry.redhat.io/rhceph
      ceph_image: rhceph-6-rhel9
      ceph_tag: latest
      ceph_prometheus_namespace: registry.redhat.io/openshift4
      ceph_prometheus_image: ose-prometheus
      ceph_prometheus_tag: v4.6
      ceph_alertmanager_namespace: registry.redhat.io/openshift4
      ceph_alertmanager_image: ose-prometheus-alertmanager
      ceph_alertmanager_tag: v4.6
      ceph_node_exporter_namespace: registry.redhat.io/openshift4
      ceph_node_exporter_image: ose-prometheus-node-exporter
      ceph_node_exporter_tag: v4.6
      ceph_grafana_namespace: registry.redhat.io/rhceph
      ceph_grafana_image: rhceph-6-dashboard-rhel9
      ceph_grafana_tag: latest
    includes:
    - collectd
    - nova-libvirt
  ComputeContainerImagePrepare: *id001
  ControllerContainerImagePrepare: *id001
  ContainerImageRegistryLogin: true
  ContainerImageRegistryCredentials:
    registry.redhat.io:
      1979710|test-rhosp16.1: PASSWORD

The yaml below comas as result of executing: 
python3 /usr/share/openstack-tripleo-heat-templates/tools/multi-rhel-container-image-prepare.py \
     ${COMPUTE_ROLES} \
     ${CONTROL_PLANE_ROLES} \
     --enable-multi-rhel \
     --excludes collectd \
     --excludes nova-libvirt \
     --minor-override "{${EL8_TAGS}${EL8_NAMESPACE}${CEPH_TAGS}${NEUTRON_DRIVER}\"no_tag\":\"not_used\"}" \
     --major-override "{${EL9_TAGS}${NAMESPACE}${CEPH_TAGS}${NEUTRON_DRIVER}\"no_tag\":\"not_used\"}" \
     --output-env-file \
    /home/stack/containers-prepare-parameter.yaml
Reference: https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/17.1/html-single/framework_for_upgrades_16.2_to_17.1/index#running-the-overcloud-upgrade-preparation_overcloud-adoption

and adding:

  ContainerImageRegistryLogin: true
  ContainerImageRegistryCredentials:
    registry.redhat.io:
      1979710|test-rhosp16.1: PASSWORD

Comment 5 Takashi Kajinami 2023-08-30 01:06:12 UTC
A few questions.

1. What was the registry used during initial deployment of RHOSPO16.2 . Did you use `registry.redhat.io` or a different internal registry ?

2. May I review sosreport from undercloud and one of the controller node ? Please take these when the issue is reproduced.

Comment 7 Takashi Kajinami 2023-08-31 01:46:46 UTC
I can't find the generated ansible playbooks in undercloud sosreport because sosreport plugin in RHEL8
does not capture the new directory path for RHOSP17, and the link does not provide sosreport of
the controller node.

My rough guess is that we update the podman version when we switch the container-tools stream and
that invalidated podman login information. Then podman login is not executed before that task is executed.

Can you check whether you still have valid login information after that command ?

Because the registry login was woring in RHOSP16, and we haven't heard any problem with that logic
in RHOSP17, the issue seems to be specific to some workflows in upgrade.

Although DF can help with any specific problem with that login task, the timing when login credential
is invalidated(and when we need re-login) should be investigated from Upgrade's PoV so I'll reassign
this to upgrades.

Comment 9 Sergii Golovatiuk 2023-08-31 10:57:10 UTC
Closing as not a bug as the issue was in  containers-prepare-parameter.yaml and how ContainerImageRegistryLogin and ContainerImageRegistryCredentials where specified there.

Comment 10 Pedro Navarro 2023-08-31 12:05:37 UTC
I confirm that adding push_destination: true, use UC as registry and the error doesn't occur.

parameter_defaults:
  ContainerImagePrepare:
  - tag_from_label: '{version}-{release}'
    set:
      namespace: registry.redhat.io/rhosp-rhel9
      name_prefix: openstack-
      name_suffix: ''
      tag: '17.1'
      rhel_containers: false
      neutron_driver: ovn
      ceph_namespace: registry.redhat.io/rhceph
      ceph_image: rhceph-6-rhel9
      ceph_tag: latest
      ceph_prometheus_namespace: registry.redhat.io/openshift4
      ceph_prometheus_image: ose-prometheus
      ceph_prometheus_tag: v4.6
      ceph_alertmanager_namespace: registry.redhat.io/openshift4
      ceph_alertmanager_image: ose-prometheus-alertmanager
      ceph_alertmanager_tag: v4.6
      ceph_node_exporter_namespace: registry.redhat.io/openshift4
      ceph_node_exporter_image: ose-prometheus-node-exporter
      ceph_node_exporter_tag: v4.6
      ceph_grafana_namespace: registry.redhat.io/rhceph
      ceph_grafana_image: rhceph-6-dashboard-rhel9
      ceph_grafana_tag: latest
    push_destination: true
  MultiRhelRoleContainerImagePrepare: &id001
  - tag_from_label: '{version}-{release}'
    set:
      namespace: registry.redhat.io/rhosp-rhel9
      name_prefix: openstack-
      name_suffix: ''
      tag: '17.1'
      rhel_containers: false
      neutron_driver: ovn
      ceph_namespace: registry.redhat.io/rhceph
      ceph_image: rhceph-6-rhel9
      ceph_tag: latest
      ceph_prometheus_namespace: registry.redhat.io/openshift4
      ceph_prometheus_image: ose-prometheus
      ceph_prometheus_tag: v4.6
      ceph_alertmanager_namespace: registry.redhat.io/openshift4
      ceph_alertmanager_image: ose-prometheus-alertmanager
      ceph_alertmanager_tag: v4.6
      ceph_node_exporter_namespace: registry.redhat.io/openshift4
      ceph_node_exporter_image: ose-prometheus-node-exporter
      ceph_node_exporter_tag: v4.6
      ceph_grafana_namespace: registry.redhat.io/rhceph
      ceph_grafana_image: rhceph-6-dashboard-rhel9
      ceph_grafana_tag: latest
    excludes:
    - collectd
    - nova-libvirt
    push_destination: true
  - tag_from_label: '{version}-{release}'
    set:
      namespace: registry.redhat.io/rhosp-rhel9
      name_prefix: openstack-
      name_suffix: ''
      tag: '17.1'
      rhel_containers: false
      neutron_driver: ovn
      ceph_namespace: registry.redhat.io/rhceph
      ceph_image: rhceph-6-rhel9
      ceph_tag: latest
      ceph_prometheus_namespace: registry.redhat.io/openshift4
      ceph_prometheus_image: ose-prometheus
      ceph_prometheus_tag: v4.6
      ceph_alertmanager_namespace: registry.redhat.io/openshift4
      ceph_alertmanager_image: ose-prometheus-alertmanager
      ceph_alertmanager_tag: v4.6
      ceph_node_exporter_namespace: registry.redhat.io/openshift4
      ceph_node_exporter_image: ose-prometheus-node-exporter
      ceph_node_exporter_tag: v4.6
      ceph_grafana_namespace: registry.redhat.io/rhceph
      ceph_grafana_image: rhceph-6-dashboard-rhel9
      ceph_grafana_tag: latest
    includes:
    - collectd
    - nova-libvirt
    push_destination: true
  ComputeContainerImagePrepare: *id001
  ControllerContainerImagePrepare: *id001
  ContainerImageRegistryLogin: true
  ContainerImageRegistryCredentials:
    registry.redhat.io:
      1979710|test-rhosp16.1: PASSWORD

However, if we don't use push_destination: true, the upgrade fails trying to pulling the images from the registry

Comment 33 errata-xmlrpc 2024-01-16 14:30:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 17.1.2 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2024:0209


Note You need to log in before you can comment on or make changes to this bug.