Bug 2221198

Summary: [OSP 16.1] Deployment failing with error Image registry.redhat.io/rhosp-rhel8/openstack-cinder-scheduler has no tag 16.1.
Product: Red Hat OpenStack Reporter: Flavio Piccioni <fpiccion>
Component: openstack-tripleo-commonAssignee: Nobody <nobody>
Status: CLOSED EOL QA Contact: David Rosenfeld <drosenfe>
Severity: high Docs Contact:
Priority: high    
Version: 16.1 (Train)CC: mburns, slinaber
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-09-18 19:22:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Flavio Piccioni 2023-07-07 12:47:49 UTC
Description of problem:
trying to scale-out the platform, customer is getting following error:

Deploying templates in the directory /tmp/tripleoclient-m_q6mtat/tripleo-heat-templates
Initializing overcloud plan deployment
{'deployment_status': 'DEPLOY_FAILED',
 'execution_id': '44719ced-7a76-4365-8e75-aed8262f928b',
 'message': 'Image registry.redhat.io/rhosp-rhel8/openstack-cinder-scheduler '
            'has no tag 16.1.\n'
            'Available tags: 16.1.7-9.1646285769, 16.1.5-2.1619641818, 16.1.2, '
            '16.2.0-60, 16.1.3, 16.2.3-10, 16.1.4-10, 16.1.7, 16.0-78, '
            '16.1.5-1.1618379274-source, 16.1.4-10-source, 16.1-43, 16.1.5, '
            '16.1.4, 16.1.3-6-source, 16.2.0-60.1638437527, 16.1.8-10, '
            '16.1-49, 16.1.5-1-source, 16.1.3-4-source, 16.1.8-7-source, '
            '16.1.5-2.1619641818-source, 16.1.8-9.1651483896-source, '
            '16.2.3-11, 16.1.6-7-source, 16.1.3-4, 16.1.6, 16.1.8-9-source, '
            '16.1.3-5.1611701648, 16.2.1-6-source, 16.2.0, 16.2.2-12-source, '
            '16.2.2-15, 16.2.2, 16.1.3-6.1614769698-source, 16.0-104, '
            '16.2.1-6, 16.1.3-6.1614769698, 16.2.2-10, 16.1.3-6, 16.1.6-4, '
            '16.2.2-14, 16.2.2-14-source, 16.2.2-15-source, '
            '16.2.2-12.1651565461-source, 16.2.2-12, 16.0-103, '
            '16.1.6-4-source, 16.1.3-5, 16.2.0-60.1638437527-source',
 'plan_name': 'overcloud',
 'root_execution_id': None,
 'status': 'FAILED'}
Image registry.redhat.io/rhosp-rhel8/openstack-cinder-scheduler has no tag 16.1.
Available tags: 16.1.7-9.1646285769, 16.1.5-2.1619641818, 16.1.2, 16.2.0-60, 16.1.3, 16.2.3-10, 16.1.4-10, 16.1.7, 16.0-78, 16.1.5-1.1618379274-source, 16.1.4-10-source, 16.1-43, 16.1.5, 16.1.4, 16.1.3-6-source, 16.2.0-60.1638437527, 16.1.8-10, 16.1-49, 16.1.5-1-source, 16.1.3-4-source, 16.1.8-7-source, 16.1.5-2.1619641818-source, 16.1.8-9.1651483896-source, 16.2.3-11, 16.1.6-7-source, 16.1.3-4, 16.1.6, 16.1.8-9-source, 16.1.3-5.1611701648, 16.2.1-6-source, 16.2.0, 16.2.2-12-source, 16.2.2-15, 16.2.2, 16.1.3-6.1614769698-source, 16.0-104, 16.2.1-6, 16.1.3-6.1614769698, 16.2.2-10, 16.1.3-6, 16.1.6-4, 16.2.2-14, 16.2.2-14-source, 16.2.2-15-source, 16.2.2-12.1651565461-source, 16.2.2-12, 16.0-103, 16.1.6-4-source, 16.1.3-5, 16.2.0-60.1638437527-source
Exception occured while running the command
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/tripleoclient/command.py", line 32, in run
    super(Command, self).run(parsed_args)
  File "/usr/lib/python3.6/site-packages/osc_lib/command/command.py", line 41, in run
    return super(Command, self).run(parsed_args)
  File "/usr/lib/python3.6/site-packages/cliff/command.py", line 185, in run
    return_code = self.take_action(parsed_args) or 0
  File "/usr/lib/python3.6/site-packages/tripleoclient/v1/overcloud_deploy.py", line 1041, in take_action
    self._deploy_tripleo_heat_templates_tmpdir(stack, parsed_args)
  File "/usr/lib/python3.6/site-packages/tripleoclient/v1/overcloud_deploy.py", line 415, in _deploy_tripleo_heat_templates_tmpdir
    new_tht_root, tht_root)
  File "/usr/lib/python3.6/site-packages/tripleoclient/v1/overcloud_deploy.py", line 532, in _deploy_tripleo_heat_templates
    deployment_options=deployment_options)
  File "/usr/lib/python3.6/site-packages/tripleoclient/v1/overcloud_deploy.py", line 551, in _try_overcloud_deploy_with_compat_yaml
    deployment_options=deployment_options)
  File "/usr/lib/python3.6/site-packages/tripleoclient/v1/overcloud_deploy.py", line 288, in _heat_deploy
    deployment_options=deployment_options)
  File "/usr/lib/python3.6/site-packages/tripleoclient/workflows/deployment.py", line 87, in deploy_and_wait
    deploy(log, clients, **workflow_input)
  File "/usr/lib/python3.6/site-packages/tripleoclient/workflows/deployment.py", line 69, in deploy
    % (payload['status'], wf_name))
ValueError: Unexpected status FAILED for tripleo.deployment.v1.deploy_plan
Unexpected status FAILED for tripleo.deployment.v1.deploy_plan
END return value: 1


Version-Release number of selected component (if applicable):
RHOSP 16.1.8
openstack-tripleo-common.noarch                       11.4.1-1.20211201113403.75bd92a.el8ost             @openstack-16.1-for-rhel-8-x86_64-rpms


How reproducible:


Steps to Reproduce:
1. run a deployment trying to add computes:
openstack overcloud deploy --templates /usr/share/openstack-tripleo-heat-templates/ -e /home/stack/templates/node-info.yaml -e /home/stack/containers-prepare-parameter.yaml -r /home/stack/roles_data.yaml -e /home/stack/lanzascript.yaml -n /home/stack/network_data.yaml -e /home/stack/openstack-tripleo-heat-templates-rendered/environments/network-isolation.yaml -e /home/stack/openstack-tripleo-heat-templates-rendered/environments/network-environment.yaml -e /home/stack/templates/custom-network-configuration.yaml -e /home/stack/templates/pci_passthrough_controller.yaml -e /home/stack/templates/pci_passthrough_compute.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/cinder-dellemc-powerstore-config.yaml --ntp-server 192.168.30.1 -v


containers-prepare-parameter.yaml in use: 

# Generated with the following on 2020-07-29T10:18:23.925060 #
#   openstack tripleo container image prepare default --local-push-destination --output-env-file containers-prepare-parameter.yaml
#

parameter_defaults:
  ContainerImagePrepare:
  - push_destination: 192.168.30.1:8787
    set:
      ceph_alertmanager_image: ose-prometheus-alertmanager
      ceph_alertmanager_namespace: registry.redhat.io/openshift4
      ceph_alertmanager_tag: 4.1
      ceph_grafana_image: rhceph-3-dashboard-rhel7
      ceph_grafana_namespace: registry.redhat.io/rhceph
      ceph_grafana_tag: 3
      ceph_image: rhceph-4-rhel8
      ceph_namespace: registry.redhat.io/rhceph
      ceph_node_exporter_image: ose-prometheus-node-exporter
      ceph_node_exporter_namespace: registry.redhat.io/openshift4
      ceph_node_exporter_tag: v4.1
      ceph_prometheus_image: ose-prometheus
      ceph_prometheus_namespace: registry.redhat.io/openshift4
      ceph_prometheus_tag: 4.1
      ceph_tag: latest
      name_prefix: openstack-
      name_suffix: ''
      namespace: registry.redhat.io/rhosp-rhel8
      neutron_driver: ovn
      rhel_containers: false
    tag_from_label: '{version}-{release}'
  ContainerImageRegistryCredentials:
    registry.redhat.io:
      'XXXXXXX' : 'XXXXXXX'
  ContainerImageRegistryLogin: true

Actual results:
deployment failing at very early stages with error:  'message': 'Image registry.redhat.io/rhosp-rhel8/openstack-cinder-scheduler has no tag 16.1.\n'

Expected results:
deployment to succeed

Additional info:
not sure if this this could be related to BZ 2213672

[0] https://bugzilla.redhat.com/show_bug.cgi?id=2213672

Comment 1 Takashi Kajinami 2023-07-10 01:47:31 UTC
I don't think BZ 2213672 can be related because that issue is the one with the image date undercloud registry.

This sounds like a problem with our public container registry. We probably have to get some debug output from
the `openstack tripleo container image prepare` which execute a similar process.

However before we dig into that deeper, the customer should be aware that the given template pulls the latest
container image. The bug report mentions the customer is using RHOSP16.1.8. Then they customer should not use
tag_from_label but tag 16.1.8 to avoid pulling 16.1.9 container images. Or are we sure the customer already
followed[1] and generated the environment file to pin their container images ?

https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.1/html-single/director_installation_and_usage/index#assembly_performing-advanced-container-image-management