Bug 2321715
| Summary: | openstack undercloud install on 16.2-latest aborts with keyerror and exception (Late oct 2024) | |||
|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Vincent S. Cojot <vcojot> | |
| Component: | openstack-tripleo-common | Assignee: | Rabi Mishra <ramishra> | |
| Status: | CLOSED MIGRATED | QA Contact: | David Rosenfeld <drosenfe> | |
| Severity: | high | Docs Contact: | ||
| Priority: | high | |||
| Version: | 16.2 (Train) | CC: | dhughes, kgilliga, mburns, pweeks, ramishra, slinaber, toneata | |
| Target Milestone: | async | Keywords: | Regression, Triaged | |
| Target Release: | 16.2 (Train on RHEL 8.4) | |||
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | openstack-tripleo-common-11.7.1-2.20241105125025.e189622.el8ost | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 2322348 2322349 2323788 (view as bug list) | Environment: | ||
| Last Closed: | 2024-12-19 18:26:13 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 2322348, 2322349, 2323788 | |||
sosreport too big. $ cat undercloud.conf [DEFAULT] image_path = /home/stack/images/ container_images_file = /home/stack/OSP/osp16.2/containers-prepare-parameter.yaml enabled_hardware_types = ipmi,redfish,idrac enabled_boot_interfaces = pxe enabled_deploy_interfaces = iscsi,direct enabled_management_interfaces = ipmitool enabled_power_interfaces = ipmitool discovery_default_driver = ipmi undercloud_enable_selinux = true undercloud_debug = false # Local site # undercloud_hostname = osp16h.lasthome.solace.krynn overcloud_domain_name = lasthome.solace.krynn subnets = ctlplane-subnet local_subnet = ctlplane-subnet undercloud_nameservers = 10.0.128.234,10.0.128.236,10.0.128.254 undercloud_ntp_servers = 10.20.0.1,10.0.128.234,10.0.128.236,10.0.128.254 hieradata_override = /home/stack/OSP/osp16.2/undercloud-override.yaml # Certificates generate_service_certificate = true certificate_generation_ca = local # undercloud_service_certificate = /etc/pki/instack-certs/undercloud.pem # Network local_ip = 10.20.0.2/24 # undercloud_public_host = 10.0.129.42 undercloud_public_host = 10.20.0.3 undercloud_admin_host = 10.20.0.4 local_interface = bond0 inspection_interface = br-ctlplane inspection_iprange = 10.20.0.200,10.20.0.240 # Features enable_mistral = true enable_zaqar = true enable_telemetry = false enable_tempest = true inspection_extras = true enable_validations = true clean_nodes = false enable_node_discovery = true ipxe_enabled = true # inspection_enable_uefi = true [ctlplane-subnet] cidr = 10.20.0.0/24 dhcp_start = 10.20.0.101 dhcp_end = 10.20.0.164 inspection_iprange = 10.20.0.200,10.20.0.240 gateway = 10.20.0.1 masquerade = true $ cat /home/stack/OSP/osp16.2/undercloud-override.yaml # HAProxy timeouts tripleo::haproxy::ssl_cipher_suite: "!SSLv2:kEECDH:kRSA:kEDH:kPSK:+3DES:!aNULL:!eNULL:!MD5:!EXP:!RC4:!SEED:!IDEA:!DES:!MEDIUM" tripleo::haproxy::ssl_options: 'no-sslv3 no-tls-tickets' tripleo::haproxy::haproxy_global_maxconn: 32768 tripleo::haproxy::haproxy_default_maxconn: 8192 # Keystone token expiry (for updates on large clouds): keystone::token_expiration: 14400 # MySQL mysql_max_connections: '8192' # nova::compute::ironic::max_concurrent_builds: 8 # Ironic cleaning ironic::conductor::cleaning_disk_erase: metadata ironic::conductor::cleaning_network: ctlplane With that change it no longer aborts:
$ diff -u /home/stack/OSP/osp16.2/containers-prepare-parameter.yaml.orig /home/stack/OSP/osp16.2/containers-prepare-parameter.yaml
--- /home/stack/OSP/osp16.2/containers-prepare-parameter.yaml.orig 2024-10-30 17:40:09.990000000 -0400
+++ /home/stack/OSP/osp16.2/containers-prepare-parameter.yaml 2024-11-04 08:32:55.365000000 -0500
@@ -1,7 +1,6 @@
parameter_defaults:
ContainerImagePrepare:
- - tag_from_label: "{version}-{release}"
- push_destination: false
+ - push_destination: false
set:
ceph_alertmanager_image: krynn_rhosp-osp16_containers-ose-prometheus-alertmanager
ceph_alertmanager_namespace: sat6.lasthome.solace.krynn
@@ -22,7 +21,7 @@
name_suffix: ''
namespace: sat6.lasthome.solace.krynn
neutron_driver: ovs
- #tag: 16.2
+ tag: 16.2
rhel_containers: false
excludes:
- ose-prometheus
the problem is many Telco customers use tag_from_label, not tags.
The satellite: [root@sat6 ~]# rpm -q satellite satellite-6.14.4.3-1.el8sat.noarch Copied the patch from my Linux box and it applied just fine, darn Windows. [root@osp16d /usr/lib/python3.6/site-packages]# patch -p1 < ~stack/p3.txt patching file tripleo_common/image/image_uploader.py With the patch in place, I got this: rendering j2 template to file: /home/stack/tripleo-heat-installer-templates/./network/ports/external_from_pool_v6.yaml jinja2 rendering network template port_v6.network.j2.yaml jinja2 rendering networks External rendering j2 template to file: /home/stack/tripleo-heat-installer-templates/./network/ports/external_v6.yaml jinja2 rendering role template role.role.j2.yaml jinja2 rendering roles Undercloud rendering j2 template to file: /home/stack/tripleo-heat-installer-templates/./puppet/undercloud-role.yaml Exception: 404 Client Error: Not Found for url: https://sat6.lasthome.solace.krynn/v2/krynn_rhosp-osp16_containers-openstack-cron/blobs/sha256:c4ae84d99c69a4001389c3ea2cceb774c987e553fe41b3c95a047277ef5b1c61 Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/tripleoclient/v1/tripleo_deploy.py", line 1297, in _standalone_deploy parsed_args) File "/usr/lib/python3.6/site-packages/tripleoclient/v1/tripleo_deploy.py", line 814, in _deploy_tripleo_heat_templates self._prepare_container_images(env, roles_data) File "/usr/lib/python3.6/site-packages/tripleoclient/v1/tripleo_deploy.py", line 759, in _prepare_container_images env, roles_data, dry_run=True) File "/usr/lib/python3.6/site-packages/tripleo_common/image/kolla_builder.py", line 228, in container_images_prepare_multi lock=lock File "/usr/lib/python3.6/site-packages/tripleo_common/image/kolla_builder.py", line 357, in container_images_prepare images, tag_from_label, default_tag) File "/usr/lib/python3.6/site-packages/tripleo_common/image/image_uploader.py", line 1227, in discover_image_tags discover_args): File "/usr/lib64/python3.6/concurrent/futures/_base.py", line 586, in result_iterator yield fs.pop().result() File "/usr/lib64/python3.6/concurrent/futures/_base.py", line 432, in result return self.__get_result() File "/usr/lib64/python3.6/concurrent/futures/_base.py", line 384, in __get_result raise self._exception File "/usr/lib64/python3.6/concurrent/futures/thread.py", line 56, in run result = self.fn(*self.args, **self.kwargs) File "/usr/lib/python3.6/site-packages/tripleo_common/image/image_uploader.py", line 2795, in discover_tag_from_inspect i = self._inspect(image_url, session=session, default_tag=default_tag) File "/usr/lib/python3.6/site-packages/tripleo_common/image/image_uploader.py", line 2622, in _inspect image_url, session=session, default_tag=default_tag) File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 292, in wrapped_f return self.call(f, *args, **kw) File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 358, in call do = self.iter(retry_state=retry_state) File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 331, in iter raise retry_exc.reraise() File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 167, in reraise raise self.last_attempt.result() File "/usr/lib64/python3.6/concurrent/futures/_base.py", line 425, in result return self.__get_result() File "/usr/lib64/python3.6/concurrent/futures/_base.py", line 384, in __get_result raise self._exception File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 361, in call result = fn(*args, **kwargs) File "/usr/lib/python3.6/site-packages/tripleo_common/image/image_uploader.py", line 1055, in _inspect allow_redirects=False File "/usr/lib/python3.6/site-packages/tripleo_common/image/image_uploader.py", line 442, in get **kwargs) File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 292, in wrapped_f return self.call(f, *args, **kw) File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 358, in call do = self.iter(retry_state=retry_state) File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 319, in iter return fut.result() File "/usr/lib64/python3.6/concurrent/futures/_base.py", line 425, in result return self.__get_result() File "/usr/lib64/python3.6/concurrent/futures/_base.py", line 384, in __get_result raise self._exception File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 361, in call result = fn(*args, **kwargs) File "/usr/lib/python3.6/site-packages/tripleo_common/image/image_uploader.py", line 421, in _action request=req) File "/usr/lib/python3.6/site-packages/tripleo_common/image/image_uploader.py", line 262, in check_status request.raise_for_status() File "/usr/lib/python3.6/site-packages/requests/models.py", line 940, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://sat6.lasthome.solace.krynn/v2/krynn_rhosp-osp16_containers-openstack-cron/blobs/sha256:c4ae84d99c69a4001389c3ea2cceb774c987e553fe41b3c95a047277ef5b1c61 None Install artifact is located at /home/stack/undercloud-install-20241104164033.tar.bzip2 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Deployment Failed! ERROR: Heat log files: /var/log/heat-launcher/undercloud_deploy-xuwhq2mx This file used to work:
parameter_defaults:
ContainerImagePrepare:
- tag_from_label: "{version}-{release}"
push_destination: false
set:
ceph_alertmanager_image: krynn_rhosp-osp16_containers-ose-prometheus-alertmanager
ceph_alertmanager_namespace: sat6.lasthome.solace.krynn
ceph_alertmanager_tag: 4.1
ceph_grafana_image: krynn_rhosp-osp16_containers-ose-grafana
ceph_grafana_namespace: sat6.lasthome.solace.krynn
ceph_grafana_tag: 4.1
ceph_node_exporter_image: krynn_rhosp-osp16_containers-ose-prometheus-node-exporter
ceph_node_exporter_namespace: sat6.lasthome.solace.krynn
ceph_node_exporter_tag: v4.1
ceph_prometheus_image: krynn_rhosp-osp16_containers-ose-prometheus
ceph_prometheus_namespace: sat6.lasthome.solace.krynn
ceph_prometheus_tag: 4.1
ceph_image: krynn_rhosp-osp16_containers-rhceph-4-rhel8
ceph_namespace: sat6.lasthome.solace.krynn
ceph_tag: latest
name_prefix: krynn_rhosp-osp16_containers-openstack-
name_suffix: ''
namespace: sat6.lasthome.solace.krynn
neutron_driver: ovs
#tag: 16.2
rhel_containers: false
excludes:
- ose-prometheus
- ose-prometheus-alertmanager
- ose-prometheus-node-exporter
ContainerImageRegistryCredentials:
registry.redhat.io:
6340056|vcojot-rhosp: eyJXXXXXXXXXXXXXXXXXXXXXX
ContainerImageRegistryLogin: true
I can pull the images by tag with or without auth:
[stack@osp16d ~]$ podman login sat6.lasthome.solace.krynn
Authenticating with existing credentials...
Existing credentials are valid. Already logged in to sat6.lasthome.solace.krynn
[stack@osp16d ~]$ podman pull sat6.lasthome.solace.krynn/krynn_rhosp-osp16_containers-openstack-cron:16.2.6
Trying to pull sat6.lasthome.solace.krynn/krynn_rhosp-osp16_containers-openstack-cron:16.2.6...
Getting image source signatures
Copying blob bbace1e08f3e skipped: already exists
Copying blob e0d0fb6418b1 skipped: already exists
Copying blob e627f7a5737e [--------------------------------------] 0.0b / 0.0b
Copying config 7ed88ede9c done
Writing manifest to image destination
Storing signatures
7ed88ede9c7ed549c930b70bba242217385984ca369e0badbe4edffa5588ff2b
I can list tags:
[root@osp16d ~]# skopeo list-tags docker://sat6.lasthome.solace.krynn/krynn_rhosp-osp16_containers-openstack-cron|grep '16.2'|wc -l
47
[root@osp16d ~]# skopeo list-tags docker://sat6.lasthome.solace.krynn/krynn_rhosp-osp16_containers-openstack-cron|grep 'sha256'|wc -l
53
[root@osp16d ~]#
I can pull (unauth) by tag:
[root@osp16d ~]# podman logout sat6.lasthome.solace.krynn
Error: Not logged into sat6.lasthome.solace.krynn
[root@osp16d ~]# podman pull docker://sat6.lasthome.solace.krynn/krynn_rhosp-osp16_containers-openstack-cron@sha256:10ab52e021b436d222cdba881825e6e1b3d7f11b96ff169ad36cb3ad412102d2
Trying to pull sat6.lasthome.solace.krynn/krynn_rhosp-osp16_containers-openstack-cron@sha256:10ab52e021b436d222cdba881825e6e1b3d7f11b96ff169ad36cb3ad412102d2...
Getting image source signatures
Copying blob b3b93d72e803 skipped: already exists
Copying blob c3b2eb007e57 [--------------------------------------] 0.0b / 0.0b
Copying blob 90d946b542da [--------------------------------------] 0.0b / 0.0b
Copying config 7a79f2246a done
Writing manifest to image destination
Storing signatures
7a79f2246a1686eafec699ba5ba34ca5ac78adad154623d80ed27321cac36bc2
I can even pull the digest that tripleo was trying to use:
[root@osp16d ~]# podman pull docker://sat6.lasthome.solace.krynn/krynn_rhosp-osp16_containers-openstack-cron@sha256:c4ae84d99c69a4001389c3ea2cceb774c987e553fe41b3c95a047277ef5b1c61
Trying to pull sat6.lasthome.solace.krynn/krynn_rhosp-osp16_containers-openstack-cron@sha256:c4ae84d99c69a4001389c3ea2cceb774c987e553fe41b3c95a047277ef5b1c61...
Getting image source signatures
Copying blob e627f7a5737e done
Copying blob bbace1e08f3e done
Copying blob e0d0fb6418b1 done
Copying config 7ed88ede9c done
Writing manifest to image destination
Storing signatures
7ed88ede9c7ed549c930b70bba242217385984ca369e0badbe4edffa5588ff2b
[root@osp16d ~]#
so this one works:
sat6.lasthome.solace.krynn/krynn_rhosp-osp16_containers-openstack-cron@sha256:c4ae84d99c69a4001389c3ea2cceb774c987e553fe41b3c95a047277ef5b1c61
but tripleo failed on:
sat6.lasthome.solace.krynn/v2/krynn_rhosp-osp16_containers-openstack-cron/blobs/sha256:c4ae84d99c69a4001389c3ea2cceb774c987e553fe41b3c95a047277ef5b1c61
https://paste.openstack.org/raw/bYjcoaNlNkNvhXusHP2j/ I can confirm that with the latest patch above the install completes without issues. Install artifact is located at /home/stack/undercloud-install-20241105082119.tar.bzip2 ######################################################## Deployment successful! ######################################################## ]$ cat /home/stack/OSP/osp16.2/containers-prepare-parameter.yaml parameter_defaults: ContainerImagePrepare: - tag_from_label: "{version}-{release}" push_destination: false set: ceph_alertmanager_image: krynn_rhosp-osp16_containers-ose-prometheus-alertmanager ceph_alertmanager_namespace: sat6.lasthome.solace.krynn ceph_alertmanager_tag: 4.1 ceph_grafana_image: krynn_rhosp-osp16_containers-ose-grafana ceph_grafana_namespace: sat6.lasthome.solace.krynn ceph_grafana_tag: 4.1 ceph_node_exporter_image: krynn_rhosp-osp16_containers-ose-prometheus-node-exporter ceph_node_exporter_namespace: sat6.lasthome.solace.krynn ceph_node_exporter_tag: v4.1 ceph_prometheus_image: krynn_rhosp-osp16_containers-ose-prometheus ceph_prometheus_namespace: sat6.lasthome.solace.krynn ceph_prometheus_tag: 4.1 ceph_image: krynn_rhosp-osp16_containers-rhceph-4-rhel8 ceph_namespace: sat6.lasthome.solace.krynn ceph_tag: latest name_prefix: krynn_rhosp-osp16_containers-openstack- name_suffix: '' namespace: sat6.lasthome.solace.krynn neutron_driver: ovs rhel_containers: false excludes: - ose-prometheus - ose-prometheus-alertmanager - ose-prometheus-node-exporter ContainerImageRegistryCredentials: registry.redhat.io: 6340056|vcojot-rhosp: eyJhbGciO............ |
Description of problem: 'openstack undercloud install' last worked fine a month ago (Sept 11th 2024). Now, with the latest rpms, I get a python error. Config files have not changed in several years. Sept 11th 2024: ** Handling template files ** jinja2 rendering normal template net-config-bond.j2.yaml rendering j2 template to file: /home/stack/tripleo-heat-installer-templates/./net-config-bond.yaml [..] rendering j2 template to file: /home/stack/tripleo-heat-installer-templates/./extraconfig/all_nodes/swap.yaml jinja2 rendering role template role.role.j2.yaml jinja2 rendering roles Undercloud rendering j2 template to file: /home/stack/tripleo-heat-installer-templates/./extraconfig/nova_metadata/krb-service-principals/undercloud-role.yaml jinja2 rendering network template network.network.j2.yaml jinja2 rendering networks External rendering j2 template to file: /home/stack/tripleo-heat-installer-templates/./network/external.yaml I now (2024/10/24) get this: rendering j2 template to file: /home/stack/tripleo-heat-installer-templates/./network/ports/external_from_pool_v6.yaml jinja2 rendering network template port_v6.network.j2.yaml jinja2 rendering networks External rendering j2 template to file: /home/stack/tripleo-heat-installer-templates/./network/ports/external_v6.yaml jinja2 rendering role template role.role.j2.yaml jinja2 rendering roles Undercloud rendering j2 template to file: /home/stack/tripleo-heat-installer-templates/./puppet/undercloud-role.yaml Exception: 'layers' Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/tripleoclient/v1/tripleo_deploy.py", line 1297, in _standalone_deploy parsed_args) File "/usr/lib/python3.6/site-packages/tripleoclient/v1/tripleo_deploy.py", line 814, in _deploy_tripleo_heat_templates self._prepare_container_images(env, roles_data) File "/usr/lib/python3.6/site-packages/tripleoclient/v1/tripleo_deploy.py", line 759, in _prepare_container_images env, roles_data, dry_run=True) File "/usr/lib/python3.6/site-packages/tripleo_common/image/kolla_builder.py", line 228, in container_images_prepare_multi lock=lock File "/usr/lib/python3.6/site-packages/tripleo_common/image/kolla_builder.py", line 357, in container_images_prepare images, tag_from_label, default_tag) File "/usr/lib/python3.6/site-packages/tripleo_common/image/image_uploader.py", line 1142, in discover_image_tags discover_args): File "/usr/lib64/python3.6/concurrent/futures/_base.py", line 586, in result_iterator yield fs.pop().result() File "/usr/lib64/python3.6/concurrent/futures/_base.py", line 432, in result return self.__get_result() File "/usr/lib64/python3.6/concurrent/futures/_base.py", line 384, in __get_result raise self._exception File "/usr/lib64/python3.6/concurrent/futures/thread.py", line 56, in run result = self.fn(*self.args, **self.kwargs) File "/usr/lib/python3.6/site-packages/tripleo_common/image/image_uploader.py", line 2779, in discover_tag_from_inspect i = self._inspect(image_url, session=session, default_tag=default_tag) File "/usr/lib/python3.6/site-packages/tripleo_common/image/image_uploader.py", line 2606, in _inspect image_url, session=session, default_tag=default_tag) File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 292, in wrapped_f return self.call(f, *args, **kw) File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 358, in call do = self.iter(retry_state=retry_state) File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 319, in iter return fut.result() File "/usr/lib64/python3.6/concurrent/futures/_base.py", line 425, in result return self.__get_result() File "/usr/lib64/python3.6/concurrent/futures/_base.py", line 384, in __get_result raise self._exception File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 361, in call result = fn(*args, **kwargs) File "/usr/lib/python3.6/site-packages/tripleo_common/image/image_uploader.py", line 956, in _inspect layers = [x['digest'] for x in manifest['layers']] KeyError: 'layers' None Install artifact is located at /home/stack/undercloud-install-20241025085006.tar.bzip2 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Deployment Failed! ERROR: Heat log files: /var/log/heat-launcher/undercloud_deploy-395n1egj !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Deployment failed. !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!