Bug 2130516 - registries.conf format changed in the middle of rhosp16, minor update is broken during converge step
Summary: registries.conf format changed in the middle of rhosp16, minor update is brok...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 16.2 (Train)
Hardware: Unspecified
OS: Unspecified
medium
low
Target Milestone: ---
: ---
Assignee: OSP Team
QA Contact: Joe H. Rahme
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-09-28 13:09 UTC by Maciej Relewicz
Modified: 2022-12-12 14:45 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-12-12 14:45:18 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-18972 0 None None None 2022-09-28 13:19:38 UTC

Description Maciej Relewicz 2022-09-28 13:09:34 UTC
Description of problem:

I'm updating from Red Hat OpenStack Platform release 16.1.6 GA (Train) and Contrail 2011.L3 to Red Hat OpenStack Platform release 16.2.3 (Train) with Contrail 21.4. During converge step I hit a error:

--privileged -u root -v /dev:/dev -v /usr/bin:/host
/usr/bin -v /etc/sysctl.d:/etc/sysctl.d -v /var/crashes:/var/crashes -v /var/log/containers/contrail:/var/log/contrail -v /var/run:/var/run -v /tmp/:/tmp/ -v /etc/contrail/ssl:/etc/contrail/ssl:z   -e  INSTALL_PUPPET=True  -e  CONTRAIL_STATUS_IMAGE=192.1
68.213.2:9798/contrail/contrail-status:21.4.0.133  -e  SSL_ENABLE=True  -e  SERVER_KEYFILE=/etc/contrail/ssl/private/server-privkey.pem  -e  SERVER_CERTFILE=/etc/contrail/ssl/certs/server.pem  -e  SERVER_CA_CERTFILE=/etc/contrail/ssl/certs/ca-cert.pem  -
e  SERVER_CA_KEYFILE=/etc/contrail/ssl/private/ca-key.pem 192.168.213.2:9798/contrail/contrail-node-init:21.4.0.133\\\\\\\\n\", \"delta\": \"0:00:00.142654\", \"end\": \"2022-09-27 06:56:05.913498\", \"msg\": \"non-zero return code\", \"rc\": 125, \"star
t\": \"2022-09-27 06:56:05.770844\", \"stderr\": \"Error: error getting default registries to try: error loading registries configuration \\\\\\\\\"/etc/containers/registries.conf\\\\\\\\\": mixing sysregistry v1/v2 is not supported\", \"stderr_lines\":
[\"Error: error getting default registries to try: error loading registries configuration \\\\\\\\\"/etc/containers/registries.conf\\\\\\\\\": mixing sysregistry v1/v2 is not supported\"], \"stdout

The problem is with registries.conf file.

cat /etc/containers/registries.conf
# This file is managed via tripleo-ansible

unqualified-search-registries = ["registry.redhat.io", "registry.access.redhat.com", "registry.fedoraproject.org", "registry.centos.org", "docker.io"]

# registries configuration

# insecure registry list
[[registry]]
prefix = "192.168.213.2:9798"
insecure = true
location = "192.168.213.2:9798"
[[registry]]
prefix = "192.168.213.2:9797"
insecure = true
location = "192.168.213.2:9797"
[[registry]]
prefix = "undercloud.ctlplane.localdomain"
insecure = true
location = "undercloud.ctlplane.localdomain"
[[registry]]
prefix = "undercloud.ctlplane.5a4s9.englab.juniper.net:8787"
insecure = true
location = "undercloud.ctlplane.5a4s9.englab.juniper.net:8787"
[registries.insecure]
registries = ['192.168.213.2:9798/contrail']

Looks like in the middle of rhosp16 you changed a registries.conf version from v1 to v2 [1] and [2].

1. https://opendev.org/openstack/tripleo-ansible/commit/27fe9d9ce471dd25a56d8a8b9e4b6480e763b07f

2. https://review.opendev.org/c/openstack/tripleo-ansible/+/719584


Version-Release number of selected component (if applicable):


How reproducible:
Install Openstack with Contrail. And try to update.

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:
Dont change configuration format during 

Additional info:

Comment 2 Jesse Pretorius 2022-09-28 15:43:19 UTC
Based on https://github.com/containers/image/blob/main/docs/containers-registries.conf.5.md

--- snip start ---

[registries.insecure]
registries = ['192.168.213.2:9798/contrail']

--- snip end ---

Is telling podman that the registry is v1, but everything else is telling podman the registry is v2.

The v1 API was deprecated in 2019: https://engineering.docker.com/2019/03/registry-v1-api-deprecation/

It is not clear how the 'registries.insecure' section got there, but it should be removed in order for podman to speak to the registry via the v2 API instead. As noted in [a] trying to mix v1 and v2 config will result in 'mixing sysregistry v1/v2 is not supported'.

[a] https://williamlieurance.com/insecure-podman-registry/

Comment 3 Brendan Shephard 2022-09-29 02:37:10 UTC
What is being used to create the registries.insecure entry? Our interface for handling insecure registries hasn't changed. It has been via the DockerInsecureRegistryAddress Heat parameter for a long time. Are you saying that you were using that interface and it didn't update your registry format from v1 to v2?

Comment 4 Maciej Relewicz 2022-09-29 10:21:21 UTC
We are using our own task [1] because DockerInsecureRegistryAddress is applied too late for us, after the step when node-init container is run. We are also configure our contrail registry as a part of DockerInsecureRegistryAddress but for later steps. Using Clean deployment with 16.2.3 working correctly, your task is run as second, and later file is not modified by us. Problem occurs during update and converge step.

1. https://github.com/tungstenfabric/tf-tripleo-heat-templates/blob/stable/train/extraconfig/pre_deploy/contrail/contrail_ansible_node_init.yaml#L115

Comment 7 Jesse Pretorius 2022-09-30 08:11:54 UTC
(In reply to Maciej Relewicz from comment #4)
> We are using our own task [1] because DockerInsecureRegistryAddress is
> applied too late for us, after the step when node-init container is run. We
> are also configure our contrail registry as a part of
> DockerInsecureRegistryAddress but for later steps. Using Clean deployment
> with 16.2.3 working correctly, your task is run as second, and later file is
> not modified by us. Problem occurs during update and converge step.
> 
> 1.
> https://github.com/tungstenfabric/tf-tripleo-heat-templates/blob/stable/
> train/extraconfig/pre_deploy/contrail/contrail_ansible_node_init.yaml#L115

This needs to change to implement the format for the v2 interface, eg:

[[registry]]
prefix = "192.168.213.2:9798/contrail"
insecure = true

But really this needs to be changed to not edit the file once Director has replaced it.
Also, the THT interface for implementing this configuration in the templated file should be used so that the configuration persists in the proper format.

Comment 9 Takashi Kajinami 2022-10-18 14:19:20 UTC
It's quite hard for us to support any injection by external tooling, especially in case the file is actively managed as part of deployment toolings.
We don't know even which file requires injection from our PoV.

Our recommendations here would be

 - Do not inject the record always, and skip the task if the file is already modified by tripleo

 - You can even use the tripleo-podman role from your tasks to manipulate the file, as long as you
   pass the consistent variable. This might be redundant but can be more robust.

Comment 10 Maciej Relewicz 2022-10-27 18:06:03 UTC
Hi,

We wil check what version is used and add proper schema entry. You can close an issue.

--
Maciej


Note You need to log in before you can comment on or make changes to this bug.