Created attachment 1616774 [details] generated customization.yaml from deploy Description of problem: When running openstack overcloud deploy, the playbook is trying to pull container images from registry.redhat.io. This server requires a login in order to access images. However, even after running a 'podman login registry.redhat.io' from the director node before running the 'openstack overcloud deploy', the nodes it is trying to set up do not get logged into the container registry, and therefore cannot retrieve containers ("invalid username/password" error) Version-Release number of selected component (if applicable): RC-0.9 and above How reproducible: always Steps to Reproduce: 1. successfully deploy an OSP15 undercloud 2. work through the steps to configure images, baremetal nodes (including introspection) for an overcloud (in this particular configuration, had minimal one Controller, one ppc64le Compute) up to and just before the "openstack overcloud deploy" 3. (optionally) run "podman login -u <RH access username> registry.redhat.io" 4. run "openstack overcloud deploy ..." with it's appropriate parameters (our details shown below) Actual results: As soon as the overcloud node attempts to run a "podman pull ..." of the containers, it fails with: "unable to pull registry.redhat.io/rhosp15-rhel8/openstack-cinder-volume:15.0: unable to pull image: Error determining manifest MIME type for docker://registry.redhat.io/rhosp15-rhel8/openstack-cinder-volume:15.0: unable to retrieve auth token: invalid username/password", Expected results: Podman will be able to retrieve and install all containers needed for an overcloud deploy Additional info: Our configuration on this test installation: * x86_64 host for 2 VMs, ansible playbooks for deploy are run from here (until we re-try the deploy from the director node). Running VirtBMC to provide out-of-band management emulation to the Controller node VM * Director node running as VM on the x86_64 host node above. * Controller node running as a VM on x86_64 host listed above. Uses VirtBMC on hosting server to emulate it's BMC * Power8 system (physical, not VM) as Compute (ppc64le) node. Deploy command run: openstack overcloud deploy --templates -e /home/stack/templates/node-info.yml -e /home/stack/templates/overcloud_images.yaml -e /home/stack/templates/tripleo-overcloud-passwords.yaml -e /home/stack/templates/customization.yaml --disable-validations -r /home/stack/templates/roles_data.yaml --ntp-server clock.corp.redhat.com
Created attachment 1616776 [details] node-info.yaml used in deploy
Created attachment 1616777 [details] overcloud_images.yaml used for deploy
Created attachment 1616778 [details] roles_data.yaml used for deploy
Created attachment 1616779 [details] tripleo-overcloud-passwords.yaml used for deploy
Some clarification on steps I had tested: From the host for my VMs, where I was running our own playbooks to deploy the undercloud and overcloud from, I was looking for command-line settings to forward the authentication to the playbooks. Nothing found for that. After I found I could run the deploy up *to* the "openstack overcloud deploy" step, I tried running the "podman login ..." command from there, to see if the expectation was that a customer would authenticate on a deployed undercloud, then the overcloud deploy would carry over the authentication. This was not successful either. Authenticating on the individual nodes won't work because the overcloud deploy will wipe out any installation on the individual nodes. The important issue is what would be the expected Customer procedure, and is it properly documented on the public site.
You can specify credentials for a registry using the ContainerImageRegistryCredentials parameter. Example: parameter_defaults: ContainerImageRegistryCredentials: registry.redhat.io: myuser: mypassword See also https://bugzilla.redhat.com/show_bug.cgi?id=1716627#c3 This should be referenced in the documentation, https://bugzilla.redhat.com/show_bug.cgi?id=1723969
Actually I found it in the docs. https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15-beta/html/director_installation_and_usage/preparing-for-director-installation#container-image-preparation-parameters Did you have this specified and it didn't work?
(In reply to Alex Schultz from comment #7) > Actually I found it in the docs. > https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15- > beta/html/director_installation_and_usage/preparing-for-director- > installation#container-image-preparation-parameters > > Did you have this specified and it didn't work? Tested this a couple ways, from a host system running an ansible playbook to do all the overcloud deploy scripts, and running the "openstack overcloud deloy" from the Director node itself. Even though the overcloud should have pulled in the login information when it ran, the overcloud deploy is not finding it, or is not passing it on to the nodes. Neither the controller nor the compute have had any containers installed.
I just tested pulling the container manually on the controller and the compute nodes (x86_64 and ppc64le respectively). Logged into the node as heat-admin, then ran "sudo podman login -u <username> registry.redhat.io". Gave it my password at the prompt, at which point it said "Login Succeeded!". Then pulled the image the deploy wanted to download, with the same command and path the deploy failed at; "sudo podman pull registry.redhat.io/rhosp15-rhel8/openstack-cinder-volume:15.0". This pulled down successfully, and a "sudo podman image list" will show REPOSITORY TAG IMAGE ID CREATED SIZE registry.redhat.io/rhosp15-rhel8/openstack-cinder-volume 15.0 90498b64119f 10 days ago 1.16 GB (for the x86_64 controller) REPOSITORY TAG IMAGE ID CREATED SIZE registry.redhat.io/rhosp15-rhel8/openstack-cinder-volume 15.0 3e7c1eec0e27 10 days ago 1.29 GB (for the Power8 compute) I would presume this means my login does have proper access for the container. Does the /home/stack/containers-prepare-parameter.yaml file need to get called by the overcloud deploy? (the format of it doesn't look right for the overcloud parameters)
Yes you need to pass -e /home/stack/containers-prepare-parameter.yaml as part of the deployment.
Setting ContainerImageRegistryCredentials and setting "push_destination: true" in the containers-prepare-parameter.yaml file lets the director download the images and store them on the director. If push_destination is not set, the login credentials are not used on the nodes unless "ContainerImageRegistryLogin: true" is also set. But when that parameter is set, you get the following error instead: fatal: [overcloud-controller-0]: FAILED! => {"msg": "The conditional check 'container_registry_logins_json | length) > 0' failed. The error was: template error while templating string: unexpected ')'. String: {% if container_registry_logins_json | length) > 0 %} True {% else %} False {% endif %}\n\nThe error appears to be in '/var/lib/mistral/overcloud/Controller/host_prep_tasks.yaml': line 737, column 5, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n kdo: password\n - name: Convert logins json to dict\n ^ here\n"}
This issue is caused by a typo, an upstream review to resolve this issue has been committed, and can be tracked here: [ https://review.opendev.org/685185 ].
With the missing parenthesis added, you end up with the following error: TASK [Convert logins json to dict] ********************************************* Friday 27 September 2019 10:05:36 -0400 (0:00:00.052) 0:02:36.839 ****** fatal: [overcloud-controller-0]: FAILED! => {"msg": "Unexpected templating type error occurred on ({{ container_registry_logins_json | from_json }}): the JSON object must be str, bytes or bytearray, not 'dict'"}
Can you test again with the following review [ https://review.opendev.org/#/c/685469 ] we believe this should ensure the data is handled correctly no matter how it is provided to the heat template.
Yes, that allowed the nodes to login to the container registry.
Nope, tried that patch on my system, it still fails to retrieve the containers. I have gone so far as to do a "podman login" to make sure the Director will recognize the login. I can do a "podman pull" of a container manually on the Director, so I know the account is valid, and that the container is also a valid one.
Please provide logs of the most recent issue as we believe we have resolved the ansible error.
Created attachment 1633084 [details] Overcloud deploy test run, 2019-Nov-05 Command used to run deploy: openstack overcloud deploy \ --templates \ -e "/home/stack/templates/node-info.yml" \ -e "/home/stack/templates/overcloud_images.yaml" \ -e "/home/stack/templates/tripleo-overcloud-passwords.yaml" \ -e "/home/stack/templates/customization.yaml" \ -e "/home/stack/containers-prepare-parameter.yaml" \ --disable-validations \ -r /home/stack/templates/roles_data.yaml \ --ntp-server clock.corp.redhat.com
I thought maybe the problem was in the podman-baremetal-ansible.yaml file, where under "ContainerImageRegistryLogin" it set "default: false" and should have been set for "true", but editing that in the file that the prior failed setup put there (trying a hard-coded edit before hunting down the configuration location) as well as putting the login information into containers-prepare-parameter.yaml and calling it as part of the overcloud deploy command (... -e /home/stack/containers-prepare-parameter.yaml ). I had run a "podman login" to registry.redhat.io before trying a deploy, and verified a "podman pull registry.redhat.io/rhosp15-rhel8/openstack-cinder-volume:15.0" would be able to pull down the container, which it did. Within that same terminal, right after trying a manual pull, I ran the overcloud deploy again, and it still fails with the "unable to retrieve auth token: invalid username/password". I will attach the logs and a redacted containers-prepare-parameter.yaml file after this comment
Created attachment 1635100 [details] Overcloud deploy log, Nov 11 Command line: openstack overcloud deploy --templates -e "/home/stack/containers-prepare-parameter.yaml" -e "/home/stack/templates/node-info.yml" -e"/home/stack/templates/overcloud_images.yaml" -e"/home/stack/templates/tripleo-overcloud-passwords.yaml" -e"/home/stack/templates/customization.yaml" --disable-validations -r /home/stack/templates/roles_data.yaml --ntp-server clock.corp.redhat.com
Created attachment 1635102 [details] containers-prepare-parameter file for overcloud deploy
You wouldn't modify the podman-baremetal-ansible.yaml. ContainerImageRegistryLogin is a parameter that should be set to true in your file that that contains the credentials Example (similar from the docs): parameter_defaults: ContainerImageRegistryLogin: true ContainerImageRegistryCredentials: registry.redhat.io: myuser: 'p@55w0rd!' registry.internalsite.com: myuser2: '0th3rp@55w0rd!' '192.0.2.1:8787': myuser3: '@n0th3rp@55w0rd!' I still haven't had a chance to reproduce this, but it sounds like we might need improve the docs.
used below openstack-tripleo-heat-templates-10.6.3-0.20191218080442.6978a62.el8ost.noarch : parameter_defaults: ContainerImagePrepare: - push_destination: true set: ceph_image: rhceph-4.0-rhel8 ceph_namespace: docker-registry.upshift.redhat.com/ceph ceph_tag: latest name_prefix: rhosp15-openstack- name_suffix: '' namespace: rhos-qe-mirror-tlv.usersys.redhat.com:5002/rh-osbs neutron_driver: ovn tag: 20200115.1 parameter_defaults: NeutronMechanismDrivers: ovn ContainerImagePrepare: - set: name_prefix: openstack- namespace: registry.redhat.io/rhosp15-rhel8 tag: latest ContainerImageRegistryCredentials: registry.redhat.io: user : 'pass' and hit same error above: unable to pull registry.redhat.io/rhosp15-rhel8/openstack-cinder-volume:latest: unable to pull image: Error initializing source docker://registry.redhat.io/rhosp15-rhel8/openstack-cinder-volume:latest: unable to retrieve auth token: invalid username/password"] Am I missing something
You need to also include ContainerImageRegistryLogin: true if push_destination is not include. See Bug 1792486
Deployed OC using: parameter_defaults: NeutronMechanismDrivers: ovn ContainerImagePrepare: - set: name_prefix: openstack- namespace: registry.redhat.io/rhosp15-rhel8 tag: latest ContainerImageRegistryLogin: true ContainerImageRegistryCredentials: registry.redhat.io: user: 'pass'
If this bug requires doc text for errata release, please set the 'Doc Type' and provide draft text according to the template in the 'Doc Text' field. The documentation team will review, edit, and approve the text. If this bug does not require doc text, please set the 'requires_doc_text' flag to '-'.
Documentation has been updated as part of Bug 1792486
I am presuming this had already been set, as the "requires_doc_text" flag is already "-". I looked over the OSP16 documentation at https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html-single/director_installation_and_usage/preparing-for-director-installation#container-image-preparation-parameters and it looks OK. Hopefully that's not merely because I understand it better now, but I think it's adequately explaining the configuration of the login settings.
Yes that's the updated documentation per Bug 1792486. There will likely be additional updates per the use case presented as part of Bug 1805117. It probably needs to be clarified that you only need to set ContainerImageRegistryLogin: true if you will be fetching containers on the overcloud systems from a remote registry that requires authentication. Using push_destination: true does not require this to be set to true.
I would still like to test that "oush_destination" setting with Power though. I had tried it before but had bad parameters elsewhere. For 16.1 I want to verify if that variation needs special settings (limited to the one small configuration to validate on for now).
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0643