Bug 1811798
| Summary: | [osp16] Random error to registry.redhat.io -> requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://registry.redhat.io/v2/rhosp-rhel8/openstack- | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Chris Janiszewski <cjanisze> |
| Component: | openstack-tripleo-common | Assignee: | Adriano Petrich <apetrich> |
| Status: | CLOSED DUPLICATE | QA Contact: | David Rosenfeld <drosenfe> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 16.0 (Train) | CC: | aschultz, dsorrent, mburns, msecaur, rrubins, slinaber |
| Target Milestone: | --- | Keywords: | Triaged |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-06-08 19:16:10 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Chris Janiszewski
2020-03-09 19:44:05 UTC
You'll have to raise an issue with the owners of registry.redhat.io. We already attempt multiple retries for requests. This seems to point to issues with the authentication mechanism and not necessarily anything in the code. I understand the problem is not necessary with OSP bits. I don't recall us ever having these issues prior to moving to registry.redhat.io .. hence the other registry has not required authentication. This has been a huge issue for the field team deploying OSP16. Is there any way we could increase the timeout/retries ? Alternatively if there is a way to detach the container upload process from the rest of the installation that would be helpful as well. At least we could fail early on and have easier way to identify the cause. The process for creating local registries with podman is not really described anywhere and the steps that used to work with docker do not apply anymore. Please advise. You can run the prepare process prior to running the overcloud deployment. You can run `openstack tripleo container image prepare` by hand to populate the registry on the undercloud. 1) openstack tripleo container image prepare default --local-push-destination --output-env-file containers.yaml 2) edit containers.yaml to add credentials 3) sudo openstack tripleo container image prepare -e containers.yaml 4) perform deploy and include -e containers.yaml Additionally if you're having issues with the registry.redhat.io, a satellite server (which we do document/recommend) would be another option for a local source. Moving the bz over to the release delivery folks who may be able to provide additional information on how to raise issues with registry.redhat.io. We already do provide multiple retries as part of the process but it's not going to solve this problem. Alex, If we wanted to avoid calling out to registry.redhat.io during the Overcloud deployment, wouldn't the process be this instead: 1) openstack tripleo container image prepare default --local-push-destination --output-env-file containers.yaml 2) edit containers.yaml to add credentials 3) sudo openstack tripleo container image prepare -e containers.yaml | tee -a ~stack/templates/local_container_images.yaml 4) perform deploy and include -e ~stack/templates/local_container_images.yaml in place of the container.yaml or container-images-prepare.yaml (from the docs) If we include the containers.yaml as it is in the overcloud deployment, with the push-destination set to true, won't it still attempt to update the local container images? Here is another example of this issue: (undercloud) [stack@undercloud-osp16 rebuild_image]$ sudo buildah bud -t undercloud-osp16.ctlplane.home.lab:8787/rhceph-4-rhel8-custom STEP 1: FROM registry.redhat.io/rhceph/rhceph-4-rhel8 error creating build container: Error initializing source docker://registry.redhat.io/rhceph/rhceph-4-rhel8:latest: unable to retrieve auth token: invalid username/password: unauthorized: Please login to the Red Hat Registry using your Customer Portal credentials. Further instructions can be found here: https://access.redhat.com/RegistryAuthentication (undercloud) [stack@undercloud-osp16 rebuild_image]$ sudo podman login registry.redhat.io Authenticating with existing credentials... Existing credentials are valid. Already logged in to registry.redhat.io (In reply to Darin Sorrentino from comment #4) > Alex, > If we wanted to avoid calling out to registry.redhat.io during the > Overcloud deployment, wouldn't the process be this instead: > > 1) openstack tripleo container image prepare default > --local-push-destination --output-env-file containers.yaml > 2) edit containers.yaml to add credentials > 3) sudo openstack tripleo container image prepare -e containers.yaml | tee > -a ~stack/templates/local_container_images.yaml > 4) perform deploy and include -e > ~stack/templates/local_container_images.yaml in place of the container.yaml > or container-images-prepare.yaml (from the docs) > > If we include the containers.yaml as it is in the overcloud deployment, with > the push-destination set to true, won't it still attempt to update the local > container images? It'll check the versions but it won't update because they already exist. It really depends. For many customers this check is not a problem, the only way to truly disconnect an environment is to use a satellite server infrastructure. (In reply to Chris Janiszewski from comment #5) > Here is another example of this issue: > (undercloud) [stack@undercloud-osp16 rebuild_image]$ sudo buildah bud -t > undercloud-osp16.ctlplane.home.lab:8787/rhceph-4-rhel8-custom > STEP 1: FROM registry.redhat.io/rhceph/rhceph-4-rhel8 > error creating build container: Error initializing source > docker://registry.redhat.io/rhceph/rhceph-4-rhel8:latest: unable to retrieve > auth token: invalid username/password: unauthorized: Please login to the Red > Hat Registry using your Customer Portal credentials. Further instructions > can be found here: https://access.redhat.com/RegistryAuthentication > (undercloud) [stack@undercloud-osp16 rebuild_image]$ sudo podman login > registry.redhat.io > Authenticating with existing credentials... > Existing credentials are valid. Already logged in to registry.redhat.io It's likely you need to use `buildah login`. This would be a bug against builadh if it still doesn't work. Hi, Alex, The problem with the solution you've provided in Comment #3 is that this error can still happen during you Step #1 (i.e. openstack tripleo container image prepare default --local-push-destination --output-env-file containers.yaml). I see this happen a lot in a lab environment in Pune inside the Red Hat network. If a single container (or maybe just a blob) takes more than 5 minutes to download, then subsequent requests will fail. I also managed to re-create this issue by driving the load on the server up very high before running the 'openstack tripleo container image prepare' command (which is just another way of making the transfer take longer than 5 minutes). In my Ansible playbooks that deploy OSP16 in our lab, I ended up just running the 'openstack tripleo container image prepare' twice since it almost always fails the first time. On the second run, most of the containers are downloaded already, so it typically works fine then. This continues to be a problem when we are using registry.redhat.io. Yes the 5 minute blob issue has been fixed via https://review.opendev.org/#/c/713923/ and should be out in a future minor release. I will have to track down the a bz for this issue. *** This bug has been marked as a duplicate of bug 1813520 *** |