We've seen 401's from the registry itself (we do retry) but that is outside of our control unfortunately. If you rerun is it failing on the same layer each time?
I'm currently try to reproduce the issue but I don't appear to be able to. Is there a proxy being used in this environment?
Oh I just noticed the undercloud.conf. The following options are in the wrong section: custom_env_files = /home/stack/templates/custom-undercloud-params.yaml container_images_file = /home/stack/containers-prepare-parameter.yaml They are under [ctlplane-subnet] and not [DEFAULT] so they aren't being picked up so no auth is being used. Please try moving them to the correct ini section.
I've replicated the issue. If the network throughput causes the layer fetching to exceed the life time of the authentication token it'll fail with something like: Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/cliff/app.py", line 401, in run_subcommand result = cmd.run(parsed_args) File "/usr/lib/python3.6/site-packages/tripleoclient/command.py", line 32, in run super(Command, self).run(parsed_args) File "/usr/lib/python3.6/site-packages/osc_lib/command/command.py", line 41, in run return super(Command, self).run(parsed_args) File "/usr/lib/python3.6/site-packages/cliff/command.py", line 185, in run return_code = self.take_action(parsed_args) or 0 File "/usr/lib/python3.6/site-packages/tripleoclient/v1/container_image.py", line 965, in take_action cleanup=parsed_args.cleanup, lock=lock) File "/usr/lib/python3.6/site-packages/tripleo_common/image/kolla_builder.py", line 235, in container_images_prepare_multi uploader.upload() File "/usr/lib/python3.6/site-packages/tripleo_common/image/image_uploader.py", line 272, in upload uploader.run_tasks() File "/usr/lib/python3.6/site-packages/tripleo_common/image/image_uploader.py", line 2252, in run_tasks for result in p.map(upload_task, self.upload_tasks): File "/usr/lib64/python3.6/concurrent/futures/process.py", line 366, in _chain_from_iterable_of_lists for element in iterable: File "/usr/lib64/python3.6/concurrent/futures/_base.py", line 586, in result_iterator yield fs.pop().result() File "/usr/lib64/python3.6/concurrent/futures/_base.py", line 432, in result return self.__get_result() File "/usr/lib64/python3.6/concurrent/futures/_base.py", line 384, in __get_result raise self._exception OSError: [rhosp-rhel8/openstack-swift-account] Write Failure: 401 Client Error: Unauthorized for url: https://registry.redhat.io/v2/rhosp-rhel8/openstack-swift-account/blobs/sha256:01d76065b50cd19077e1b2d2aafb0dd332ecfd6f8d02088dd87242de97e72a43 I was able to replicate this by using iproute-tc to limit the ingress to ~4mbit down and ~1mbit up (using script from https://wiki.gentoo.org/wiki/Traffic_shaping). I'll need to figure out the correct place to work around this issue.
Right now the workaround would be to not use "push_destination: true" when installing the undercloud. That should allow you to at least get an undercloud installed. From there you could fetch the containers manually and push them to the registry. Unfortunately it seems that the transfer from registry.redhat.io is taking longer than 5 minutes for the various layers resulting in the token expiring and causing the process to fail.
*** Bug 1821490 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2114
*** Bug 1811798 has been marked as a duplicate of this bug. ***