Bug 1813520
| Summary: | [RHOSP-16] Undercloud deployment is failing with HTTPError: 401 Client Error: Unauthorized for url | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Nilesh <nchandek> |
| Component: | openstack-tripleo-common | Assignee: | Alex Schultz <aschultz> |
| Status: | CLOSED ERRATA | QA Contact: | David Rosenfeld <drosenfe> |
| Severity: | urgent | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 16.0 (Train) | CC: | aschultz, bdobreli, cjanisze, ctrinh, emacchi, jamsmith, jhajyahy, kecarter, mburns, slinaber |
| Target Milestone: | --- | Keywords: | Triaged |
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | All | ||
| Whiteboard: | |||
| Fixed In Version: | openstack-tripleo-common-11.3.3-0.20200321092338.da2cc62.el8ost | Doc Type: | Bug Fix |
| Doc Text: |
This update fixes authentication timeouts caused by slow transfer of container images.
Previously, undercloud and overcloud pulls against container sources that require authentication could fail, and generate a 401 error, if the image transfer exceeded five minutes.
Now, if the container fetching process exceeds 5 minutes, the code attempts to re-authenticate, preventing the timeout.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-05-14 12:16:18 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Comment 2
Alex Schultz
2020-03-16 23:10:29 UTC
I'm currently try to reproduce the issue but I don't appear to be able to. Is there a proxy being used in this environment? Oh I just noticed the undercloud.conf. The following options are in the wrong section: custom_env_files = /home/stack/templates/custom-undercloud-params.yaml container_images_file = /home/stack/containers-prepare-parameter.yaml They are under [ctlplane-subnet] and not [DEFAULT] so they aren't being picked up so no auth is being used. Please try moving them to the correct ini section. I've replicated the issue. If the network throughput causes the layer fetching to exceed the life time of the authentication token it'll fail with something like:
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/cliff/app.py", line 401, in run_subcommand
result = cmd.run(parsed_args)
File "/usr/lib/python3.6/site-packages/tripleoclient/command.py", line 32, in run
super(Command, self).run(parsed_args)
File "/usr/lib/python3.6/site-packages/osc_lib/command/command.py", line 41, in run
return super(Command, self).run(parsed_args)
File "/usr/lib/python3.6/site-packages/cliff/command.py", line 185, in run
return_code = self.take_action(parsed_args) or 0
File "/usr/lib/python3.6/site-packages/tripleoclient/v1/container_image.py", line 965, in take_action
cleanup=parsed_args.cleanup, lock=lock)
File "/usr/lib/python3.6/site-packages/tripleo_common/image/kolla_builder.py", line 235, in container_images_prepare_multi
uploader.upload()
File "/usr/lib/python3.6/site-packages/tripleo_common/image/image_uploader.py", line 272, in upload
uploader.run_tasks()
File "/usr/lib/python3.6/site-packages/tripleo_common/image/image_uploader.py", line 2252, in run_tasks
for result in p.map(upload_task, self.upload_tasks):
File "/usr/lib64/python3.6/concurrent/futures/process.py", line 366, in _chain_from_iterable_of_lists
for element in iterable:
File "/usr/lib64/python3.6/concurrent/futures/_base.py", line 586, in result_iterator
yield fs.pop().result()
File "/usr/lib64/python3.6/concurrent/futures/_base.py", line 432, in result
return self.__get_result()
File "/usr/lib64/python3.6/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
OSError: [rhosp-rhel8/openstack-swift-account] Write Failure: 401 Client Error: Unauthorized for url: https://registry.redhat.io/v2/rhosp-rhel8/openstack-swift-account/blobs/sha256:01d76065b50cd19077e1b2d2aafb0dd332ecfd6f8d02088dd87242de97e72a43
I was able to replicate this by using iproute-tc to limit the ingress to ~4mbit down and ~1mbit up (using script from https://wiki.gentoo.org/wiki/Traffic_shaping).
I'll need to figure out the correct place to work around this issue.
Right now the workaround would be to not use "push_destination: true" when installing the undercloud. That should allow you to at least get an undercloud installed. From there you could fetch the containers manually and push them to the registry. Unfortunately it seems that the transfer from registry.redhat.io is taking longer than 5 minutes for the various layers resulting in the token expiring and causing the process to fail. *** Bug 1821490 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2114 *** Bug 1811798 has been marked as a duplicate of this bug. *** |