Description of problem:
openstack overcloud container image upload --config-file ./container_images.yaml
takes more than 10 minute on high core count system,
even with extremely fast I/O (SSD , >= 10GbE net).
Version-Release number of selected component (if applicable):
The original problem what the `overcloud container image upload` was intended to solve is having a user controllable registry where the user can made adjustment on the container images when it is really necessary.
However if the modification is not need needed, a registry in proxy mode would be more efficient to use.
The issue with the proxy registries they would try to forward the pushes to the origin registry, which is not necessary the thing what the users wants.
The `overcloud container image upload` does
pull form origin registry , unpack and store locally
push (compress) to user registry (undercloud)
Docker does minimal parallelism in these steps,
for example if an image has multiple new layers it tries to operate in parallel, but after the first images are there usually we have just 1~2 new layer and just 1~2 core is utilized, the operation is clearly CPU intensive not the I/O is the bottleneck.
Populating the local docker images is just a side effect,
I wonder is there any better way for bypassing this step and directly uploading the downloaded layers to the undercloud registry.
The task can be made more effect by using parallel loops, it easily
can made the operation 3 times faster, even just with 4 core system.
The issue with simply executing things in parallel, we will try to upload
the common layers multiple times at the first iteration, however the end result
was better in all measured case.
Steve can it be improved by removing containers that are not going to be use during the overcloud deployment ?
Omri, sure, using the --service-environment-file and --roles-file arguments is always recommended when calling "openstack overcloud container image prepare" to minimise uploads to the containers actually used.
However this change is more about parallelising the uploads, and avoiding the unnecessary transfer to the local docker cache.
I have a change upstream to switch to skopeo to do the transfers, which avoids the local docker cache.
I'll use this bz to track skopeo landing, and also adding some concurrency to the transfers.
the used of skopeo seems to improve the process times, please re-open if discover other inefficiencies
And using -e in the prepare command can assure to download only
'to be used' containers
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.