Bug 1481693 - overcloud container image upload is unefficient
Summary: overcloud container image upload is unefficient
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-common
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: beta
: 13.0 (Queens)
Assignee: Steve Baker
QA Contact: Artem Hrechanychenko
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-08-15 13:15 UTC by Attila Fazekas
Modified: 2018-06-27 13:34 UTC (History)
11 users (show)

Fixed In Version: openstack-tripleo-common-8.3.1-0.20180123050219.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-06-27 13:33:53 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Launchpad 1710992 None None None 2017-08-16 13:19:28 UTC
Launchpad 1733740 None None None 2017-11-22 02:42:38 UTC
OpenStack gerrit 522055 None MERGED Concurrent upload container images 2020-06-29 04:18:14 UTC
Red Hat Product Errata RHEA-2018:2086 None None None 2018-06-27 13:34:45 UTC

Description Attila Fazekas 2017-08-15 13:15:01 UTC
Description of problem:

openstack overcloud container image upload --config-file ./container_images.yaml

takes more than 10 minute on high core count system, 
even with extremely fast I/O (SSD , >= 10GbE net).

Version-Release number of selected component (if applicable):
docker-rhel-push-plugin-1.12.6-48.git0fdc778.el7.x86_64
docker-client-1.12.6-48.git0fdc778.el7.x86_64
python-docker-py-1.10.6-1.el7.noarch
docker-1.12.6-48.git0fdc778.el7.x86_64
python-docker-pycreds-1.10.6-1.el7.noarch
docker-distribution-2.6.1-1.1.gita25b9ef.el7.x86_64
docker-common-1.12.6-48.git0fdc778.el7.x86_64

openstack-tripleo-common-7.4.1-0.20170807001945.8c46306.el7ost.noarch
python-tripleoclient-7.2.1-0.20170807222309.a731597.el7ost.noarch


The original problem what the `overcloud container image upload` was intended to solve is having a user controllable registry where the user can made adjustment on the container images when it is really necessary.

However if the modification is not need needed, a registry in proxy mode would be more efficient to use.

The issue with the proxy registries they would try to forward the pushes to the origin registry, which is not necessary the thing what the users wants.

The `overcloud container image upload` does 
foreach {image_to_mange}:
  pull form origin registry , unpack and store locally
  tag locally  
  push (compress) to user registry (undercloud)

Docker does minimal parallelism in these steps,
for example if an image has multiple new layers it tries to operate in parallel, but after the first images are there usually we have just 1~2 new layer and just 1~2 core is utilized, the operation is clearly CPU intensive not the I/O is the bottleneck.

Populating the local docker images is just a side effect,
I wonder is there any better way for bypassing this step and directly uploading the downloaded layers to the undercloud registry.

The task can be made more effect by using parallel loops, it easily 
can made the operation 3 times faster, even just with 4 core system.

The issue with simply executing things in parallel, we will try to upload
the common layers multiple times at the first iteration, however the end result
was better in all measured case.

Comment 1 Omri Hochman 2017-08-16 13:18:54 UTC
Steve can it be improved by removing containers that are not going to be use during the overcloud deployment ?

Comment 2 Steve Baker 2017-10-29 21:56:47 UTC
Omri, sure, using the --service-environment-file and --roles-file arguments is always recommended when calling "openstack overcloud container image prepare" to minimise uploads to the containers actually used.

However this change is more about parallelising the uploads, and avoiding the unnecessary transfer to the local docker cache.

I have a change upstream to switch to skopeo to do the transfers, which avoids the local docker cache.

I'll use this bz to track skopeo landing, and also adding some concurrency to the transfers.

Comment 9 Omri Hochman 2018-06-06 15:01:45 UTC
the used of skopeo seems to improve the process times,  please re-open if discover other inefficiencies   

And using -e in the prepare command can assure to download only 
'to be used' containers

Comment 11 errata-xmlrpc 2018-06-27 13:33:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2086


Note You need to log in before you can comment on or make changes to this bug.