1481693 – overcloud container image upload is unefficient

Bug 1481693 - overcloud container image upload is unefficient

Summary: overcloud container image upload is unefficient

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-tripleo-common
Sub Component:
Version:	13.0 (Queens)
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	beta
Target Release:	13.0 (Queens)
Assignee:	Steve Baker
QA Contact:	Artem Hrechanychenko
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-08-15 13:15 UTC by Attila Fazekas
Modified:	2018-06-27 13:34 UTC (History)
CC List:	11 users (show)
Fixed In Version:	openstack-tripleo-common-8.3.1-0.20180123050219.el7ost
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-06-27 13:33:53 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Launchpad	1710992	None	None	None	2017-08-16 13:19:28 UTC
Launchpad	1733740	None	None	None	2017-11-22 02:42:38 UTC
OpenStack gerrit	522055	None	MERGED	Concurrent upload container images	2021-02-01 23:47:01 UTC
Red Hat Product Errata	RHEA-2018:2086	None	None	None	2018-06-27 13:34:45 UTC

Description Attila Fazekas 2017-08-15 13:15:01 UTC

Description of problem:

openstack overcloud container image upload --config-file ./container_images.yaml

takes more than 10 minute on high core count system, 
even with extremely fast I/O (SSD , >= 10GbE net).

Version-Release number of selected component (if applicable):
docker-rhel-push-plugin-1.12.6-48.git0fdc778.el7.x86_64
docker-client-1.12.6-48.git0fdc778.el7.x86_64
python-docker-py-1.10.6-1.el7.noarch
docker-1.12.6-48.git0fdc778.el7.x86_64
python-docker-pycreds-1.10.6-1.el7.noarch
docker-distribution-2.6.1-1.1.gita25b9ef.el7.x86_64
docker-common-1.12.6-48.git0fdc778.el7.x86_64

openstack-tripleo-common-7.4.1-0.20170807001945.8c46306.el7ost.noarch
python-tripleoclient-7.2.1-0.20170807222309.a731597.el7ost.noarch


The original problem what the `overcloud container image upload` was intended to solve is having a user controllable registry where the user can made adjustment on the container images when it is really necessary.

However if the modification is not need needed, a registry in proxy mode would be more efficient to use.

The issue with the proxy registries they would try to forward the pushes to the origin registry, which is not necessary the thing what the users wants.

The `overcloud container image upload` does 
foreach {image_to_mange}:
  pull form origin registry , unpack and store locally
  tag locally  
  push (compress) to user registry (undercloud)

Docker does minimal parallelism in these steps,
for example if an image has multiple new layers it tries to operate in parallel, but after the first images are there usually we have just 1~2 new layer and just 1~2 core is utilized, the operation is clearly CPU intensive not the I/O is the bottleneck.

Populating the local docker images is just a side effect,
I wonder is there any better way for bypassing this step and directly uploading the downloaded layers to the undercloud registry.

The task can be made more effect by using parallel loops, it easily 
can made the operation 3 times faster, even just with 4 core system.

The issue with simply executing things in parallel, we will try to upload
the common layers multiple times at the first iteration, however the end result
was better in all measured case.

Comment 1 Omri Hochman 2017-08-16 13:18:54 UTC

Steve can it be improved by removing containers that are not going to be use during the overcloud deployment ?

Comment 2 Steve Baker 2017-10-29 21:56:47 UTC

Omri, sure, using the --service-environment-file and --roles-file arguments is always recommended when calling "openstack overcloud container image prepare" to minimise uploads to the containers actually used.

However this change is more about parallelising the uploads, and avoiding the unnecessary transfer to the local docker cache.

I have a change upstream to switch to skopeo to do the transfers, which avoids the local docker cache.

I'll use this bz to track skopeo landing, and also adding some concurrency to the transfers.

Comment 9 Omri Hochman 2018-06-06 15:01:45 UTC

the used of skopeo seems to improve the process times,  please re-open if discover other inefficiencies   

And using -e in the prepare command can assure to download only 
'to be used' containers

Comment 11 errata-xmlrpc 2018-06-27 13:33:53 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2086

Note You need to log in before you can comment on or make changes to this bug.