Bug 1316422 - [platformmanagement_public_426]The image size not calculated accurately
[platformmanagement_public_426]The image size not calculated accurately
Product: OpenShift Origin
Classification: Red Hat
Component: Image Registry (Show other bugs)
Unspecified Unspecified
medium Severity medium
: ---
: ---
Assigned To: Michal Minar
Wei Sun
Depends On:
  Show dependency treegraph
Reported: 2016-03-10 03:12 EST by zhou ying
Modified: 2016-05-12 13:14 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2016-05-12 13:14:20 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description zhou ying 2016-03-10 03:12:04 EST
Description of problem:
Push image to the docker-registry, check the ImageStream info, the used size is not calculate accurately with  the Image size.

Version-Release number of selected component (if applicable):
openshift v1.1.3-553-g19dbf2a
kubernetes v1.2.0-alpha.7-703-gbc4550d
etcd 2.2.5

How reproducible:

Steps to Reproduce:
1. Start OpenShift and login;
2. As cluster-admin create quota for user project:
[root@ip-172-18-12-26 amd64]# more quota.json 
  "apiVersion": "v1",
  "kind": "ResourceQuota",
  "metadata": {
    "name": "test-quota"
  "spec": {
    "hard": {
      "memory": "512Mi",
      "cpu": "200m",
      "openshift.io/imagesize": "500Mi",
      "openshift.io/imagestreamsize": "500Mi",
      "openshift.io/projectimagessize": "800Mi",
      "pods": "3",
      "services": "3",
      "replicationcontrollers": "3",
      "resourcequotas": "1"

3. Use the user token to login the docker-registr:
  `docker login -u name -p token -e email`
4. Tag and push image to the docker-registry:
  `docker tag docker.io/zhouying7780/origin-ruby-sample`
  `docker push`
5. Check the image size from the ImageStream info.

Actual results:
The image size is not calculated accurately.
[root@ip-172-18-12-26 amd64]# docker images |grep origin-ruby-sample
docker.io/zhouying7780/origin-ruby-sample      latest              afda8e0d6f57        4 months ago        410.8 MB   latest              afda8e0d6f57        4 months ago        410.8 MB
[root@ip-172-18-12-26 amd64]# oc get is
NAME                 DOCKER REPO                                    TAGS      UPDATED
origin-ruby-sample   latest    13 minutes ago
test                        latest    17 minutes ago
[root@ip-172-18-12-26 amd64]# oc describe is origin-ruby-sample
Name:            origin-ruby-sample
Created:        13 minutes ago
Labels:            <none>
Annotations:        <none>
Docker Pull Spec:
Quota Usage:        391.77MiB / 500MiB

Tag    Spec        Created        PullSpec                                Image
latest    <pushed>    13 minutes ago    <same>

Expected results:
The image size is calculated accurately.

Additional info:
Comment 1 Paul Weil 2016-03-10 08:20:39 EST
Michal, is this due to shared layers?
Comment 2 Michal Minar 2016-03-10 09:56:16 EST
That's because of compression. Docker shows the size of the data stored in a layer. What you see in the is description is a sum of sizes of layers stored in the registry. Layer blobs are copressed tars. Thanks to the compression, the resulting size is a bit lower than the size shown by docker.

To verify the sizes, go to the docker storage and lookup all the layer blobs referenced in the image manifest. Sum of their sizes should match the value shown. Also make sure to count each layer just once (manifest may refer to the same layer - usually an empty one - multiple times).
Comment 3 zhou ying 2016-03-10 21:58:21 EST
Hi Michal,

Thanks for your response, yesterday, I tested on a multi-node env, I use the image :docker.io/zhouying7780/origin-ruby-sample witch have size 432M, but use the describe show only have 140M , this is due to 'shared layers'?
Comment 4 Michal Minar 2016-03-11 03:01:44 EST
No. That's because of compression as I said earlier. See this:

    $ docker push
    The push refers to a repository []
    latest: digest: sha256:53843bad162d94e554641fe688dee4e271a69b9f645b5e79f96e453dac5afc88 size: 6941

    $ docker images
    REPOSITORY                                       TAG                 IMAGE ID            CREATED             SIZE   latest              b3e628177e9d        4 months ago        410.8 MB

    # change to the registry's blob store
    $ cd /var/lib/docker-registry@2/docker/registry/v2/blobs

    # grep the manifest to see its layers
    $ cat sha256/53/53843bad162d94e554641fe688dee4e271a69b9f645b5e79f96e453dac5afc88/data | grep blobSum
         "blobSum": "sha256:a66cad0cc265ff3be4aed5a3b4affbeee9c7abf65b454b7d3235815abad49c44"
         "blobSum": "sha256:0be08759f213fb88c0f1758c0aa2ffe29356e1f287140b2b727d6f717e7d8b9e"
         "blobSum": "sha256:0c98b5a2e188fa2cbfea93cb12dcdc051f8c770e83ab55a2ec386527fae3efca"
         "blobSum": "sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4"
         "blobSum": "sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4"
         "blobSum": "sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4"
         "blobSum": "sha256:05346a7282e1073e197c35de5953e2dc4d59df661f1bd72cefe712da699dfb74"
         "blobSum": "sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4"

    # get their sizes in bytes (note the `sort -u` which removes duplicates)
    $ cat sha256/53/53843bad162d94e554641fe688dee4e271a69b9f645b5e79f96e453dac5afc88/data | sed -n 's,.*blobSum.*sha256:\(\(..\)[^"]\+\).*,sha256/\2/\1/data,p' | sort -u | xargs stat --printf='%s\n'

    # sum them
    $ cat sha256/53/53843bad162d94e554641fe688dee4e271a69b9f645b5e79f96e453dac5afc88/data | sed -n 's,.*blobSum.*sha256:\(\(..\)[^"]\+\).*,sha256/\2/\1/data,p' | sort -u | xargs stat --printf='%s\n' | paste -s -d+ - | bc

    # which is approx. 141.6 MB

During this example I was using docker 1.10 and upstream registry v2.2.1.
Different docker version will give you different size results.

With OSE registry, you'd have to parse an image object stored in etcd. So the fourth command would look like this:

    $ oc get -o yaml image sha256:53843bad162d94e554641fe688dee4e271a69b9f645b5e79f96e453dac5afc88 | grep blobSum
Comment 5 zhou ying 2016-03-11 10:20:54 EST
confirmed with ami devenv_rhel7_3695, the result is correct, thanks.
[root@ip-172-18-2-116 amd64]# openshift version
openshift v1.1.3-610-g10db3d4
kubernetes v1.2.0-origin-41-g91d3e75
etcd 2.2.5

[root@ip-172-18-2-116 amd64]# docker push
The push refers to a repository [] (len: 1)
afda8e0d6f57: Pushed 
635d272834c7: Pushed 
13a92d1249a7: Pushed 
5880f4b9ebad: Pushed 
1e7126b9a44b: Pushed 
c7449ee13099: Pushed 
728835507ee6: Pushed 
ac354ed931ab: Pushed 
latest: digest: sha256:7c8fa744a597399b736b41fb5b8bdc795d9cb6026b25471fae5324c4166f4d12 size: 17077

oc get image -o yaml sha256:7c8fa744a597399b736b41fb5b8bdc795d9cb6026b25471fae5324c4166f4d12 grep size|awk '{print $2}'| paste -s -d+ - |bc

[root@ip-172-18-2-116 amd64]# oc describe quota
Name:				test-quota
Namespace:			zhouy
Resource			Used		Hard
--------			----		----
cpu				0		200m
memory				0		512Mi
openshift.io/imagesize		0		500Mi
openshift.io/imagestreamsize	0		500Mi
openshift.io/projectimagessize	410781756	800Mi
pods				0		3
replicationcontrollers		0		3
resourcequotas			1		1
services			0		3

Note You need to log in before you can comment on or make changes to this bug.