Bug 1316422 - [platformmanagement_public_426]The image size not calculated accurately
Summary: [platformmanagement_public_426]The image size not calculated accurately
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OKD
Classification: Red Hat
Component: Image Registry
Version: 3.x
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Michal Minar
QA Contact: Wei Sun
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-03-10 08:12 UTC by zhou ying
Modified: 2016-05-12 17:14 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-05-12 17:14:20 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description zhou ying 2016-03-10 08:12:04 UTC
Description of problem:
Push image to the docker-registry, check the ImageStream info, the used size is not calculate accurately with  the Image size.

Version-Release number of selected component (if applicable):
openshift v1.1.3-553-g19dbf2a
kubernetes v1.2.0-alpha.7-703-gbc4550d
etcd 2.2.5

How reproducible:
Always

Steps to Reproduce:
1. Start OpenShift and login;
2. As cluster-admin create quota for user project:
[root@ip-172-18-12-26 amd64]# more quota.json 
{
  "apiVersion": "v1",
  "kind": "ResourceQuota",
  "metadata": {
    "name": "test-quota"
  },
  "spec": {
    "hard": {
      "memory": "512Mi",
      "cpu": "200m",
      "openshift.io/imagesize": "500Mi",
      "openshift.io/imagestreamsize": "500Mi",
      "openshift.io/projectimagessize": "800Mi",
      "pods": "3",
      "services": "3",
      "replicationcontrollers": "3",
      "resourcequotas": "1"
    }
  }
}

3. Use the user token to login the docker-registr:
  `docker login -u name -p token -e email 172.30.228.175:5000`
4. Tag and push image to the docker-registry:
  `docker tag docker.io/zhouying7780/origin-ruby-sample 172.30.228.175:5000/zhouy/origin-ruby-sample`
  `docker push 172.30.228.175:5000/zhouy/origin-ruby-sample`
5. Check the image size from the ImageStream info.



Actual results:
The image size is not calculated accurately.
[root@ip-172-18-12-26 amd64]# docker images |grep origin-ruby-sample
docker.io/zhouying7780/origin-ruby-sample      latest              afda8e0d6f57        4 months ago        410.8 MB
172.30.228.175:5000/zhouy/origin-ruby-sample   latest              afda8e0d6f57        4 months ago        410.8 MB
[root@ip-172-18-12-26 amd64]# oc get is
NAME                 DOCKER REPO                                    TAGS      UPDATED
origin-ruby-sample   172.30.228.175:5000/zhouy/origin-ruby-sample   latest    13 minutes ago
test                 172.30.228.175:5000/zhouy/test                 latest    17 minutes ago
[root@ip-172-18-12-26 amd64]# oc describe is origin-ruby-sample
Name:            origin-ruby-sample
Created:        13 minutes ago
Labels:            <none>
Annotations:        <none>
Docker Pull Spec:    172.30.228.175:5000/zhouy/origin-ruby-sample
Quota Usage:        391.77MiB / 500MiB

Tag    Spec        Created        PullSpec                                Image
latest    <pushed>    13 minutes ago    172.30.228.175:5000/zhouy/origin-ruby-sample@sha256:c4b2703d5ad397...    <same>

Expected results:
The image size is calculated accurately.

Additional info:

Comment 1 Paul Weil 2016-03-10 13:20:39 UTC
Michal, is this due to shared layers?

Comment 2 Michal Minar 2016-03-10 14:56:16 UTC
That's because of compression. Docker shows the size of the data stored in a layer. What you see in the is description is a sum of sizes of layers stored in the registry. Layer blobs are copressed tars. Thanks to the compression, the resulting size is a bit lower than the size shown by docker.

To verify the sizes, go to the docker storage and lookup all the layer blobs referenced in the image manifest. Sum of their sizes should match the value shown. Also make sure to count each layer just once (manifest may refer to the same layer - usually an empty one - multiple times).

Comment 3 zhou ying 2016-03-11 02:58:21 UTC
Hi Michal,

Thanks for your response, yesterday, I tested on a multi-node env, I use the image :docker.io/zhouying7780/origin-ruby-sample witch have size 432M, but use the describe show only have 140M , this is due to 'shared layers'?

Comment 4 Michal Minar 2016-03-11 08:01:44 UTC
No. That's because of compression as I said earlier. See this:

    $ docker push 127.0.0.1:5002/zhouying7780/origin-ruby-sample
    The push refers to a repository [127.0.0.1:5002/zhouying7780/origin-ruby-sample]
    ...
    latest: digest: sha256:53843bad162d94e554641fe688dee4e271a69b9f645b5e79f96e453dac5afc88 size: 6941

    $ docker images 127.0.0.1:5002/zhouying7780/origin-ruby-sample
    REPOSITORY                                       TAG                 IMAGE ID            CREATED             SIZE
    127.0.0.1:5002/zhouying7780/origin-ruby-sample   latest              b3e628177e9d        4 months ago        410.8 MB

    # change to the registry's blob store
    $ cd /var/lib/docker-registry@2/docker/registry/v2/blobs

    # grep the manifest to see its layers
    $ cat sha256/53/53843bad162d94e554641fe688dee4e271a69b9f645b5e79f96e453dac5afc88/data | grep blobSum
         "blobSum": "sha256:a66cad0cc265ff3be4aed5a3b4affbeee9c7abf65b454b7d3235815abad49c44"
         "blobSum": "sha256:0be08759f213fb88c0f1758c0aa2ffe29356e1f287140b2b727d6f717e7d8b9e"
         "blobSum": "sha256:0c98b5a2e188fa2cbfea93cb12dcdc051f8c770e83ab55a2ec386527fae3efca"
         "blobSum": "sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4"
         "blobSum": "sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4"
         "blobSum": "sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4"
         "blobSum": "sha256:05346a7282e1073e197c35de5953e2dc4d59df661f1bd72cefe712da699dfb74"
         "blobSum": "sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4"

    # get their sizes in bytes (note the `sort -u` which removes duplicates)
    $ cat sha256/53/53843bad162d94e554641fe688dee4e271a69b9f645b5e79f96e453dac5afc88/data | sed -n 's,.*blobSum.*sha256:\(\(..\)[^"]\+\).*,sha256/\2/\1/data,p' | sort -u | xargs stat --printf='%s\n'
    62906508
    17241097
    63850073
    32
    4529085

    # sum them
    $ cat sha256/53/53843bad162d94e554641fe688dee4e271a69b9f645b5e79f96e453dac5afc88/data | sed -n 's,.*blobSum.*sha256:\(\(..\)[^"]\+\).*,sha256/\2/\1/data,p' | sort -u | xargs stat --printf='%s\n' | paste -s -d+ - | bc
    148526795

    # which is approx. 141.6 MB

During this example I was using docker 1.10 and upstream registry v2.2.1.
Different docker version will give you different size results.

With OSE registry, you'd have to parse an image object stored in etcd. So the fourth command would look like this:

    $ oc get -o yaml image sha256:53843bad162d94e554641fe688dee4e271a69b9f645b5e79f96e453dac5afc88 | grep blobSum

Comment 5 zhou ying 2016-03-11 15:20:54 UTC
confirmed with ami devenv_rhel7_3695, the result is correct, thanks.
[root@ip-172-18-2-116 amd64]# openshift version
openshift v1.1.3-610-g10db3d4
kubernetes v1.2.0-origin-41-g91d3e75
etcd 2.2.5

[root@ip-172-18-2-116 amd64]# docker push 172.30.54.223:5000/zhouy/test
The push refers to a repository [172.30.54.223:5000/zhouy/test] (len: 1)
afda8e0d6f57: Pushed 
635d272834c7: Pushed 
13a92d1249a7: Pushed 
5880f4b9ebad: Pushed 
1e7126b9a44b: Pushed 
c7449ee13099: Pushed 
728835507ee6: Pushed 
ac354ed931ab: Pushed 
latest: digest: sha256:7c8fa744a597399b736b41fb5b8bdc795d9cb6026b25471fae5324c4166f4d12 size: 17077


oc get image -o yaml sha256:7c8fa744a597399b736b41fb5b8bdc795d9cb6026b25471fae5324c4166f4d12 grep size|awk '{print $2}'| paste -s -d+ - |bc
410781756

[root@ip-172-18-2-116 amd64]# oc describe quota
Name:				test-quota
Namespace:			zhouy
Resource			Used		Hard
--------			----		----
cpu				0		200m
memory				0		512Mi
openshift.io/imagesize		0		500Mi
openshift.io/imagestreamsize	0		500Mi
openshift.io/projectimagessize	410781756	800Mi
pods				0		3
replicationcontrollers		0		3
resourcequotas			1		1
services			0		3


Note You need to log in before you can comment on or make changes to this bug.