Description of problem: For the new comamnd "openshift ex dockergc", when it calculate the space usage it use docker rootdir, if the docker use devicemapper as device driver, it can't cleanup the database. https://github.com/openshift/origin/blob/master/pkg/oc/experimental/dockergc/dockergc.go#L121 Version-Release number of selected component (if applicable): openshift v3.7.4 kubernetes v1.7.6+a08f5eeb62 etcd 3.2.8 How reproducible: Always Steps to Reproduce: 1. Check docker info [root@qe-pod-node-registry-router-1 ~]# docker info Containers: 0 Running: 0 Paused: 0 Stopped: 0 Images: 9 Server Version: 1.12.6 Storage Driver: devicemapper Pool Name: rhel-docker--pool Pool Blocksize: 524.3 kB Base Device Size: 10.74 GB Backing Filesystem: xfs Data file: Metadata file: Data Space Used: 8.82 GB Data Space Total: 15.26 GB Data Space Available: 6.443 GB Metadata Space Used: 2.064 MB Metadata Space Total: 33.55 MB Metadata Space Available: 31.49 MB Thin Pool Minimum Free Space: 1.526 GB Udev Sync Supported: true Deferred Removal Enabled: true Deferred Deletion Enabled: true Deferred Deleted Device Count: 0 Library Version: 1.02.140-RHEL7 (2017-05-03) Logging Driver: journald Cgroup Driver: systemd Plugins: Volume: local Network: bridge host null overlay Authorization: rhel-push-plugin Swarm: inactive Runtimes: docker-runc runc Default Runtime: docker-runc Security Options: seccomp selinux Kernel Version: 3.10.0-693.5.2.el7.x86_64 Operating System: Red Hat Enterprise Linux Server 7.4 (Maipo) OSType: linux Architecture: x86_64 Number of Docker Hooks: 3 CPUs: 1 Total Memory: 3.456 GiB Name: qe-pod-node-registry-router-1 ID: BOCP:4GXH:F4AE:A7FU:3JJ5:6JBL:6FLX:HGAG:XQ7R:F6MZ:FB7X:44KP Docker Root Dir: /var/lib/docker Debug Mode (client): false Debug Mode (server): false Registry: https://registry.reg-aws.openshift.com:443/v1/ WARNING: bridge-nf-call-iptables is disabled WARNING: bridge-nf-call-ip6tables is disabled Insecure Registries: asb-registry.usersys.redhat.com:5000 brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888 registry.reg-aws.openshift.com:443 virt-openshift-05.lab.eng.nay.redhat.com:5000 virt-openshift-05.lab.eng.nay.redhat.com:5001 127.0.0.0/8 Registries: registry.reg-aws.openshift.com:443 (insecure), registry.access.redhat.com (secure), registry.access.redhat.com (secure), docker.io (secure) [root@qe-pod-node-registry-router-1 ~]# docker images REPOSITORY TAG IMAGE ID CREATED SIZE docker.io/openshift/origin-deployer latest 732b7f9481a8 2 hours ago 1.099 GB registry.reg-aws.openshift.com:443/openshift3/ose-service-catalog v3.7 44bbd6bcec32 5 hours ago 265.4 MB docker.io/openshift/jenkins-2-centos7 latest 086e26565757 8 hours ago 2.021 GB docker.io/openshift/origin-metrics-cassandra latest d77a710bd9f0 7 days ago 780 MB docker.io/openshift/origin-metrics-hawkular-metrics latest 67c1503b2ae2 7 days ago 914.4 MB docker.io/openshift/origin-metrics-heapster latest 93f72c7c2f46 7 days ago 819.9 MB docker.io/openshift/wildfly-101-centos7 latest b0948ecacc39 2 weeks ago 943.6 MB docker.io/openshift/php-55-centos7 latest 089abeb67362 10 months ago 539.3 MB docker.io/openshift/mysql-55-centos7 latest 968db52211da 11 months ago 384.6 MB [root@qe-pod-node-registry-router-1 ~]# lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert docker-pool rhel twi-a-t--- 14.21g 57.78 6.15 root rhel -wi-ao---- 15.00g [root@qe-pod-master-etcd-1 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 30G 0 disk ├─sda1 8:1 0 500M 0 part /boot └─sda2 8:2 0 29.5G 0 part ├─rhel-root 253:0 0 15G 0 lvm / ├─rhel-docker--pool_tmeta 253:1 0 32M 0 lvm │ └─rhel-docker--pool 253:3 0 8.2G 0 lvm └─rhel-docker--pool_tdata 253:2 0 8.2G 0 lvm └─rhel-docker--pool 253:3 0 8.2G 0 lvm 2. Run the dockergc command to clean up image space. [root@qe-pod-node-registry-router-1 ~]# openshift ex dockergc --image-gc-high-threshold=40 --image-gc-low-threshold=30 --dry-run=true docker build garbage collection daemon MinimumGCAge: {1h0m0s}, ImageGCHighThresholdPercent: 40, ImageGCLowThresholdPercent: 30 gathering disk usage data usage is under high threshold (4774MB < 6140MB) Actual results: Expected results: 2. Should cleanup the image as my data space usage > 40% Additional info:
I believe this was a limitation that we all agreed was acceptable since this feature is targeted for Online deployment when cri-o is the runtime. In that case we would be running docker with overlay. Derek is this correct? That being said, it might not be too difficult to make it work for devicemapper as well if we can extract the usage data directly from docker. Do we want to try that?
Not supporting usage of devicemapper is fine ATM. Let's explicitly doc the set of storage drivers we can support with the experimental command.
Origin PR: https://github.com/openshift/origin/pull/17327
Verify on ocp-3.9 # oc version oc v3.9.0-0.16.0 kubernetes v1.9.0-beta1 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://xxxx:8443 openshift v3.9.0-0.16.0 kubernetes v1.9.0-beta1 There is "Only the overlay2 docker storage driver is supported at this time." in 'oc ex -h' //detail: # oc ex dockergc -h Perform garbage collection to free space in docker storage If the OpenShift node is configured to use a container runtime other than docker, docker will still be used to do builds. However OpenShift itself may not manage the docker storage since it is not the container runtime for pods. This utility allows garbage collection to do be done on the docker storage. Only the overlay2 docker storage driver is supported at this time. Usage: oc ex dockergc [NAME] [options] Examples: # Perform garbage collection with the default settings ocex dockergc Options: --dry-run=false: If true, show the result of the operation without performing it. --image-gc-high-threshold=80: The percent of disk usage after which image garbage collection is always run. --image-gc-low-threshold=60: The percent of disk usage before which image garbage collection is never run. Lowest disk usage to garbage collect to. --minimum-ttl-duration=1h0m0s: Minimum age for a container or unused image before it is garbage collected. Examples: '300ms', '10s' or '2h45m'. -o, --output='': Output results as yaml or json instead of executing, or use name for succint output (resource/name). -a, --show-all=false: When printing, show all resources (default hide terminated pods.) --show-labels=false: When printing, show all labels as the last column (default hide labels column) --sort-by='': If non-empty, sort list types using this field specification. The field specification is expressed as a JSONPath expression (e.g. '{.metadata.name}'). The field in the API resource specified by this JSONPath expression must be an integer or a string. --template='': Template string or path to template file to use when -o=go-template, -o=go-template-file. The template format is golang templates [http://golang.org/pkg/text/template/#pkg-overview]. Use "oc options" for a list of global command-line options (applies to all commands).
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0489