Red Hat Bugzilla – Bug 1287035
Some images are not deleted successfully by garbage collector
Last modified: 2015-12-01 07:33:38 EST
Description of problem:
Trigger image garbage collection by creating big files to reach its image-gc-high-threshold, monitor the node log, found image was deletee, However 'docker images' show the image still exists.
See log: http://pastebin.test.redhat.com/331780
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. In node-config.yaml, set image-gc-high-threshold to '30'
2. Start node
3. Create big files on the node to make the disk usage grow over 30%.
4. Monitor the node log:
I1201 10:41:25.455733 1517 image_manager.go:254] [ImageManager]: Removing image "33a68611c7343d78bdca68f0e680c28ca70bf871fdc8d4427191f1abcede64b5" to free 239376801 bytes
5. On node, run docker images to verify the image is removed.
After step 5: The image is not removed
[fedora@ip-172-18-14-28 ~]$ docker images --no-trunc|grep 33a68611c7343d78bdca68f0e680c28ca70bf871fdc8d4427191f1abcede64b5
openshift/origin-base latest 33a68611c7343d78bdca68f0e680c28ca70bf871fdc8d4427191f1abcede64b5 10 hours ago 239.4 MB
If you look at http://pastebin.test.redhat.com/331780, you will find the image 33a68611c7343d78bdca68f0e680c28ca70bf871fdc8d4427191f1abcede64b5 was being deleted more than once, which indicated the first deletion was not successful.
The image should be removed.
At first, GC determines how much memory has to be freed. Then it sorts images and starts removing them until enough of memory is fried. If an image is not deleted (from any reason), it is skipped and GC does as the image would never existed. I.e. its size is not counted into freed memory.
Thus in real the required amount of memory is actually freed. Just some older images are kept even if they were supposed to be deleted.
For example, an image is not deleted if there is running container that was run from the image. Most likely, for each image that was not removed, you will find its corresponding container.
If there are more failed removals of images, only the last error is reported. What we could do is to report each such error or conjoin all errors into one. However, I think this is not an issue. GC removes as much images as it can and reports the last error. From the error you can deduce there was an error but the number of deleted images stays the same. What can be deleted, is deleted. If it is not enough, GC will try it next time and maybe then it will succeed.