Bug 1543511

Summary: CRI-O runtime - image pruning fails
Product: OpenShift Container Platform Reporter: Vikas Laad <vlaad>
Component: Image RegistryAssignee: Alexey Gladkov <agladkov>
Status: CLOSED ERRATA QA Contact: Dongbo Yan <dyan>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.9.0CC: amurdaca, aos-bugs, bparees, mpatel
Target Milestone: ---   
Target Release: 3.9.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Feature: Make image pruner tolerate since to empty docker image reference. Reason: In the new version of kube, a space is used to indicate that the field is empty, but is flooded. Result: The image pruner ignore the field containing space.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-03-28 14:27:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Vikas Laad 2018-02-08 15:36:59 UTC
Description of problem:
Image pruning is failing during reliability test run, these tests run over a period of 2 weeks and keeps creating/building/scaling/accessing quickstart apps.

root@ip-172-31-9-217: ~ # oc adm prune images --keep-tag-revisions=3 --keep-younger-than=60m --confirm
Failed to build graph!

The following objects have invalid references:

  DeploymentConfig[nodejs-mongodb-u-820-294-837-301/nodejs-mongodb-example]: invalid docker image reference " ": invalid reference format

Either fix the references or delete the objects to make the pruner proceed.
error: failed to build graph - no changes made

I am trying to do pruning with a user which has following role
system:image-pruner                                                   /system:image-pruner                                                   redhat                                                                          default/registry                                                                                                   

I also see following Warnings on nodes
  Warning  ImageGCFailed          8m (x93 over 7h)   kubelet, ip-172-31-11-219.us-west-2.compute.internal  (combined from similar events): wanted to free 6071304192 bytes, but freed 0 bytes space with errors in image deletion: [rpc error: code = Unknown desc = Image used by 6c28f3ff59333616c4d54998e9db5ca60e8b0b92600202bcbfe980916dd398dd: image is in use by a container, rpc error: code = Unknown desc = Image used by b2f15b6484145363112b005f5597f4cf826cc09dd8ad3178495241e84dfa351c: image is in use by a container, rpc error: code = Unknown desc = Image used by a37256b6db66f6e8351d97633f6cf6c3478c742e0312e44020a06c3d5757cfc0: image is in use by a container, rpc error: code = Unknown desc = Image used by 552f10aa6cd9e06c1940263b7f70d7d673e52808acefb11448f84a1112457346: image is in use by a container, rpc error: code = Unknown desc = Image used by e9baeaf58fc0285ba1c82f8e3fbc2ebc9d193cb0c245c28a362d2c02466eb716: image is in use by a container, rpc error: code = Unknown desc = Image used by 23627270dc49a342ce298dff6f84d96b8fd922f01cfe2b371ef73505ce261544: image is in use by a container, rpc error: code = Unknown desc = Image used by 6c56ad0acb0dd0064386c6485f686dee0ad62968e89bc918577a4fbd0b6201f2: image is in use by a container, rpc error: code = Unknown desc = Image used by 22b668ee3a26f2efff038a02f059b3263501a2a61726c92c166d1f8c8c003cdc: image is in use by a container, rpc error: code = Unknown desc = Image used by 1a82132c2a9d7846130844ea3841cb4f6a20acdddb2b7422ca3abea090af0df7: image is in use by a container, rpc error: code = Unknown desc = Image used by 4e811f0ea557fa207c3ad184f9aa7e4809f2d32f0194cf120f3cc5e6c8441980: image is in use by a container, rpc error: code = Unknown desc = Image used by 711ad77cb87b4d3c0826b439ccd6486457b9b4262daeb9f14bfdd1ecc1c973a8: image is in use by a container, rpc error: code = Unknown desc = Image used by 3bf7967be8ef8f90ad877acfcf604cc8f559c8b079e496a096560051326617fe: image is in use by a container]


Version-Release number of selected component (if applicable):
openshift v3.9.0-0.36.0
kubernetes v1.9.1+a0ce1bc657
etcd 3.2.8

How reproducible:


Steps to Reproduce:
1. create/build/delete quickstart apps for long time
2. provide image-pruner role to a user and login with that user
3. oc adm prune images --keep-tag-revisions=3 --keep-younger-than=60m --confirm

Actual results:
Errors

Expected results:
Pruning should work

Additional info:
Attaching master and node logs.

Comment 4 Ben Parees 2018-02-08 15:48:10 UTC
ImageGC warnings are unrelated to registry pruning, please open a separate issue for that.

The registry pruning is unrelated to CRIO.  The failure you're getting:

DeploymentConfig[nodejs-mongodb-u-820-294-837-301/nodejs-mongodb-example]: invalid docker image reference " ": invalid reference format

Is something I guess we need to make our pruner tolerate since we explicitly tell people to use a docker image reference value of " " in order to work around some limitations of the API.  I'm surprised you're the first person to hit this.

Comment 5 Alexey Gladkov 2018-02-12 13:35:35 UTC
Can you test this fix https://github.com/openshift/origin/pull/18570 ?

Comment 6 Vikas Laad 2018-02-12 14:20:20 UTC
Alexey, I wont be able to test it until its available in puddle.

Comment 8 Vikas Laad 2018-02-26 20:05:08 UTC
Started the reliability tests run.

Comment 9 Vikas Laad 2018-03-06 14:00:31 UTC
Verified on following version

openshift v3.9.0-0.53.0
kubernetes v1.9.1+a0ce1bc657
etcd 3.2.8

Comment 12 errata-xmlrpc 2018-03-28 14:27:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0489