Bug 1985192

Summary:	Imagepruner met error "Job has reached the specified backoff limit" which causes image registry degraded
Product:	OpenShift Container Platform	Reporter:	XiuJuan Wang <xiuwang>
Component:	Image Registry	Assignee:	Oleg Bulatov <obulatov>
Status:	CLOSED DUPLICATE	QA Contact:	XiuJuan Wang <xiuwang>
Severity:	low	Docs Contact:
Priority:	low
Version:	4.7	CC:	aos-bugs
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2021-10-11 14:34:41 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description XiuJuan Wang 2021-07-23 06:46:13 UTC

This bug was initially created as a copy of Bug #1887010

I am copying this bug because: 

Description of problem:
ImagePruner error causes image registry degraded.


Version-Release number of selected component (if applicable):

4.7.20-x86_64

How reproducible:
10%?

Steps to Reproduce:
1.Set up a cluster
2.
3.

Actual results:
Image registry is degraded for "ImagePrunerDegraded: Job has reached the specified backoff limit"
          spec:
            affinity: {}
            containers:
            - args:
              - adm
              - prune
              - images
              - --confirm=true
              - --certificate-authority=/var/run/configmaps/serviceca/service-ca.crt
              - --keep-tag-revisions=3
              - --keep-younger-than=60m
              - --ignore-invalid-refs=true
              - --loglevel=1
              - --prune-registry=true
              - --registry-url=https://image-registry.openshift-image-registry.svc:5000
              command:
              - oc
              image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:3a555f211c96700bfe652a6919b8ce0bb1e8b939cd58bb4e874ee11eba183eee
              imagePullPolicy: IfNotPresent


[2021-07-22T06:29:45.114Z] Spec:
[2021-07-22T06:29:45.114Z] Status:
[2021-07-22T06:29:45.114Z]   Conditions:
[2021-07-22T06:29:45.114Z]     Last Transition Time:  2021-07-22T00:04:05Z
[2021-07-22T06:29:45.114Z]     Message:               Available: The registry is ready
[2021-07-22T06:29:45.114Z] ImagePrunerAvailable: Pruner CronJob has been created
[2021-07-22T06:29:45.114Z]     Reason:                Ready
[2021-07-22T06:29:45.114Z]     Status:                True
[2021-07-22T06:29:45.114Z]     Type:                  Available
[2021-07-22T06:29:45.114Z]     Last Transition Time:  2021-07-22T05:57:18Z
[2021-07-22T06:29:45.114Z]     Message:               Progressing: The registry is ready
[2021-07-22T06:29:45.114Z]     Reason:                Ready
[2021-07-22T06:29:45.114Z]     Status:                False
[2021-07-22T06:29:45.114Z]     Type:                  Progressing
[2021-07-22T06:29:45.114Z]     Last Transition Time:  2021-07-21T23:55:32Z
[2021-07-22T06:29:45.114Z]     Message:               ImagePrunerDegraded: Job has reached the specified backoff limit
[2021-07-22T06:29:45.114Z]     Reason:                ImagePrunerJobFailed
[2021-07-22T06:29:45.114Z]     Status:                True
[2021-07-22T06:29:45.114Z]     Type:                  Degraded
[2021-07-22T06:29:45.114Z]   Extension:               <nil>


Expected results:
Should has no such error.

Additional info:

Comment 6 XiuJuan Wang 2021-09-06 02:12:44 UTC

We met this issues several times in ci job, but the ci must-gather log not enough.
I couldn't reproduce it manually.
I will trigger more jobs and keep cluster alive when reproduce it.

Comment 8 Oleg Bulatov 2021-10-11 14:34:41 UTC


*** This bug has been marked as a duplicate of bug 1990125 ***