Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1508563

Summary:

Image tag keeps disappearing

Product:

OpenShift Container Platform

Reporter:

Stefanie Forrester <dakini>

Component:

Image Registry

Assignee:

Michal Minar <miminar>

Status:

CLOSED ERRATA

QA Contact:

Dongbo Yan <dyan>

Severity:

urgent

Docs Contact:

Priority:

unspecified

Version:

3.5.1

CC:

aos-bugs, bparees, dakini, jupierce, rvokal, wabouham

Target Milestone:

---

Target Release:

3.8.0

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Cause: Broken prioritization algorithm of imagestreamtags could not cope with two tags with the same name, one having additional "v" prefix. Consequence: One of the tags got lost during image stream update. Fix: Prioritization algorithm has been fixed. Conversion functions no longer use the prioritization algorithm. Result: Image stream tags do not disappear.

Story Points:

---

Clone Of:

Environment:

Last Closed:

2018-03-28 14:09:47 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Current running configuration of our docker registry	none
master-config	none
node-config	none
Example of destroying a tag in real time	none

Description Stefanie Forrester 2017-11-01 16:44:16 UTC

Created attachment 1346634 [details]
Current running configuration of our docker registry

Description of problem:

We have a docker registry running on a 3.5 cluster that keeps losing or deleting some image tags. Every time we upload the tags again, they end up disappearing within a day or so.

Version-Release number of selected component (if applicable):

oc v3.5.5.26
kubernetes v1.5.2+43a9be4
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://internal.api.reg-aws.openshift.com:443
openshift v3.5.5.26
kubernetes v1.5.2+43a9be4

image: registry.access.redhat.com/openshift3/ose-docker-registry:v3.5.5.26


How reproducible:

Every time.

Steps to Reproduce:
1. Add the tag 'v3.7.0' to an image.
2. Wait 12-24 hours.
3. Check to see if the tag is still there. 'oc get is -n openshift3 ose -o yaml'.

Actual results:

Tag v3.7.0 has gone missing from the image yaml.

Expected results:

Tag v3.7.0 should still exist, and should have new entries pre-pended to it as we push that tag a few times per week, (similar to the 'latest' tag).

Additional info:

Comment 2 Stefanie Forrester 2017-11-01 16:47:14 UTC

Created attachment 1346636 [details]
master-config

Comment 3 Stefanie Forrester 2017-11-01 16:47:34 UTC

Created attachment 1346637 [details]
node-config

Comment 4 Ben Parees 2017-11-01 16:49:15 UTC

imagestream tags are not the same as what is in your registry.

I suspect you are updating(replacing) your imagestreams with one that does not have the 3.7.0 tag.

either that or you are running pruning and the 3.7.0 tag is not referenced by anything.

Comment 6 Stefanie Forrester 2017-11-01 16:56:53 UTC

(In reply to Ben Parees from comment #4)
> imagestream tags are not the same as what is in your registry.
> 
> I suspect you are updating(replacing) your imagestreams with one that does
> not have the 3.7.0 tag.
> 
> either that or you are running pruning and the 3.7.0 tag is not referenced
> by anything.

Ok, it sounds like I'm checking for the existence of this tag incorrectly then. That's good to know, thanks. But I can also confirm that the tag is missing by doing this:

[root@online-int-master-05114 ~]# curl -sH "Authorization: Bearer $(oc --config=/root/.kube/reg-aws whoami -t)" https://registry.reg-aws.openshift.com/v2/openshift3/ose/tags/list  | python -m json.tool |grep v3.7.0\"
[root@online-int-master-05114 ~]# 

Whereas the same command tells me that tag v3.7 exists.

[root@online-int-master-05114 ~]# curl -sH "Authorization: Bearer $(oc --config=/root/.kube/reg-aws whoami -t)" https://registry.reg-aws.openshift.com/v2/openshift3/ose/tags/list  | python -m json.tool |grep v3.7\"
        "v3.7",


I checked for the presence of a pruning cron job on the Ops side, but it appears to be gone. I had disabled that job over a week ago, so it makes sense that the cron job is gone now.

Is there maybe somewhere else I can check for the presence of a pruning job? Maybe something internal to openshift?

Would it help if I posted the master audit logs?

Comment 8 Ben Parees 2017-11-02 02:39:35 UTC

the only way to run pruning is via oadm prune (or oc adm prune).

But it can be run from anywhere that has admin credentials.

I'm not aware of any other "normal" mechanism that would be just deleting tags out of the registry.

Comment 10 Ben Parees 2017-11-02 20:02:11 UTC

I spoke w/ Stefanie and she's going to set loglevel 3 on the master-api and master-controllers so we can catch the DELETE api calls (assuming they are happening).  

In theory we should never see a delete event on this cluster since no tags are ever being deleted, so if we see any in the logs that is indicative that someone is explicitly deleting them.  (Assuming we do see the DELETE api call though, i'm still not sure how we track down who is doing it... we'll get some client information, hopefully that will be enough).

Comment 11 Ben Parees 2017-11-06 23:01:47 UTC

have any more tags disappeared since we turned on logging?

Comment 18 Justin Pierce 2017-11-21 22:02:52 UTC

Created attachment 1357016 [details]
Example of destroying a tag in real time

Comment 19 Justin Pierce 2017-11-21 22:07:33 UTC

In the attached example, pushing "3.7.9" destroys the "v3.7.9" tag. Pushing "3.7.9-1" destroys the "v3.7.9-1" tag.

Comment 20 Michal Minar 2017-11-22 12:17:02 UTC

Wow, that's really interesting. I can reproduce it locally on 3.7 cluster. Debugging now. Thanks for the reproducer Justin!

Comment 21 Michal Minar 2017-11-22 12:23:07 UTC

Sorry, false alarm. My reproducer was buggy. Trying to reproduce once more.

Comment 22 Michal Minar 2017-11-22 13:05:16 UTC

I switched to latest 3.5 release and am happy to report that I can reproduce there.

Comment 23 Ben Parees 2017-11-22 14:25:24 UTC

That is fantastic news, thank you Michal.

Justin in the meantime, you're going to want to make very sure you don't run your job w/ the wrong tag being pushed since that seems to be the definitive cause of the "good" tags being lost.

Comment 24 Michal Minar 2017-11-23 16:09:07 UTC

Fix: https://github.com/openshift/ose/pull/932

Comment 25 Ben Parees 2017-12-06 20:34:12 UTC

origin master: https://github.com/openshift/origin/pull/17430

ocp 3.5: https://github.com/openshift/ose/pull/932
ocp 3.6: https://github.com/openshift/ose/pull/934
ocp 3.7: https://github.com/openshift/ose/pull/935

Comment 27 Dongbo Yan 2018-01-04 09:16:57 UTC

Verified
oc v3.9.0-0.9.0
kubernetes v1.8.1+0d5291c
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server 
openshift v3.9.0-0.9.0
kubernetes v1.8.1+0d5291c

1.push image to imagestreamTag nodejs-mongodb-example:v3.9, then check image if exists.
2.push image to imagestreamTag nodejs-mongodb-example:3.9, then check v3.9 image if exists.

both v3.9 and 3.9 images are existing, could move to verified

Comment 30 errata-xmlrpc 2018-03-28 14:09:47 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0489