1658898 – Imagestreams and templates can't be recreate/update under Managed status

Bug 1658898 - Imagestreams and templates can't be recreate/update under Managed status

Summary: Imagestreams and templates can't be recreate/update under Managed status

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	ImageStreams
Sub Component:
Version:	4.1.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	4.1.0
Assignee:	Gabe Montero
QA Contact:	XiuJuan Wang
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-12-13 06:34 UTC by XiuJuan Wang
Modified:	2019-06-04 10:41 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-06-04 10:41:14 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2019:0758	0	None	None	None	2019-06-04 10:41:20 UTC

Description XiuJuan Wang 2018-12-13 06:34:20 UTC

Description of problem:
Imagestreams and templates can't be recreate/update under Managed status

Version-Release number of selected component (if applicable):

version   0.0.1-2018-12-08-172651   True        False         33m       Cluster version is 0.0.1-2018-12-08-172651

quay.io/openshift-release-dev/ocp-v4.0@sha256:d46de909247d8002e92e7d1ad6e3f32d3b4e439e44ebe4ae9bd63a39bf9d4276

How reproducible:
always

Steps to Reproduce:
1.Make sure Imatestreams and templates have been created under project
  Make sure "Management State:  Managed" in samplesresource

$oc get is jenkins dotnet -o yaml  -n openshift  | grep operator
    samplesoperator.config.openshift.io/version: v0.0.1
    samplesoperator.config.openshift.io/managed: "true"
    samplesoperator.config.openshift.io/version: v0.0.1
    samplesoperator.config.openshift.io/managed: "true"

2.Delete dotnet is , and tag jenkins imagestream point to another registry
3.Check the imagestreams later

Actual results:

dotnet imagestream can't be recreated and jenkins imagestream can't be recovered

$oc logs -f cluster-samples-operator-5cb877f4fb-hpzlb

time="2018-12-13T05:48:40Z" level=error msg="error syncing key (openshift/samples-registry-credentials): Operation cannot be fulfilled on samplesresources.samplesoperator.config.openshift.io \"openshift-samples\": the object has been modified; please apply your changes to the latest version and try again"
time="2018-12-13T05:48:43Z" level=info msg="updating dockerconfig secret samples-registry-credentials in openshift namespace"
time="2018-12-13T05:49:36Z" level=error msg="error syncing key (openshift/jenkins): retry imagestream jenkins because in progress is not yet true and retryCount is not exceeded"
time="2018-12-13T05:49:39Z" level=error msg="error syncing key (openshift/jenkins): retry imagestream jenkins because in progress is not yet true and retryCount is not exceeded"
time="2018-12-13T05:49:42Z" level=error msg="error syncing key (openshift/jenkins): retry imagestream jenkins because in progress is not yet true and retryCount is not exceeded"
time="2018-12-13T05:49:45Z" level=error msg="error syncing key (openshift/jenkins): retry imagestream jenkins because in progress is not yet true and retryCount is not exceeded"
time="2018-12-13T05:49:48Z" level=error msg="error syncing key (openshift/jenkins): retry imagestream jenkins because in progress is not yet true and retryCount is not exceeded"
time="2018-12-13T05:49:51Z" level=info msg="retry count for imagestream event for jenkins exceeded; ignoring this change"
time="2018-12-13T05:49:54Z" level=error msg="error syncing key (openshift/jenkins): retry imagestream jenkins because in progress is not yet true and retryCount is not exceeded"
time="2018-12-13T05:49:57Z" level=error msg="error syncing key (openshift/jenkins): retry imagestream jenkins because in progress is not yet true and retryCount is not exceeded"
time="2018-12-13T05:50:00Z" level=error msg="error syncing key (openshift/jenkins): retry imagestream jenkins because in progress is not yet true and retryCount is not exceeded"
time="2018-12-13T05:50:03Z" level=error msg="error syncing key (openshift/jenkins): retry imagestream jenkins because in progress is not yet true and retryCount is not exceeded"
time="2018-12-13T05:50:06Z" level=error msg="error syncing key (openshift/jenkins): retry imagestream jenkins because in progress is not yet true and retryCount is not exceeded"
time="2018-12-13T05:50:10Z" level=info msg="retry count for imagestream event for jenkins exceeded; ignoring this change"
time="2018-12-13T05:52:48Z" level=error msg="error syncing key (openshift/mysql): retry imagestream mysql because in progress is not yet true and retryCount is not exceeded"
time="2018-12-13T05:52:51Z" level=error msg="error syncing key (openshift/ruby): retry imagestream ruby because in progress is not yet true and retryCount is not exceeded"
time="2018-12-13T05:52:55Z" level=error msg="error syncing key (openshift/httpd): retry imagestream httpd because in progress is not yet true and retryCount is not exceeded"
time="2018-12-13T05:52:58Z" level=error msg="error syncing key (openshift/mariadb): retry imagestream mariadb because in progress is not yet true and retryCount is not exceeded
Expected results:


Additional info:
$ oc get samplesresources -o yaml 
apiVersion: v1
items:
- apiVersion: samplesoperator.config.openshift.io/v1alpha1
  kind: SamplesResource
  metadata:
    creationTimestamp: 2018-12-13T05:16:16Z
    finalizers:
    - samplesoperator.config.openshift.io/finalizer
    generation: 1
    name: openshift-samples
    namespace: ""
    resourceVersion: "41006"
    selfLink: /apis/samplesoperator.config.openshift.io/v1alpha1/samplesresources/openshift-samples
    uid: 376716ef-fe96-11e8-8c40-0aaa93c407fa
  spec:
    architectures:
    - x86_64
    installType: centos
    managementState: Managed
  status:
    architectures:
    - x86_64
    conditions:
    - lastTransitionTime: 2018-12-13T05:16:43Z
      lastUpdateTime: 2018-12-13T05:16:43Z
      status: "True"
      type: SamplesExist
    - lastTransitionTime: 2018-12-13T05:48:34Z
      lastUpdateTime: 2018-12-13T05:48:34Z
      status: "True"
      type: ImportCredentialsExists
    - lastTransitionTime: 2018-12-13T05:16:13Z
      lastUpdateTime: 2018-12-13T05:16:13Z
      status: "True"
      type: ConfigurationValid
    - lastTransitionTime: 2018-12-13T05:18:27Z
      lastUpdateTime: 2018-12-13T05:18:27Z
      status: "False"
      type: ChangesInProgress
    - lastTransitionTime: 2018-12-13T05:16:13Z
      lastUpdateTime: 2018-12-13T05:16:13Z
      status: "False"
      type: PendingRemove
    installType: centos
    managementState: Managed
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

Comment 1 Gabe Montero 2018-12-13 16:22:38 UTC

Hmm ... conflict churn.

The "retry imagestream mysql because in progress is not yet true and retryCount is not exceeded" messages on the
imagestreams you did not change @XiuJuan could indicate a problem with setting to true in the first place.

I'll try to recreate, but if you can supply the entire pod log instead of the snippet you posted, that could help
just in case this is a timing issue.

Comment 2 XiuJuan Wang 2018-12-14 03:47:24 UTC

This kind error "error syncing key (openshift/jenkins): retry imagestream jenkins because in progress is not yet true and retryCount is not exceeded" always exist, no matter the imagestream if modified.

http://pastebin.test.redhat.com/683878

After delete/retag imagestreams for 1 hour, the imagestreams don't be recreate/update.

Comment 3 XiuJuan Wang 2018-12-14 07:46:58 UTC

Could reproduce this issue with latest origin image.
docker.io/openshift/origin-cluster-samples-operator              latest              55cb422f7826        2 hours ago         281 MB

Comment 4 Gabe Montero 2018-12-14 17:26:26 UTC

OK the complete log was helpful.  I see where the bug is now.  Starting on a fix.

Comment 5 Gabe Montero 2018-12-14 18:00:36 UTC

Also, the recovery of the jenkins imagestream after you tag in an invalid registry is not something the operator does @XiuJuan.

Ben and I have discussed such a notion, and the decision is that it is not "meets min" for 4.0.  If the admin corrupts the imagestream
in such a way, he can delete the sampleresource or mark it Removed to recover.

But the deleted dotnet imagestream from your description should be recovered.

Comment 6 Gabe Montero 2018-12-14 21:47:58 UTC

PR https://github.com/openshift/cluster-samples-operator/pull/67

Comment 7 Gabe Montero 2018-12-15 22:23:28 UTC

PR has merged

Comment 8 wewang 2018-12-17 07:39:25 UTC

Tested delete dotnet, can be recreated, and jenkins can be retag, 
tested  in next_gen_installer env  
v0.7.0-master-4-ga4e426ee762c20019bbb90fe35d33c9b26d23393
and latest origin image
docker.io/openshift/origin-cluster-samples-operator   eb7db30d28bc        27 hours ago        281 MB

$ oc delete is dotnet -n openshift
imagestream.image.openshift.io "dotnet" deleted
$ oc get is dotnet -n openshift
NAME      IMAGE REPOSITORY                                                    TAGS             UPDATED
dotnet    image-registry.openshift-image-registry.svc:5000/openshift/dotnet   2.0,2.1,latest   6 seconds ago
$ oc get is jenkins -n openshift
NAME      IMAGE REPOSITORY                                                     TAGS              UPDATED
jenkins   image-registry.openshift-image-registry.svc:5000/openshift/jenkins   test,1,2,latest   11 seconds ago

Comment 9 Gabe Montero 2019-01-08 16:02:47 UTC

This is a 4.0 specific bug fixed in 4.0 dev cycle  ... no doc update needed

Comment 12 errata-xmlrpc 2019-06-04 10:41:14 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758

Note You need to log in before you can comment on or make changes to this bug.