Hide Forgot
Description of problem: When some imagestreams import failed, the ImageChangesInProgress always keep to true. Meantime save managementstate to 'Removed'. The status.managementstate would keep in 'Managed' forever. This conduce imagestreams and templates not removed. Version-Release number of selected component (if applicable): $ oc get clusterversion version NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.alpha-2019-01-03-031244 True False 2h Cluster version is 4.0.0-0.alpha-2019-01-03-031244 How reproducible: always Steps to Reproduce: 1.switch installtype to rhel in samplesresources and create the credentials for registry.redhat.io 2.Wait mins, check imagestreams under openshift project jenkins imagestream import failed due to image not found 3.Save managementstate to removed. 4.Check samplesresources 5.Check imagestreams and templates Actual results: step4: ImageChangesInProgress has keet true for 20mins+, and status.mangementstate is still managed. $ oc get samplesresources -o yaml apiVersion: v1 items: - apiVersion: samplesoperator.config.openshift.io/v1alpha1 kind: SamplesResource metadata: creationTimestamp: 2019-01-04T05:35:26Z finalizers: - samplesoperator.config.openshift.io/finalizer generation: 1 name: openshift-samples namespace: "" resourceVersion: "217615" selfLink: /apis/samplesoperator.config.openshift.io/v1alpha1/samplesresources/openshift-samples uid: 8a41e914-0fe2-11e9-a8af-029e36a3ae62 spec: architectures: - x86_64 installType: rhel managementState: Removed version: 4.0.0-alpha1-85ee5a974 status: architectures: - x86_64 conditions: - lastTransitionTime: 2019-01-04T07:41:26Z lastUpdateTime: 2019-01-04T07:41:26Z status: "True" type: SamplesExist - lastTransitionTime: 2019-01-04T07:37:40Z lastUpdateTime: 2019-01-04T07:37:40Z status: "True" type: ImportCredentialsExists - lastTransitionTime: 2019-01-04T05:47:51Z lastUpdateTime: 2019-01-04T05:47:51Z status: "True" type: ConfigurationValid - lastTransitionTime: 2019-01-04T08:13:52Z lastUpdateTime: 2019-01-04T08:13:52Z reason: 'jenkins dotnet-runtime ' status: "True" type: ImageChangesInProgress - lastTransitionTime: 2019-01-04T07:49:20Z lastUpdateTime: 2019-01-04T07:49:20Z status: "True" type: PendingRemove - lastTransitionTime: 2019-01-04T05:35:23Z lastUpdateTime: 2019-01-04T05:35:23Z status: "False" type: MigrationInProgress - lastTransitionTime: 2019-01-04T08:13:52Z lastUpdateTime: 2019-01-04T08:13:52Z status: "False" type: ImportImageErrorsExist installType: rhel managementState: Managed kind: List metadata: resourceVersion: "" selfLink: "" step5: imagestreams and templates both exist. Expected results: Additional info: The operator log after change managementstage to Removed. time="2019-01-04T07:41:31Z" level=error msg="error syncing key (openshift/jboss-datagrid72-openshift): Operation cannot be fulfilled on samplesresources.samplesoperator.config.openshift.io \"openshift-samples\": the object has been modified; please apply your changes to the latest version and try again" time="2019-01-04T07:46:29Z" level=info msg="watch event cli not part of operators inventory" time="2019-01-04T07:47:00Z" level=info msg="processing secret watch event while in Managed state; deletion event: false" time="2019-01-04T07:47:00Z" level=info msg="updating dockerconfig secret samples-registry-credentials in openshift namespace" time="2019-01-04T07:47:05Z" level=error msg="error syncing key (openshift/jboss-webserver31-tomcat8-openshift): Operation cannot be fulfilled on samplesresources.samplesoperator.config.openshift.io \"openshift-samples\": the object has been modified; please apply your changes to the latest version and try again" ERROR: logging before flag.Parse: W0104 07:47:06.118674 1 reflector.go:341] github.com/openshift/cluster-samples-operator/vendor/github.com/operator-framework/operator-sdk/pkg/sdk/informer.go:84: watch of *unstructured.Unstructured ended with: very short watch: github.com/openshift/cluster-samples-operator/vendor/github.com/operator-framework/operator-sdk/pkg/sdk/informer.go:84: Unexpected watch close - watch lasted less than a second and no items received ERROR: logging before flag.Parse: W0104 07:47:09.729494 1 reflector.go:341] github.com/openshift/cluster-samples-operator/vendor/github.com/operator-framework/operator-sdk/pkg/sdk/informer.go:84: watch of *unstructured.Unstructured ended with: very short watch: github.com/openshift/cluster-samples-operator/vendor/github.com/operator-framework/operator-sdk/pkg/sdk/informer.go:84: Unexpected watch close - watch lasted less than a second and no items received time="2019-01-04T07:49:24Z" level=error msg="error syncing key (openshift/jboss-fuse70-karaf-openshift): Operation cannot be fulfilled on samplesresources.samplesoperator.config.openshift.io \"openshift-samples\": the object has been modified; please apply your changes to the latest version and try again" time="2019-01-04T07:49:37Z" level=info msg="processing secret watch event while in Removed state; deletion event: false" time="2019-01-04T07:49:37Z" level=info msg="creation of credential in openshift namespace recognized" time="2019-01-04T07:50:21Z" level=info msg="watch event cli not part of operators inventory" ERROR: logging before flag.Parse: W0104 07:55:52.248677 1 reflector.go:341] github.com/openshift/cluster-samples-operator/vendor/github.com/operator-framework/operator-sdk/pkg/sdk/informer.go:84: watch of *unstructured.Unstructured ended with: very short watch: github.com/openshift/cluster-samples-operator/vendor/github.com/operator-framework/operator-sdk/pkg/sdk/informer.go:84: Unexpected watch close - watch lasted less than a second and no items received time="2019-01-04T07:56:08Z" level=info msg="watch event cli not part of operators inventory" time="2019-01-04T07:57:00Z" level=info msg="processing secret watch event while in Removed state; deletion event: false" time="2019-01-04T07:57:00Z" level=info msg="updating dockerconfig secret samples-registry-credentials in openshift namespace" time="2019-01-04T07:57:05Z" level=error msg="error syncing key (openshift/fuse7-karaf-openshift): Operation cannot be fulfilled on samplesresources.samplesoperator.config.openshift.io \"openshift-samples\": the object has been modified; please apply your changes to the latest version and try again" time="2019-01-04T07:59:37Z" level=info msg="processing secret watch event while in Removed state; deletion event: false" time="2019-01-04T07:59:37Z" level=info msg="creation of credential in openshift namespace recognized" ERROR: logging before flag.Parse: W0104 08:02:05.033031 1 reflector.go:341] github.com/openshift/cluster-samples-operator/vendor/github.com/operator-framework/operator-sdk/pkg/sdk/informer.go:84: watch of *unstructured.Unstructured ended with: very short watch: github.com/openshift/cluster-samples-operator/vendor/github.com/operator-framework/operator-sdk/pkg/sdk/informer.go:84: Unexpected watch close - watch lasted less than a second and no items received time="2019-01-04T08:07:00Z" level=info msg="processing secret watch event while in Removed state; deletion event: false" time="2019-01-04T08:07:00Z" level=info msg="updating dockerconfig secret samples-registry-credentials in openshift namespace" time="2019-01-04T08:07:03Z" level=error msg="error syncing key (openshift-cluster-samples-operator/samples-registry-credentials): Operation cannot be fulfilled on samplesresources.samplesoperator.config.openshift.io \"openshift-samples\": the object has been modified; please apply your changes to the latest version and try again" time="2019-01-04T08:07:06Z" level=info msg="processing secret watch event while in Removed state; deletion event: false" time="2019-01-04T08:07:06Z" level=info msg="updating dockerconfig secret samples-registry-credentials in openshift namespace" time="2019-01-04T08:07:09Z" level=error msg="error syncing key (openshift-cluster-samples-operator/samples-registry-credentials): Operation cannot be fulfilled on samplesresources.samplesoperator.config.openshift.io \"openshift-samples\": the object has been modified; please apply your changes to the latest version and try again" time="2019-01-04T08:07:12Z" level=info msg="processing secret watch event while in Removed state; deletion event: false" time="2019-01-04T08:07:12Z" level=info msg="updating dockerconfig secret samples-registry-credentials in openshift namespace" time="2019-01-04T08:07:15Z" level=error msg="error syncing key (openshift-cluster-samples-operator/samples-registry-credentials): Operation cannot be fulfilled on samplesresources.samplesoperator.config.openshift.io \"openshift-samples\": the object has been modified; please apply your changes to the latest version and try again" time="2019-01-04T08:07:18Z" level=info msg="processing secret watch event while in Removed state; deletion event: false" time="2019-01-04T08:07:18Z" level=info msg="updating dockerconfig secret samples-registry-credentials in openshift namespace" time="2019-01-04T08:07:21Z" level=error msg="error syncing key (openshift-cluster-samples-operator/samples-registry-credentials): Operation cannot be fulfilled on samplesresources.samplesoperator.config.openshift.io \"openshift-samples\": the object has been modified; please apply your changes to the latest version and try again" time="2019-01-04T08:07:24Z" level=info msg="processing secret watch event while in Removed state; deletion event: false" time="2019-01-04T08:07:24Z" level=info msg="updating dockerconfig secret samples-registry-credentials in openshift namespace" time="2019-01-04T08:07:27Z" level=error msg="error syncing key (openshift-cluster-samples-operator/samples-registry-credentials): Operation cannot be fulfilled on samplesresources.samplesoperator.config.openshift.io \"openshift-samples\": the object has been modified; please apply your changes to the latest version and try again" time="2019-01-04T08:07:30Z" level=info msg="processing secret watch event while in Removed state; deletion event: false" time="2019-01-04T08:07:30Z" level=info msg="updating dockerconfig secret samples-registry-credentials in openshift namespace" time="2019-01-04T08:07:33Z" level=error msg="error syncing key (openshift-cluster-samples-operator/samples-registry-credentials): Operation cannot be fulfilled on samplesresources.samplesoperator.config.openshift.io \"openshift-samples\": the object has been modified; please apply your changes to the latest version and try again" time="2019-01-04T08:07:37Z" level=info msg="processing secret watch event while in Removed state; deletion event: false" time="2019-01-04T08:07:37Z" level=info msg="updating dockerconfig secret samples-registry-credentials in openshift namespace" time="2019-01-04T08:07:40Z" level=error msg="error syncing key (openshift-cluster-samples-operator/samples-registry-credentials): Operation cannot be fulfilled on samplesresources.samplesoperator.config.openshift.io \"openshift-samples\": the object has been modified; please apply your changes to the latest version and try again" time="2019-01-04T08:07:43Z" level=info msg="processing secret watch event while in Removed state; deletion event: false" time="2019-01-04T08:07:43Z" level=info msg="updating dockerconfig secret samples-registry-credentials in openshift namespace" time="2019-01-04T08:07:46Z" level=error msg="error syncing key (openshift-cluster-samples-operator/samples-registry-credentials): Operation cannot be fulfilled on samplesresources.samplesoperator.config.openshift.io \"openshift-samples\": the object has been modified; please apply your changes to the latest version and try again" time="2019-01-04T08:07:50Z" level=info msg="processing secret watch event while in Removed state; deletion event: false" time="2019-01-04T08:07:50Z" level=info msg="updating dockerconfig secret samples-registry-credentials in openshift namespace" time="2019-01-04T08:07:53Z" level=error msg="error syncing key (openshift-cluster-samples-operator/samples-registry-credentials): Operation cannot be fulfilled on samplesresources.samplesoperator.config.openshift.io \"openshift-samples\": the object has been modified; please apply your changes to the latest version and try again" time="2019-01-04T08:07:57Z" level=info msg="processing secret watch event while in Removed state; deletion event: false" time="2019-01-04T08:07:57Z" level=info msg="updating dockerconfig secret samples-registry-credentials in openshift namespace" time="2019-01-04T08:08:02Z" level=error msg="error syncing key (openshift/rhdm71-optaweb-employee-rostering-openshift): Operation cannot be fulfilled on samplesresources.samplesoperator.config.openshift.io \"openshift-samples\": the object has been modified; please apply your changes to the latest version and try again" time="2019-01-04T08:09:37Z" level=info msg="processing secret watch event while in Removed state; deletion event: false" time="2019-01-04T08:09:37Z" level=info msg="creation of credential in openshift namespace recognized" ERROR: logging before flag.Parse: W0104 08:10:38.378999 1 reflector.go:341] github.com/openshift/cluster-samples-operator/vendor/github.com/operator-framework/operator-sdk/pkg/sdk/informer.go:84: watch of *unstructured.Unstructured ended with: very short watch: github.com/openshift/cluster-samples-operator/vendor/github.com/operator-framework/operator-sdk/pkg/sdk/informer.go:84: Unexpected watch close - watch lasted less than a second and no items received time="2019-01-04T08:11:46Z" level=info msg="watch event cli not part of operators inventory" time="2019-01-04T08:17:00Z" level=info msg="processing secret watch event while in Removed state; deletion event: false" time="2019-01-04T08:17:00Z" level=info msg="updating dockerconfig secret samples-registry-credentials in openshift namespace"
CVO status Every 2.0s: oc describe clusteroperator openshift-cluster-samples-operator dhcp-140-96.nay.redhat.com: Fri Jan 4 16:31:25 2019 Name: openshift-cluster-samples-operator Namespace: Labels: <none> Annotations: <none> API Version: config.openshift.io/v1 Kind: ClusterOperator Metadata: Creation Timestamp: 2019-01-04T05:35:26Z Generation: 1 Resource Version: 240219 Self Link: /apis/config.openshift.io/v1/clusteroperators/openshift-cluster-samples-operator UID: 8a55e802-0fe2-11e9-a8af-029e36a3ae62 Spec: Status: Conditions: Last Transition Time: 2019-01-04T08:31:23Z Status: False Type: Available Last Transition Time: 2019-01-04T07:41:30Z Message: Samples moving to 4.0.0-alpha1-85ee5a974 Status: True Type: Progressing Last Transition Time: 2019-01-04T08:31:23Z Status: False Type: Failing Extension: <nil> Version: Events: <none>
Hey @XiuJuan, Would you have the portion of the pod logs that include the "Image import for imagestream %s failed with reason %s and detailed message %s" when the jenkins imagestream import failed?
In case it was not obvious, the image stream name goes in place of the %s of the message I noted in my previous comment.
Also provide the yaml for the jenkins image stream when the import error occurs.
I might have a clue as to what is going on based on re-checking the image api in openshift/origin the jenkins imagestream yaml when you get the particular import failure you are producing should be key here. And if it is what I'm suspecting, I bet there is *NOT* a message in the entire pod logs like "Image import for imagestream %s failed with reason %s and detailed message %s"
I didn't install a new cluster with next gen installer successfully today. In my memory, there is no log about the jenkins imagestream import failure message in sample operator pod. I would paste more info after I install a new cluster
Hey @XiuJuan Wang OK, if there is not log, that as I was suspecting in https://bugzilla.redhat.com/show_bug.cgi?id=1663406#c5 the samples operator did *NOT* properly detect the import failure. Given what the ImageChanges condition reports, that is what is happening. That is where running "oc get is jenkins -n openshift -o yaml" after the import failure like I noted in https://bugzilla.redhat.com/show_bug.cgi?id=1663406#c4 is key. Based on how the error you got is represented in the jenkins imagestream yaml, I can adjust the error detection logic accordingly.
Confirmed that no logs about import failure in operator pod. Here is jenkins import error due to registry.redhat.io/openshift/jenkins-2-rhel7:v4.0 not exist. $ oc get is jenkins -o yaml -n openshift apiVersion: image.openshift.io/v1 kind: ImageStream metadata: annotations: openshift.io/display-name: Jenkins openshift.io/image.dockerRepositoryCheck: 2019-01-08T06:33:20Z samplesoperator.config.openshift.io/version: 4.0.0-alpha1-85ee5a974 creationTimestamp: 2019-01-08T06:32:42Z generation: 2 labels: samplesoperator.config.openshift.io/managed: "true" name: jenkins namespace: openshift resourceVersion: "224166" selfLink: /apis/image.openshift.io/v1/namespaces/openshift/imagestreams/jenkins uid: 33933ac2-130f-11e9-b397-0a580a800010 spec: lookupPolicy: local: false tags: - annotations: description: Provides a Jenkins 1.X server on RHEL 7. For more information about using this container image, including OpenShift considerations, see https://github.com/openshift/jenkins/blob/master/README.md. iconClass: icon-jenkins openshift.io/display-name: Jenkins 1.X openshift.io/provider-display-name: Red Hat, Inc. tags: hidden,jenkins version: 1.x from: kind: DockerImage name: registry.redhat.io/openshift3/jenkins-1-rhel7:latest generation: 2 importPolicy: {} name: "1" referencePolicy: type: Local - annotations: description: Provides a Jenkins 2.X server on RHEL 7. For more information about using this container image, including OpenShift considerations, see https://github.com/openshift/jenkins/blob/master/README.md. iconClass: icon-jenkins openshift.io/display-name: Jenkins 2.X openshift.io/provider-display-name: Red Hat, Inc. tags: jenkins version: 2.x from: kind: DockerImage name: registry.redhat.io/openshift/jenkins-2-rhel7:v4.0 generation: 2 importPolicy: {} name: "2" referencePolicy: type: Local - annotations: description: |- Provides a Jenkins server on RHEL 7. For more information about using this container image, including OpenShift considerations, see https://github.com/openshift/jenkins/blob/master/README.md. WARNING: By selecting this tag, your application will automatically update to use the latest version of Jenkins available on OpenShift, including major versions updates. iconClass: icon-jenkins openshift.io/display-name: Jenkins (Latest) openshift.io/provider-display-name: Red Hat, Inc. tags: jenkins from: kind: ImageStreamTag name: "2" generation: 1 importPolicy: {} name: latest referencePolicy: type: Local status: dockerImageRepository: image-registry.openshift-image-registry.svc:5000/openshift/jenkins tags: - items: - created: 2019-01-08T06:33:20Z dockerImageReference: registry.redhat.io/openshift3/jenkins-1-rhel7@sha256:3ae2a9ea40f6dab95ce85febe7eaf36807dda14c8698d93afb6431a5077ed09b generation: 2 image: sha256:3ae2a9ea40f6dab95ce85febe7eaf36807dda14c8698d93afb6431a5077ed09b tag: "1" - conditions: - generation: 2 lastTransitionTime: 2019-01-08T06:33:20Z message: 'Internal error occurred: unknown: Not Found' reason: InternalError status: "False" type: ImportSuccess items: null tag: "2"
Another way to reproduce this ImageChangesInProgress always true issue: step1: When installtype is centos, set management to Removed. step2: After imagestreams|templates deleted, change management to Managed, installtype to rhel. During this change, the samplesresources.stauts.installtype is delay to update, still be centos. And some imagestreams have been created. After save step2, the processing imagestream will import failed with error "Internal error occurred: Get https://registry.redhat.io/v2/rhscl/*****/manifests/latest: unauthorized: Please login to the Red Hat Registry using your Customer Portal credentials. Further instructions can be found here: https://access.redhat.com/articles/3399531". ImageChangesInProgress will stuck in the failed imagestreams. Then spec.installtype will mismatch with status.installtype. Two errors come out "Cannot create rhel imagestreams to registry.redhat.io without the credentials being available" and "cannot change installtype from centos to rhel" To resolve the import error: Delete the imagestream, the recreate one will import succeed.... Spec: Architectures: x86_64 Install Type: rhel Management State: Managed Skipped Imagestreams: jenkins Version: 4.0.0-alpha1-85ee5a974 Status: Architectures: x86_64 Conditions: Last Transition Time: 2019-01-08T07:02:16Z Last Update Time: 2019-01-08T07:02:16Z Status: False Type: SamplesExist Last Transition Time: 2019-01-08T07:02:32Z Last Update Time: 2019-01-08T07:02:32Z Message: Cannot create rhel imagestreams to registry.redhat.io without the credentials being available Status: False Type: ImportCredentialsExists Last Transition Time: 2019-01-08T07:02:39Z Last Update Time: 2019-01-08T07:02:39Z Message: cannot change installtype from centos to rhel Status: False Type: ConfigurationValid Last Transition Time: 2019-01-08T07:08:02Z Last Update Time: 2019-01-08T07:08:02Z Reason: nginx Status: True Type: ImageChangesInProgress Last Transition Time: 2019-01-08T06:32:25Z Last Update Time: 2019-01-08T06:32:25Z Status: False Type: PendingRemove Last Transition Time: 2019-01-08T06:32:25Z Last Update Time: 2019-01-08T06:32:25Z Status: False Type: MigrationInProgress Last Transition Time: 2019-01-08T07:08:02Z Last Update Time: 2019-01-08T07:08:02Z Status: False Type: ImportImageErrorsExist Install Type: centos Management State: Managed Skipped Imagestreams: jenkins Events: <none> More info : http://pastebin.test.redhat.com/692111
OK, the yaml in https://bugzilla.redhat.com/show_bug.cgi?id=1663406#c8 is what I needed. The *difference* in the imagestream yaml from the errors I produced during testing is that the status generation is getting updated even with errors. One explanation could be that it was initially OK, then started having issues say on a scheduled import. Or their has been a behavior change. Or something went amiss during my original testing. In any event, the change to the import error logic to address either situation is pretty straight forward. Should have a PR up fairly soon. The same basic thing occurred in the imagestream with the scenario from https://bugzilla.redhat.com/show_bug.cgi?id=1663406#c9 Also, per the comment #c9, the the samplesresources.stauts.installtype is delay to update won't get updated until all the image in progress stuff is complete.
OK I've got commit https://github.com/gabemontero/cluster-samples-operator/commit/b6ef465f4262c0aa4d3c2ea7cd962a76244cdd0e pushed that should address the import errors noted in this bugzilla. It is based on the current state of https://github.com/openshift/cluster-samples-operator/pull/71 (migrating off of the operator SDK), but that PR is still under review, so some underlying things may change, so I'm not going to create a PR for this fix just yet. When PR 71 merges, I'll rebase the branch and get a PR up for this fix.
PR https://github.com/openshift/cluster-samples-operator/pull/72 has been created
PR merged
Didn't fix in registry.svc.ci.openshift.org/openshifr/origin-release:4.0.0-0.alpha-2019-01-11-075335 #oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.alpha-2019-01-11-075335 True False 1h Cluster version is 4.0.0-0.alpha-2019-01-11-075335 $ oc describe configs.samples.operator.openshift.io instance Name: instance Namespace: Labels: <none> Annotations: <none> API Version: samples.operator.openshift.io/v1 Kind: Config Metadata: Creation Timestamp: 2019-01-11T10:14:21Z Finalizers: samples.operator.openshift.io/finalizer Generation: 1 Resource Version: 88637 Self Link: /apis/samples.operator.openshift.io/v1/configs/instance UID: a9a48155-1589-11e9-9931-0258eced7de4 Spec: Architectures: x86_64 Install Type: rhel Management State: Managed Version: 4.0.0-alpha1-137b53463 Status: Architectures: x86_64 Conditions: Last Transition Time: 2019-01-11T11:57:32Z Last Update Time: 2019-01-11T11:57:32Z Status: True Type: SamplesExist Last Transition Time: 2019-01-11T11:55:42Z Last Update Time: 2019-01-11T11:55:42Z Status: True Type: ImportCredentialsExist Last Transition Time: 2019-01-11T10:14:17Z Last Update Time: 2019-01-11T10:14:17Z Status: True Type: ConfigurationValid Last Transition Time: 2019-01-11T12:10:38Z Last Update Time: 2019-01-11T12:10:38Z Reason: jenkins Status: True Type: ImageChangesInProgress Last Transition Time: 2019-01-11T10:14:17Z Last Update Time: 2019-01-11T10:14:17Z Status: False Type: RemovePending Last Transition Time: 2019-01-11T10:14:17Z Last Update Time: 2019-01-11T10:14:17Z Status: False Type: MigrationInProgress Last Transition Time: 2019-01-11T12:10:38Z Last Update Time: 2019-01-11T12:10:38Z Status: False Type: ImportImageErrorsExist Install Type: rhel Management State: Managed Events: <none> $ oc describe clusteroperator openshift-cluster-samples-operator dhcp-140-96.nay.redhat.com: Fri Jan 11 20:11:01 2019 Name: openshift-cluster-samples-operator Namespace: Labels: <none> Annotations: <none> API Version: config.openshift.io/v1 Kind: ClusterOperator Metadata: Creation Timestamp: 2019-01-11T10:14:21Z Generation: 1 Resource Version: 88837 Self Link: /apis/config.openshift.io/v1/clusteroperators/openshift-cluster-samples-operator UID: a9fb4fa4-1589-11e9-9931-0258eced7de4 Spec: Status: Conditions: Last Transition Time: 2019-01-11T12:10:59Z Status: False Type: Available Last Transition Time: 2019-01-11T11:57:32Z Message: Samples moving to 4.0.0-alpha1-137b53463 Status: True Type: Progressing Last Transition Time: 2019-01-11T12:10:59Z Status: False Type: Failing Extension: <nil> Version: Events: <none> $ oc describe is jenkins -n openshift Name: jenkins Namespace: openshift Created: 14 minutes ago Labels: samples.operator.openshift.io/managed=true Annotations: openshift.io/display-name=Jenkins openshift.io/image.dockerRepositoryCheck=2019-01-11T11:56:59Z samples.operator.openshift.io/version=4.0.0-alpha1-137b53463 Image Repository: image-registry.openshift-image-registry.svc:5000/openshift/jenkins Image Lookup: local=false Unique Images: 1 Tags: 3 1 tagged from registry.redhat.io/openshift3/jenkins-1-rhel7:latest prefer registry pullthrough when referencing this tag Provides a Jenkins 1.X server on RHEL 7. For more information about using this container image, including OpenShift considerations, see https://github.com/openshift/jenkins/blob/master/README.md. Tags: hidden, jenkins * registry.redhat.io/openshift3/jenkins-1-rhel7@sha256:3ae2a9ea40f6dab95ce85febe7eaf36807dda14c8698d93afb6431a5077ed09b 14 minutes ago 2 (latest) tagged from registry.redhat.io/openshift/jenkins-2-rhel7:v4.0 prefer registry pullthrough when referencing this tag Provides a Jenkins 2.X server on RHEL 7. For more information about using this container image, including OpenShift considerations, see https://github.com/openshift/jenkins/blob/master/README.md. Tags: jenkins ! error: Import failed (InternalError): Internal error occurred: unknown: Not Found 14 minutes ago
I've cc:ed Ben Parees. Ben - I'm having trouble finding the instructions you sent out last year for internal redhatters to get an actual set of credentials to the TBR. Could you help refresh my memory? It's possible I need to re-run the precise flow here to reproduce, as my unit tests and other error producing scenarios (like accessing the TBR *WITHOUT* any credentials) are not hitting this.
OK I've obtained TBR credentials and have reproduced this latest incarnation/form .... though why it is occurring is not obvious to me at all at first blush. The jenkins v4.0 tag is missing as expected. The other samples imagestreams successfully imported from the TBR. Debugging has started.
Debugging successful ....
PR https://github.com/openshift/cluster-samples-operator/pull/79 is up
PR has merged
Yes, The jenkins v4.0 tag is missing as expected. I have checked the latest fix oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.alpha-2019-01-14-015843 True False 37m Cluster version is 4.0.0-0.alpha-2019-01-14-015843 configs.samples.operator.openshift.io 4.0.0-alpha1-f76f4f23b This fix didn't resolved all issues. Imagestreams import failed which will not block stauts.managedment sync. But ImportImageErrorsExist will be set true, and processing of cvo status always keep true with error. $oc describe configs.samples.operator.openshift.io instance Status: Architectures: x86_64 Conditions: Last Transition Time: 2019-01-14T06:05:49Z Last Update Time: 2019-01-14T06:05:49Z Status: True Type: ConfigurationValid Last Transition Time: 2019-01-14T06:40:01Z Last Update Time: 2019-01-14T06:40:01Z Status: False Type: ImageChangesInProgress Last Transition Time: 2019-01-14T06:33:06Z Last Update Time: 2019-01-14T06:33:06Z Status: True Type: SamplesExist Last Transition Time: 2019-01-14T06:31:50Z Last Update Time: 2019-01-14T06:31:50Z Status: True Type: ImportCredentialsExist Last Transition Time: 2019-01-14T06:05:49Z Last Update Time: 2019-01-14T06:05:49Z Status: False Type: RemovePending Last Transition Time: 2019-01-14T06:05:49Z Last Update Time: 2019-01-14T06:05:49Z Status: False Type: MigrationInProgress Last Transition Time: 2019-01-14T06:40:01Z Last Update Time: 2019-01-14T06:40:01Z Message: imagestream/jenkins: Internal error occurred: unknown: Not Found; Reason: jenkins Status: True Type: ImportImageErrorsExist Install Type: rhel Management State: Managed Events: <none> $oc describe clusteroperator openshift-cluster-samples-operator dhcp-140-96.nay.redhat.com: Mon Jan 14 15:59:04 2019 Name: openshift-cluster-samples-operator Namespace: Labels: <none> Annotations: <none> API Version: config.openshift.io/v1 Kind: ClusterOperator Metadata: Creation Timestamp: 2019-01-14T06:05:36Z Generation: 1 Resource Version: 79094 Self Link: /apis/config.openshift.io/v1/clusteroperators/openshift-cluster-samples-operator UID: 68ca591b-17c2-11e9-af4d-023542499316 Spec: Status: Conditions: Last Transition Time: 2019-01-14T07:58:26Z Status: False Type: Available Last Transition Time: 2019-01-14T06:33:53Z Message: Samples installation in error at 4.0.0-alpha1-f76f4f23b: image import problem Status: True Type: Progressing Last Transition Time: 2019-01-14T07:58:26Z Message: Samples installation in error at 4.0.0-alpha1-f76f4f23b: imagestream/jenkins: Internal error occurred: unknown: Not Found; Status: True Type: Failing Extension: <nil> Even I have add jenkins imagestreams in the skipped list, the ImportImageErrorsExist is still true. Meantimes processing of cvo status always keep true with error. $ oc describe is jenkins -n openshift | grep operator Labels: samples.operator.openshift.io/managed=false samples.operator.openshift.io/version=4.0.0-alpha1-f76f4f23b
"But ImportImageErrorsExist will be set true, and processing of cvo status always keep true with error." *IS* the intended behavior per https://github.com/openshift/cluster-version-operator/blob/master/docs/dev/clusteroperator.md#conditions and https://godoc.org/github.com/openshift/api/config/v1#ClusterStatusConditionType In particular, see the "If an error blocks reaching 4.0.1, the conditions might be:" portion of https://github.com/openshift/cluster-version-operator/blob/master/docs/dev/clusteroperator.md#conditions Please reset your test expectations @XiuJuan Wang for those points. Your point on adding jenkins to the skipped list as a means to bypassing the failure and getting to an available state is interesting. I'm inclined to agree on that point. @Ben - what do you think on the notion of adding to the skip list after the import failure occurs as a means to ignoring the error and moving on?
skiplist question addressed in chat: https://coreos.slack.com/archives/CE2HALN2W/p1547478309327400 tldr: images in the skiplist should not be reported as import errors/should not block operator availability, even if they are added to the skiplist after the error occurs.
Ok, skipped list to ignoring the error is going on. How about the expect result after fixing jenkins imagestreams error manually? Now ,The fact is jenkins and perl imagestreams has imported without error by manual. But the processing is still true with error. See http://pastebin.test.redhat.com/695464 My questions are: When processing will set to false after the import error occurs? Or only when samples operator fix the error automaticlly?
Hmm, I misunderstand the skipped list point, just thought you devels would not treat skipped list... So correct my questions in comment #24: How does Processing behave after fixing jenkins imagestreams error manually?
With the changes I am working on, Once the import error condition is clean (false, no imagestreams listed) as a result of fixing jenkins or any other failed imports, Processing will go to false, available to true, with both having messages stating that the samples are at the given level. i.e. "the steady state" if you will My changes will include and update to the readme on the possible corrective actions in addition to the code changes.
PR https://github.com/openshift/cluster-samples-operator/pull/80 is up
After meet imagestream import error,could add imagestream(s) to skippedImagestreams, then cvo status will pass this error. Finally, processing will be false, avaliable be true, failing be false. Make this bug to verified. # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.alpha-2019-01-17-070151 True False 1h Cluster version is 4.0.0-0.alpha-2019-01-17-070151
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758