Bug 1782683
Summary: | [Disconnected] openshift-samples operator setting management state to Removed does not complete while Progressing==true | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Johnny Liu <jialiu> | ||||||||
Component: | Samples | Assignee: | Gabe Montero <gmontero> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | XiuJuan Wang <xiuwang> | ||||||||
Severity: | medium | Docs Contact: | |||||||||
Priority: | medium | ||||||||||
Version: | 4.2.z | CC: | adam.kaplan, bparees, gmontero, jialiu, wzheng, xiuwang | ||||||||
Target Milestone: | --- | Keywords: | Regression, Reopened | ||||||||
Target Release: | 4.4.0 | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: |
Cause: Samples operator would delay moving to Removed management state while imagestream imports were in progress
Consequence: If those imagestream imports were doomed to fail and retry forever for reasons like lack of connectivity to the source registry, imports would be in progress for a very long time and prevent removed processing to occur
Fix: samples operator was changed to not gate moving to Removed state if imagestream imports were still in progress
Result: administrators can now switch samples operator to removed quickly in cases where sample imagestream imports are doomed for failure, like when connectivity to the source registry does not exist
|
Story Points: | --- | ||||||||
Clone Of: | 1772178 | ||||||||||
: | 1805615 1805815 (view as bug list) | Environment: | |||||||||
Last Closed: | 2020-05-13 21:54:56 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 1805615, 1805815 | ||||||||||
Attachments: |
|
Comment 1
Gabe Montero
2019-12-12 14:49:25 UTC
QE's ci job is NOT on https://openshift-release.svc.ci.openshift.org/releasestream/4.2.0-0.nightly/release/4.2.0-0.nightly-2019-12-11-171302. I just tried to reproduce this bug with the same payload image, did not reproduce it, maybe just a flake. I will keep an eye on it, OK thanks for the update. Yeah let's keep this open for a bit, see what happens. Where if it happens again, get me the pod logs along with the samples config yaml. adjusting severity given intermittent nature Created attachment 1649330 [details]
cluster-samples-operator.log
Created attachment 1652647 [details]
samples config logs
Created attachment 1652648 [details] samples operator pod log Don't met installation blocked by samples operator failure,then remove processing is not so longer as comment #11, around 10 mins. If the log is enough, I will paste more log when met the installation blocked by samples operator failure Set samples operator to Removed when the processing=true, tried ten times, all succeed in 4.4.0-0.nightly-2020-02-03-021633 The longest during time is 5 mins, it's acceptable $ oc get co openshift-samples -o yaml apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: creationTimestamp: "2020-02-03T06:57:08Z" generation: 1 name: openshift-samples resourceVersion: "32088" selfLink: /apis/config.openshift.io/v1/clusteroperators/openshift-samples uid: 9fe06cb4-d4ba-4739-b8e0-a1340b0ceb68 spec: {} status: conditions: - lastTransitionTime: "2020-02-03T07:28:52Z" message: Samples processing to 4.4.0-0.nightly-2020-02-03-021633 status: "True" type: Progressing - lastTransitionTime: "2020-02-03T07:03:46Z" status: "False" type: Degraded - lastTransitionTime: "2020-02-03T07:03:50Z" message: Samples installation successful at 4.4.0-0.nightly-2020-02-03-021633 status: "True" type: Available extension: null relatedObjects: - group: samples.operator.openshift.io name: cluster resource: configs - group: "" name: openshift-cluster-samples-operator resource: namespaces - group: "" name: openshift resource: namespaces versions: - name: operator version: 4.4.0-0.nightly-2020-02-03-021633 $ oc get config.samples -o yaml apiVersion: v1 items: - apiVersion: samples.operator.openshift.io/v1 kind: Config metadata: creationTimestamp: "2020-02-03T06:57:08Z" finalizers: - samples.operator.openshift.io/finalizer generation: 4 name: cluster resourceVersion: "32307" selfLink: /apis/samples.operator.openshift.io/v1/configs/cluster uid: e07adbca-edab-4347-bfec-b527b4eaf9a0 spec: architectures: - x86_64 managementState: Removed status: architectures: - x86_64 conditions: - lastTransitionTime: "2020-02-03T06:57:14Z" lastUpdateTime: "2020-02-03T06:57:14Z" status: "False" type: RemovePending - lastTransitionTime: "2020-02-03T06:57:10Z" lastUpdateTime: "2020-02-03T06:57:10Z" status: "True" type: ImportCredentialsExist - lastTransitionTime: "2020-02-03T07:03:46Z" lastUpdateTime: "2020-02-03T07:03:46Z" status: "True" type: SamplesExist - lastTransitionTime: "2020-02-03T07:28:49Z" lastUpdateTime: "2020-02-03T07:28:49Z" reason: 'jboss-eap72-openshift apicurito-ui jboss-webserver31-tomcat7-openshift jboss-eap70-openshift jenkins-agent-nodejs rhpam-kieserver-rhel8 ruby jboss-webserver30-tomcat8-openshift fis-karaf-openshift jboss-fuse70-java-openshift redhat-openjdk18-openshift postgresql rhpam-businesscentral-monitoring-rhel8 fis-java-openshift openjdk-8-rhel8 redis rhpam-smartrouter-rhel8 jboss-webserver50-tomcat9-openshift jboss-datagrid71-openshift java mongodb redhat-sso73-openshift jboss-amq-63 jboss-datavirt64-openshift jboss-fuse70-console redhat-sso72-openshift jboss-webserver31-tomcat8-openshift dotnet-runtime eap-cd-openshift mariadb mysql jboss-processserver64-openshift openjdk-11-rhel7 openjdk-11-rhel8 jenkins jboss-datavirt64-driver-openshift jboss-eap71-openshift fuse7-eap-openshift fuse7-java-openshift jboss-fuse70-karaf-openshift perl apicast-gateway jboss-datagrid73-openshift nodejs jboss-eap64-openshift golang redhat-sso71-openshift jenkins-agent-maven modern-webapp rhpam-businesscentral-rhel8 jboss-datagrid65-client-openshift jboss-datagrid71-client-openshift jboss-datagrid72-openshift jboss-decisionserver64-openshift httpd redhat-sso70-openshift jboss-webserver30-tomcat7-openshift python dotnet rhdm-optaweb-employee-rostering-rhel8 jboss-amq-62 jboss-datagrid65-openshift jboss-fuse70-eap-openshift rhdm-decisioncentral-rhel8 rhdm-kieserver-rhel8 fuse-apicurito-generator fuse7-console fuse7-karaf-openshift nginx php ' status: "True" type: ImageChangesInProgress - lastTransitionTime: "2020-02-03T07:28:49Z" lastUpdateTime: "2020-02-03T07:28:49Z" message: <imagestream/apicast-gateway>dockerimage.image.openshift.io "xiuwang-gcp-dis.mirror-registry.qe.gcp.devcluster.openshift.com:5000/3scale-amp21/apicast-gateway:1.4-2" not found<imagestream/apicast-gateway> reason: 'apicast-gateway ' status: "True" type: ImportImageErrorsExist - lastTransitionTime: "2020-02-03T07:03:43Z" lastUpdateTime: "2020-02-03T07:03:43Z" status: "True" type: ConfigurationValid - lastTransitionTime: "2020-02-03T07:03:43Z" lastUpdateTime: "2020-02-03T07:03:43Z" status: "False" type: MigrationInProgress managementState: Managed version: 4.4.0-0.nightly-2020-02-03-021633 kind: List metadata: resourceVersion: "" selfLink: "" $ oc get co openshift-samples -o yaml apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: creationTimestamp: "2020-02-03T06:57:08Z" generation: 1 name: openshift-samples resourceVersion: "33548" selfLink: /apis/config.openshift.io/v1/clusteroperators/openshift-samples uid: 9fe06cb4-d4ba-4739-b8e0-a1340b0ceb68 spec: {} status: conditions: - lastTransitionTime: "2020-02-03T07:33:04Z" message: Samples installation was previously successful at 4.4.0-0.nightly-2020-02-03-021633 but the samples operator is now Removed reason: CurrentlyRemoved status: "False" type: Progressing - lastTransitionTime: "2020-02-03T07:33:04Z" message: Samples installation was previously successful at 4.4.0-0.nightly-2020-02-03-021633 but the samples operator is now Removed reason: CurrentlyRemoved status: "False" type: Degraded - lastTransitionTime: "2020-02-03T07:33:04Z" message: Samples installation was previously successful at 4.4.0-0.nightly-2020-02-03-021633 but the samples operator is now Removed reason: CurrentlyRemoved status: "True" type: Available extension: null relatedObjects: - group: samples.operator.openshift.io name: cluster resource: configs - group: "" name: openshift-cluster-samples-operator resource: namespaces - group: "" name: openshift resource: namespaces versions: - name: operator version: 4.4.0-0.nightly-2020-02-03-021633 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0581 |