Bug 1782683
| Summary: | [Disconnected] openshift-samples operator setting management state to Removed does not complete while Progressing==true | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Johnny Liu <jialiu> | ||||||||
| Component: | Samples | Assignee: | Gabe Montero <gmontero> | ||||||||
| Status: | CLOSED ERRATA | QA Contact: | XiuJuan Wang <xiuwang> | ||||||||
| Severity: | medium | Docs Contact: | |||||||||
| Priority: | medium | ||||||||||
| Version: | 4.2.z | CC: | adam.kaplan, bparees, gmontero, jialiu, wzheng, xiuwang | ||||||||
| Target Milestone: | --- | Keywords: | Regression, Reopened | ||||||||
| Target Release: | 4.4.0 | ||||||||||
| Hardware: | Unspecified | ||||||||||
| OS: | Unspecified | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||
| Doc Text: |
Cause: Samples operator would delay moving to Removed management state while imagestream imports were in progress
Consequence: If those imagestream imports were doomed to fail and retry forever for reasons like lack of connectivity to the source registry, imports would be in progress for a very long time and prevent removed processing to occur
Fix: samples operator was changed to not gate moving to Removed state if imagestream imports were still in progress
Result: administrators can now switch samples operator to removed quickly in cases where sample imagestream imports are doomed for failure, like when connectivity to the source registry does not exist
|
Story Points: | --- | ||||||||
| Clone Of: | 1772178 | ||||||||||
| : | 1805615 1805815 (view as bug list) | Environment: | |||||||||
| Last Closed: | 2020-05-13 21:54:56 UTC | Type: | --- | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Embargoed: | |||||||||||
| Bug Depends On: | |||||||||||
| Bug Blocks: | 1805615, 1805815 | ||||||||||
| Attachments: |
|
||||||||||
|
Comment 1
Gabe Montero
2019-12-12 14:49:25 UTC
QE's ci job is NOT on https://openshift-release.svc.ci.openshift.org/releasestream/4.2.0-0.nightly/release/4.2.0-0.nightly-2019-12-11-171302. I just tried to reproduce this bug with the same payload image, did not reproduce it, maybe just a flake. I will keep an eye on it, OK thanks for the update. Yeah let's keep this open for a bit, see what happens. Where if it happens again, get me the pod logs along with the samples config yaml. adjusting severity given intermittent nature Created attachment 1649330 [details]
cluster-samples-operator.log
Created attachment 1652647 [details]
samples config logs
Created attachment 1652648 [details] samples operator pod log Don't met installation blocked by samples operator failure,then remove processing is not so longer as comment #11, around 10 mins. If the log is enough, I will paste more log when met the installation blocked by samples operator failure Set samples operator to Removed when the processing=true, tried ten times, all succeed in 4.4.0-0.nightly-2020-02-03-021633
The longest during time is 5 mins, it's acceptable
$ oc get co openshift-samples -o yaml
apiVersion: config.openshift.io/v1
kind: ClusterOperator
metadata:
creationTimestamp: "2020-02-03T06:57:08Z"
generation: 1
name: openshift-samples
resourceVersion: "32088"
selfLink: /apis/config.openshift.io/v1/clusteroperators/openshift-samples
uid: 9fe06cb4-d4ba-4739-b8e0-a1340b0ceb68
spec: {}
status:
conditions:
- lastTransitionTime: "2020-02-03T07:28:52Z"
message: Samples processing to 4.4.0-0.nightly-2020-02-03-021633
status: "True"
type: Progressing
- lastTransitionTime: "2020-02-03T07:03:46Z"
status: "False"
type: Degraded
- lastTransitionTime: "2020-02-03T07:03:50Z"
message: Samples installation successful at 4.4.0-0.nightly-2020-02-03-021633
status: "True"
type: Available
extension: null
relatedObjects:
- group: samples.operator.openshift.io
name: cluster
resource: configs
- group: ""
name: openshift-cluster-samples-operator
resource: namespaces
- group: ""
name: openshift
resource: namespaces
versions:
- name: operator
version: 4.4.0-0.nightly-2020-02-03-021633
$ oc get config.samples -o yaml
apiVersion: v1
items:
- apiVersion: samples.operator.openshift.io/v1
kind: Config
metadata:
creationTimestamp: "2020-02-03T06:57:08Z"
finalizers:
- samples.operator.openshift.io/finalizer
generation: 4
name: cluster
resourceVersion: "32307"
selfLink: /apis/samples.operator.openshift.io/v1/configs/cluster
uid: e07adbca-edab-4347-bfec-b527b4eaf9a0
spec:
architectures:
- x86_64
managementState: Removed
status:
architectures:
- x86_64
conditions:
- lastTransitionTime: "2020-02-03T06:57:14Z"
lastUpdateTime: "2020-02-03T06:57:14Z"
status: "False"
type: RemovePending
- lastTransitionTime: "2020-02-03T06:57:10Z"
lastUpdateTime: "2020-02-03T06:57:10Z"
status: "True"
type: ImportCredentialsExist
- lastTransitionTime: "2020-02-03T07:03:46Z"
lastUpdateTime: "2020-02-03T07:03:46Z"
status: "True"
type: SamplesExist
- lastTransitionTime: "2020-02-03T07:28:49Z"
lastUpdateTime: "2020-02-03T07:28:49Z"
reason: 'jboss-eap72-openshift apicurito-ui jboss-webserver31-tomcat7-openshift
jboss-eap70-openshift jenkins-agent-nodejs rhpam-kieserver-rhel8 ruby jboss-webserver30-tomcat8-openshift
fis-karaf-openshift jboss-fuse70-java-openshift redhat-openjdk18-openshift
postgresql rhpam-businesscentral-monitoring-rhel8 fis-java-openshift openjdk-8-rhel8
redis rhpam-smartrouter-rhel8 jboss-webserver50-tomcat9-openshift jboss-datagrid71-openshift
java mongodb redhat-sso73-openshift jboss-amq-63 jboss-datavirt64-openshift
jboss-fuse70-console redhat-sso72-openshift jboss-webserver31-tomcat8-openshift
dotnet-runtime eap-cd-openshift mariadb mysql jboss-processserver64-openshift
openjdk-11-rhel7 openjdk-11-rhel8 jenkins jboss-datavirt64-driver-openshift
jboss-eap71-openshift fuse7-eap-openshift fuse7-java-openshift jboss-fuse70-karaf-openshift
perl apicast-gateway jboss-datagrid73-openshift nodejs jboss-eap64-openshift
golang redhat-sso71-openshift jenkins-agent-maven modern-webapp rhpam-businesscentral-rhel8
jboss-datagrid65-client-openshift jboss-datagrid71-client-openshift jboss-datagrid72-openshift
jboss-decisionserver64-openshift httpd redhat-sso70-openshift jboss-webserver30-tomcat7-openshift
python dotnet rhdm-optaweb-employee-rostering-rhel8 jboss-amq-62 jboss-datagrid65-openshift
jboss-fuse70-eap-openshift rhdm-decisioncentral-rhel8 rhdm-kieserver-rhel8
fuse-apicurito-generator fuse7-console fuse7-karaf-openshift nginx php '
status: "True"
type: ImageChangesInProgress
- lastTransitionTime: "2020-02-03T07:28:49Z"
lastUpdateTime: "2020-02-03T07:28:49Z"
message: <imagestream/apicast-gateway>dockerimage.image.openshift.io "xiuwang-gcp-dis.mirror-registry.qe.gcp.devcluster.openshift.com:5000/3scale-amp21/apicast-gateway:1.4-2"
not found<imagestream/apicast-gateway>
reason: 'apicast-gateway '
status: "True"
type: ImportImageErrorsExist
- lastTransitionTime: "2020-02-03T07:03:43Z"
lastUpdateTime: "2020-02-03T07:03:43Z"
status: "True"
type: ConfigurationValid
- lastTransitionTime: "2020-02-03T07:03:43Z"
lastUpdateTime: "2020-02-03T07:03:43Z"
status: "False"
type: MigrationInProgress
managementState: Managed
version: 4.4.0-0.nightly-2020-02-03-021633
kind: List
metadata:
resourceVersion: ""
selfLink: ""
$ oc get co openshift-samples -o yaml
apiVersion: config.openshift.io/v1
kind: ClusterOperator
metadata:
creationTimestamp: "2020-02-03T06:57:08Z"
generation: 1
name: openshift-samples
resourceVersion: "33548"
selfLink: /apis/config.openshift.io/v1/clusteroperators/openshift-samples
uid: 9fe06cb4-d4ba-4739-b8e0-a1340b0ceb68
spec: {}
status:
conditions:
- lastTransitionTime: "2020-02-03T07:33:04Z"
message: Samples installation was previously successful at 4.4.0-0.nightly-2020-02-03-021633
but the samples operator is now Removed
reason: CurrentlyRemoved
status: "False"
type: Progressing
- lastTransitionTime: "2020-02-03T07:33:04Z"
message: Samples installation was previously successful at 4.4.0-0.nightly-2020-02-03-021633
but the samples operator is now Removed
reason: CurrentlyRemoved
status: "False"
type: Degraded
- lastTransitionTime: "2020-02-03T07:33:04Z"
message: Samples installation was previously successful at 4.4.0-0.nightly-2020-02-03-021633
but the samples operator is now Removed
reason: CurrentlyRemoved
status: "True"
type: Available
extension: null
relatedObjects:
- group: samples.operator.openshift.io
name: cluster
resource: configs
- group: ""
name: openshift-cluster-samples-operator
resource: namespaces
- group: ""
name: openshift
resource: namespaces
versions:
- name: operator
version: 4.4.0-0.nightly-2020-02-03-021633
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0581 |