Bug 1688582
Summary: | [Upgrade] cluster-storage-operator incorrectly reporting status and version after the upgrade | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Wenqi He <wehe> |
Component: | Storage | Assignee: | Bradley Childs <bchilds> |
Status: | CLOSED ERRATA | QA Contact: | Wenqi He <wehe> |
Severity: | medium | Docs Contact: | |
Priority: | high | ||
Version: | 4.1.0 | CC: | aos-bugs, aos-storage-staff, bbennett, bchilds, chaoyang, eparis, jokerman, mmccomas, sponnaga, wsun |
Target Milestone: | --- | Keywords: | BetaBlocker |
Target Release: | 4.1.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-06-04 10:45:47 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Wenqi He
2019-03-14 03:16:41 UTC
https://github.com/openshift/cluster-storage-operator/pull/16 should fix this, it's not in the installer yet (currently 0.14) On second thought, moving this over to the installer component in case there's some work to be done there. Putting this back on storage. This a bug in your code. Seems like it sits modified until QE tests a nightly. I have tested with the upgrade between below two versions, still not changed after upgrade: Before upgrade: $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.nightly-2019-03-18-200009 True False 4h55m Cluster version is 4.0.0-0.nightly-2019-03-18-200009 $ oc get pods -oyaml -n openshift-cluster-storage-operator | grep image image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:27e821eabac565c10d0f8833dc812a26d5803c5847b12f2d5c1951d4b257d96f $ oc image info quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:27e821eabac565c10d0f8833dc812a26d5803c5847b12f2d5c1951d4b257d96f | grep io.openshift.build.commit.id io.openshift.build.commit.id=4cdc1e782067eacd0eed79cc886b023868498194 Which above shows the image contains the fix already. $ oc get pods -oyaml -n openshift-cluster-storage-operator | grep conditions -A 25 conditions: - lastProbeTime: null lastTransitionTime: 2019-03-19T02:52:35Z status: "True" type: Initialized - lastProbeTime: null lastTransitionTime: 2019-03-19T02:52:46Z status: "True" type: Ready - lastProbeTime: null lastTransitionTime: 2019-03-19T02:52:46Z status: "True" type: ContainersReady - lastProbeTime: null lastTransitionTime: 2019-03-19T02:52:35Z status: "True" type: PodScheduled $ oc describe clusteroperator storage | grep conditions Conditions: Last Transition Time: 2019-03-19T02:52:47Z Status: False Type: Progressing Last Transition Time: 2019-03-19T02:52:47Z Status: True Type: Available Last Transition Time: 2019-03-19T02:52:47Z Status: False Type: Failing After upgrade: $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.nightly-2019-03-18-223058 True False 7m55s Cluster version is 4.0.0-0.nightly-2019-03-18-223058 $ oc get pods -oyaml -n openshift-cluster-storage-operator | grep conditions -A 25 conditions: - lastProbeTime: null lastTransitionTime: 2019-03-19T08:15:23Z status: "True" type: Initialized - lastProbeTime: null lastTransitionTime: 2019-03-19T08:15:34Z status: "True" type: Ready - lastProbeTime: null lastTransitionTime: 2019-03-19T08:15:34Z status: "True" type: ContainersReady - lastProbeTime: null lastTransitionTime: 2019-03-19T08:15:23Z status: "True" type: PodScheduled $ oc get clusteroperator NAME VERSION AVAILABLE PROGRESSING FAILING SINCE service-catalog-apiserver 4.0.0-0.nightly-2019-03-18-223058 True False False 12m service-catalog-controller-manager 4.0.0-0.nightly-2019-03-18-223058 True False False 69s storage 4.0.0-0.nightly-2019-03-18-223058 True False False 5h35m $ oc describe clusteroperator storage | grep Conditions -A 15 Conditions: Last Transition Time: 2019-03-19T02:52:47Z Status: False Type: Progressing Last Transition Time: 2019-03-19T02:52:47Z Status: True Type: Available Last Transition Time: 2019-03-19T02:52:47Z Status: False Type: Failing I don't understand, the clusteroperator has Available True and is reporting version 4.0.0-0.nightly-2019-03-18-223058. Is this not expected? What is the expected behaviour? Is it because Last Transition Time hasn't changed? It is bit tricky because unlike other operators the storage operator's operand is just a StorageClass at the moment, the class is either created or not, there is no in-between state. But it might make sense to make the status Progressing for the x milliseconds during which the version does not match. Hi Matthew, Yes, you are right, the bug reported because of the Last Transition Time has not changed, and when we `$ oc get clusteroperator` it still shows SINCE 5h35m ago while other operators shows just created after the upgrade. BTW, I am requested by manager to report this bug and it also related to the bug#1686121 which I mentioned in Description at the beginning. It said each operator owner should be responsible to each operator. Sorry let you confuse. No problem, I did not read carefully. I've opened a PR: https://github.com/openshift/cluster-storage-operator/pull/22 Hi mawong, I tried to check this bug, but the last Transition Time still not update Before upgrade: oc get pods -oyaml -n openshift-cluster-storage-operator | grep conditions -A 25 conditions: - lastProbeTime: null lastTransitionTime: 2019-03-25T03:16:14Z status: "True" type: Initialized - lastProbeTime: null lastTransitionTime: 2019-03-25T03:16:24Z status: "True" type: Ready - lastProbeTime: null lastTransitionTime: 2019-03-25T03:16:24Z status: "True" type: ContainersReady - lastProbeTime: null lastTransitionTime: 2019-03-25T03:16:14Z status: "True" type: PodScheduled containerStatuses: - containerID: cri-o://45039e6910f5a7f4a166852d9fdc390a07a8894ff38213e9559730cbaa280a7d image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:dd43925dad987bc527cb625dc0316e1ddbf92c45c1fbb3198b344ecbb028f541 imageID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:dd43925dad987bc527cb625dc0316e1ddbf92c45c1fbb3198b344ecbb028f541 lastState: {} name: cluster-storage-operator ready: true restartCount: 0 state: After upgrade: oc describe clusteroperator storage Name: storage Namespace: Labels: <none> Annotations: <none> API Version: config.openshift.io/v1 Kind: ClusterOperator Metadata: Creation Timestamp: 2019-03-25T03:16:24Z Generation: 1 Resource Version: 107915 Self Link: /apis/config.openshift.io/v1/clusteroperators/storage UID: 5ef48151-4eac-11e9-b15a-0227bde3ac64 Spec: Status: Conditions: Last Transition Time: 2019-03-25T03:16:24Z Status: False Type: Progressing Last Transition Time: 2019-03-25T03:16:24Z Status: True Type: Available Last Transition Time: 2019-03-25T03:16:24Z Status: False Type: Failing Extension: <nil> Related Objects: <nil> Versions: Name: operator Version: 4.0.0-0.nightly-2019-03-23-183709 Events: <none> Please try on nightly 4.0.0-0.alpha-2019-03-24-171037 or later, it worked for me (I am using libvirt, I bump the release version & the clusterstorage goes Available(2019-03-25T16:42:43Z)->Progressing(2019-03-25T16:47:39Z)->Available(2019-03-25T16:47:39Z)) - lastTransitionTime: 2019-03-25T16:47:39Z message: Unsupported platform for storageclass creation status: "True" type: Available - lastTransitionTime: 2019-03-25T16:42:43Z status: "False" type: Failing - lastTransitionTime: 2019-03-25T16:47:39Z status: "False" type: Progressing The change is present in 4.0.0-0.alpha-2019-03-22-235916 https://origin-release.svc.ci.openshift.org/releasestream/4.0.0-0.alpha/release/4.0.0-0.alpha-2019-03-25-172405?from=4.0.0-0.alpha-2019-03-22-235916, so it should work if the pre-upgrade version is at least 4.0.0-0.alpha-2019-03-22-235916 It is passed when upgrade from 4.0.0-0.nightly-2019-03-25-180911 to 4.0.0-0.nightly-2019-03-26-034754 oc describe clusteroperator storage Name: storage Namespace: Labels: <none> Annotations: <none> API Version: config.openshift.io/v1 Kind: ClusterOperator Metadata: Creation Timestamp: 2019-03-26T06:01:36Z Generation: 1 Resource Version: 545806 Self Link: /apis/config.openshift.io/v1/clusteroperators/storage UID: 9d6d30bd-4f8c-11e9-8526-0ab0e92a085a Spec: Status: Conditions: Last Transition Time: 2019-03-27T04:05:40Z Status: True Type: Available Last Transition Time: 2019-03-26T06:01:36Z Status: False Type: Failing Last Transition Time: 2019-03-27T04:05:40Z Status: False Type: Progressing Extension: <nil> Related Objects: <nil> Versions: Name: operator Version: 4.0.0-0.nightly-2019-03-26-034754 Events: <none> @chaoyang, Thanks to help me drive here! Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758 |