1690342 – [upgrade] the status.conditions of DNS operator is not updated after upgrade

Bug 1690342 - [upgrade] the status.conditions of DNS operator is not updated after upgrade

Summary: [upgrade] the status.conditions of DNS operator is not updated after upgrade

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	4.1.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	high
Target Milestone:	---
Target Release:	4.1.0
Assignee:	Daneyon Hansen
QA Contact:	Hongan Li
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1695204 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-03-19 10:21 UTC by Hongan Li
Modified:	2022-08-04 22:24 UTC (History)
CC List:	12 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-06-04 10:46:06 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	openshift cluster-dns-operator pull 104	None	closed	Bug 1690342: Update Progressing condition if upgrade happened	2020-03-17 08:38:50 UTC
Github	openshift cluster-dns-operator pull 106	None	closed	Bug 1690342: Fix Progressing condition upgrade check	2020-03-17 08:38:50 UTC
Github	openshift cluster-dns-operator pull 88	None	closed	Bug 1690342: Initial DNS status conditions implementation	2020-03-17 08:38:50 UTC
Red Hat Product Errata	RHBA-2019:0758	None	None	None	2019-06-04 10:46:15 UTC

Description Hongan Li 2019-03-19 10:21:09 UTC

Description of problem:
the status.conditions of DNS operator is not updated after upgrade

Version-Release number of selected component (if applicable):
4.0.0-0.nightly-2019-03-18-200009

How reproducible:
always

Steps to Reproduce:
1. install cluster with 4.0.0-0.nightly-2019-03-15-043409 
2. $ oc get clusteroperator dns -o yaml
3. upgrade to 4.0.0-0.nightly-2019-03-18-200009
4. $ oc get clusteroperator dns -o yaml

Actual results:
the status.conditions is not updated and still show old timestamp.

---> step2 (before upgrade):
apiVersion: config.openshift.io/v1
kind: ClusterOperator
metadata:
  creationTimestamp: 2019-03-18T09:38:08Z
  generation: 1
  name: dns
  resourceVersion: "2945"
  selfLink: /apis/config.openshift.io/v1/clusteroperators/dns
  uid: 89ac4631-4961-11e9-99b1-06983a957c1e
spec: {}
status:
  conditions:
  - lastTransitionTime: 2019-03-18T09:38:09Z
    status: "False"
    type: Failing
  - lastTransitionTime: 2019-03-18T09:38:09Z
    status: "False"
    type: Progressing
  - lastTransitionTime: 2019-03-18T09:38:34Z
    status: "True"
    type: Available
  extension: null
  relatedObjects:
  - group: ""
    name: openshift-dns-operator
    resource: namespaces
  - group: ""
    name: openshift-dns
    resource: namespaces
  versions:
  - name: operator
    version: 4.0.0-0.nightly-2019-03-15-043409
  - name: coredns
    version: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:b9e3107ae30e589be00c2c28fab543e0c05a1dff5188de479cf37693834eabe1


---> step4 (after upgrade):
apiVersion: config.openshift.io/v1
kind: ClusterOperator
metadata:
  creationTimestamp: 2019-03-18T09:38:08Z
  generation: 1
  name: dns
  resourceVersion: "731397"
  selfLink: /apis/config.openshift.io/v1/clusteroperators/dns
  uid: 89ac4631-4961-11e9-99b1-06983a957c1e
spec: {}
status:
  conditions:
  - lastTransitionTime: 2019-03-18T09:38:09Z
    status: "False"
    type: Failing
  - lastTransitionTime: 2019-03-18T09:38:09Z
    status: "False"
    type: Progressing
  - lastTransitionTime: 2019-03-18T09:38:34Z
    status: "True"
    type: Available
  extension: null
  relatedObjects:
  - group: ""
    name: openshift-dns-operator
    resource: namespaces
  - group: ""
    name: openshift-dns
    resource: namespaces
  versions:
  - name: operator
    version: 4.0.0-0.nightly-2019-03-18-200009
  - name: coredns
    version: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:b9e3107ae30e589be00c2c28fab543e0c05a1dff5188de479cf37693834eabe1


Expected results:
status.conditions should be updated after upgrade

Additional info:

Comment 1 Daneyon Hansen 2019-03-25 21:31:38 UTC

This bug appears to share the same fundamental problem as [1]. I believe [2] will fix [1] and the operator upgrade use case will be tested after the PR is merged. If [2] does fix [1], the approach in [2] can be duplicated to fix this bug.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1690333
[2] https://github.com/openshift/cluster-ingress-operator/pull/175

Comment 2 Daneyon Hansen 2019-04-02 17:05:20 UTC

*** Bug 1695204 has been marked as a duplicate of this bug. ***

Comment 3 Clayton Coleman 2019-04-02 17:36:39 UTC

https://github.com/openshift/cluster-version-operator/pull/154 will document this and an e2e test will verify it in the future post-upgrade

Comment 4 Clayton Coleman 2019-04-02 17:38:06 UTC

Increasing severity to match other bugs.  Operators MUST report accurate status post upgrade or will cause customer confusion.

Comment 5 Daneyon Hansen 2019-04-04 23:24:32 UTC

PR https://github.com/openshift/cluster-dns-operator/pull/88 has been submitted to fix this bug.

Comment 7 Wei Sun 2019-04-10 03:29:26 UTC

The PR is still open,move the bug to MODIFIED status.

Comment 11 Hongkai Liu 2019-05-03 18:10:02 UTC

Run the test with 2 clusters. Had same results.
status.conditions looked the same before/after upgrade.

$ oc get clusterversions.config.openshift.io 
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.1.0-0.nightly-2019-05-03-045206   True        False         3m1s    Cluster version is 4.1.0-0.nightly-2019-05-03-045206

$ oc get clusteroperator dns -o yaml
apiVersion: config.openshift.io/v1
kind: ClusterOperator
metadata:
  creationTimestamp: "2019-05-03T15:22:55Z"
  generation: 1
  name: dns
  resourceVersion: "3036"
  selfLink: /apis/config.openshift.io/v1/clusteroperators/dns
  uid: 535a16c4-6db7-11e9-b6f4-02be967dc830
spec: {}
status:
  conditions:
  - lastTransitionTime: "2019-05-03T15:23:10Z"
    message: All desired DNS DaemonSets available and operand Namespace exists
    reason: AsExpected
    status: "False"
    type: Degraded
  - lastTransitionTime: "2019-05-03T15:23:10Z"
    message: Desired and available number of DNS DaemonSets are equal
    reason: AsExpected
    status: "False"
    type: Progressing
  - lastTransitionTime: "2019-05-03T15:23:10Z"
    message: At least 1 DNS DaemonSet available
    reason: AsExpected
    status: "True"
    type: Available
  extension: null
  relatedObjects:
  - group: ""
    name: openshift-dns-operator
    resource: namespaces
  - group: ""
    name: openshift-dns
    resource: namespaces
  versions:
  - name: operator
    version: 4.1.0-0.nightly-2019-05-03-045206
  - name: coredns
    version: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:3f2209f60c556962fa4bcf681ea3996bb47cb2f7ae18a4129d20827b71238b13
  - name: openshift-cli
    version: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:3f263410c83d80fefedbad910042c4648d7e81b339011709e8c238f456e9e6e2

$ oc version
Client Version: version.Info{Major:"4", Minor:"1+", GitVersion:"v4.1.0-201905030232+533d5bc-dirty", GitCommit:"533d5bc", GitTreeState:"dirty", BuildDate:"2019-05-03T07:08:56Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"13+", GitVersion:"v1.13.4+b9f554d", GitCommit:"b9f554d", GitTreeState:"clean", BuildDate:"2019-05-03T03:32:34Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}

$ oc adm upgrade --to-image=registry.svc.ci.openshift.org/ocp/release:4.1.0-0.nightly-2019-05-03-121122 --force

$ oc get clusterversions.config.openshift.io 
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.1.0-0.nightly-2019-05-03-121122   True        False         112m    Cluster version is 4.1.0-0.nightly-2019-05-03-121122


$ oc get clusteroperator dns -o yaml
apiVersion: config.openshift.io/v1
kind: ClusterOperator
metadata:
  creationTimestamp: "2019-05-03T15:22:55Z"
  generation: 1
  name: dns
  resourceVersion: "23442"
  selfLink: /apis/config.openshift.io/v1/clusteroperators/dns
  uid: 535a16c4-6db7-11e9-b6f4-02be967dc830
spec: {}
status:
  conditions:
  - lastTransitionTime: "2019-05-03T15:23:10Z"
    message: All desired DNS DaemonSets available and operand Namespace exists
    reason: AsExpected
    status: "False"
    type: Degraded
  - lastTransitionTime: "2019-05-03T15:23:10Z"
    message: Desired and available number of DNS DaemonSets are equal
    reason: AsExpected
    status: "False"
    type: Progressing
  - lastTransitionTime: "2019-05-03T15:23:10Z"
    message: At least 1 DNS DaemonSet available
    reason: AsExpected
    status: "True"
    type: Available
  extension: null
  relatedObjects:
  - group: ""
    name: openshift-dns-operator
    resource: namespaces
  - group: ""
    name: openshift-dns
    resource: namespaces
  versions:
  - name: operator
    version: 4.1.0-0.nightly-2019-05-03-121122
  - name: coredns
    version: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:3f2209f60c556962fa4bcf681ea3996bb47cb2f7ae18a4129d20827b71238b13
  - name: openshift-cli
    version: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:3e8679cfc635d7037e933ef025725bf4c000b7d89b0486d1cef063fc2007c122

Comment 12 Peter Ruan 2019-05-04 06:48:05 UTC

Looks like it's not just DNS clusteroperator.  Many other operators are not updating after a successful upgrade either


pruan@dhcp-91-104 ~/Downloads $ oc get clusteroperators                                                                                                                                                                                                              [ruby-2.6.0]
NAME                                 VERSION                             AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                       4.1.0-0.nightly-2019-05-03-152614   True        False         False      13h
cloud-credential                     4.1.0-0.nightly-2019-05-03-152614   True        False         False      13h
cluster-autoscaler                   4.1.0-0.nightly-2019-05-03-152614   True        False         False      13h
console                              4.1.0-0.nightly-2019-05-03-152614   True        False         False      13h
dns                                  4.1.0-0.nightly-2019-05-03-152614   True        False         False      13h
image-registry                       4.1.0-0.nightly-2019-05-03-152614   True        False         False      8h
ingress                              4.1.0-0.nightly-2019-05-03-152614   True        False         False      13h
kube-apiserver                       4.1.0-0.nightly-2019-05-03-152614   True        False         False      13h
kube-controller-manager              4.1.0-0.nightly-2019-05-03-152614   True        False         False      13h
kube-scheduler                       4.1.0-0.nightly-2019-05-03-152614   True        False         False      13h
machine-api                          4.1.0-0.nightly-2019-05-03-152614   True        False         False      13h
machine-config                       4.1.0-0.nightly-2019-05-03-152614   True        False         False      13h
marketplace                          4.1.0-0.nightly-2019-05-03-152614   True        False         False      49m
monitoring                           4.1.0-0.nightly-2019-05-03-152614   True        False         False      46m
network                              4.1.0-0.nightly-2019-05-03-152614   True        False         False      13h
node-tuning                          4.1.0-0.nightly-2019-05-03-152614   True        False         False      62m
openshift-apiserver                  4.1.0-0.nightly-2019-05-03-152614   True        False         False      8h
openshift-controller-manager         4.1.0-0.nightly-2019-05-03-152614   True        False         False      13h
openshift-samples                    4.1.0-0.nightly-2019-05-03-152614   True        False         False      62m
operator-lifecycle-manager           4.1.0-0.nightly-2019-05-03-152614   True        False         False      13h
operator-lifecycle-manager-catalog   4.1.0-0.nightly-2019-05-03-152614   True        False         False      13h
service-ca                           4.1.0-0.nightly-2019-05-03-152614   True        False         False      13h
service-catalog-apiserver            4.1.0-0.nightly-2019-05-03-152614   True        False         False      13h
service-catalog-controller-manager   4.1.0-0.nightly-2019-05-03-152614   True        False         False      13h
storage                              4.1.0-0.nightly-2019-05-03-152614   True        False         False      62m

Comment 13 Dan Mace 2019-05-06 13:20:29 UTC

https://bugzilla.redhat.com/show_bug.cgi?id=1690342#c12 Please note that "Since" isn't the criteria here — you need to check specifically whether the Progressing last transition time was updated for this particular bug. The "since" value may not align with the progressing transition time.

Comment 15 Miciah Dashiel Butler Masters 2019-05-06 18:44:28 UTC

PR: https://github.com/openshift/cluster-dns-operator/pull/104

Comment 16 Miciah Dashiel Butler Masters 2019-05-06 20:02:35 UTC

Follow-up PR: https://github.com/openshift/cluster-dns-operator/pull/106

Comment 17 Hongan Li 2019-05-08 06:24:17 UTC

(In reply to Dan Mace from comment #13)
> https://bugzilla.redhat.com/show_bug.cgi?id=1690342#c12 Please note that
> "Since" isn't the criteria here — you need to check specifically whether the
> Progressing last transition time was updated for this particular bug. The
> "since" value may not align with the progressing transition time.

Hi Dan, do you mean only lastTransitionTime of `Progressing` should be updated during upgrade? The lastTransitionTime of `Available` and `Degraded` will keep same, right? 

Does `SINCE` align with `Available` transition time? I created a new ingresscontroller and saw the three items `Progressing`, `Available` and `SINCE` were updated.

Comment 18 Miciah Dashiel Butler Masters 2019-05-08 16:54:52 UTC

Yes, the "SINCE" column is from the Available condition: https://github.com/openshift/cluster-version-operator/blob/6750bf41fa6b291d29c373e2dae6db3892e08c32/install/0000_00_cluster-version-operator_01_clusteroperator.crd.yaml#L23-L26

We need to make sure that the last transition time of the Progressing condition is updated, whether or not other conditions are updated.  To get the progressing condition's last transition time, you can use json or yaml output, or you can use custom columns as in the following command:

    oc get clusteroperators -o 'custom-columns=NAME:.metadata.name,VERSION:.status.versions[?(@.name=="operator")].version,PROGRESSING LAST TRANSITION:.status.conditions[?(@.type=="Progressing")].lastTransitionTime'

Comment 20 Hongan Li 2019-05-09 06:54:19 UTC

Thank you for your reply, Miciah.

I have upgraded a cluster from 4.1.0-0.nightly-2019-05-08-012425 to 4.1.0-0.nightly-2019-05-08-065958 and the issue has been fixed.

status:
  conditions:
  - lastTransitionTime: "2019-05-09T02:31:54Z"
    message: All desired DNS DaemonSets available and operand Namespace exists
    reason: AsExpected
    status: "False"
    type: Degraded
  - lastTransitionTime: "2019-05-09T06:43:36Z"
    message: Desired and available number of DNS DaemonSets are equal
    reason: AsExpected
    status: "False"
    type: Progressing
  - lastTransitionTime: "2019-05-09T02:31:54Z"
    message: At least 1 DNS DaemonSet available
    reason: AsExpected
    status: "True"
    type: Available

Comment 22 errata-xmlrpc 2019-06-04 10:46:06 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758

Note You need to log in before you can comment on or make changes to this bug.