Bug 1689139 - OLM upgrade failed via the OTA
Summary: OLM upgrade failed via the OTA
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: OLM
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.1.0
Assignee: Evan Cordell
QA Contact: Jian Zhang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-03-15 09:24 UTC by Jian Zhang
Modified: 2019-06-04 10:46 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-06-04 10:45:52 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:0758 None None None 2019-06-04 10:45:59 UTC

Description Jian Zhang 2019-03-15 09:24:25 UTC
Description of problem:
OLM upgrade failed.

Version-Release number of selected component (if applicable):
The cluster version:
Before upgrade:
4.0.0-0.nightly-2019-03-13-233958
After upgrade:
4.0.0-0.nightly-2019-03-14-040908 

How reproducible:
always

Steps to Reproduce:
1. Install OCP 4.0 with payload: 4.0.0-0.nightly-2019-03-13-233958
2. Record the OLM clusteroperator CR:
[jzhang@dhcp-140-18 upgrade15]$ oc get clusteroperator operator-lifecycle-manager -o yaml
apiVersion: config.openshift.io/v1
kind: ClusterOperator
metadata:
  creationTimestamp: 2019-03-15T02:38:20Z
  generation: 1
  name: operator-lifecycle-manager
  resourceVersion: "1654"
  selfLink: /apis/config.openshift.io/v1/clusteroperators/operator-lifecycle-manager
  uid: 653821e6-46cb-11e9-acf2-02ca74d1ac00
spec: {}
status:
  conditions:
  - lastTransitionTime: 2019-03-15T02:38:20Z
    message: Done deploying 0.8.1.
    status: "False"
    type: Progressing
  - lastTransitionTime: 2019-03-15T02:38:20Z
    message: Done deploying 0.8.1.
    status: "False"
    type: Failing
  - lastTransitionTime: 2019-03-15T02:38:20Z
    message: Done deploying 0.8.1.
    status: "True"
    type: Available
  extension: null
  relatedObjects: null
  versions:
  - name: operator
    version: 0.8.1-31e16a9

3. Check the available new versions:
[jzhang@dhcp-140-18 upgrade15]$ oc adm upgrade
Cluster version is 4.0.0-0.nightly-2019-03-13-233958

Updates:

VERSION                           IMAGE
4.0.0-0.nightly-2019-03-14-040908 registry.svc.ci.openshift.org/ocp/release:4.0.0-0.nightly-2019-03-14-040908
4.0.0-0.nightly-2019-03-14-135819 registry.svc.ci.openshift.org/ocp/release:4.0.0-0.nightly-2019-03-14-135819
4.0.0-0.nightly-2019-03-14-040908 registry.svc.ci.openshift.org/ocp/release:4.0.0-0.nightly-2019-03-14-040908
4.0.0-0.nightly-2019-03-14-135819 registry.svc.ci.openshift.org/ocp/release:4.0.0-0.nightly-2019-03-14-135819

4, Upgrade to the new version:
[jzhang@dhcp-140-18 upgrade15]$ oc adm upgrade --to-image registry.svc.ci.openshift.org/ocp/release:4.0.0-0.nightly-2019-03-14-040908 
Updating to release image registry.svc.ci.openshift.org/ocp/release:4.0.0-0.nightly-2019-03-14-040908

5, Check the OLM clusteroperator CR.

Actual results:
The OLM wasn't be upgraded, still, use the old image.
[jzhang@dhcp-140-18 upgrade15]$ oc get clusteroperator operator-lifecycle-manager -o yaml
apiVersion: config.openshift.io/v1
kind: ClusterOperator
metadata:
  creationTimestamp: 2019-03-15T02:38:20Z
  generation: 1
  name: operator-lifecycle-manager
  resourceVersion: "1654"
  selfLink: /apis/config.openshift.io/v1/clusteroperators/operator-lifecycle-manager
  uid: 653821e6-46cb-11e9-acf2-02ca74d1ac00
spec: {}
status:
  conditions:
  - lastTransitionTime: 2019-03-15T02:38:20Z
    message: Done deploying 0.8.1.
    status: "False"
    type: Progressing
  - lastTransitionTime: 2019-03-15T02:38:20Z
    message: Done deploying 0.8.1.
    status: "False"
    type: Failing
  - lastTransitionTime: 2019-03-15T02:38:20Z
    message: Done deploying 0.8.1.
    status: "True"
    type: Available
  extension: null
  relatedObjects: null
  versions:
  - name: operator
    version: 0.8.1-31e16a9

[jzhang@dhcp-140-18 upgrade15]$ oc get clusteroperator
NAME                                  VERSION                             AVAILABLE   PROGRESSING   FAILING   SINCE
cluster-autoscaler                    4.0.0-0.nightly-2019-03-14-040908   True        False         False     34m
console                               4.0.0-0.nightly-2019-03-14-040908   True        False         False     33m
dns                                   4.0.0-0.nightly-2019-03-14-040908   True        False         False     6h31m
image-registry                        4.0.0-0.nightly-2019-03-14-040908   True        False         False     34m
ingress                               4.0.0-0.nightly-2019-03-14-040908   True        False         False     6h25m
kube-apiserver                        4.0.0-0.nightly-2019-03-14-040908   True        False         False     40m
kube-controller-manager               4.0.0-0.nightly-2019-03-14-040908   True        False         False     39m
kube-scheduler                        4.0.0-0.nightly-2019-03-14-040908   True        False         False     40m
machine-api                           4.0.0-0.nightly-2019-03-14-040908   True        False         False     6h31m
machine-config                        4.0.0-0.nightly-2019-03-14-040908   True        False         False     6h30m
marketplace-operator                  4.0.0-0.nightly-2019-03-14-040908   True        False         False     34m
monitoring                            4.0.0-0.nightly-2019-03-14-040908   True        False         False     27m
network                               4.0.0-0.nightly-2019-03-14-040908   True        False         False     6h26m
node-tuning                           4.0.0-0.nightly-2019-03-14-040908   True        False         False     34m
openshift-apiserver                   4.0.0-0.nightly-2019-03-14-040908   True        False         False     35m
openshift-authentication                                                  True        False         False     101m
openshift-cloud-credential-operator                                       True        False         False     6h30m
openshift-controller-manager          4.0.0-0.nightly-2019-03-14-040908   True        False         False     34m
openshift-samples                     4.0.0-0.nightly-2019-03-14-040908   True        False         False     33m
operator-lifecycle-manager            0.8.1-31e16a9                       True        False         False     6h31m
service-ca                                                                True        False         False     40m
service-catalog-apiserver             4.0.0-0.nightly-2019-03-14-040908   True        False         False     28m
service-catalog-controller-manager    4.0.0-0.nightly-2019-03-14-040908   True        False         False     33m
storage                               4.0.0-0.nightly-2019-03-14-040908   True        False         False     6h26m


Expected results:
The OLM should be upgraded successfully and use the new image from the newer payload.

Additional info:
[jzhang@dhcp-140-18 upgrade15]$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.0.0-0.nightly-2019-03-14-040908   True        False         10m     Cluster version is 4.0.0-0.nightly-2019-03-14-040908
...
  history:
  - completionTime: 2019-03-15T08:36:39Z
    image: registry.svc.ci.openshift.org/ocp/release:4.0.0-0.nightly-2019-03-14-040908
    startedTime: 2019-03-15T08:26:38Z
    state: Completed
    version: 4.0.0-0.nightly-2019-03-14-040908
  - completionTime: 2019-03-15T08:26:38Z
    image: registry.svc.ci.openshift.org/ocp/release@sha256:128bf3c22c7f7fdc3747e481031022cc995d7282f7c53bc6676cc7e91931c73c
    startedTime: 2019-03-15T02:37:05Z
    state: Completed
    version: 4.0.0-0.nightly-2019-03-13-233958

Comment 2 Jian Zhang 2019-03-19 09:15:47 UTC
Before upgrade:
Cluster version is 4.0.0-0.nightly-2019-03-18-200009
OLM version commit: io.openshift.build.commit.id=69457423c2da01da0110b17fac1ac48b994b99e8
[jzhang@dhcp-140-18 ocp119]$ oc exec catalog-operator-657f5ddf79-nqmwg -- olm --version
OLM version: 0.8.1
git commit: e528ffb

After upgrade:
Cluster version is 4.0.0-0.nightly-2019-03-18-223058
OLM version commit: io.openshift.build.commit.id=5159b0a1c0dfe2cb76eb706afb4e3cc2ac4447fd
[jzhang@dhcp-140-18 ocp119]$ oc exec catalog-operator-657f5ddf79-nqmwg -- olm --version
OLM version: 0.8.1
git commit: e528ffb

[jzhang@dhcp-140-18 ocp119]$ oc get clusterversion -o yaml|grep history -A 9
    history:
    - completionTime: 2019-03-19T08:52:01Z
      image: registry.svc.ci.openshift.org/ocp/release:4.0.0-0.nightly-2019-03-18-223058
      startedTime: 2019-03-19T08:29:56Z
      state: Completed
      version: 4.0.0-0.nightly-2019-03-18-223058
    - completionTime: 2019-03-19T08:29:56Z
      image: registry.svc.ci.openshift.org/ocp/release@sha256:e3f2bff3e7a40f7ca0777ada2ad89197a5ab6d7296d3bd12a28dc5aa6b4311dc
      startedTime: 2019-03-19T07:18:49Z
      state: Completed


LGTM, but there is still a little issue, the "since" time is incorrect. As below:

[jzhang@dhcp-140-18 ocp119]$ oc get clusteroperator
NAME                                  VERSION                             AVAILABLE   PROGRESSING   FAILING   SINCE
authentication                        4.0.0-0.nightly-2019-03-18-223058   True        False         False     22m
cluster-autoscaler                    4.0.0-0.nightly-2019-03-18-223058   True        False         False     22m
console                               4.0.0-0.nightly-2019-03-18-223058   True        False         False     21m
dns                                   4.0.0-0.nightly-2019-03-18-223058   True        False         False     111m
image-registry                        4.0.0-0.nightly-2019-03-18-223058   True        False         False     22m
ingress                               4.0.0-0.nightly-2019-03-18-223058   True        False         False     104m
kube-apiserver                        4.0.0-0.nightly-2019-03-18-223058   True        False         False     25m
kube-controller-manager               4.0.0-0.nightly-2019-03-18-223058   True        False         False     38m
kube-scheduler                        4.0.0-0.nightly-2019-03-18-223058   True        False         False     31m
machine-api                           4.0.0-0.nightly-2019-03-18-223058   True        False         False     112m
machine-config                        4.0.0-0.nightly-2019-03-18-223058   True        False         False     24m
marketplace-operator                  4.0.0-0.nightly-2019-03-18-223058   True        False         False     22m
monitoring                            4.0.0-0.nightly-2019-03-18-223058   True        False         False     23m
network                               4.0.0-0.nightly-2019-03-18-223058   True        False         False     112m
node-tuning                           4.0.0-0.nightly-2019-03-18-223058   True        False         False     22m
openshift-apiserver                   4.0.0-0.nightly-2019-03-18-223058   True        False         False     23m
openshift-cloud-credential-operator   4.0.0-0.nightly-2019-03-18-223058   True        False         False     112m
openshift-controller-manager          4.0.0-0.nightly-2019-03-18-223058   True        False         False     22m
openshift-samples                     4.0.0-0.nightly-2019-03-18-223058   True        False         False     21m
operator-lifecycle-manager            4.0.0-0.nightly-2019-03-18-223058   True        False         False     112m
service-ca                                                                True        False         False     27m
service-catalog-apiserver             4.0.0-0.nightly-2019-03-18-223058   True        False         False     22m
service-catalog-controller-manager    4.0.0-0.nightly-2019-03-18-223058   True        False         False     22m
storage                               4.0.0-0.nightly-2019-03-18-223058   True        False         False     105m

Comment 3 Jian Zhang 2019-03-19 09:24:09 UTC
> LGTM, but there is still a little issue, the "since" time is incorrect. As below:

For the above issue, I think it as the same as bug 1688611, we can trace it in there.
For this issue, LGTM since the OLM images/versions can be updated successfully. Verify it.

Comment 5 errata-xmlrpc 2019-06-04 10:45:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758


Note You need to log in before you can comment on or make changes to this bug.