Bug 1890038 - Infrastructure status.platform not migrated to status.platformStatus causes warnings
Summary: Infrastructure status.platform not migrated to status.platformStatus causes w...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: config-operator
Version: 4.5
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.7.0
Assignee: Matthew Staebler
QA Contact: Ke Wang
URL:
Whiteboard:
: 1911467 (view as bug list)
Depends On:
Blocks: 1936543
TreeView+ depends on / blocked
 
Reported: 2020-10-21 09:18 UTC by Pablo Alonso Rodriguez
Modified: 2021-03-08 17:40 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: The .status.platformStatus field of the Infrastructure resource is not populated when migrating from earlier OpenShift versions. Consequence: The cluster-config-operator emits warnings about the un-populated field. Fix: Update the migration controller in the cluster-config-operator to populate the .status.platformStatus field when it is un-populated. Result: The .status.platformStatus field is populated for all platforms regardless of the original OpenShift version installed.
Clone Of:
Environment:
Last Closed: 2021-02-24 15:27:07 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-config-operator pull 174 0 None closed Bug 1890038: update AWS platform status migration controller for all platforms 2021-02-12 19:27:26 UTC
Red Hat Knowledge Base (Solution) 5519741 0 None None None 2020-10-27 16:37:05 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:27:37 UTC

Description Pablo Alonso Rodriguez 2020-10-21 09:18:48 UTC
Description of problem:

In 4.1, Infrastructure objects used status.platform field[1] but since 4.2, it has been replaced with status.platformStatus field that contains a type subfield and this is what is currently used in 4.5[2].

This causes that, if a cluster was installed in 4.1 and is upgraded all the path to 4.5 (i.e. 4.1-->4.2-->4.3-->4.4-->4.5), the following warning is constantly emitted: 

"Falling back to deprecated status.platform because infrastructures.config.openshift.io/cluster status.platformStatus.type is empty"

Which seems to come from here[3].

Emitting this warning and working with older field is good, but it would be best if the field is just migrated properly and make sure that compatibility is not just removed without migrating, as it would impact clusters installed at 4.1 and updated all the way to latest. In addition, it would also prevent the warning spam.

Version-Release number of selected component (if applicable):

4.5

How reproducible:

Always but only if cluster was installed 4.1 and updated all the way to 4.5

Steps to Reproduce:
1. Install a 4.1 cluster
2. Upgrade it all the way to 4.5: 4.1-->4.2-->4.3-->4.4-->4.5
3. Bug reproducible

Actual results:

Infrastructure object uses deprecated field and warning is raised

Expected results:

Infrastructure object had its fields migrated and no warning.

References:

[1] - https://github.com/openshift/installer/blob/release-4.1/pkg/asset/manifests/infrastructure.go#L74
[2] - https://github.com/openshift/installer/blob/release-4.5/pkg/asset/manifests/infrastructure.go#L65
[3] - https://github.com/openshift/cluster-config-operator/blob/release-4.5/pkg/operator/kube_cloud_config/controller.go#L88

Comment 3 Stefan Schimanski 2020-10-22 08:26:36 UTC
infrastructure.status.platformStatus is owned by installer:

$ git blame config/v1/types_infrastructure.go
dedfb47b1 (W. Trevor King             2019-04-26 11:49:00 -0700  59)    // platformStatus holds status information specific to the underlying
dedfb47b1 (W. Trevor King             2019-04-26 11:49:00 -0700  60)    // infrastructure provider.
dedfb47b1 (W. Trevor King             2019-04-26 11:49:00 -0700  61)    // +optional
dedfb47b1 (W. Trevor King             2019-04-26 11:49:00 -0700  62)    PlatformStatus *PlatformStatus `json:"platformStatus,omitempty"`
dedfb47b1 (W. Trevor King             2019-04-26 11:49:00 -0700  63)

Changing component.

Comment 4 Scott Dodson 2020-10-22 12:35:09 UTC
The installer is not a runtime component, it can't resolve items like this. This needs to be reconciled by the config-operator.

Moving back to config-operator and assigning to aos-install@redhat.com

Comment 5 Pablo Alonso Rodriguez 2020-11-30 15:18:14 UTC
Increasing sev and prio of this bug.

It seems that machine-api does not properly fall back to the deprecated status.platform field in the same way than the config operator, so a cluster installed at 4.1 and updated all this way up to 4.6 can end up with the machine-api degraded and misbehaving.

I am still working on opening a separate bug at machine-api component, so that they properly fall back to deprecated status.platform field.

However, we are in risk than other components make the same mistake, potentially causing equal or worse issues and I understand that, at some point, status.platform will be completely removed. So we do need a proper and automatic migration to happen.

Comment 7 Scott Dodson 2020-12-09 16:15:22 UTC
I'm setting the blocker+ flag, we need to close this gap no later than 4.7.

Comment 17 Joel Speed 2021-01-04 10:48:04 UTC
*** Bug 1911467 has been marked as a duplicate of this bug. ***

Comment 23 Ke Wang 2021-01-28 10:38:06 UTC
One successful upgrade path as below, 
$ oc get clusterversion -o json|jq ".items[0].status.history"
[
  {
    "completionTime": "2021-01-28T09:28:04Z",
    "image": "quay.io/openshift-release-dev/ocp-release@sha256:465d130601325059554b57dfc9553b826918f356b1362972ef21b5112a4e1e71",
    "startedTime": "2021-01-28T08:20:35Z",
    "state": "Completed",
    "verified": false,
    "version": "4.7.0-fc.4"
  },
  {
    "completionTime": "2021-01-28T07:44:35Z",
    "image": "quay.io/openshift-release-dev/ocp-release@sha256:5c3618ab914eb66267b7c552a9b51c3018c3a8f8acf08ce1ff7ae4bfdd3a82bd",
    "startedTime": "2021-01-28T06:42:17Z",
    "state": "Completed",
    "verified": false,
    "version": "4.6.12"
  },
  {
    "completionTime": "2021-01-28T05:54:17Z",
    "image": "quay.io/openshift-release-dev/ocp-release@sha256:412276155bfe186c35322a788321ebf110130a272e18f55a1a2510f15ee0bb04",
    "startedTime": "2021-01-28T04:56:06Z",
    "state": "Completed",
    "verified": true,
    "version": "4.5.27"
  },
  {
    "completionTime": "2021-01-28T04:46:02Z",
    "image": "quay.io/openshift-release-dev/ocp-release@sha256:fcffc8b6c05f9cadd1ab96b134fcc4de28bcd8e11dd8aadb3a040baf54a0a072",
    "startedTime": "2021-01-28T03:58:00Z",
    "state": "Completed",
    "verified": true,
    "version": "4.4.32"
  },
  {
    "completionTime": "2021-01-28T03:34:30Z",
    "image": "quay.io/openshift-release-dev/ocp-release@sha256:9ff90174a170379e90a9ead6e0d8cf6f439004191f80762764a5ca3dbaab01dc",
    "startedTime": "2021-01-28T02:51:45Z",
    "state": "Completed",
    "verified": true,
    "version": "4.3.40"
  },
  {
    "completionTime": "2021-01-28T02:36:45Z",
    "image": "quay.io/openshift-release-dev/ocp-release@sha256:f097ce3fb313ec1613a146b6f1dec64dbb1e85b1b1c8d01bd95ef29525a32b65",
    "startedTime": "2021-01-28T01:58:34Z",
    "state": "Completed",
    "verified": true,
    "version": "4.2.34"
  },
  {
    "completionTime": "2021-01-28T01:49:49Z",
    "image": "quay.io/openshift-release-dev/ocp-release@sha256:a8f706d139c8e77d884ccedbf67d69eefd67b66dcf69ee1032b507fe3acbf8c8",
    "startedTime": "2021-01-28T01:36:41Z",
    "state": "Completed",
    "verified": false,
    "version": "4.1.41"
  }
]

Upgraded from 4.4 to 4.5, a new field platformStatus added.
$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.5.27    True        False         46m     Cluster version is 4.5.27

$ oc get infrastructures.config.openshift.io/cluster -oyaml
apiVersion: config.openshift.io/v1
kind: Infrastructure
...
status:
...
  platform: AWS
  platformStatus:
    aws:
      region: us-east-2
    type: AWS

From above results, the fix works fine, move the bug VERIFIED.

Comment 26 errata-xmlrpc 2021-02-24 15:27:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.