Bug 1955418 - 4.8 -> 4.7 rollbacks broken on unrecognized flowschema openshift-etcd-operator
Summary: 4.8 -> 4.7 rollbacks broken on unrecognized flowschema openshift-etcd-operator
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Etcd
Version: 4.7
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.7.z
Assignee: W. Trevor King
QA Contact: ge liu
URL:
Whiteboard:
: 1961451 (view as bug list)
Depends On: 1955414
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-04-30 05:10 UTC by W. Trevor King
Modified: 2021-06-15 09:27 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of: 1955414
Environment:
Last Closed: 2021-06-15 09:27:08 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-etcd-operator pull 582 0 None open Bug 1955418: manifests: Shift FlowSchema to level 50 2021-04-30 05:12:52 UTC
Red Hat Product Errata RHSA-2021:2286 0 None None None 2021-06-15 09:27:55 UTC

Description W. Trevor King 2021-04-30 05:10:07 UTC
+++ This bug was initially created as a clone of Bug #1955414 +++

Recent job [1] stuck going back to 4.7.9:

  $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-aws-upgrade-rollback/1387886269920448512/artifacts/e2e-aws-upgrade-rollback/gather-extra/artifacts/clusterversion.json | jq -r '.items[].status.history[] | .startedTime + " " + (.completionTime // "-") + " " + .state + " " + .version'
  2021-04-29T23:51:44Z - Partial 4.7.9
  2021-04-29T22:45:09Z 2021-04-29T23:51:44Z Partial 4.8.0-0.ci-2021-04-29-151002
  2021-04-29T22:16:20Z 2021-04-29T22:42:02Z Completed 4.7.9

Stuck on the etcd-operator FlowSchema:

  $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-aws-upgrade-rollback/1387886269920448512/artifacts/e2e-aws-upgrade-rollback/gather-extra/artifacts/clusterversion.json | jq -r '.items[].status.conditions[] | .lastTransitionTime + " " + .type + "=" + .status + " " + .reason + ": " + .message'
  2021-04-29T22:16:20Z RetrievedUpdates=False NoChannel: The update channel has not been configured.
  2021-04-29T22:42:02Z Available=True : Done applying 4.7.9
  2021-04-30T01:55:35Z Failing=True UpdatePayloadResourceTypeMissing: Could not update flowschema "openshift-etcd-operator" (73 of 668): the server does not recognize this resource, check extension API servers
  2021-04-29T22:45:09Z Progressing=True UpdatePayloadResourceTypeMissing: Unable to apply 4.7.9: a required extension is not available to update

Sounds like the issue was that 4.7 supported both v1alpha1 and v1beta1 FlowSchema, but 4.8 only supports v1beta1 [2].  So on rollback, the etcd operator is trying to push out  its 4.7 v1alpha1 FlowSchema, and the 4.8 API-server is saying "v1alpha1?  No idea...".  Fix is probably bumping 4.7's manifest to use v1beta1, but... That may make life exciting for 4.6 -> 4.7, unless the 4.6 API-server also understands v1beta1, which it doesn't sound like it does.  Possibly close this as a WONTFIX dup of 1954481, but I thought I'd file a bug explaining why the 4.8 -> 4.7 rollback jobs were broken.  Original motivation for this FlowSchema manifest is in [3].

[1]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-aws-upgrade-rollback/1387886269920448512
[2]: https://bugzilla.redhat.com/show_bug.cgi?id=1954481#c2
[3]: https://github.com/openshift/cluster-etcd-operator/pull/462

--- Additional comment from W. Trevor King on 2021-04-30 04:49:54 UTC ---

Hmm, or we can move the FlowSchema manifest after the kube-apiserver manifests, and then use v1beta1 in 4.7.z?  4.6 -> 4.7 updates would have their v1alpha1 FlowSchema thrown out on kube-apiserver bump, but then shortly thereafter restored via the v1beta1 manifest.  4.8 -> 4.7 downgrades would no longer have any version bump, because v1beta1 is compatible with both versions.  Moving to etcd to see if we can get that to work...

Comment 1 Sam Batschelet 2021-05-27 13:21:44 UTC
*** Bug 1961451 has been marked as a duplicate of this bug. ***

Comment 5 Siddharth Sharma 2021-06-04 18:39:02 UTC
This bug will be shipped as part of next z-stream release 4.7.15 on June 14th, as 4.7.14 was dropped due to a regression https://bugzilla.redhat.com/show_bug.cgi?id=1967614

Comment 9 errata-xmlrpc 2021-06-15 09:27:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.16 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2286


Note You need to log in before you can comment on or make changes to this bug.