Bug 1787765
Summary: | Audit for schema/defaulting config changes between releases | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Clayton Coleman <ccoleman> |
Component: | Installer | Assignee: | W. Trevor King <wking> |
Installer sub component: | openshift-installer | QA Contact: | Johnny Liu <jialiu> |
Status: | CLOSED CURRENTRELEASE | Docs Contact: | |
Severity: | high | ||
Priority: | high | CC: | aconstan, adahiya, arghosh, ccoleman, ChetRHosey, jmalde, lmohanty, nagrawal, openshift-bugs-escalate, ricarril, ssadhale, svaughn, trees, wking, zzhao |
Version: | 4.3.0 | ||
Target Milestone: | --- | ||
Target Release: | 4.5.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | 1773870 | Environment: | |
Last Closed: | 2020-04-27 17:37:14 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1773870 | ||
Bug Blocks: | 1778235, 1781558 |
Description
Clayton Coleman
2020-01-05 06:46:53 UTC
All components need to: 1. Review their status or spec fields in global config that changed from 4.1,0 on 2. Identify any fields that should have been filled in during/after upgrade 3. Test a 4.1 to 4.2 to 4.3 upgrade on any configuration that is relevant 4. Normalize current status to the appropriate value. We also need a CI job that installs 4.1 and upgrades serially. In case it falls under the same scope, BZ 1786246 is another case where 4.1 settings break under 4.2. In that case the Jenkins image can no longer be pulled with settings that were set under 4.1 (presumably by the installer). Ideally it would be caught by a CI job such as the one proposed. But it's hard to test everything so I wanted to raise awareness in case it's in scope. *** Bug 1779299 has been marked as a duplicate of this bug. *** Comparing 4.2->4.3 vs. 4.3 (using 4.3->4.3 as a stand-in for a raw 4.3 job, because it's easier for me to find update jobs). Looking at *->4.3.3 update CI [1] and picking successful jobs: * 4.2.20 -> 4.3.3 [2]. Drilling down to the config.openshift.io directory [3]. * 4.3.0 -> 4.3.3 [4]. Drilling down to the config.openshift.io directory [5]. Fetching: $ wget -r -e robots=off -np -H https://gcsweb-ci.svc.ci.openshift.org/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/18062/artifacts/e2e-aws-upgrade/must-gather/quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-8c22ff0f9629be3dab2f1f9b773ae800438e7db3a57e1d422aa6ee9a8ff6abfc/cluster-scoped-resources/config.openshift.io/ $ wget -r -e robots=off -np -H https://gcsweb-ci.svc.ci.openshift.org/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/18080/artifacts/e2e-aws-upgrade/must-gather/quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-8c22ff0f9629be3dab2f1f9b773ae800438e7db3a57e1d422aa6ee9a8ff6abfc/cluster-scoped-resources/config.openshift.io/ Diffing, and removing fields with timestamps and other expected divergence: $ diff -ru storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/* | $ grep '^+\|^-' | grep -v '^---\|^+++\|resourceVersion\|creationTimestamp\|lastTransitionTime:\| uid:\|apiServerInternalURI:\|apiServerURL:\|infrastructureName:\|versionHash:\|clusterID:\|consoleURL:\|baseDomain:\|ci-op-\|message:\|lastReportTime:\|domain:\|etcdDiscoveryDomain:\|startedTime:\|completionTime:\|are at latest configuration\|/release@sha256' | sort | uniq +--- - 4.3.3 not found in the "stable-4.2" channel' + 4.3.3 not found in the "stable-4.3" channel' - channel: stable-4.2 - channel: stable-4.2 + channel: stable-4.3 + channel: stable-4.3 - finishes. - finishes. - image: registry.svc.ci.openshift.org/ocp/release:4.3.3 - image: registry.svc.ci.openshift.org/ocp/release:4.3.3 - name: certified-operators + name: certified-operators - name: community-operators + name: community-operators - - name: operator - - name: operator + - name: operator + - name: operator - name: redhat-operators + name: redhat-operators - not found in the "stable-4.2" channel' + not found in the "stable-4.3" channel' - reason: RollOutInProgress - reason: RollOutInProgress - region: us-east-2 - region: us-east-2 + region: us-west-1 + region: us-west-1 - verified: false - verified: false + verified: true + verified: true - version: 4.2.20 - version: 4.2.20 + version: 4.3.0 + version: 4.3.0 - version: 4.3.3 - version: 4.3.3 + version: 4.3.3 + version: 4.3.3 And the bulk of those changes are just unstable ordering. For example: $ diff -u $(grep -rl redhat-operators storage.googleapis.com) --- storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/18080/artifacts/e2e-aws-upgrade/must-gather/quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-8c22ff0f9629be3dab2f1f9b773ae800438e7db3a57e1d422aa6ee9a8ff6abfc/cluster-scoped-resources/config.openshift.io/operatorhubs.yaml 2020-02-19 13:59:43.000000000 -0800 +++ storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/18062/artifacts/e2e-aws-upgrade/must-gather/quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-8c22ff0f9629be3dab2f1f9b773ae800438e7db3a57e1d422aa6ee9a8ff6abfc/cluster-scoped-resources/config.openshift.io/operatorhubs.yaml 2020-02-19 10:55:21.000000000 -0800 @@ -6,26 +6,26 @@ metadata: annotations: release.openshift.io/create-only: "true" - creationTimestamp: "2020-02-19T20:51:51Z" + creationTimestamp: "2020-02-19T17:42:22Z" generation: 1 name: cluster - resourceVersion: "23400" + resourceVersion: "47685" selfLink: /apis/config.openshift.io/v1/operatorhubs/cluster - uid: e921c768-d891-4ea9-9047-2099d9a7c912 + uid: 2eedf533-533f-11ea-bf80-02b1edf5936c spec: {} status: sources: - disabled: false - name: community-operators - status: Success - - disabled: false name: redhat-operators status: Success - disabled: false name: certified-operators status: Success + - disabled: false + name: community-operators + status: Success kind: OperatorHubList metadata: continue: "" - resourceVersion: "43003" + resourceVersion: "53562" selfLink: /apis/config.openshift.io/v1/operatorhubs Still would be nice to check 4.3->4.3, 4.1->4.2->4.3->4.4, etc. [1]: https://openshift-release.svc.ci.openshift.org/releasestream/4-stable/release/4.3.3#upgrades-from [2]: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/18062 [3]: https://gcsweb-ci.svc.ci.openshift.org/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/18062/artifacts/e2e-aws-upgrade/must-gather/quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-8c22ff0f9629be3dab2f1f9b773ae800438e7db3a57e1d422aa6ee9a8ff6abfc/cluster-scoped-resources/config.openshift.io/ [4]: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/18080 [5]: https://gcsweb-ci.svc.ci.openshift.org/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/18080/artifacts/e2e-aws-upgrade/must-gather/quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-8c22ff0f9629be3dab2f1f9b773ae800438e7db3a57e1d422aa6ee9a8ff6abfc/cluster-scoped-resources/config.openshift.io/ Repeating the above with *->4.2.4 [1]. * 4.1.23 -> 4.2.4 [2]. Drilling down to the config.openshift.io directory [3]. * 4.2.2 -> 4.2.4 [4]. Drilling down to the config.openshift.io directory [5]. Fetching: $ wget -r -e robots=off -np -H https://gcsweb-ci.svc.ci.openshift.org/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/10737/artifacts/e2e-aws-upgrade/must-gather/quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-2bebbc3d547d70cb8caea206a567642f5ab1c7e098ddba55bf7e64b5c58534f2/cluster-scoped-resources/config.openshift.io/ $ wget -r -e robots=off -np -H https://gcsweb-ci.svc.ci.openshift.org/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/10738/artifacts/e2e-aws-upgrade/must-gather/quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-2bebbc3d547d70cb8caea206a567642f5ab1c7e098ddba55bf7e64b5c58534f2/cluster-scoped-resources/config.openshift.io/ Diffing, and removing fields with timestamps and other expected divergence (and removing a $ from '$ grep' from a sloppy copy/paste from my previous comment): $ diff -ru storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/* | grep '^+\|^-' | grep -v '^---\|^+++\|resourceVersion\|creationTimestamp\|lastTransitionTime:\| uid:\|apiServerInternalURI:\|apiServerURL:\|infrastructureName:\|versionHash:\|clusterID:\|consoleURL:\|baseDomain:\|ci-op-\|message:\|lastReportTime:\|domain:\|etcdDiscoveryDomain:\|startedTime:\|completionTime:\|are at latest configuration\|/release@sha256' | sort | uniq - (0.3) - 4.2.4 not found in the "stable-4.1" channel' + 4.2.4 not found in the "stable-4.2" channel' - annotations: - annotations: + aws: + aws: - channel: stable-4.1 - channel: stable-4.1 + channel: stable-4.2 + channel: stable-4.2 + externalIP: - - group: cloudcredential.openshift.io - - group: cloudcredential.openshift.io + mastersSchedulable: false + name: "" + name: "" - name: certified-operators + name: certified-operators - name: community-operators + name: community-operators - - name: kube-apiserver - - name: kube-apiserver + - name: kube-apiserver + - name: kube-apiserver - - name: kube-controller-manager - - name: kube-controller-manager + - name: kube-controller-manager + - name: kube-controller-manager - - name: oauth-openshift - - name: oauth-openshift + - name: oauth-openshift + - name: oauth-openshift - name: openshift-machine-api - name: openshift-machine-api - name: redhat-operators + name: redhat-operators - namespace: openshift-cloud-credential-operator - namespace: openshift-cloud-credential-operator - not found in the "stable-4.1" channel' + not found in the "stable-4.2" channel' + platformStatus: + platformStatus: + policy: {} + policy: - reason: AsExpected - reason: AsExpected + reason: AsExpected + reason: AsExpected - reason: OperandTransitionsSucceeding - reason: OperandTransitionsSucceeding + reason: OperandTransitionsSucceeding + reason: OperandTransitionsSucceeding + region: us-east-1 + region: us-east-1 - release.openshift.io/create-only: "true" - release.openshift.io/create-only: "true" - resource: CredentialsRequest - resource: CredentialsRequest - spec: {} -spec: {} + spec: +spec: + status: {} +status: {} - status: "False" - status: "False" + status: "False" + status: "False" - status: "True" - status: "True" + status: "True" + status: "True" + trustedCA: + trustedCA: + type: AWS + type: AWS - type: Degraded - type: Degraded + type: Degraded + type: Degraded - type: Progressing - type: Progressing + type: Progressing + type: Progressing - type: Upgradeable - type: Upgradeable + type: Upgradeable + type: Upgradeable - version: 1.14.6 - version: 1.14.6 + version: 1.14.6 + version: 1.14.6 - version: 4.1.23 - version: 4.1.23 + version: 4.2.2 + version: 4.2.2 - version: 4.2.4_openshift - version: 4.2.4_openshift + version: 4.2.4_openshift + version: 4.2.4_openshift so you can see the platformStatus bit from bug 1773870. [1]: https://openshift-release.svc.ci.openshift.org/releasestream/4-stable/release/4.2.4#upgrades-from [2]: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/10737 [3]: https://gcsweb-ci.svc.ci.openshift.org/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/10737/artifacts/e2e-aws-upgrade/must-gather/quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-2bebbc3d547d70cb8caea206a567642f5ab1c7e098ddba55bf7e64b5c58534f2/cluster-scoped-resources/config.openshift.io/ [4]: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/10738 [5]: https://gcsweb-ci.svc.ci.openshift.org/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/10738/artifacts/e2e-aws-upgrade/must-gather/quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-2bebbc3d547d70cb8caea206a567642f5ab1c7e098ddba55bf7e64b5c58534f2/cluster-scoped-resources/config.openshift.io/ We also have chained update jobs, although on nightlies, not release candidates. E.g. here's 4.1->4.2->4.3 [1]. Currently nothing that includes 4.4 in that chain yet, but we can probably bump that since we've had 4.4 nightlies for a while now. [1]: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade-4.1-to-4.2-to-4.3-nightly/10 We've performed a one time audit of config drift between upgrade and greenfield installations and identified only the platform issue previously identified. The team will work on continuing the upgrade chaining work that Clayton start in the 4.1 to 4.2 to 4.3 upgrade jobs and that work will be tracked via Jira. If not completed ahead of 4.5 we'll track this as a 4.5 blocker bug to audit again. We now have 4.1->4.2->4.3->4.4 CI since [1]. [1]: https://github.com/openshift/release/pull/7230 we have CI jobs that test this now. |