Bug 2104135 - changes must be reverted before upgrading, but no diff explaining what needs changing
Summary: changes must be reverted before upgrading, but no diff explaining what needs ...
Keywords:
Status: CLOSED DUPLICATE of bug 2097555
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.9
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Candace Holman
QA Contact: Hongan Li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-07-05 15:41 UTC by W. Trevor King
Modified: 2022-08-04 21:58 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-07-06 05:52:40 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description W. Trevor King 2022-07-05 15:41:00 UTC
Not happy in CI:

$ w3m -dump -cols 200 'https://search.ci.openshift.org/?maxAge=48h&type=junit&search=changes+must+be+reverted+before+upgrading' | grep 'failures match' | sort
periodic-ci-openshift-release-master-ci-4.10-upgrade-from-stable-4.9-e2e-azure-ovn-upgrade (all) - 2 runs, 100% failed, 100% of failures match = 100% impact
periodic-ci-openshift-release-master-ci-4.10-upgrade-from-stable-4.9-e2e-gcp-ovn-upgrade (all) - 14 runs, 100% failed, 100% of failures match = 100% impact
periodic-ci-openshift-release-master-ci-4.9-e2e-azure-ovn (all) - 4 runs, 100% failed, 100% of failures match = 100% impact
periodic-ci-openshift-release-master-ci-4.9-e2e-azure-upgrade-ovn-single-node (all) - 5 runs, 100% failed, 100% of failures match = 100% impact
periodic-ci-openshift-release-master-ci-4.9-e2e-gcp-ovn (all) - 4 runs, 100% failed, 100% of failures match = 100% impact
periodic-ci-openshift-release-master-ci-4.9-upgrade-from-stable-4.8-e2e-azure-ovn-upgrade (all) - 2 runs, 100% failed, 100% of failures match = 100% impact
periodic-ci-openshift-release-master-ci-4.9-upgrade-from-stable-4.8-e2e-gcp-ovn-upgrade (all) - 1 runs, 100% failed, 100% of failures match = 100% impact
periodic-ci-openshift-release-master-nightly-4.10-upgrade-from-stable-4.9-e2e-azure-upgrade-ovn-single-node (all) - 2 runs, 100% failed, 100% of failures match = 100% impact
periodic-ci-openshift-release-master-okd-4.10-upgrade-from-okd-4.9-e2e-upgrade-gcp (all) - 1 runs, 100% failed, 100% of failures match = 100% impact

Digging into periodic-ci-openshift-release-master-ci-4.9-e2e-gcp-ovn with TestGrid [1], brackets seem to be:

* 4.9.0-0.ci-2022-05-31-184403 passed [2]
* 4.9.0-0.nightly-2022-06-06-070133 failed [3]

I'm not clear on the pivot to nightlies in the *-master-ci-* job name.  Perhaps that comes into this?  But diffing the two releases:

$ REF_A=https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.9-e2e-gcp-ovn/1533464045682692096/artifacts/release/artifacts/release-images-latest
$ REF_B=https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.9-e2e-gcp-ovn/1533706341426663424/artifacts/release/artifacts/release-images-latest
$ JQ='[.spec.tags[] | .name + " " + .annotations["io.openshift.build.source-location"] + "/commit/" + .annotations["io.openshift.build.commit.id"]] | sort[]'
wking@penguin /tmp $ diff -U0 <(curl -s "${REF_A}" | jq -r "${JQ}") <(curl -s "${REF_B}" | jq -r "${JQ}")
$ diff -U0 <(curl -s "${REF_A}" | jq -r "${JQ}") <(curl -s "${REF_B}" | jq -r "${JQ}") | grep ingress
-cluster-ingress-operator https://github.com/openshift/cluster-ingress-operator/commit/fd6b1ec051115f4d520b27977fa0bceea9c418a8
+cluster-ingress-operator https://github.com/openshift/cluster-ingress-operator/commit/cb650259c650b77f28903ad50733703e05238e13

Getting logs:

$ git clone --branch release-4.9 --depth 50 https://github.com/openshift/cluster-ingress-operator
$ cd cluster-ingress-operator
$ git --no-pager log --first-parent --oneline fd6b1ec051..cb650259c6
cb65025 Merge pull request #713 from Miciah/BZ2060542-use-externalTrafficPolicy-Cluster-with-OVN

So possibly introduced with 4.9.38 [4]?

[1]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.9-informing#periodic-ci-openshift-release-master-ci-4.9-e2e-gcp-ovn
[2]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.9-e2e-gcp-ovn/1533464045682692096
[3]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.9-e2e-gcp-ovn/1533706341426663424
[4]: https://bugzilla.redhat.com/show_bug.cgi?id=2079517#c6

Comment 1 W. Trevor King 2022-07-05 15:46:08 UTC
Details from that first failing run [1]:

: [sig-arch][Early] Managed cluster should start all core operators [Skipped:Disconnected] [Suite:openshift/conformance/parallel]
Run #0: Failed	1s

{  fail [github.com/onsi/ginkgo.0-origin.0+incompatible/internal/leafnodes/runner.go:113]: Jun  6 07:41:00.449: Some cluster operators are not ready: ingress (Upgradeable=False IngressControllersNotUpgradeable: Some ingresscontrollers are not upgradeable: ingresscontroller "default" is not upgradeable: OperandsNotUpgradeable: One or more managed resources are not upgradeable: load balancer service has been modified; changes must be reverted before upgrading: )}

And the backing ClusterOperator condition:

$ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.9-e2e-gcp-ovn/1533706341426663424/artifacts/e2e-gcp-ovn/gather-extra/artifacts/clusteroperators.json | jq -r '.items[] | select(.metadata.name == "ingress").status.conditions[] | select(.type == "Upgradeable")'
{
  "lastTransitionTime": "2022-06-06T08:09:21Z",
  "message": "Some ingresscontrollers are not upgradeable: ingresscontroller \"default\" is not upgradeable: OperandsNotUpgradeable: One or more managed resources are not upgradeable: load balancer service has been modified; changes must be reverted before upgrading: ",
  "reason": "IngressControllersNotUpgradeable",
  "status": "False",
  "type": "Upgradeable"
}

Operator logs in [2], but I'm not sure what to look for in there.

[1]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.9-e2e-gcp-ovn/1533706341426663424
[2]: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.9-e2e-gcp-ovn/1533706341426663424/artifacts/e2e-gcp-ovn/gather-extra/artifacts/pods/openshift-ingress-operator_ingress-operator-5df94bf455-r6pzj_ingress-operator.log

Comment 3 W. Trevor King 2022-07-06 05:52:40 UTC
Looks like a dup of bug 2097555, which is going back to 4.10.z with bug 2097735 and 4.9.z with bug 2097736.

*** This bug has been marked as a duplicate of bug 2097555 ***


Note You need to log in before you can comment on or make changes to this bug.