Bug 2097736 - IngressControllersNotUpgradeable: load balancer service has been modified; changes must be reverted before upgrading
Summary: IngressControllersNotUpgradeable: load balancer service has been modified; ch...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.9
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: ---
: 4.9.z
Assignee: Miciah Dashiel Butler Masters
QA Contact: Hongan Li
URL:
Whiteboard:
Depends On: 2097735
Blocks: 2097737
TreeView+ depends on / blocked
 
Reported: 2022-06-16 12:40 UTC by OpenShift BugZilla Robot
Modified: 2022-08-04 21:58 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: When an ingresscontroller is configured to use a LoadBalancer-type service, the ingress operator creates and manages this service, and if the operator detects that the user has modified an annotation that the operator manages on this service, then the operator sets the ingress clusteroperator's "Upgradeable" status condition "False" to block upgrades. However, the operator's check of the service's annotations had a logic error that could falsely report that the user had modified the annotations when service had no annotations. Consequence: The ingress operator could erroneously set the ingress clusteroperator's "Upgradeable" status condition to "False", blocking upgrades, if the service had no annotations. In particular, this could happen on Alibaba, Azure, and GCP clusters when using OVN with a public (not internal) load balancer. Fix: The logic that checks the service's annotations was fixed to handle empty annotations correctly. Result: The ingress operator should no longer erroneously block upgrades.
Clone Of:
Environment:
Last Closed: 2022-07-20 10:52:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-ingress-operator pull 786 0 None open [release-4.9] Bug 2097736: Fix loadBalancerServiceAnnotationsChanged check and update 2022-06-16 22:03:05 UTC
Red Hat Product Errata RHBA-2022:5561 0 None None None 2022-07-20 10:53:07 UTC

Description OpenShift BugZilla Robot 2022-06-16 12:40:27 UTC
+++ This bug was initially created as a clone of Bug #2097735 +++

+++ This bug was initially created as a clone of Bug #2097555 +++

Description of problem:

The cluster-ingress-operator is causing 4.9 -> 4.10 CI upgrade tests to fail:

{  fail [github.com/onsi/ginkgo.0-origin.0+incompatible/internal/leafnodes/runner.go:113]: Jun 15 18:22:34.283: Some cluster operators are not ready: ingress (Upgradeable=False IngressControllersNotUpgradeable: Some ingresscontrollers are not upgradeable: ingresscontroller "default" is not upgradeable: OperandsNotUpgradeable: One or more managed resources are not upgradeable: load balancer service has been modified; changes must be reverted before upgrading: )}


OpenShift release version: 4.9


Cluster Platform: GCP, OVN


How reproducible: Unclear, but most recent nightly GCP OVN upgrade tests are failing due to this bug, even with three retries per nightly run.


Steps to Reproduce (in detail):
1. Run OCP 4.9 to 4.10 nightly upgrade test


Actual results: OCP cluster upgrade fails


Expected results: OCP cluster upgrade fails


Impact of the problem: This is currently blocking releases in the 4.10.z stream.


Additional info:

Examples of failing runs:
https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.10-upgrade-from-stable-4.9-e2e-gcp-ovn-upgrade/1537128423896387584

https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.10-upgrade-from-stable-4.9-e2e-gcp-ovn-upgrade/1536843529286848512

Comment 2 Hongan Li 2022-07-07 02:55:36 UTC
verified with 4.9.0-0.nightly-2022-07-06-220327 and passed.

$ oc get co/ingress 
NAME      VERSION                             AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
ingress   4.9.0-0.nightly-2022-07-06-220327   True        False         False      28m     

$ oc get co/ingress -oyaml
<---snip--->
  - lastTransitionTime: "2022-07-07T02:09:25Z"
    reason: IngressControllersUpgradeable
    status: "True"
    type: Upgradeable


And in previous build, we can see Upgradeable=False

$ oc get co/ingress
NAME      VERSION                             AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
ingress   4.9.0-0.nightly-2022-06-24-070308   True        False         False      45m     

$ oc get co/ingress -oyaml
<---snip--->
  - lastTransitionTime: "2022-07-07T01:52:46Z"
    message: 'Some ingresscontrollers are not upgradeable: ingresscontroller "default"
      is not upgradeable: OperandsNotUpgradeable: One or more managed resources are
      not upgradeable: load balancer service has been modified; changes must be reverted
      before upgrading: '
    reason: IngressControllersNotUpgradeable
    status: "False"
    type: Upgradeable

Comment 5 errata-xmlrpc 2022-07-20 10:52:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.9.43 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:5561


Note You need to log in before you can comment on or make changes to this bug.