Bug 2094051 - Custom created services in openshift-ingress removed even though the services are not of type LoadBalancer
Summary: Custom created services in openshift-ingress removed even though the services...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.9
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: 4.10.z
Assignee: Grant Spence
QA Contact: Hongan Li
URL:
Whiteboard:
Depends On: 2054200
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-06-06 17:07 UTC by OpenShift BugZilla Robot
Modified: 2022-10-04 11:28 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: The logic in the ingress-operator didn't validate whether a kubernetes service object in the openshift-ingress namespace was actually created/owned by the ingress controller it was attempting to reconcile with. Consequence: The ingress-operator would modify/remove kubernetes services with the same name and namespace regardless of ownership which could cause unexpected behavior. This quite rare because the service has to have a very specific name and it also has to be in the openshift-ingress namespace. Fix: The ingress-operator now checks the ownership of existing kubernetes services it attempts to create/remove and if ownership doesn't match, the ingress-operator throws an error and does not take any action. Result: The ingress-operator won't modify/delete custom kubernetes services with the same name as one that it wants to modify/remove in the openshift-ingress namespace.
Clone Of:
Environment:
Last Closed: 2022-10-04 11:28:44 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-ingress-operator pull 820 0 None open BUG 2094051: Fix removing custom created service in openshift-ingress with same naming convention 2022-08-26 04:55:40 UTC
Red Hat Product Errata RHBA-2022:6728 0 None None None 2022-10-04 11:28:48 UTC

Comment 1 Miciah Dashiel Butler Masters 2022-06-09 15:55:44 UTC
Copying doc text from bug 2054200.

Comment 2 Arvind iyengar 2022-09-05 07:36:54 UTC
Verified with "4.10.0-0.ci.test-2022-09-05-050839-ci-ln-vx21mkk-latest" ci image containing the merge. With this setup, it is observed the identical named service is no more deleted when the operator pod is restarted and there are warnings emitted for the duplicate name:
------
oc get clusterversion
NAME      VERSION                                                   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.ci.test-2022-09-05-050839-ci-ln-vx21mkk-latest   True        False         103m    Cluster version is 4.10.0-0.ci.test-2022-09-05-050839-ci-ln-vx21mkk-latest

oc -n openshift-ingress get all
NAME                                        READY   STATUS    RESTARTS   AGE
pod/router-default-79d8554cb7-qgwrp         1/1     Running   0          68m
pod/router-default-79d8554cb7-qz56m         1/1     Running   0          68m
pod/router-internalapps2-6c94d78bf4-pfqtf   2/2     Running   0          48s

NAME                                    TYPE           CLUSTER-IP      EXTERNAL-IP                                                               PORT(S)                      AGE
service/router-default                  LoadBalancer   172.30.241.27   ac98f2623ddd04595b1b388850d88296-1359022875.us-east-2.elb.amazonaws.com   80:32004/TCP,443:30115/TCP   69m
service/router-internal-default         ClusterIP      172.30.35.189   <none>                                                                    80/TCP,443/TCP,1936/TCP      69m
service/router-internal-internalapps2   ClusterIP      172.30.14.239   <none>                                                                    80/TCP,443/TCP,1936/TCP      49s


oc create svc nodeport router-internalapps2 --tcp=80 -n openshift-ingress
service/router-internalapps2 created

Post creation:
NAME                                    TYPE           CLUSTER-IP      EXTERNAL-IP                                                               PORT(S)                      AGE
service/router-default                  LoadBalancer   172.30.241.27   ac98f2623ddd04595b1b388850d88296-1359022875.us-east-2.elb.amazonaws.com   80:32004/TCP,443:30115/TCP   110m
service/router-internal-default         ClusterIP      172.30.35.189   <none>                                                                    80/TCP,443/TCP,1936/TCP      110m
service/router-internal-internalapps2   ClusterIP      172.30.14.239   <none>                                                                    80/TCP,443/TCP,1936/TCP      42m
service/router-internalapps2            NodePort       172.30.220.35   <none>                                                                    80:32132/TCP                 113s


Post operator pod removal:
oc -n openshift-ingress-operator delete  pod/ingress-operator-68c989d4f-h2ggp
pod "ingress-operator-68c989d4f-h2ggp" deleted


oc -n openshift-ingress get all
NAME                                        READY   STATUS    RESTARTS   AGE
pod/router-default-79d8554cb7-qgwrp         1/1     Running   0          116m
pod/router-default-79d8554cb7-qz56m         1/1     Running   0          116m
pod/router-internalapps2-6c94d78bf4-pfqtf   2/2     Running   0          48m


NAME                                    TYPE           CLUSTER-IP      EXTERNAL-IP                                                               PORT(S)                      AGE
service/router-default                  LoadBalancer   172.30.241.27   ac98f2623ddd04595b1b388850d88296-1359022875.us-east-2.elb.amazonaws.com   80:32004/TCP,443:30115/TCP   116m
service/router-internal-default         ClusterIP      172.30.35.189   <none>                                                                    80/TCP,443/TCP,1936/TCP      116m
service/router-internal-internalapps2   ClusterIP      172.30.14.239   <none>                                                                    80/TCP,443/TCP,1936/TCP      48m
service/router-internalapps2            NodePort       172.30.220.35   <none>                                                                    80:32132/TCP                 7m19s


messages noted with after the above operation:
----
2022-09-05T07:20:16.202Z        INFO    operator.ingress_controller     controller/controller.go:114    reconciling     {"request": "openshift-ingress-operator/internalapps2"}
2022-09-05T07:20:16.277Z        ERROR   operator.init.controller.ingress_controller     controller/controller.go:266    Reconciler error        {"name": "internalapps2", "namespace": "openshift-ingress-operator", "error": "failed to ensure load balancer service for internalapps2: a conflicting load balancer service exists that is not owned by the ingress controller: openshift-ingress/router-internalapps2", "errorCauses": [{"error": "failed to ensure load balancer service for internalapps2: a conflicting load balancer service exists that is not owned by the ingress controller: openshift-ingress/router-internalapps2"}]}
----

------

Comment 5 Hongan Li 2022-09-28 08:45:20 UTC
moving to verified since the PR has been tested with pre-merge process and get merged

Comment 8 errata-xmlrpc 2022-10-04 11:28:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.10.35 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:6728


Note You need to log in before you can comment on or make changes to this bug.