Bug 2094051
| Summary: | Custom created services in openshift-ingress removed even though the services are not of type LoadBalancer | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | OpenShift BugZilla Robot <openshift-bugzilla-robot> |
| Component: | Networking | Assignee: | Grant Spence <gspence> |
| Networking sub component: | router | QA Contact: | Hongan Li <hongli> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | high | ||
| Priority: | high | CC: | aos-bugs, gspence, hongli, mmasters |
| Version: | 4.9 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.10.z | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
Cause: The logic in the ingress-operator didn't validate whether a kubernetes service object in the openshift-ingress namespace was actually created/owned by the ingress controller it was attempting to reconcile with.
Consequence: The ingress-operator would modify/remove kubernetes services with the same name and namespace regardless of ownership which could cause unexpected behavior. This quite rare because the service has to have a very specific name and it also has to be in the openshift-ingress namespace.
Fix: The ingress-operator now checks the ownership of existing kubernetes services it attempts to create/remove and if ownership doesn't match, the ingress-operator throws an error and does not take any action.
Result: The ingress-operator won't modify/delete custom kubernetes services with the same name as one that it wants to modify/remove in the openshift-ingress namespace.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-10-04 11:28:44 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 2054200 | ||
| Bug Blocks: | |||
|
Comment 1
Miciah Dashiel Butler Masters
2022-06-09 15:55:44 UTC
Verified with "4.10.0-0.ci.test-2022-09-05-050839-ci-ln-vx21mkk-latest" ci image containing the merge. With this setup, it is observed the identical named service is no more deleted when the operator pod is restarted and there are warnings emitted for the duplicate name:
------
oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.10.0-0.ci.test-2022-09-05-050839-ci-ln-vx21mkk-latest True False 103m Cluster version is 4.10.0-0.ci.test-2022-09-05-050839-ci-ln-vx21mkk-latest
oc -n openshift-ingress get all
NAME READY STATUS RESTARTS AGE
pod/router-default-79d8554cb7-qgwrp 1/1 Running 0 68m
pod/router-default-79d8554cb7-qz56m 1/1 Running 0 68m
pod/router-internalapps2-6c94d78bf4-pfqtf 2/2 Running 0 48s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/router-default LoadBalancer 172.30.241.27 ac98f2623ddd04595b1b388850d88296-1359022875.us-east-2.elb.amazonaws.com 80:32004/TCP,443:30115/TCP 69m
service/router-internal-default ClusterIP 172.30.35.189 <none> 80/TCP,443/TCP,1936/TCP 69m
service/router-internal-internalapps2 ClusterIP 172.30.14.239 <none> 80/TCP,443/TCP,1936/TCP 49s
oc create svc nodeport router-internalapps2 --tcp=80 -n openshift-ingress
service/router-internalapps2 created
Post creation:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/router-default LoadBalancer 172.30.241.27 ac98f2623ddd04595b1b388850d88296-1359022875.us-east-2.elb.amazonaws.com 80:32004/TCP,443:30115/TCP 110m
service/router-internal-default ClusterIP 172.30.35.189 <none> 80/TCP,443/TCP,1936/TCP 110m
service/router-internal-internalapps2 ClusterIP 172.30.14.239 <none> 80/TCP,443/TCP,1936/TCP 42m
service/router-internalapps2 NodePort 172.30.220.35 <none> 80:32132/TCP 113s
Post operator pod removal:
oc -n openshift-ingress-operator delete pod/ingress-operator-68c989d4f-h2ggp
pod "ingress-operator-68c989d4f-h2ggp" deleted
oc -n openshift-ingress get all
NAME READY STATUS RESTARTS AGE
pod/router-default-79d8554cb7-qgwrp 1/1 Running 0 116m
pod/router-default-79d8554cb7-qz56m 1/1 Running 0 116m
pod/router-internalapps2-6c94d78bf4-pfqtf 2/2 Running 0 48m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/router-default LoadBalancer 172.30.241.27 ac98f2623ddd04595b1b388850d88296-1359022875.us-east-2.elb.amazonaws.com 80:32004/TCP,443:30115/TCP 116m
service/router-internal-default ClusterIP 172.30.35.189 <none> 80/TCP,443/TCP,1936/TCP 116m
service/router-internal-internalapps2 ClusterIP 172.30.14.239 <none> 80/TCP,443/TCP,1936/TCP 48m
service/router-internalapps2 NodePort 172.30.220.35 <none> 80:32132/TCP 7m19s
messages noted with after the above operation:
----
2022-09-05T07:20:16.202Z INFO operator.ingress_controller controller/controller.go:114 reconciling {"request": "openshift-ingress-operator/internalapps2"}
2022-09-05T07:20:16.277Z ERROR operator.init.controller.ingress_controller controller/controller.go:266 Reconciler error {"name": "internalapps2", "namespace": "openshift-ingress-operator", "error": "failed to ensure load balancer service for internalapps2: a conflicting load balancer service exists that is not owned by the ingress controller: openshift-ingress/router-internalapps2", "errorCauses": [{"error": "failed to ensure load balancer service for internalapps2: a conflicting load balancer service exists that is not owned by the ingress controller: openshift-ingress/router-internalapps2"}]}
----
------
moving to verified since the PR has been tested with pre-merge process and get merged Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.10.35 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:6728 |