Hide Forgot
Description of problem: Deletion of oc get svc router-default -n openshift-ingress hangs Version-Release number of selected component (if applicable): OCP 4.6 How reproducible: Always Steps to Reproduce: 1. [miheer@miheer gcp-ocp]oc delete svc router-default -n openshift-ingress service "router-default" deleted ..........hangs 2. 3. Actual results: deletion of svc hangs Expected results: deletion of svc shall not hang Additional info: Workaround is to delete finalizers from the svc which then helps the oc delete command to complete.
2021-01-08T07:53:41.731Z INFO operator.ingress_controller handler/enqueue_mapped.go:104 queueing ingress {"name": "default", "related": "/api/v1/namespaces/openshift-ingress/services/router-default"} 2021-01-08T07:53:41.731Z INFO operator.ingress_controller handler/enqueue_mapped.go:104 queueing ingress {"name": "default", "related": "/api/v1/namespaces/openshift-ingress/services/router-default"} 2021-01-08T07:53:41.731Z INFO operator.ingress_controller controller/controller.go:235 reconciling {"request": "openshift-ingress-operator/default"} 2021-01-08T07:53:41.843Z DEBUG operator.init.controller controller/controller.go:209 Successfully Reconciled {"controller": "ingress_controller", "name": "default", "namespace": "openshift-ingress-operator"} 2021-01-08T07:54:09.806Z INFO operator.ingress_controller handler/enqueue_mapped.go:104 queueing ingress {"name": "default", "related": "/api/v1/namespaces/openshift-ingress/services/router-default"} 2021-01-08T07:54:09.806Z INFO operator.ingress_controller handler/enqueue_mapped.go:104 queueing ingress {"name": "default", "related": "/api/v1/namespaces/openshift-ingress/services/router-default"} 2021-01-08T07:54:09.807Z INFO operator.ingress_controller controller/controller.go:235 reconciling {"request": "openshift-ingress-operator/default"} 2021-01-08T07:54:09.822Z INFO operator.ingress_controller handler/enqueue_mapped.go:104 queueing ingress {"name": "default", "related": "/api/v1/namespaces/openshift-ingress/services/router-default"} 2021-01-08T07:54:09.822Z INFO operator.ingress_controller handler/enqueue_mapped.go:104 queueing ingress {"name": "default", "related": "/api/v1/namespaces/openshift-ingress/services/router-default"} 2021-01-08T07:54:09.917Z ERROR operator.ingress_controller controller/controller.go:235 got retryable error; requeueing {"after": "1m29.999991841s", "error": "IngressController may become degraded soon: LoadBalancerReady=False"} 2021-01-08T07:54:09.917Z INFO operator.ingress_controller controller/controller.go:235 reconciling {"request": "openshift-ingress-operator/default"} 2021-01-08T07:54:09.918Z INFO operator.status_controller controller/controller.go:235 Reconciling {"request": "openshift-ingress-operator/default"} 2021-01-08T07:54:09.928Z DEBUG operator.init.controller controller/controller.go:209 Successfully Reconciled {"controller": "status_controller", "name": "default", "namespace": "openshift-ingress-operator"} 2021-01-08T07:54:10.016Z ERROR operator.ingress_controller controller/controller.go:235 got retryable error; requeueing {"after": "1m28.984842132s", "error": "IngressController may become degraded soon: LoadBalancerReady=False"} 2021-01-08T07:54:10.016Z INFO operator.ingress_controller controller/controller.go:235 reconciling {"request": "openshift-ingress-operator/default"} 2021-01-08T07:54:10.131Z ERROR operator.ingress_controller controller/controller.go:235 got retryable error; requeueing {"after": "1m28.87058498s", "error": "IngressController may become degraded soon: LoadBalancerReady=False"} 2021-01-08T07:54:10.277Z DEBUG operator.init.controller controller/controller.go:209 Successfully Reconciled {"controller": "certificate_controller", "name": "default", "namespace": "openshift-ingress-operator"}
Can you check the service controller logs? The service controller runs in the kube-controller-manager pod; use `oc -n openshift-kube-controller-manager get pods -l app=kube-controller-manager` to list the pods, and then use something like `oc logs -n openshift-kube-controller-manager -c kube-controller-manager kube-controller-manager-foo` to check each pod's. The relevant error messages will probably have "gce" or "load balancer" in them.
related kube controller logs after deletion I0108 10:47:48.663169 1 deployment_controller.go:490] "Error syncing deployment" deployment="openshift-monitoring/grafana" err="Operation cannot be fulfilled on deployments.apps \"grafana\": the object has been modified; please apply your changes to the latest version and try again" I0108 10:49:58.197632 1 controller.go:353] Deleting existing load balancer for service openshift-ingress/router-default I0108 10:49:58.198494 1 event.go:291] "Event occurred" object="openshift-ingress/router-default" kind="Service" apiVersion="v1" type="Normal" reason="DeletingLoadBalancer" message="Deleting load balancer" I0108 10:49:58.643140 1 gce_loadbalancer_external.go:337] ensureExternalLoadBalancerDeleted(a2056239fcfed49cba62027a81fa49ff(openshift-ingress/router-default)): Deleting forwarding rule. I0108 10:49:58.643148 1 gce_loadbalancer_external.go:319] ensureExternalLoadBalancerDeleted(a2056239fcfed49cba62027a81fa49ff(openshift-ingress/router-default)): Deleting firewall rule. I0108 10:49:58.643187 1 gce_loadbalancer_external.go:333] ensureExternalLoadBalancerDeleted(a2056239fcfed49cba62027a81fa49ff(openshift-ingress/router-default)): Deleting IP address. I0108 10:50:16.191945 1 gce_loadbalancer_external.go:343] ensureExternalLoadBalancerDeleted(a2056239fcfed49cba62027a81fa49ff(openshift-ingress/router-default)): Deleting target pool. I0108 10:50:19.794303 1 gce_loadbalancer_external.go:379] DeleteExternalTargetPoolAndChecks(a2056239fcfed49cba62027a81fa49ff(openshift-ingress/router-default)): Deleting health check a2056239fcfed49cba62027a81fa49ff. I0108 10:50:22.054471 1 gce_loadbalancer_external.go:401] DeleteExternalTargetPoolAndChecks(a2056239fcfed49cba62027a81fa49ff(openshift-ingress/router-default)): Deleting health check firewall k8s-a2056239fcfed49cba62027a81fa49ff-http-hc. I0108 10:50:27.099495 1 controller.go:868] Removing finalizer from service openshift-ingress/router-default I0108 10:50:27.117322 1 controller.go:894] Patching status for service openshift-ingress/router-default I0108 10:50:27.117629 1 event.go:291] "Event occurred" object="openshift-ingress/router-default" kind="Service" apiVersion="v1" type="Normal" reason="DeletedLoadBalancer" message="Deleted load balancer" I0108 10:50:27.132544 1 controller.go:368] Ensuring load balancer for service openshift-ingress/router-default I0108 10:50:27.132663 1 controller.go:853] Adding finalizer to service openshift-ingress/router-default I0108 10:50:27.133766 1 event.go:291] "Event occurred" object="openshift-ingress/router-default" kind="Service" apiVersion="v1" type="Normal" reason="EnsuringLoadBalancer" message="Ensuring load balancer" E0108 10:50:27.143730 1 controller.go:275] error processing service openshift-ingress/router-default (will retry): failed to add load balancer cleanup finalizer: Service "router-default" is invalid: metadata.finalizers: Forbidden: no new finalizers can be added if the object is being deleted, found new finalizers []string{"service.kubernetes.io/load-balancer-cleanup"} I0108 10:50:27.143835 1 event.go:291] "Event occurred" object="openshift-ingress/router-default" kind="Service" apiVersion="v1" type="Warning" reason="SyncLoadBalancerFailed" message="Error syncing load balancer: failed to add load balancer cleanup finalizer: Service \"router-default\" is invalid: metadata.finalizers: Forbidden: no new finalizers can be added if the object is being deleted, found new finalizers []string{\"service.kubernetes.io/load-balancer-cleanup\"}" I0108 10:50:32.144102 1 controller.go:368] Ensuring load balancer for service openshift-ingress/router-default I0108 10:50:32.144173 1 controller.go:853] Adding finalizer to service openshift-ingress/router-default I0108 10:50:32.144318 1 event.go:291] "Event occurred" object="openshift-ingress/router-default" kind="Service" apiVersion="v1" type="Normal" reason="EnsuringLoadBalancer" message="Ensuring load balancer" E0108 10:50:32.152049 1 controller.go:275] error processing service openshift-ingress/router-default (will retry): failed to add load balancer cleanup finalizer: Service "router-default" is invalid: metadata.finalizers: Forbidden: no new finalizers can be added if the object is being deleted, found new finalizers []string{"service.kubernetes.io/load-balancer-cleanup"} I0108 10:50:32.152143 1 event.go:291] "Event occurred" object="openshift-ingress/router-default" kind="Service" apiVersion="v1" type="Warning" reason="SyncLoadBalancerFailed" message="Error syncing load balancer: failed to add load balancer cleanup finalizer: Service \"router-default\" is invalid: metadata.finalizers: Forbidden: no new finalizers can be added if the object is being deleted, found new finalizers []string{\"service.kubernetes.io/load-balancer-cleanup\"}" I0108 10:50:42.152320 1 controller.go:368] Ensuring load balancer for service openshift-ingress/router-default I0108 10:50:42.152436 1 controller.go:853] Adding finalizer to service openshift-ingress/router-default I0108 10:50:42.152573 1 event.go:291] "Event occurred" object="openshift-ingress/router-default" kind="Service" apiVersion="v1" type="Normal" reason="EnsuringLoadBalancer" message="Ensuring load balancer" E0108 10:50:42.166954 1 controller.go:275] error processing service openshift-ingress/router-default (will retry): failed to add load balancer cleanup finalizer: Service "router-default" is invalid: metadata.finalizers: Forbidden: no new finalizers can be added if the object is being deleted, found new finalizers []string{"service.kubernetes.io/load-balancer-cleanup"} I0108 10:50:42.167042 1 event.go:291] "Event occurred" object="openshift-ingress/router-default" kind="Service" apiVersion="v1" type="Warning" reason="SyncLoadBalancerFailed" message="Error syncing load balancer: failed to add load balancer cleanup finalizer: Service \"router-default\" is invalid: metadata.finalizers: Forbidden: no new finalizers can be added if the object is being deleted, found new finalizers []string{\"service.kubernetes.io/load-balancer-cleanup\"}" I0108 10:51:02.167300 1 controller.go:368] Ensuring load balancer for service openshift-ingress/router-default I0108 10:51:02.167411 1 controller.go:853] Adding finalizer to service openshift-ingress/router-default I0108 10:51:02.167561 1 event.go:291] "Event occurred" object="openshift-ingress/router-default" kind="Service" apiVersion="v1" type="Normal" reason="EnsuringLoadBalancer" message="Ensuring load balancer" E0108 10:51:02.181051 1 controller.go:275] error processing service openshift-ingress/router-default (will retry): failed to add load balancer cleanup finalizer: Service "router-default" is invalid: metadata.finalizers: Forbidden: no new finalizers can be added if the object is being deleted, found new finalizers []string{"service.kubernetes.io/load-balancer-cleanup"} I0108 10:51:02.181134 1 event.go:291] "Event occurred" object="openshift-ingress/router-default" kind="Service" apiVersion="v1" type="Warning" reason="SyncLoadBalancerFailed" message="Error syncing load balancer: failed to add load balancer cleanup finalizer: Service \"router-default\" is invalid: metadata.finalizers: Forbidden: no new finalizers can be added if the object is being deleted, found new finalizers []string{\"service.kubernetes.io/load-balancer-cleanup\"}" I0108 10:51:42.181329 1 controller.go:368] Ensuring load balancer for service openshift-ingress/router-default I0108 10:51:42.181409 1 controller.go:853] Adding finalizer to service openshift-ingress/router-default I0108 10:51:42.181622 1 event.go:291] "Event occurred" object="openshift-ingress/router-default" kind="Service" apiVersion="v1" type="Normal" reason="EnsuringLoadBalancer" message="Ensuring load balancer" E0108 10:51:42.191909 1 controller.go:275] error processing service openshift-ingress/router-default (will retry): failed to add load balancer cleanup finalizer: Service "router-default" is invalid: metadata.finalizers: Forbidden: no new finalizers can be added if the object is being deleted, found new finalizers []string{"service.kubernetes.io/load-balancer-cleanup"} I0108 10:51:42.192040 1 event.go:291] "Event occurred" object="openshift-ingress/router-default" kind="Service" apiVersion="v1" type="Warning" reason="SyncLoadBalancerFailed" message="Error syncing load balancer: failed to add load balancer cleanup finalizer: Service \"router-default\" is invalid: metadata.finalizers: Forbidden: no new finalizers can be added if the object is being deleted, found new finalizers []string{\"service.kubernetes.io/load-balancer-cleanup\"}" [miheer@miheer gcp-ocp]$ oc get svc router-default -n openshift-ingress NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE router-default LoadBalancer 172.30.97.221 <pending> 80:31633/TCP,443:31368/TCP 11m apiVersion: v1 kind: Service metadata: creationTimestamp: "2021-01-08T10:43:14Z" deletionGracePeriodSeconds: 0 deletionTimestamp: "2021-01-08T10:49:57Z" finalizers: - ingress.openshift.io/operator labels: app: router ingresscontroller.operator.openshift.io/owning-ingresscontroller: default router: router-default name: router-default namespace: openshift-ingress ownerReferences: - apiVersion: apps/v1 controller: true kind: Deployment name: router-default uid: afb5630c-32e8-436a-b843-2265d60ab02e resourceVersion: "111945" selfLink: /api/v1/namespaces/openshift-ingress/services/router-default uid: 2056239f-cfed-49cb-a620-27a81fa49ff7 spec: clusterIP: 172.30.97.221 externalTrafficPolicy: Local healthCheckNodePort: 30623 ports: - name: http nodePort: 31633 port: 80 protocol: TCP targetPort: http - name: https nodePort: 31368 port: 443 protocol: TCP targetPort: https selector: ingresscontroller.operator.openshift.io/deployment-ingresscontroller: default sessionAffinity: None type: LoadBalancer status: loadBalancer: {} After deleting the finalizer [miheer@miheer gcp-ocp]$ oc get svc router-default -n openshift-ingress -w NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE router-default LoadBalancer 172.30.63.192 34.87.251.58 80:31171/TCP,443:31356/TCP 82s apiVersion: v1 kind: Service metadata: creationTimestamp: "2021-01-08T10:56:06Z" finalizers: - ingress.openshift.io/operator - service.kubernetes.io/load-balancer-cleanup labels: app: router ingresscontroller.operator.openshift.io/owning-ingresscontroller: default router: router-default name: router-default namespace: openshift-ingress ownerReferences: - apiVersion: apps/v1 controller: true kind: Deployment name: router-default uid: afb5630c-32e8-436a-b843-2265d60ab02e resourceVersion: "113798" selfLink: /api/v1/namespaces/openshift-ingress/services/router-default uid: f1911857-3775-4efd-8cfb-b8797ca1f8c6 spec: clusterIP: 172.30.63.192 externalTrafficPolicy: Local healthCheckNodePort: 31157 ports: - name: http nodePort: 31171 port: 80 protocol: TCP targetPort: http - name: https nodePort: 31356 port: 443 protocol: TCP targetPort: https selector: ingresscontroller.operator.openshift.io/deployment-ingresscontroller: default sessionAffinity: None type: LoadBalancer status: loadBalancer: ingress: - ip: 34.87.251.58 One interesting happened I removed - ingress.openshift.io/operator from finalizers and ran oc delete svc router-default and it worked. Related logs looking good -> I0108 10:58:39.950289 1 controller.go:353] Deleting existing load balancer for service openshift-ingress/router-default I0108 10:58:39.950568 1 event.go:291] "Event occurred" object="openshift-ingress/router-default" kind="Service" apiVersion="v1" type="Normal" reason="DeletingLoadBalancer" message="Deleting load balancer" I0108 10:58:40.404331 1 gce_loadbalancer_external.go:337] ensureExternalLoadBalancerDeleted(af191185737754efd8cfbb8797ca1f8c(openshift-ingress/router-default)): Deleting forwarding rule. I0108 10:58:40.404344 1 gce_loadbalancer_external.go:319] ensureExternalLoadBalancerDeleted(af191185737754efd8cfbb8797ca1f8c(openshift-ingress/router-default)): Deleting firewall rule. I0108 10:58:40.404382 1 gce_loadbalancer_external.go:333] ensureExternalLoadBalancerDeleted(af191185737754efd8cfbb8797ca1f8c(openshift-ingress/router-default)): Deleting IP address. I0108 10:58:58.095853 1 gce_loadbalancer_external.go:343] ensureExternalLoadBalancerDeleted(af191185737754efd8cfbb8797ca1f8c(openshift-ingress/router-default)): Deleting target pool. I0108 10:59:01.594496 1 gce_loadbalancer_external.go:379] DeleteExternalTargetPoolAndChecks(af191185737754efd8cfbb8797ca1f8c(openshift-ingress/router-default)): Deleting health check af191185737754efd8cfbb8797ca1f8c. I0108 10:59:03.984714 1 gce_loadbalancer_external.go:401] DeleteExternalTargetPoolAndChecks(af191185737754efd8cfbb8797ca1f8c(openshift-ingress/router-default)): Deleting health check firewall k8s-af191185737754efd8cfbb8797ca1f8c-http-hc. I0108 10:59:09.539832 1 controller.go:868] Removing finalizer from service openshift-ingress/router-default I0108 10:59:09.557441 1 controller.go:894] Patching status for service openshift-ingress/router-default I0108 10:59:09.558677 1 event.go:291] "Event occurred" object="openshift-ingress/router-default" kind="Service" apiVersion="v1" type="Normal" reason="DeletedLoadBalancer" message="Deleted load balancer" I0108 10:59:09.558803 1 garbagecollector.go:404] "Processing object" object="openshift-ingress/router-default-wwqfr" objectUID=bc641926-5410-4c63-8c4a-dfa2d6442281 kind="EndpointSlice" I0108 10:59:09.606598 1 garbagecollector.go:519] "Deleting object" object="openshift-ingress/router-default-wwqfr" objectUID=bc641926-5410-4c63-8c4a-dfa2d6442281 kind="EndpointSlice" propagationPolicy=Background I0108 10:59:09.665930 1 controller.go:368] Ensuring load balancer for service openshift-ingress/router-default I0108 10:59:09.666110 1 controller.go:853] Adding finalizer to service openshift-ingress/router-default I0108 10:59:09.668159 1 event.go:291] "Event occurred" object="openshift-ingress/router-default" kind="Service" apiVersion="v1" type="Normal" reason="EnsuringLoadBalancer" message="Ensuring load balancer" I0108 10:59:11.355734 1 gce_loadbalancer_external.go:74] ensureExternalLoadBalancer(a737b6a846e8340beb69980b33c5f2c6(openshift-ingress/router-default), australia-southeast1, , [TCP/80 TCP/443], [misalunk-w2658-master-1.c.openshift-gce-devel.internal misalunk-w2658-master-2.c.openshift-gce-devel.internal misalunk-w2658-worker-b-8rp79.c.openshift-gce-devel.internal misalunk-w2658-worker-c-k4nh5.c.openshift-gce-devel.internal misalunk-w2658-worker-a-g9tcj.c.openshift-gce-devel.internal misalunk-w2658-master-0.c.openshift-gce-devel.internal], map[]) I0108 10:59:12.565721 1 gce_loadbalancer_external.go:92] ensureExternalLoadBalancer(a737b6a846e8340beb69980b33c5f2c6(openshift-ingress/router-default)): Forwarding rule a737b6a846e8340beb69980b33c5f2c6 doesn't exist. I0108 10:59:15.230402 1 gce_loadbalancer_external.go:155] ensureExternalLoadBalancer(a737b6a846e8340beb69980b33c5f2c6(openshift-ingress/router-default)): Ensured IP address 34.116.73.192 (tier: Premium). I0108 10:59:15.656542 1 gce_loadbalancer_external.go:189] ensureExternalLoadBalancer(a737b6a846e8340beb69980b33c5f2c6(openshift-ingress/router-default)): Creating firewall. I0108 10:59:19.339921 1 gce_loadbalancer_external.go:193] ensureExternalLoadBalancer(a737b6a846e8340beb69980b33c5f2c6(openshift-ingress/router-default)): Created firewall. I0108 10:59:19.727391 1 gce_loadbalancer_external.go:202] ensureExternalLoadBalancer(a737b6a846e8340beb69980b33c5f2c6(openshift-ingress/router-default)): Target pool for service doesn't exist. I0108 10:59:20.165109 1 gce_loadbalancer_external.go:218] ensureExternalLoadBalancer(a737b6a846e8340beb69980b33c5f2c6(openshift-ingress/router-default)): Updating from nodes health checks to local traffic health checks. I0108 10:59:20.597179 1 gce_loadbalancer_external.go:901] Creating firewall k8s-a737b6a846e8340beb69980b33c5f2c6-http-hc for health checks. I0108 10:59:24.203646 1 gce_loadbalancer_external.go:905] Created firewall k8s-a737b6a846e8340beb69980b33c5f2c6-http-hc for health checks. I0108 10:59:24.630610 1 gce_loadbalancer_external.go:694] Did not find health check a737b6a846e8340beb69980b33c5f2c6, creating port 30902 path /healthz I0108 10:59:27.357522 1 gce_loadbalancer_external.go:703] Created HTTP health check a737b6a846e8340beb69980b33c5f2c6 healthCheckNodePort: 30902 I0108 10:59:27.357563 1 gce_loadbalancer_external.go:553] Creating targetpool a737b6a846e8340beb69980b33c5f2c6 with 1 healthchecks I0108 10:59:31.181479 1 gce_loadbalancer_external.go:495] ensureTargetPoolAndHealthCheck(a737b6a846e8340beb69980b33c5f2c6(openshift-ingress/router-default)): Created health checks a737b6a846e8340beb69980b33c5f2c6. I0108 10:59:31.181514 1 gce_loadbalancer_external.go:498] ensureTargetPoolAndHealthCheck(a737b6a846e8340beb69980b33c5f2c6(openshift-ingress/router-default)): Created target pool. I0108 10:59:31.181523 1 gce_loadbalancer_external.go:262] ensureExternalLoadBalancer(a737b6a846e8340beb69980b33c5f2c6(openshift-ingress/router-default)): Creating forwarding rule, IP 34.116.73.192 (tier: Premium). ^C [miheer@miheer gcp-ocp]$ My observations -> Initially both finalizers are there one by ingress operator and other by kube controller manager [miheer@miheer gcp-ocp]$ oc get svc router-default -n openshift-ingress -o yaml apiVersion: v1 kind: Service metadata: creationTimestamp: "2021-01-08T11:12:57Z" finalizers: - ingress.openshift.io/operator - service.kubernetes.io/load-balancer-cleanup labels: app: router ingresscontroller.operator.openshift.io/owning-ingresscontroller: default router: router-default managedFields: - apiVersion: v1 fieldsType: FieldsV1 fieldsV1: f:metadata: f:finalizers: .: {} v:"ingress.openshift.io/operator": {} f:labels: .: {} f:app: {} f:ingresscontroller.operator.openshift.io/owning-ingresscontroller: {} f:router: {} f:ownerReferences: .: {} k:{"uid":"afb5630c-32e8-436a-b843-2265d60ab02e"}: .: {} f:apiVersion: {} f:controller: {} f:kind: {} f:name: {} f:uid: {} f:spec: f:externalTrafficPolicy: {} f:ports: .: {} k:{"port":80,"protocol":"TCP"}: .: {} f:name: {} f:port: {} f:protocol: {} f:targetPort: {} k:{"port":443,"protocol":"TCP"}: .: {} f:name: {} f:port: {} f:protocol: {} f:targetPort: {} f:selector: .: {} f:ingresscontroller.operator.openshift.io/deployment-ingresscontroller: {} f:sessionAffinity: {} f:type: {} manager: ingress-operator operation: Update time: "2021-01-08T11:12:57Z" - apiVersion: v1 fieldsType: FieldsV1 fieldsV1: f:metadata: f:finalizers: v:"service.kubernetes.io/load-balancer-cleanup": {} f:status: f:loadBalancer: f:ingress: {} manager: kube-controller-manager operation: Update time: "2021-01-08T11:13:44Z" name: router-default namespace: openshift-ingress ownerReferences: - apiVersion: apps/v1 controller: true kind: Deployment name: router-default uid: afb5630c-32e8-436a-b843-2265d60ab02e resourceVersion: "118756" selfLink: /api/v1/namespaces/openshift-ingress/services/router-default uid: b44a9a1c-82a0-43ab-953b-536ef54354d9 spec: clusterIP: 172.30.252.204 externalTrafficPolicy: Local healthCheckNodePort: 30375 ports: - name: http nodePort: 31229 port: 80 protocol: TCP targetPort: http - name: https nodePort: 30183 port: 443 protocol: TCP targetPort: https selector: ingresscontroller.operator.openshift.io/deployment-ingresscontroller: default sessionAffinity: None type: LoadBalancer status: loadBalancer: ingress: - ip: 34.87.251.58 [miheer@miheer gcp-ocp]$ When say oc delete svc router-default -n openshift-ingress the service.kubernetes.io/load-balancer-cleanup is removed as expected but - ingress.openshift.io/operator remains which is not correct from my understanding causing the oc delete command to get hung. Once ingress.openshift.io/operator is removed manually the oc delete commands completes and a new svc is created with new LB external IP So it looks like from the ingress operator code we need to delete the finalizer ingress.openshift.io/operator from the router default service after we invoke a delete command for router default svc in openshift-ingress. It looks like we delete that finalizer once the ingress controller is deleted https://github.com/openshift/cluster-ingress-operator/blob/87e9d6cf3fa320f85ad4e1ffd4552f579a65e857/pkg/operator/controller/ingress/controller.go#L187 https://github.com/openshift/cluster-ingress-operator/blob/87e9d6cf3fa320f85ad4e1ffd4552f579a65e857/pkg/operator/controller/ingress/controller.go#L544 https://github.com/openshift/cluster-ingress-operator/blob/87e9d6cf3fa320f85ad4e1ffd4552f579a65e857/pkg/operator/controller/ingress/load_balancer_service.go#L259 But the question how to handle this from kubernetes level because when delete a service we will have make changes in the kubernetes service controller https://github.com/kubernetes/kubernetes/blob/43ce28b9954c0d0b8b43b02724f12dce795befec/staging/src/k8s.io/cloud-provider/controllers/service/controller.go#L324
I think we don't have any control at kubernetes service level code so before deleting the service we need to delete the finalizers and then perform the delete action. Shall we close this BZ ?
The issue is that the service controller tries to re-add its finalizer after the service has been marked for deletion: E0108 10:50:27.143730 1 controller.go:275] error processing service openshift-ingress/router-default (will retry): failed to add load balancer cleanup finalizer: Service "router-default" is invalid: metadata.finalizers: Forbidden: no new finalizers can be added if the object is being deleted, found new finalizers []string{"service.kubernetes.io/load-balancer-cleanup"} That seems like a logic error in the service controller, right? As for the ingress.openshift.io/operator finalizer, I think I understand what happened. Earlier, we merged <https://github.com/openshift/cluster-ingress-operator/pull/472>, which deleted logic to add the ingress.openshift.io/operator finalizer and added logic to delete the same, so that finalizer did not block deletion of the service. Subsequently, bug 1898417 was reported, which included deleting the service as part of the steps to reproduce the issue, and these steps worked because #472 had removed the ingress.openshift.io/operator finalizer. Then we merged <https://github.com/openshift/cluster-ingress-operator/pull/514>, which reverted #472, meaning the ingress.openshift.io/operator finalizer was again added, so bug 1898417's steps to reproduce the issue no longer work. So as far as the deletion hanging, I think we are really just back to the pre-#472 behavior, which is undesirable, and we should go ahead and get rid of the ingress.openshift.io/operator finalizer (i.e., restore that part of #472 without the other parts of #472), but this BZ seems less urgent than it initially seemed to be, so it can be deferred until post-4.7. As far as the logic error in the service controller, that appears to be a harmless error (as the API's validation is prohibiting the service controller's erroneous re-adding of the finalizer to succeed), so that too can be deferred until post-4.7.
"oc delete svc router-default -n openshift-ingress" finished within reasonable time, did not hang [jechen@jechen ~]$ oc version Client Version: 4.8.0-0.nightly-2021-02-23-200827 Server Version: 4.8.0-0.nightly-2021-02-24-063313 Kubernetes Version: v1.20.0+6f8878d [jechen@jechen ~]$ oc -n openshift-ingress get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE router-default LoadBalancer 172.30.139.55 35.227.59.35 80:30563/TCP,443:30122/TCP 24m router-internal-default ClusterIP 172.30.47.162 <none> 80/TCP,443/TCP,1936/TCP 24m [jechen@jechen ~]$ oc -n openshift-ingress delete svc router-default service "router-default" deleted [jechen@jechen ~]$ oc -n openshift-ingress get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE router-default LoadBalancer 172.30.66.86 104.196.114.95 80:31815/TCP,443:30425/TCP 42s router-internal-default ClusterIP 172.30.196.65 <none> 80/TCP,443/TCP,1936/TCP 90s
Hi, does this bug require doc text? If so, please update the doc text field.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438