Bug 1970231

Summary: Google Cloud is not reflecting correct backends information for load balancer services of the OCP cluster
Product: OpenShift Container Platform Reporter: Pamela Escorza <pescorza>
Component: Cloud ComputeAssignee: Joel Speed <jspeed>
Cloud Compute sub component: Cloud Controller Manager QA Contact: sunzhaohua <zhsun>
Status: CLOSED NOTABUG Docs Contact:
Severity: low    
Priority: unspecified CC: aos-bugs
Version: 4.6   
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-06-14 09:42:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
LB health check information none

Description Pamela Escorza 2021-06-10 06:24:23 UTC
Description of problem:

When cluster OCP IPI internal is deployed on Google Cloud Provider, the cloud provider creates load balancer with their respective health-checks :
~~~
$ gcloud compute health-checks list 
NAME                              PROTOCOL
aa973131c99c242a58428421be18b117  HTTP
ac5159cb260fc427691ed33f57446bdf  HTTP
k8s-9dbb74ab8cd189b6-node         HTTP
ocp-int-79462-api-internal        HTTPS
~~~

One of them for the ingress default route, in this case "ac5159cb260fc427691ed33f57446bdf" where the health check status remains in warning[0] because is verifying the router availability within all the instance groups of the cluster. which are included as backends of the load balancer created by Google Cloud, this doesn't match the information of the default router from OCP cluster. 

Bellow detailed information about route default load balancer from Google Cloud Console:	

Forwarding rule name:                 ac5159cb260fc427691ed33f57446bdf 	
Scope:	                              Regional (europe-west2) 	
Address:                              10.17.0.32:80,443 	
Protocol:                             TCP (Internal)
Network Tier:	                      Premium 	
Load balancer:                        ac5159cb260fc427691ed33f57446bdf 	
Regional backend service details

  ac5159cb260fc427691ed33f57446bdf    {"kubernetes.io/service-name":"openshift-ingress/router-default"}
  - General properties 
      Region:            europe-west2
      Protocol:          TCP
      Session affinity:  None
      In use by:         ac5159cb260fc427691ed33f57446bdf
      Backends:
        ocp-int-79462-master-europe-west2-c
        ocp-int-79462-master-europe-west2-b
        ocp-int-79462-master-europe-west2-a
        k8s-ig--9dbb74ab8cd189b6
        k8s-ig--9dbb74ab8cd189b6
        k8s-ig--9dbb74ab8cd189b6

Information from OCP cluster:

$ oc describe service router-default -n openshift-ingress
Name:                     router-default
Namespace:                openshift-ingress
Labels:                   app=router
                          ingresscontroller.operator.openshift.io/owning-ingresscontroller=default
                          router=router-default
Annotations:              cloud.google.com/load-balancer-type: Internal
Selector:                 ingresscontroller.operator.openshift.io/deployment-ingresscontroller=default
Type:                     LoadBalancer
IP:                       172.30.114.68
LoadBalancer Ingress:     10.17.0.32
Port:                     http  80/TCP
TargetPort:               http/TCP
NodePort:                 http  30389/TCP
Endpoints:                10.153.4.4:80,10.155.4.4:80
Port:                     https  443/TCP
TargetPort:               https/TCP
NodePort:                 https  30406/TCP
Endpoints:                10.153.4.4:443,10.155.4.4:443
Session Affinity:         None
External Traffic Policy:  Local
HealthCheck NodePort:     32265
Events:                   <none>

$  oc -n openshift-ingress get svc
NAME                      TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE
router-default            LoadBalancer   172.30.114.68   10.17.0.32    80:30389/TCP,443:30406/TCP   25h
router-internal-default   ClusterIP      172.30.41.13    <none>        80/TCP,443/TCP,1936/TCP      25h


The Taint and Toleration are configured to host the default router on the infra nodes:

$ oc get ingresscontroller/default -n  openshift-ingress-operator -o jsonpath='{.spec.nodePlacement}' | jq -r
{
  "nodeSelector": {
    "matchLabels": {
      "node-role.kubernetes.io/infra": ""
    }
  },
  "tolerations": [
    {
      "effect": "NoSchedule",
      "key": "infra",
      "value": "reserved"
    },
    {
      "effect": "NoExecute",
      "key": "infra",
      "value": "reserved"
    }
  ]
}

The bug is open to request information on how to remediate this warning, is there any annotation that can be send to Cloud Manager Controller in order to reflect the correct configuration for the load balancer? 

The manual modification of the Load Balancer details from the GCP console is not being preserved.
Also there is a feature gate that Kubernetes offer in order to disable the http load balancer verification but seems it's not helpful as will affect  https load balancer services that customer is using for their applications. 

The customer has OCP IPI clusters publish as Internal but the behavior is the same for cluster publish as External.

Customer has also opened a case to Google Cloud Support in order to give further information about this.

Version-Release number of selected component (if applicable):
OpenShift 4.6 IPI on Google Cloud 

How reproducible:
Deploy an IPI OCP 4.6 cluster on GCP.

Steps to Reproduce:
1. Once installed the cluster, go to Load Balancer information from the Google Cloud Console and verify the load balancer created for the default router

Actual results:
Wrong asignation of the backends for the default router load balancer created by the cloud provider, it's including as backends all the instances of the cluster when it shuould be only the instance hosting the endpoints of the default router service. 

Expected results:
Get a correct information of the http load balancer backends from the Google Cloud Console based on the information provided by OCP cluster

Additional info:
[0] Picture 1 and 2 of the pdf attached to this bug

Comment 1 Pamela Escorza 2021-06-10 06:26:46 UTC
Created attachment 1789732 [details]
LB health check information

Comment 3 Joel Speed 2021-06-10 08:50:22 UTC
This is a part of the design of the core Kubernetes service controllers and not something we are going to be able to change.

But for some context, it is intended that all instances in a cluster are targets for the load balancer.
When a service is created, the `externalTrafficPolicy` can be set to make sure that only instances running the router pods actually accept traffic.
This means that these instances pass the health checks, and the instances which don't contain the pods, fail the health checks.

It is then the responsibility of the cloud load balancer to route the traffic appropriately.

If you don't set the `externalTrafficPolicy`, then all instances will accept the traffic and proxy within the cluster, so there would be no failures in the health checks, but there would be an extra hop for some requests.

If you want to prevent the instances from being registered, you can use the node label `node.kubernetes.io/exclude-from-external-load-balancers: ""`, but this will exclude the node from ALL services that use load balancers, there is no way to set this just for a single service object.

This isn't a bug and for the most part, is actually beneficial. For example when you have 3 pods across 6 nodes, with all of those nodes registered, if the pods move between nodes, they are available on the load balancer faster than if we had to register and deregister targets every time the pods moved.

To be extra clear, this is a core Kubernetes design, in use across all of GCP and other platforms too. This is not something we will be able to change.

Comment 4 Joel Speed 2021-06-14 09:42:34 UTC
Customer has closed their support case after receiving the feedback above, closing this one out too