1905748 – with sharded ingresscontrollers, all shards reload when any endpoint changes

Bug 1905748 - with sharded ingresscontrollers, all shards reload when any endpoint changes

Summary: with sharded ingresscontrollers, all shards reload when any endpoint changes

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	4.5
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	high
Target Milestone:	---
Target Release:	4.7.0
Assignee:	Andrew McDermott
QA Contact:	Hongan Li
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1918194 1918197
TreeView+	depends on / blocked

Reported:	2020-12-09 00:20 UTC by Dan Yocum
Modified:	2024-03-25 17:27 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-02-24 15:41:14 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift router pull 243	0	None	closed	Bug 1905748: Prevent unnecessary reloads in router shards	2021-02-18 03:46:25 UTC
Red Hat Product Errata	RHSA-2020:5633	0	None	None	None	2021-02-24 15:41:35 UTC

Description Dan Yocum 2020-12-09 00:20:55 UTC

Description of problem:
We have a number of ingresscontrollers setup to handle different routes:

NAME         AGE
crcshard-0   159d
crcshard-1   159d
crcshard-2   159d
crcshard-3   159d
crcshard-4   159d
crcshard-5   159d
default      286d
public       138d

Each of these has different routeSelector, but no namespaceSelector.  Each of the routes in our clusters match one, and only one of these routeSelectors (handled by custom webhook/operator). 

What we are seeing is constant (every 5 seconds as that appears to be the min interval):

I1208 12:10:16.145539       1 router.go:536] template "msg"="router reloaded"  "output"=" - Checking http://localhost:80 ...\n - Health check ok : 0 retry attempt(s).\n"
I1208 12:10:21.144016       1 router.go:536] template "msg"="router reloaded"  "output"=" - Checking http://localhost:80 ...\n - Health check ok : 0 retry attempt(s).\n"
I1208 12:10:26.131853       1 router.go:536] template "msg"="router reloaded"  "output"=" - Checking http://localhost:80 ...\n - Health check ok : 0 retry attempt(s).\n"
I1208 12:10:31.186205       1 router.go:536] template "msg"="router reloaded"  "output"=" - Checking http://localhost:80 ...\n - Health check ok : 0 retry attempt(s).\n"
I1208 12:10:36.145065       1 router.go:536] template "msg"="router reloaded"  "output"=" - Checking http://localhost:80 ...\n - Health check ok : 0 retry attempt(s).\n"
I1208 12:10:41.167243       1 router.go:536] template "msg"="router reloaded"  "output"=" - Checking http://localhost:80 ...\n - Health check ok : 0 retry attempt(s).\n"
I1208 12:10:46.134906       1 router.go:536] template "msg"="router reloaded"  "output"=" - Checking http://localhost:80 ...\n - Health check ok : 0 retry attempt(s).\n"
I1208 12:10:51.165807       1 router.go:536] template "msg"="router reloaded"  "output"=" - Checking http://localhost:80 ...\n - Health check ok : 0 retry attempt(s).\n"

What's more is, all of the router-controllers do this in unison, even though most of their endpoints do not change in that interval (But likely one or two do).   

It appears that `registerInformerEventHandlers()` and `HandleEndpoints()`, in cases where namespaceSelector is nil, causes commitAndReload() being hit even if no changes were done in that cycle, on all routers.   Confirmed that the router configs are identical before and after reloads in most cases, but the reloads (and config rewrites) keep coming.

In addition to this, because of the very frequent reloading, the `balance` configuration of each route becomes biased.  Regardless of `leastconn` or `roundrobin`, as the `chash` tree within haproxy appears to gets reset on reload, this in turn makes the router severely favor the first pod on in the configuration.


Version-Release number of selected component (if applicable):

OpenShift 4.5.16

How reproducible:

Always

Steps to Reproduce:
1. Create several router shards
2. See all shards reloaded every 5s even when endpoints do not change
3.

Actual results:

All shards are reloaded

Expected results:

Shards are only reloaded when an endpoint changes

Additional info:

Comment 1 Andrew McDermott 2020-12-10 17:14:58 UTC

There is this existing BZ which is broadly the same issue: https://bugzilla.redhat.com/show_bug.cgi?id=1839989

As this really is an enhancement to the current design this is now captured in the following RFE: 

  https://issues.redhat.com/browse/NE-391

Comment 2 Dan Yocum 2020-12-11 19:49:49 UTC

Re-opening this BZ.

That RFE refers to an old v3.11 BZ and it doesn't address the issue that the customer is experiencing.  

They have 6 router shards.  When a single endpoint changes (create/delete/migrate), *ALL* the routers reload, not just the router with the endpoint.  This isn't an haproxy issue, it's a k8s issue.  

The customer has dug into the code and this is what they have to say:

"This [issue] is a result of the "Kind: endpoints/endpointslice" changing in k8s, not haproxy noticing dead backends."

Comment 7 Andrew McDermott 2021-01-13 11:08:28 UTC

(In reply to Dan Yocum from comment #0)
> Description of problem:
> We have a number of ingresscontrollers setup to handle different routes:
> 
> NAME         AGE
> crcshard-0   159d
> crcshard-1   159d
> crcshard-2   159d
> crcshard-3   159d
> crcshard-4   159d
> crcshard-5   159d
> default      286d
> public       138d
> 
> Each of these has different routeSelector, but no namespaceSelector.  Each
> of the routes in our clusters match one, and only one of these
> routeSelectors (handled by custom webhook/operator). 

Please could we attach the YAML output for all these ingresscontrollers:

$ oc get ingresscontrollers --all-namespaces -o yaml

Comment 18 Hongan Li 2021-01-25 09:46:09 UTC

Tested with 4.7.0-0.nightly-2021-01-22-134922 and passed.

1. create route shards with two more custom ingresscontrollers, one is using namespace label and one is using route label.

spec:
  namespaceSelector:
    matchLabels:
      namespace: router-test

spec:
  routeSelector:
    matchLabels:
      route: router-test

2. create three projects, pods, services and routes, ns1 is labelled as "namespace=router-test", route2 in ns2 is labelled as route=router-test, 

3. scale up/down the pods in ns3 to make endpoints change, no reload in both router pods with labels.

4. scale up/down the pods in ns2, no reload in the router pod with namespace label.

5. scale up/down the pods in ns1, no reload in the router pod with route label. 


logs:
$ oc -n openshift-ingress logs router-nslabel-6b9c5d77b-l25mj | tail -n2
I0125 09:15:24.735305       1 router.go:578] template "msg"="router reloaded"  "output"=" - Checking http://localhost:80 ...\n - Health check ok : 0 retry attempt(s).\n"
I0125 09:25:34.356529       1 router.go:578] template "msg"="router reloaded"  "output"=" - Checking http://localhost:80 ...\n - Health check ok : 0 retry attempt(s).\n"

$ oc -n openshift-ingress logs router-routelabel-ff4dfdd4-bfbtg | tail -n2
I0125 09:17:10.457487       1 router.go:578] template "msg"="router reloaded"  "output"=" - Checking http://localhost:80 ...\n - Health check ok : 0 retry attempt(s).\n"
I0125 09:23:26.001079       1 router.go:578] template "msg"="router reloaded"  "output"=" - Checking http://localhost:80 ...\n - Health check ok : 0 retry attempt(s).\n"

Comment 21 errata-xmlrpc 2021-02-24 15:41:14 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633

Note You need to log in before you can comment on or make changes to this bug.