1479295 – Router sharding causes routes in new namespaces detection to be delayed

Bug 1479295 - Router sharding causes routes in new namespaces detection to be delayed

Summary: Router sharding causes routes in new namespaces detection to be delayed

Keywords:
Status:	CLOSED NEXTRELEASE
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	3.5.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Ravi Sankar
QA Contact:	zhaozhanqi
Docs Contact:
URL:
Whiteboard:
Duplicates (2):	1479452 1486322 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-08-08 09:57 UTC by Jaspreet Kaur
Modified:	2022-08-04 22:20 UTC (History)
CC List:	15 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-10-17 18:12:00 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Origin (Github)	16039	0	None	None	None	2017-09-13 19:55:45 UTC

Description Jaspreet Kaur 2017-08-08 09:57:07 UTC

3. What is the nature and description of the request?

when using NAMESPACE_LABEL based router sharding, created a default template to make sure new namespaces will get the correct label, and when i create a new namespace/project it gets the correct label, but the route wouldn't work for 10-15 minutes (the pods are up  few seconds after namespace creation).
In an ideal situation, i would expect the routes to work right after my application is up and running. However it is observed that the default resync interval is 10 minutes which is pretty high.

4. Why does the customer need this? (List the business requirements here)

 Router-Sharded production environment with approximately 1500 routes.Fix the auto-sync of new namespaces created with NAMESPACE_LABEL so new routes will create as soon as new namespaces create.

5. How would the customer like to achieve this? (List the functional requirements here) 
The resync interval should be educed which should be safe enough for production env's having atleast 2000 routes and it shouldn't effect CPU,NETWORK and MEMORY metrics.

Comment 2 Snir 2017-08-08 11:39:53 UTC

This BZ opened as continuation of The following bugzilla bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1355711,
The issue and the way is was closed as WONTFIX is a bit confusing, since regular routes in openshift are admitted immediately onto all routers,
why should there be any discrimination between regular routes and routers, and sharded routes (namespace labeled) and sharded routers?
In our view a new route in a new namespace should be recognized immediately by a preexisting router even in a sharded environment, and not wait for a full resync which is <10minutes away by default.

Comment 3 Phil Cameron 2017-08-08 14:55:06 UTC

ccoleman, eparis
What would you like to do about this. It is a duplicate of 1355711 which is closed as won't fix.

Comment 4 Phil Cameron 2017-08-09 15:19:32 UTC

Also, duplicate of: https://bugzilla.redhat.com/show_bug.cgi?id=1479452

Comment 5 Rajat Chopra 2017-08-09 18:20:41 UTC

*** Bug 1479452 has been marked as a duplicate of this bug. ***

Comment 9 Snir 2017-08-15 07:31:27 UTC

Hi, can i ask what are your plans about this case? We are about to upgrade our openshift environment and it depends on router sharding..

Comment 10 Ravi Sankar 2017-08-17 16:44:56 UTC

We are working on the fix so that sharded router based on namespace or project labels notices routes immediately just like the behavior you observe on non-sharded router. Hoping to get the fix in one of the 3.6.x release.

Comment 11 Ravi Sankar 2017-08-17 21:58:49 UTC

Created trello card: https://trello.com/c/Q0puUQOT/540-5-sharded-router-should-notice-routes-immediately

Comment 12 Ben Bennett 2017-09-11 15:47:49 UTC

*** Bug 1486322 has been marked as a duplicate of this bug. ***

Comment 17 Ravi Sankar 2017-10-17 00:36:24 UTC

Fix https://github.com/openshift/origin/pull/16039 merged in origin and will be available in 3.7.1 release.
Earlier releases could use the workaround by setting the router 'resync-interval' to lower value.

Note You need to log in before you can comment on or make changes to this bug.