Bug 1479295

Summary: Router sharding causes routes in new namespaces detection to be delayed
Product: OpenShift Container Platform Reporter: Jaspreet Kaur <jkaur>
Component: NetworkingAssignee: Ravi Sankar <rpenta>
Networking sub component: router QA Contact: zhaozhanqi <zzhao>
Status: CLOSED NEXTRELEASE Docs Contact:
Severity: high    
Priority: high CC: aos-bugs, bbennett, bjarolim, bmeng, ccoleman, eparis, jmencak, jokerman, mmccomas, nnosenzo, pcameron, rchopra, rpenta, snirk4triel, vwalek
Version: 3.5.0   
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-10-17 18:12:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jaspreet Kaur 2017-08-08 09:57:07 UTC
3. What is the nature and description of the request?

when using NAMESPACE_LABEL based router sharding, created a default template to make sure new namespaces will get the correct label, and when i create a new namespace/project it gets the correct label, but the route wouldn't work for 10-15 minutes (the pods are up  few seconds after namespace creation).
In an ideal situation, i would expect the routes to work right after my application is up and running. However it is observed that the default resync interval is 10 minutes which is pretty high.

4. Why does the customer need this? (List the business requirements here)

 Router-Sharded production environment with approximately 1500 routes.Fix the auto-sync of new namespaces created with NAMESPACE_LABEL so new routes will create as soon as new namespaces create.

5. How would the customer like to achieve this? (List the functional requirements here) 
The resync interval should be educed which should be safe enough for production env's having atleast 2000 routes and it shouldn't effect CPU,NETWORK and MEMORY metrics.

Comment 2 Snir 2017-08-08 11:39:53 UTC
This BZ opened as continuation of The following bugzilla bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1355711,
The issue and the way is was closed as WONTFIX is a bit confusing, since regular routes in openshift are admitted immediately onto all routers,
why should there be any discrimination between regular routes and routers, and sharded routes (namespace labeled) and sharded routers?
In our view a new route in a new namespace should be recognized immediately by a preexisting router even in a sharded environment, and not wait for a full resync which is <10minutes away by default.

Comment 3 Phil Cameron 2017-08-08 14:55:06 UTC
ccoleman, eparis
What would you like to do about this. It is a duplicate of 1355711 which is closed as won't fix.

Comment 4 Phil Cameron 2017-08-09 15:19:32 UTC
Also, duplicate of: https://bugzilla.redhat.com/show_bug.cgi?id=1479452

Comment 5 Rajat Chopra 2017-08-09 18:20:41 UTC
*** Bug 1479452 has been marked as a duplicate of this bug. ***

Comment 9 Snir 2017-08-15 07:31:27 UTC
Hi, can i ask what are your plans about this case? We are about to upgrade our openshift environment and it depends on router sharding..

Comment 10 Ravi Sankar 2017-08-17 16:44:56 UTC
We are working on the fix so that sharded router based on namespace or project labels notices routes immediately just like the behavior you observe on non-sharded router. Hoping to get the fix in one of the 3.6.x release.

Comment 12 Ben Bennett 2017-09-11 15:47:49 UTC
*** Bug 1486322 has been marked as a duplicate of this bug. ***

Comment 17 Ravi Sankar 2017-10-17 00:36:24 UTC
Fix https://github.com/openshift/origin/pull/16039 merged in origin and will be available in 3.7.1 release.
Earlier releases could use the workaround by setting the router 'resync-interval' to lower value.