Bug 1486322 - Router image is not getting all routes
Summary: Router image is not getting all routes
Keywords:
Status: CLOSED DUPLICATE of bug 1479295
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 3.4.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Ravi Sankar
QA Contact: zhaozhanqi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-08-29 13:11 UTC by Vladislav Walek
Modified: 2022-08-04 22:20 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-09-11 15:47:49 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Vladislav Walek 2017-08-29 13:11:09 UTC
Description of problem:

customer has configured 2 routers and using namespace labels env variable to shards the routes based on namespace. They configured env variable ROUTER_ALLOWED_DOMAINS on each router.
However, when the router reloads it doesn't get any route. The logs are showing only:

E0825 16:48:36.851076       1 host_admitter.go:121] Route example/route not admitted: host not in the allowed list of domains
E0825 16:48:36.851095       1 controller.go:169] host not in the allowed list of domains

The problem is that the router is not loading all the routes. In some cases it gets the route and some not.
For some reason the router was reloading after 1 sec (based on message Router reloaded).

The reload time was set to 10m by setting the --resync interval on the openshift-router.
However, even after that it is not getting all the routes.

The labels are ok. The configuration is ok as at some point the route is admitted and sometimes is not. The behavior looks like the router doesn't download all the routes. 

Version-Release number of selected component (if applicable):
OpenShift container platform 3.4

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 3 Ravi Sankar 2017-08-31 23:23:16 UTC
I have tested this use case on 3.4 latest release (v3.4.1.44.18) and also on latest code on master. ROUTER_ALLOWED_DOMAINS and NAMESPACE_LABELS on router worked as expected with known caveats! 

Looking at the logs, router config looks good.
I suspect one of these could have caused the issue:

(1) Unlike non-sharded router, sharded router can take up to 20 mins to refresh the state for 10min resync interval. 10 min resync interval will update the namespaces but routes/endpoints resources are also resynced every 10 mins and the router could take 2 resync cycles to catch up (existing issue: https://bugzilla.redhat.com/show_bug.cgi?id=1479295)
This might explain why you see routes sometimes and not in some cases after 10 mins. 
To validate this, you can try:
- Create the route that matches namespace labels and also present in allowed domains.
- Check the route is working or not (it may not work as per this bug)
- Now, scale down the router to 0 (oc scale dc/<router-name> --replicas=0)
- Now, scale up the router to 2 (oc scale dc/<router-name> --replicas=2)
- Check the route again, if this works then you are hitting this issue.

(2) There are few issues related to router event queue like: 
 https://github.com/openshift/ose/pull/669, https://github.com/openshift/ose/pull/559 that could cause this issue. These fixes are back ported to 3.4, check if your 3.4 version has those fixes.

Comment 4 Ben Bennett 2017-09-11 15:47:49 UTC

*** This bug has been marked as a duplicate of bug 1479295 ***


Note You need to log in before you can comment on or make changes to this bug.