| Summary: | [3.4] Router doesn't immediately load existing routes on pod redeployement | |||
|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Jaspreet Kaur <jkaur> | |
| Component: | Networking | Assignee: | Maru Newby <mnewby> | |
| Networking sub component: | router | QA Contact: | zhaozhanqi <zzhao> | |
| Status: | CLOSED ERRATA | Docs Contact: | ||
| Severity: | medium | |||
| Priority: | high | CC: | ahoness, aos-bugs, bbennett, bmeng, byount, clichybi, erich, hongli, ichavero, javier.ramirez, mchappel, mnewby, rabdulra, stwalter, vwalek | |
| Version: | 3.2.0 | Keywords: | Performance | |
| Target Milestone: | --- | |||
| Target Release: | 3.4.z | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | Bug Fix | ||
| Doc Text: |
Cause: The router wouldn't reload HAProxy after the initial sync if the last item of the initial list of any of the watched resources didn't reach the router to trigger the commit. This could be caused by a route being rejected for any reason (e.g. specifying a host claimed by another namespace).
Consequence: The router could be left in its initial state (without any routes configured) until another commit-triggering event occurred (e.g. a watch event).
Fix: The router always reloads after initial sync.
Result: Routes are available after the initial sync.
|
Story Points: | --- | |
| Clone Of: | ||||
| : | 1415276 (view as bug list) | Environment: | ||
| Last Closed: | 2017-01-31 20:18:47 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Bug Depends On: | 1382388 | |||
| Bug Blocks: | 1415276 | |||
|
Description
Jaspreet Kaur
2016-10-11 11:46:50 UTC
I think this is related to https://bugzilla.redhat.com/show_bug.cgi?id=1387714 Maru: Can you see if we can do something about not loading the router until we have processed the routes? My only concern about stalling is that not putting up something in a timely manner may interfere with health checks and then we may get killed and then loop. @bbennett: The liveness/readiness check for a template router targets the haproxy stats port. This precludes starting the router pod without starting haproxy, and it's not possible to change the liveness/readiness probes for a running pod. However, it should be possible to configure haproxy to avoid binding ports for http/tls traffic when it initially starts. Binding could be delayed until the route state had been read. *** Bug 1387714 has been marked as a duplicate of this bug. *** Hi, have a similar case but customer is saying that after the second reload. Thanks I think I found the cause of this issue, separate from the port binding issue. The github PR has a fix. verified in 3.4.1.0 and the issue has been fixed. # oc logs router-2-badgl I0122 05:18:00.655645 1 router.go:456] Router reloaded: - Checking HAProxy /healthz on port 1936 ... - HAProxy port 1936 health check ok : 0 retry attempt(s). I0122 05:18:00.655951 1 router.go:221] Router is only using routes in namespaces matching team=red E0122 05:18:00.697077 1 controller.go:169] a route in another namespace holds test-edge.example.com and is older than secured-edge-route I0122 05:18:00.747442 1 router.go:456] Router reloaded: - Checking HAProxy /healthz on port 1936 ... - HAProxy port 1936 health check ok : 0 retry attempt(s). *** Bug 1381584 has been marked as a duplicate of this bug. *** *** Bug 1402488 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:0218 |