+++ This bug was initially created as a clone of Bug #1415112 +++ Description of problem: If add NAMESPACE_LABELS to router firstly, then when you adding the label to namespace or remove the label from the namespace, the router configuration cannot be reloaded. But if change router NAMESPACE_LABELS at last, the configuration can be reloaded. Version-Release number of selected component (if applicable): openshift v3.5.0.6+87f6173 kubernetes v1.5.2+43a9be4 etcd 3.1.0-rc.0 How reproducible: always Steps to Reproduce: 1. create project, pod, service and route. # oc new-project u1p1 # oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/routing/caddy-docker.json # oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/routing/edge/service_unsecure.json # oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/routing/edge/route_edge.json 2. add label to the project. # oc label namespace u1p1 team=red 3. create pod, service, route in another project but without label. 4. add NAMESPACE_LABELS to router # oadm policy add-cluster-role-to-user cluster-reader system:serviceaccount:default:router # oc env dc/router NAMESPACE_LABELS=team=red 5. check the router configuration # oc rsh router-2-xxxxx cat haproxy.config 6. remove the label from the project. # oc label namespace u1p1 team- 7. check the router configuration again. Actual results: In step 5, the router configuration is reloaded and only the route in the labelled namespace can be shown. But in step 7, the router configuration is not reloaded, the route is still there even the label has been removed. Expected results: the router configuration should be reloaded after namespace label changed (in step 7) Additional info: --- Additional comment from Maru Newby on 2017-01-24 12:35:12 EST --- The router only updates namespaces on an interval (controllable via --resync-interval and defaulting to 10m). Changes to namespace labelling will not be reflected in the routes served by a given router instance until the interval is hit or the instance is restarted. --- Additional comment from hongli on 2017-03-14 06:39:38 EDT --- retested in latest OCP 3.5.0.50 and found router not updated in 10m, so it is not same to bug 1355711 and reopen this one. I'm wonder if the default 10m of resync-interval has been removed due to the PR: https://github.com/openshift/origin/pull/12242/ --- Additional comment from Maru Newby on 2017-03-15 11:12:28 EDT --- (In reply to hongli from comment #2) > retested in latest OCP 3.5.0.50 and found router not updated in 10m, so it > is not same to bug 1355711 and reopen this one. > > I'm wonder if the default 10m of resync-interval has been removed due to the > PR: https://github.com/openshift/origin/pull/12242/ The resync interval was not removed. That PR only prevents reloads if the route state has not changed, but a change in the set of namespaces a router targets should still result in a state change. It would be helpful to increase the logging verbosity on the router and provide those logs. There is logging around the updating of watched namespaces. --- Additional comment from hongli on 2017-03-16 06:23:28 EDT --- Hi Maru, the reproduce step has just a different order with the original bug (run step4 firstly). The steps as follows: 1. add NAMESPACE_LABELS to router 2. create project, pod, service and route. 3. add label to the project. Wait more than 10 min but the route doesn't reload. Below is the router logs, observed "forcing resync" at 10:04 but not reload route. I0316 09:54:41.882547 1 router.go:390] Writing the router state I0316 09:54:41.883141 1 router.go:395] Writing the router config I0316 09:54:41.915215 1 router.go:400] Reloading the router I0316 09:54:41.977765 1 reaper.go:24] Signal received: child exited I0316 09:54:41.977804 1 reaper.go:32] Reaped process with pid 53 I0316 09:54:42.001333 1 router.go:475] Router reloaded: - Checking HAProxy /healthz on port 1936 ... - HAProxy port 1936 health check ok : 0 retry attempt(s). I0316 09:54:42.001410 1 reaper.go:24] Signal received: child exited I0316 09:55:46.704353 1 controller.go:305] Processing Route: u1p1/secured-edge-route -> service-unsecure I0316 09:55:46.704373 1 controller.go:306] Alias: test-edge.example.com I0316 09:55:46.704378 1 controller.go:307] Path: I0316 09:55:46.704382 1 controller.go:308] Event: ADDED I0316 09:55:46.704391 1 router.go:129] host test-edge.example.com admitted I0316 10:01:32.208521 1 reflector.go:392] github.com/openshift/origin/pkg/router/template/service_lookup.go:30: Watch close - *api.Service total 1 items received I0316 10:04:27.335563 1 controller.go:150] Updating watched namespaces: map[u1p1:{}] I0316 10:04:36.772510 1 reflector.go:273] github.com/openshift/origin/pkg/router/template/service_lookup.go:30: forcing resync I0316 10:04:37.146518 1 reflector.go:273] github.com/openshift/origin/pkg/router/controller/factory/factory.go:75: forcing resync I0316 10:04:37.146630 1 reflector.go:273] github.com/openshift/origin/pkg/router/controller/factory/factory.go:68: forcing resync I0316 10:04:37.146657 1 controller.go:305] Processing Route: default/docker-registry -> docker-registry I0316 10:04:37.146662 1 controller.go:306] Alias: docker-registry-default.0316-yny.qe.rhcloud.com I0316 10:04:37.146667 1 controller.go:307] Path: I0316 10:04:37.146670 1 controller.go:308] Event: I0316 10:04:37.146681 1 router.go:129] host docker-registry-default.0316-yny.qe.rhcloud.com admitted I0316 10:04:37.146698 1 plugin.go:151] Processing 1 Endpoints for Name: service-unsecure () I0316 10:04:37.146706 1 plugin.go:154] Subset 0 : api.EndpointSubset{Addresses:[]api.EndpointAddress{api.EndpointAddress{IP:"10.2.2.21", Hostname:"", NodeName:(*string)(0xc42011f630), TargetRef:(*api.ObjectReference)(0xc42046b730)}}, NotReadyAddresses:[]api.EndpointAddress(nil), Ports:[]api.EndpointPort{api.EndpointPort{Name:"http", Port:8080, Protocol:"TCP"}}} I0316 10:09:14.213867 1 reflector.go:392] github.com/openshift/origin/pkg/router/template/service_lookup.go:30: Watch close - *api.Service total 0 items received --- Additional comment from Maru Newby on 2017-03-18 12:05:56 EDT --- (In reply to hongli from comment #4) > Hi Maru, the reproduce step has just a different order with the original bug > (run step4 firstly). The steps as follows: > > 1. add NAMESPACE_LABELS to router > 2. create project, pod, service and route. > 3. add label to the project. > > Wait more than 10 min but the route doesn't reload. Below is the router > logs, observed "forcing resync" at 10:04 but not reload route. There are 2 types of resync - namespace and everything else. The 2 use different mechanisms, and namespace sync is intended to be triggered just before the sync of route data. Does the text 'Updating watched namespaces' appear in the log? --- Additional comment from Maru Newby on 2017-03-18 12:14:53 EDT --- (In reply to Maru Newby from comment #5) > (In reply to hongli from comment #4) > > Hi Maru, the reproduce step has just a different order with the original bug > > (run step4 firstly). The steps as follows: > > > > 1. add NAMESPACE_LABELS to router > > 2. create project, pod, service and route. > > 3. add label to the project. > > > > Wait more than 10 min but the route doesn't reload. Below is the router > > logs, observed "forcing resync" at 10:04 but not reload route. > > There are 2 types of resync - namespace and everything else. The 2 use > different mechanisms, and namespace sync is intended to be triggered just > before the sync of route data. Does the text 'Updating watched namespaces' > appear in the log? Nevermind, I see it. So the namespace sync is being triggered. I don't see an event for the route in the resync though, is that the end of the log? --- Additional comment from Maru Newby on 2017-03-18 19:23:21 EDT --- PR is up. This issue looks to have been latent for a long time, good catch.
PR https://github.com/openshift/ose/pull/669
*** Bug 1434707 has been marked as a duplicate of this bug. ***
Since today's env is OCP 3.4.1.11 and still can reproduce the issue, will verify it ASAP when v3.4.1.12 env ready.
verified in 3.4.1.12 (atomic-openshift-3.4.1.12-1.git.0.57d7e1d.el7.x86_64) and the issue has been fixed. The router is reloaded after 10 minutes and curl is ok. [root@host-8-174-62 ~]# oc logs router-3-l36r1 I0330 01:36:16.503068 1 router.go:456] Router reloaded: - Checking HAProxy /healthz on port 1936 ... - HAProxy port 1936 health check ok : 0 retry attempt(s). I0330 01:36:16.503153 1 router.go:221] Router is only using routes in namespaces matching team=red I0330 01:36:16.552631 1 router.go:456] Router reloaded: - Checking HAProxy /healthz on port 1936 ... - HAProxy port 1936 health check ok : 0 retry attempt(s). I0330 01:36:21.230074 1 router.go:456] Router reloaded: - Checking HAProxy /healthz on port 1936 ... - HAProxy port 1936 health check ok : 0 retry attempt(s). I0330 01:46:16.546185 1 router.go:456] Router reloaded: - Checking HAProxy /healthz on port 1936 ... - HAProxy port 1936 health check ok : 0 retry attempt(s). I0330 01:46:16.583721 1 router.go:456] Router reloaded: - Checking HAProxy /healthz on port 1936 ... - HAProxy port 1936 health check ok : 0 retry attempt(s).
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:0865