Bug 1332510

Summary: [networking_public_53] Error about "unable to find namespaces for router" appears in the router log when adding the NAMESPACE_LABELS to the router
Product: OKD Reporter: Meng Bo <bmeng>
Component: RoutingAssignee: Phil Cameron <pcameron>
Status: CLOSED CURRENTRELEASE QA Contact: zhaozhanqi <zzhao>
Severity: low Docs Contact:
Priority: medium    
Version: 3.xCC: aloughla, aos-bugs, atragler, mleitner, ramr, rkhan
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-12-09 21:50:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Meng Bo 2016-05-03 11:12:05 UTC
Description of problem:
Create router on the existing cluster. And try to add the NAMESPACE_LABELS to the router dc to filter the route by namespaces.

Error log appears in the router pod when using --loglevel=4


Version-Release number of selected component (if applicable):
haproxy-router image id: e5a68f1889fc
openshift v1.3.0-alpha.0-267-gcd62e58

How reproducible:
always

Steps to Reproduce:
1. Create router in the existing cluster
# oadm policy add-scc-to-user hostnetwork -z router
# oadm router
2. Try to add env to router to filter the routes by namespace labels
# oc labels dc/router NAMESPACE_LABELS=team=red
3. Check the router pod log after new deployment completed
# oc logs -f router-2-pod

Actual results:
Errors about unable to find namespaces for router appears, and routes in any namespace/project can be served by the router.

[root@master ~]# oc logs router-18-h2bqk -f
I0503 11:03:26.423797       1 router.go:117] Creating a new template router, writing to /var/lib/containers/router
I0503 11:03:26.440443       1 router.go:119] Router will use default/router service to identify peers
I0503 11:03:26.440494       1 router.go:159] Template router will coalesce reloads within 5 seconds of each other
I0503 11:03:26.440502       1 router.go:164] Reading persisted state
I0503 11:03:26.451375       1 router.go:168] Committing state
I0503 11:03:26.451577       1 merged_client_builder.go:102] No kubeconfig could be created, falling back to service account.
I0503 11:03:26.452020       1 router.go:153] Router is only using routes in namespaces matching team=blue
I0503 11:03:26.452108       1 controller.go:41] Running router controller
I0503 11:03:26.452198       1 router.go:225] Writing the router state
I0503 11:03:26.453282       1 router.go:230] Writing the router config
I0503 11:03:26.454336       1 router.go:235] Reloading the router
E0503 11:03:26.482601       1 controller.go:63] unable to find namespaces for router: User "system:serviceaccount:default:router" cannot list all namespaces in the cl
uster
I0503 11:03:27.598173       1 router.go:310] Router reloaded:
 - Checking HAProxy /healthz on port 1936 ...
 - HAProxy port 1936 health check ok : 0 retry attempt(s).
E0503 11:03:36.484912       1 controller.go:63] unable to find namespaces for router: User "system:serviceaccount:default:router" cannot list all namespaces in the cl
uster
E0503 11:03:46.487355       1 controller.go:63] unable to find namespaces for router: User "system:serviceaccount:default:router" cannot list all namespaces in the cl
uster
E0503 11:03:56.489611       1 controller.go:63] unable to find namespaces for router: User "system:serviceaccount:default:router" cannot list all namespaces in the cl
uster
E0503 11:04:06.491644       1 controller.go:63] unable to find namespaces for router: User "system:serviceaccount:default:router" cannot list all namespaces in the cl
uster
I0503 11:04:16.491817       1 controller.go:66] Unable to update list of namespaces


Expected results:
Should not have the errors and only the route in specific namespace/project can be hosted by the router.

Additional info:

Comment 1 Ram Ranganathan 2016-05-06 02:48:27 UTC
The error is because the  User "system:serviceaccount:default:router" cannot list all namespaces in the cluster.

You will need to give permissions to the default service account to do that.
Ala:
$ oadm policy add-cluster-role-to-user cluster-reader system:serviceaccount:default:router


And if you run the router with namespace/project labels: 
$ oadm router ...
$ oc env dc/router NAMESPACE_LABELS="router=r1"

Example test:
$ oc label namespace default "router=r1"
$ # create routes in default namespace.
$ # example oc create -f route1.yaml
$ echo "route1 should be available via the router"

$ oc new-project p1
$ # create routes in project p1
$ # and these should not show up
$ # example oc create -f route2.yaml
$ echo "route2 should not be available via the router"

If you now label namespace p1 with "router=r1" ala: $ oc label namespace p1 "router=r1"

the routes should show up in the router.

Note that removing the label the namespace won't have immediate effect (as we don't see the updates in the router), so if you redeploy/start a new router pod, you should see the unlabelled effects.
Ala: 
$ oc scale dc/router --replicas=0 && oc scale dc/router --replicas=1

Comment 2 Meng Bo 2016-05-06 03:11:39 UTC
@ramr

Thanks for the instruction. I know the errors must be caused by some permission issues.

But I did not find it in any of our docs or the client output or the router logs.

I think this is a bug since we should let user know that he has to do the step oadm policy add-cluster-role-to-user to make the NAMESPACE_LABELS works.
Just like when the oadm router failed at the first time, it will guide the user to add the service account router to the hostnetwork privileged scc group.

And it is more better if we can do it automatically when creating the router. Like we create the service account router and the cluster-rolebinding router-router-role.

Comment 3 Ram Ranganathan 2016-05-06 06:44:18 UTC
@bmengm I guess we can update docs for this release and add to the default set of permissions on the router service account in the next release. Does that sound good? 
The docs are still in the midst of getting written, so bear with us for a bit. Thx

Comment 4 Meng Bo 2016-05-06 06:58:11 UTC
@ramr Thanks, I think we should at least make it described clearly in documents.

Comment 5 Phil Cameron 2016-06-03 13:03:40 UTC
openshift-docs PR2199 address this. Looking for feedback.

Comment 6 Phil Cameron 2016-06-10 15:10:41 UTC
Ben Bennet, Ram R, tnguyen-rh, all provided feedback on the doc changes. I made the changes and it is back out for review.

Please feel free to review the changes on github.

Comment 7 openshift-github-bot 2016-06-29 08:16:54 UTC
Commit pushed to master at https://github.com/openshift/openshift-docs

https://github.com/openshift/openshift-docs/commit/3c1d243c064978eaf607cde689da89d2235ff446
Merge pull request #2199 from pecameron/bz1332510

bz1332510 - unable to find namespaces for router
fixes bug 1332510

Comment 8 Meng Bo 2016-09-02 06:40:57 UTC
Close this bug since the doc has been updated.