Bug 1382388 - [3.5] Router does not load existing routes
Summary: [3.5] Router does not load existing routes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 3.3.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: Ivan Chavero
QA Contact: zhaozhanqi
URL:
Whiteboard:
Depends On:
Blocks: 1267746 1383663 1415276
TreeView+ depends on / blocked
 
Reported: 2016-10-06 14:27 UTC by Alexander Koksharov
Modified: 2022-08-04 22:20 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-04-12 19:07:04 UTC
Target Upstream Version:
Embargoed:
sdodson: needinfo-
ichavero: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Origin (Github) 11768 0 None None None 2017-01-06 15:32:22 UTC
Origin (Github) 12199 0 None None None 2017-01-06 15:32:37 UTC
Red Hat Knowledge Base (Solution) 2851251 0 None None None 2017-01-20 00:15:24 UTC
Red Hat Product Errata RHBA-2017:0884 0 normal SHIPPED_LIVE Red Hat OpenShift Container Platform 3.5 RPM Release Advisory 2017-04-12 22:50:07 UTC

Description Alexander Koksharov 2016-10-06 14:27:55 UTC
Description of problem:

Router is set up with NAMESPACE_LABELS. However it does not automatically load existing routes. Adding new route or modifying existing one does trigger an update.



Version-Release number of selected component (if applicable):
3.3

How reproducible:

root@master1 # oc get dc router-shard -o yaml | grep -A1 NAMESPACE_LABELS
        - name: NAMESPACE_LABELS
          value: name=cake


root@master1 # oc get namespace cake --show-labels 
NAME      STATUS    AGE       LABELS
cake      Active    4d        name=cake

root@master1 # oc get pods
NAME                      READY     STATUS    RESTARTS   AGE
docker-registry-4-e7hhc   1/1       Running   0          4d
router-3-hay8s            1/1       Running   0          4d
router-shard-3-p5vy0      1/1       Running   0          15h

root@master1 # oc logs router-shard-3-p5vy0
I1004 11:18:07.792643       1 router.go:153] Router is only using routes in namespaces matching name=cake
I1004 11:18:07.901543       1 router.go:404] Router reloaded:
 - Checking HAProxy /healthz on port 1937 ...
 - HAProxy port 1937 health check ok : 0 retry attempt(s).

root@master1 # oc get route -n cake
NAME          HOST/PORT                                   PATH      SERVICES      PORT       TERMINATION
hello-world   hello-world-cake.apps.alko.lab ... 1 more             hello-world   8080-tcp   

root@master1 # oc exec router-shard-3-p5vy0 -- ls -l
total 76
-rw-r--r--. 1 root root  2035 Sep 22 19:55 default_pub_keys.pem
-rw-r--r--. 1 root root  1736 Sep 22 19:55 error-page-503.http
-rw-r--r--. 1 root root 28245 Sep 22 19:55 haproxy-config.template
-rwxrwxrwx. 1 root root  9160 Oct  4 11:18 haproxy.config
-rwxrwxrwx. 1 root root     2 Oct  4 11:18 os_edge_http_be.map
-rwxrwxrwx. 1 root root     2 Oct  4 11:18 os_edge_http_expose.map
-rwxrwxrwx. 1 root root     2 Oct  4 11:18 os_edge_http_redirect.map
-rwxrwxrwx. 1 root root     2 Oct  4 11:18 os_http_be.map
-rwxrwxrwx. 1 root root     2 Oct  4 11:18 os_reencrypt.map
-rwxrwxrwx. 1 root root     2 Oct  4 11:18 os_sni_passthrough.map
-rwxrwxrwx. 1 root root     2 Oct  4 11:18 os_tcp_be.map

root@master1 # oc edit route hello-world -n cake
route "hello-world" edited

root@master1 # oc exec router-shard-3-p5vy0 -- ls -l
total 76
-rw-r--r--. 1 root root  2035 Sep 22 19:55 default_pub_keys.pem
-rw-r--r--. 1 root root  1736 Sep 22 19:55 error-page-503.http
-rw-r--r--. 1 root root 28245 Sep 22 19:55 haproxy-config.template
-rwxrwxrwx. 1 root root 10138 Oct  5 03:16 haproxy.config
-rwxrwxrwx. 1 root root     4 Oct  5 03:16 os_edge_http_be.map
-rwxrwxrwx. 1 root root     4 Oct  5 03:16 os_edge_http_expose.map
-rwxrwxrwx. 1 root root     4 Oct  5 03:16 os_edge_http_redirect.map
-rwxrwxrwx. 1 root root    53 Oct  5 03:16 os_http_be.map
-rwxrwxrwx. 1 root root     4 Oct  5 03:16 os_reencrypt.map
-rwxrwxrwx. 1 root root     4 Oct  5 03:16 os_sni_passthrough.map
-rwxrwxrwx. 1 root root     4 Oct  5 03:16 os_tcp_be.map


root@master1 # oc project cake
Now using project "cake" on server "https://master.usersys.redhat.com:8443".

root@master1 # oc get svc
NAME          CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
hello-world   172.30.217.215   <none>        8080/TCP   4d

root@master1 # oc expose service hello-world --name=test-route --hostname=test-route.apps.usersys.redhat.com
route "test-route" exposed

root@master1 # oc get routes 
NAME          HOST/PORT                                       PATH      SERVICES      PORT       TERMINATION
hello-world   hello-world-cake.apps.alko.lab ... 1 more                 hello-world   8080-tcp   
test-route    test-route.apps.usersys.redhat.com ... 1 more             hello-world   8080-tcp   

root@master1 # oc project default
Now using project "default" on server "https://master.usersys.redhat.com:8443".

[05/10 09:20:14][~]
root@master1 # oc exec router-shard-3-p5vy0 -- ls -l
total 76
-rw-r--r--. 1 root root  2035 Sep 22 19:55 default_pub_keys.pem
-rw-r--r--. 1 root root  1736 Sep 22 19:55 error-page-503.http
-rw-r--r--. 1 root root 28245 Sep 22 19:55 haproxy-config.template
-rwxrwxrwx. 1 root root 11115 Oct  5 03:20 haproxy.config
-rwxrwxrwx. 1 root root     6 Oct  5 03:20 os_edge_http_be.map
-rwxrwxrwx. 1 root root     6 Oct  5 03:20 os_edge_http_expose.map
-rwxrwxrwx. 1 root root     6 Oct  5 03:20 os_edge_http_redirect.map
-rwxrwxrwx. 1 root root   107 Oct  5 03:20 os_http_be.map
-rwxrwxrwx. 1 root root     6 Oct  5 03:20 os_reencrypt.map
-rwxrwxrwx. 1 root root     6 Oct  5 03:20 os_sni_passthrough.map
-rwxrwxrwx. 1 root root     6 Oct  5 03:20 os_tcp_be.map

[05/10 09:20:21][~]
root@master1 #



Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 zhaozhanqi 2016-10-10 09:42:29 UTC
This bug should be duplicate with https://bugzilla.redhat.com/show_bug.cgi?id=1355711

Comment 3 Mark Chappell 2016-10-10 12:53:54 UTC
I'm not sure this is a duplicate of #1355711 the issue here (which I think I'm also seeing) is more that the initial HAProxy configuration isn't being loaded when the HAProxy pods are started.

Comment 4 Jaspreet Kaur 2016-10-11 12:07:24 UTC
rpm -qa | grep openshift
atomic-openshift-clients-3.2.1.7-1.git.0.2702170.el7.x86_64
atomic-openshift-master-3.2.1.7-1.git.0.2702170.el7.x86_64
tuned-profiles-atomic-openshift-node-3.2.1.7-1.git.0.2702170.el7.x86_64
atomic-openshift-node-3.2.1.7-1.git.0.2702170.el7.x86_64
atomic-openshift-sdn-ovs-3.2.1.7-1.git.0.2702170.el7.x86_64
atomic-openshift-3.2.1.7-1.git.0.2702170.el7.x86_64


[root@master0 ~]# oc version
oc v3.2.1.7
kubernetes v1.2.0-36-g4a3f9c5

Comment 5 Ben Bennett 2016-10-12 17:40:31 UTC
Weibin: Can you please try to duplicate this?

Comment 6 Weibin Liang 2016-10-12 18:18:50 UTC
Ben: sure, it look it is openshift 3.2, let me install it.

Comment 7 Weibin Liang 2016-10-13 20:34:55 UTC
Load openshift 3.2.1.7 and follow above testing steps, but can not reproduce this problem. 


[root@dhcp-41-180 ~]# oc version
oc v3.2.1.7
kubernetes v1.2.0-36-g4a3f9c5
[root@dhcp-41-180 ~]# 

[root@dhcp-41-180 ~]# oadm policy add-scc-to-user privileged -z user1
[root@dhcp-41-180 ~]# oadm router router-shard --replicas=0  --service-account=user1 -n default --host-network=true
Router "router-shard" service exists
[root@dhcp-41-180 ~]# oadm policy add-cluster-role-to-user cluster-reader system:serviceaccount:default:user1
[root@dhcp-41-180 ~]# oc env dc/router-shard NAMESPACE_LABELS="name=cake"
deploymentconfig "router-shard" updated
[root@dhcp-41-180 ~]# oc scale dc/router-shard --replicas=1
deploymentconfig "router-shard" scaled
[root@dhcp-41-180 ~]# 
[root@dhcp-41-180 ~]# 
[root@dhcp-41-180 ~]# oc new-project cake
Now using project "cake" on server "https://dhcp-41-180.bos.redhat.com:8443".

You can add applications to this project with the 'new-app' command. For example, try:

    $ oc new-app centos/ruby-22-centos7~https://github.com/openshift/ruby-hello-world.git

to build a new hello-world application in Ruby.
[root@dhcp-41-180 ~]# oc label namespace cake "name=cake"
namespace "cake" labeled
[root@dhcp-41-180 ~]# oc create -f https://raw.githubusercontent.com/weliang1/Openshift_Networking/master/OSE3.3/hello-openshift-twopods.json
route "hello-route" created
service "hello-service" created
pod "hello-pod-1" created
pod "hello-pod-2" created
[root@dhcp-41-180 ~]# oc project default
Now using project "default" on server "https://dhcp-41-180.bos.redhat.com:8443".
[root@dhcp-41-180 ~]# oc get dc router-shard -o yaml | grep -A1 NAMESPACE_LABELS
        - name: NAMESPACE_LABELS
          value: name=cake
[root@dhcp-41-180 ~]# oc get namespace cake --show-labels 
NAME      STATUS    AGE       LABELS
cake      Active    50s       name=cake
[root@dhcp-41-180 ~]# oc get pods
NAME                    READY     STATUS             RESTARTS   AGE
router-shard-1-deploy   0/1       DeadlineExceeded   0          9m
router-shard-2-ld3tp    1/1       Running            0          5m
[root@dhcp-41-180 ~]# oc logs router-shard-2-ld3tp
I1013 16:14:39.431084       1 router.go:153] Router is only using routes in namespaces matching name=cake
I1013 16:14:39.499742       1 router.go:321] Router reloaded:
 - Checking HAProxy /healthz on port 1936 ...
 - HAProxy port 1936 health check ok : 0 retry attempt(s).
E1013 16:14:40.061737       1 status.go:195] Unable to write router status - please ensure you reconcile your system policy or grant this router access to update route status: User "system:serviceaccount:default:user1" cannot update routes/status in project "cake"
I1013 16:14:44.452732       1 router.go:321] Router reloaded:
 - Checking HAProxy /healthz on port 1936 ...
 - HAProxy port 1936 health check ok : 0 retry attempt(s).
I1013 16:14:53.181918       1 router.go:321] Router reloaded:
 - Checking HAProxy /healthz on port 1936 ...
 - HAProxy port 1936 health check ok : 0 retry attempt(s).
I1013 16:14:58.270230       1 router.go:321] Router reloaded:
 - Checking HAProxy /healthz on port 1936 ...
 - HAProxy port 1936 health check ok : 0 retry attempt(s).
E1013 16:17:34.439091       1 status.go:195] Unable to write router status - please ensure you reconcile your system policy or grant this router access to update route status: User "system:serviceaccount:default:user1" cannot update routes/status in project "cake"
I1013 16:17:34.464080       1 router.go:321] Router reloaded:
 - Checking HAProxy /healthz on port 1936 ...
 - HAProxy port 1936 health check ok : 0 retry attempt(s).
I1013 16:19:02.209927       1 router.go:321] Router reloaded:
 - Checking HAProxy /healthz on port 1936 ...
 - HAProxy port 1936 health check ok : 0 retry attempt(s).
I1013 16:19:04.463085       1 router.go:321] Router reloaded:
 - Checking HAProxy /healthz on port 1936 ...
 - HAProxy port 1936 health check ok : 0 retry attempt(s).
E1013 16:19:33.980511       1 status.go:195] Unable to write router status - please ensure you reconcile your system policy or grant this router access to update route status: User "system:serviceaccount:default:user1" cannot update routes/status in project "cake"
I1013 16:19:34.003572       1 router.go:321] Router reloaded:
 - Checking HAProxy /healthz on port 1936 ...
 - HAProxy port 1936 health check ok : 0 retry attempt(s).
I1013 16:19:34.477546       1 router.go:321] Router reloaded:
 - Checking HAProxy /healthz on port 1936 ...
 - HAProxy port 1936 health check ok : 0 retry attempt(s).
I1013 16:19:45.908169       1 router.go:321] Router reloaded:
 - Checking HAProxy /healthz on port 1936 ...
 - HAProxy port 1936 health check ok : 0 retry attempt(s).
I1013 16:19:50.997966       1 router.go:321] Router reloaded:
 - Checking HAProxy /healthz on port 1936 ...
 - HAProxy port 1936 health check ok : 0 retry attempt(s).
[root@dhcp-41-180 ~]# oc get route -n cake
NAME          HOST/PORT             PATH      SERVICE         TERMINATION   LABELS
hello-route   hello-openshift.com             hello-service                 name=hello-route-1
[root@dhcp-41-180 ~]# oc exec router-shard-2-ld3tp -- ls -l
total 68
-rwxrwxrwx. 1 root root  2042 Jul 13 13:50 default_pub_keys.pem
-rwxrwxrwx. 1 root root  1736 Jul 13 14:31 error-page-503.http
-rwxrwxrwx. 1 root root 16899 Jul 13 14:31 haproxy-config.template
-rwxrwxrwx. 1 root root  9827 Oct 13 16:19 haproxy.config
-rwxrwxrwx. 1 root root     6 Oct 13 16:19 os_edge_http_be.map
-rwxrwxrwx. 1 root root     6 Oct 13 16:19 os_edge_http_expose.map
-rwxrwxrwx. 1 root root     6 Oct 13 16:19 os_edge_http_redirect.map
-rwxrwxrwx. 1 root root    44 Oct 13 16:19 os_http_be.map
-rwxrwxrwx. 1 root root     6 Oct 13 16:19 os_reencrypt.map
-rwxrwxrwx. 1 root root     6 Oct 13 16:19 os_sni_passthrough.map
-rwxrwxrwx. 1 root root     6 Oct 13 16:19 os_tcp_be.map
[root@dhcp-41-180 ~]# curl --resolve hello-openshift.com:80:10.18.41.10 http://hello-openshift.com
Hello OpenShift!

##### Log into host which router pod created to check route information:
[root@dhcp-41-10 ~]# cid=$(docker ps | egrep "openshift3/(origin|ose)-haproxy-router" | awk '{print $1}')
[root@dhcp-41-10 ~]# sudo nsenter -m -u -n -i -p -t $(docker inspect --format "{{ .State.Pid }}" "$cid")
[root@dhcp-41-10 /]# cd /var/lib/haproxy/conf
[root@dhcp-41-10 conf]# more /var/lib/haproxy/conf/os_http_be.map
hello-openshift.com cake_hello-route
####


[root@dhcp-41-180 ~]# oc project cake
Now using project "cake" on server "https://dhcp-41-180.bos.redhat.com:8443".
[root@dhcp-41-180 ~]# oc get svc
NAME            CLUSTER-IP     EXTERNAL-IP   PORT(S)    AGE
hello-service   172.30.57.12   <none>        8080/TCP   3m
[root@dhcp-41-180 ~]# oc expose service hello-service --name=test-route --hostname=test-route.apps.usersys.redhat.com
route "test-route" exposed
[root@dhcp-41-180 ~]# oc get routes 
NAME          HOST/PORT                            PATH      SERVICE         TERMINATION   LABELS
hello-route   hello-openshift.com                            hello-service                 name=hello-route-1
test-route    test-route.apps.usersys.redhat.com             hello-service                 
[root@dhcp-41-180 ~]# oc project default
Now using project "default" on server "https://dhcp-41-180.bos.redhat.com:8443".
[root@dhcp-41-180 ~]# oc exec router-shard-2-ld3tp -- ls -l
total 68
-rwxrwxrwx. 1 root root  2042 Jul 13 13:50 default_pub_keys.pem
-rwxrwxrwx. 1 root root  1736 Jul 13 14:31 error-page-503.http
-rwxrwxrwx. 1 root root 16899 Jul 13 14:31 haproxy-config.template
-rwxrwxrwx. 1 root root 10763 Oct 13 16:23 haproxy.config
-rwxrwxrwx. 1 root root     8 Oct 13 16:23 os_edge_http_be.map
-rwxrwxrwx. 1 root root     8 Oct 13 16:23 os_edge_http_expose.map
-rwxrwxrwx. 1 root root     8 Oct 13 16:23 os_edge_http_redirect.map
-rwxrwxrwx. 1 root root    98 Oct 13 16:23 os_http_be.map
-rwxrwxrwx. 1 root root     8 Oct 13 16:23 os_reencrypt.map
-rwxrwxrwx. 1 root root     8 Oct 13 16:23 os_sni_passthrough.map
-rwxrwxrwx. 1 root root     8 Oct 13 16:23 os_tcp_be.map
[root@dhcp-41-180 ~]# curl --resolve test-route.apps.usersys.redhat.com:80:10.18.41.10 http://test-route.apps.usersys.redhat.com
Hello OpenShift!
[root@dhcp-41-180 ~]# 

##### Log into host which router pod created to check route information:
[root@dhcp-41-10 ~]# cid=$(docker ps | egrep "openshift3/(origin|ose)-haproxy-router" | awk '{print $1}')
[root@dhcp-41-10 ~]# sudo nsenter -m -u -n -i -p -t $(docker inspect --format "{{ .State.Pid }}" "$cid")
[root@dhcp-41-10 /]# cd /var/lib/haproxy/conf
[root@dhcp-41-10 conf]# more /var/lib/haproxy/conf/os_http_be.map
test-route.apps.usersys.redhat.com cake_test-route
hello-openshift.com cake_hello-route
[root@dhcp-41-10 conf]# 
#####

Comment 9 Weibin Liang 2016-10-17 20:11:29 UTC
Mark, I think this time I replicate the same issue, here is router logs:

####
[root@dhcp-41-180 ~]# oc logs router-shard-3-ws0tf -n default
I1017 15:36:10.039821       1 router.go:120] Creating a new template router, writing to /var/lib/containers/router
I1017 15:36:10.041004       1 router.go:122] Router will use default/router-shard service to identify peers
I1017 15:36:10.041044       1 router.go:191] Template router will coalesce reloads within 5 seconds of each other
I1017 15:36:10.041051       1 router.go:161] Reading persisted state
I1017 15:36:10.041231       1 router.go:165] Committing state
I1017 15:36:10.041275       1 router.go:236] Writing the router state
I1017 15:36:10.042100       1 merged_client_builder.go:102] No kubeconfig could be created, falling back to service account.
I1017 15:36:10.042522       1 router.go:153] Router is only using routes in namespaces matching name=cake
I1017 15:36:10.042588       1 controller.go:47] Running router controller
I1017 15:36:10.043627       1 router.go:241] Writing the router config
I1017 15:36:10.044076       1 router.go:246] Reloading the router
I1017 15:36:10.078884       1 router.go:628] Updating skip commit to: %!s(bool=true)
I1017 15:36:10.078904       1 controller.go:69] Updating watched namespaces: map[]
I1017 15:36:10.119416       1 router.go:321] Router reloaded:
 - Checking HAProxy /healthz on port 1936 ...
 - HAProxy port 1936 health check ok : 0 retry attempt(s).
I1017 15:36:10.119464       1 router.go:225] Skipping router commit until last sync has been processed
I1017 15:36:10.119479       1 reaper.go:17] Launching reaper
I1017 15:36:10.122253       1 controller.go:69] Updating watched namespaces: map[]
I1017 15:36:10.122270       1 router.go:225] Skipping router commit until last sync has been processed
I1017 15:38:12.789046       1 router.go:628] Updating skip commit to: %!s(bool=false)
I1017 15:38:12.789071       1 controller.go:97] Processing Route: hello-route -> hello-service
I1017 15:38:12.789076       1 controller.go:98]            Alias: hello-openshift.com
I1017 15:38:12.789079       1 controller.go:99]            Event: ADDED
I1017 15:38:22.871565       1 controller.go:97] Processing Route: test-route -> hello-service
I1017 15:38:22.871599       1 controller.go:98]            Alias: test-route.apps.usersys.redhat.com
I1017 15:38:22.871604       1 controller.go:99]            Event: ADDED
I1017 15:46:00.126201       1 controller.go:69] Updating watched namespaces: map[cake:{}]
I1017 15:46:11.046511       1 router.go:628] Updating skip commit to: %!s(bool=true)
I1017 15:46:11.046912       1 controller.go:97] Processing Route: hello-route -> hello-service
I1017 15:46:11.046934       1 controller.go:98]            Alias: hello-openshift.com
I1017 15:46:11.046938       1 controller.go:99]            Event: MODIFIED
I1017 15:46:11.046963       1 unique_host.go:157] Route cake/hello-service claims hello-openshift.com
I1017 15:46:11.046998       1 status.go:162] has last touch <nil> for cake/hello-route
I1017 15:46:11.047040       1 status.go:252] admit: admitting route by updating status: hello-route (true): hello-openshift.com
E1017 15:46:11.061318       1 status.go:195] Unable to write router status - please ensure you reconcile your system policy or grant this router access to update route status: User "system:serviceaccount:default:user1" cannot update routes/status in project "cake"
I1017 15:46:11.061342       1 plugin.go:176] Creating new frontend for key: cake/hello-service
I1017 15:46:11.061357       1 plugin.go:180] Modifying routes for cake/hello-service
I1017 15:46:11.061379       1 router.go:225] Skipping router commit until last sync has been processed
I1017 15:46:11.061403       1 plugin.go:140] Processing 1 Endpoints for Name: hello-service (MODIFIED)
I1017 15:46:11.061430       1 plugin.go:143]   Subset 0 : api.EndpointSubset{Addresses:[]api.EndpointAddress{api.EndpointAddress{IP:"10.1.0.2", TargetRef:(*api.ObjectReference)(0xc2083fc2a0)}, api.EndpointAddress{IP:"10.1.0.3", TargetRef:(*api.ObjectReference)(0xc2083fc3f0)}}, NotReadyAddresses:[]api.EndpointAddress(nil), Ports:[]api.EndpointPort{api.EndpointPort{Name:"", Port:8080, Protocol:"TCP"}}}
I1017 15:46:11.061495       1 plugin.go:152] Modifying endpoints for cake/hello-service
I1017 15:46:11.061510       1 router.go:225] Skipping router commit until last sync has been processed
I1017 15:46:11.061522       1 router.go:628] Updating skip commit to: %!s(bool=false)
I1017 15:46:11.061527       1 controller.go:97] Processing Route: test-route -> hello-service
I1017 15:46:11.061529       1 controller.go:98]            Alias: test-route.apps.usersys.redhat.com
I1017 15:46:11.061532       1 controller.go:99]            Event: MODIFIED
I1017 15:46:11.061538       1 unique_host.go:157] Route cake/hello-service claims test-route.apps.usersys.redhat.com
I1017 15:46:11.061542       1 status.go:162] has last touch <nil> for cake/test-route
I1017 15:46:11.061562       1 status.go:252] admit: admitting route by updating status: test-route (true): test-route.apps.usersys.redhat.com
E1017 15:46:11.063024       1 status.go:195] Unable to write router status - please ensure you reconcile your system policy or grant this router access to update route status: User "system:serviceaccount:default:user1" cannot update routes/status in project "cake"
I1017 15:46:11.063036       1 plugin.go:180] Modifying routes for cake/hello-service
I1017 15:46:11.063068       1 router.go:236] Writing the router state
I1017 15:46:11.063302       1 router.go:241] Writing the router config
I1017 15:46:11.064597       1 router.go:246] Reloading the router
I1017 15:46:11.160973       1 reaper.go:24] Signal received: child exited
I1017 15:46:11.161044       1 reaper.go:32] Reaped process with pid 28
I1017 15:46:11.169444       1 router.go:321] Router reloaded:
 - Checking HAProxy /healthz on port 1936 ...
 - HAProxy port 1936 health check ok : 0 retry attempt(s).
I1017 15:46:11.169474       1 reaper.go:24] Signal received: child exited
[root@dhcp-41-180 ~]# curl --resolve test-route.apps.usersys.redhat.com:80:10.18.41.10 http://test-route.apps.usersys.redhat.com
Hello OpenShift!
[root@dhcp-41-180 ~]#  curl --resolve hello-openshift.com:80:10.18.41.10  http://hello-openshift.com
Hello OpenShift!
[root@dhcp-41-180 ~]# 
####

Look at the line "I1017 15:46:00.126201       1 controller.go:69] Updating watched namespaces: map[cake:{}]", it means even admin create new routes in namspace/cake, but it will get update 10 minutes late, and two curl will work after 10 minutes.

I think this is same issue as this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1355711 which mentioned the router watches for changes to routes and endpoints, but only queries namespace state on startup and every subsequent sync interval which is 10m by default.

If I update --resync-interval=3m, updating watched namespaces occur after 3m and curl also works after 3m.

Here is the logs from setting --resync-interval=3m:
####
[root@dhcp-41-180 ~]# oc logs router-shard-3-z3b3a -n default
I1017 16:02:30.254236       1 router.go:120] Creating a new template router, writing to /var/lib/containers/router
I1017 16:02:30.255707       1 router.go:122] Router will use default/router-shard service to identify peers
I1017 16:02:30.255748       1 router.go:191] Template router will coalesce reloads within 5 seconds of each other
I1017 16:02:30.255753       1 router.go:161] Reading persisted state
I1017 16:02:30.256006       1 router.go:165] Committing state
I1017 16:02:30.256060       1 router.go:236] Writing the router state
I1017 16:02:30.256737       1 merged_client_builder.go:102] No kubeconfig could be created, falling back to service account.
I1017 16:02:30.257160       1 router.go:153] Router is only using routes in namespaces matching name=cake
I1017 16:02:30.257262       1 controller.go:47] Running router controller
I1017 16:02:30.258913       1 router.go:241] Writing the router config
I1017 16:02:30.259276       1 router.go:246] Reloading the router
I1017 16:02:30.276747       1 router.go:628] Updating skip commit to: %!s(bool=true)
I1017 16:02:30.276803       1 controller.go:69] Updating watched namespaces: map[]
I1017 16:02:30.326290       1 router.go:321] Router reloaded:
 - Checking HAProxy /healthz on port 1936 ...
 - HAProxy port 1936 health check ok : 0 retry attempt(s).
I1017 16:02:30.326336       1 router.go:225] Skipping router commit until last sync has been processed
I1017 16:02:30.326349       1 reaper.go:17] Launching reaper
I1017 16:02:30.329129       1 controller.go:69] Updating watched namespaces: map[]
I1017 16:02:30.329145       1 router.go:225] Skipping router commit until last sync has been processed
I1017 16:02:49.489496       1 router.go:628] Updating skip commit to: %!s(bool=false)
I1017 16:02:49.489517       1 controller.go:97] Processing Route: hello-route -> hello-service
I1017 16:02:49.489521       1 controller.go:98]            Alias: hello-openshift.com
I1017 16:02:49.489524       1 controller.go:99]            Event: ADDED
I1017 16:02:50.803688       1 controller.go:97] Processing Route: test-route -> hello-service
I1017 16:02:50.803721       1 controller.go:98]            Alias: test-route.apps.usersys.redhat.com
I1017 16:02:50.803726       1 controller.go:99]            Event: ADDED
I1017 16:05:20.333160       1 controller.go:69] Updating watched namespaces: map[cake:{}]
I1017 16:05:31.263393       1 router.go:628] Updating skip commit to: %!s(bool=true)
I1017 16:05:31.263431       1 controller.go:97] Processing Route: hello-route -> hello-service
I1017 16:05:31.263440       1 controller.go:98]            Alias: hello-openshift.com
I1017 16:05:31.263444       1 controller.go:99]            Event: MODIFIED
I1017 16:05:31.263489       1 unique_host.go:157] Route cake/hello-service claims hello-openshift.com
I1017 16:05:31.263526       1 status.go:162] has last touch <nil> for cake/hello-route
I1017 16:05:31.263585       1 status.go:252] admit: admitting route by updating status: hello-route (true): hello-openshift.com
E1017 16:05:31.289823       1 status.go:195] Unable to write router status - please ensure you reconcile your system policy or grant this router access to update route status: User "system:serviceaccount:default:user1" cannot update routes/status in project "cake"
I1017 16:05:31.289852       1 plugin.go:176] Creating new frontend for key: cake/hello-service
I1017 16:05:31.289866       1 plugin.go:180] Modifying routes for cake/hello-service
I1017 16:05:31.289888       1 router.go:225] Skipping router commit until last sync has been processed
I1017 16:05:31.289985       1 controller.go:97] Processing Route: test-route -> hello-service
I1017 16:05:31.289991       1 controller.go:98]            Alias: test-route.apps.usersys.redhat.com
I1017 16:05:31.289993       1 controller.go:99]            Event: MODIFIED
I1017 16:05:31.290001       1 unique_host.go:157] Route cake/hello-service claims test-route.apps.usersys.redhat.com
I1017 16:05:31.290006       1 status.go:162] has last touch <nil> for cake/test-route
I1017 16:05:31.290053       1 status.go:252] admit: admitting route by updating status: test-route (true): test-route.apps.usersys.redhat.com
E1017 16:05:31.291834       1 status.go:195] Unable to write router status - please ensure you reconcile your system policy or grant this router access to update route status: User "system:serviceaccount:default:user1" cannot update routes/status in project "cake"
I1017 16:05:31.291854       1 plugin.go:180] Modifying routes for cake/hello-service
I1017 16:05:31.291866       1 router.go:225] Skipping router commit until last sync has been processed
I1017 16:05:31.291917       1 router.go:628] Updating skip commit to: %!s(bool=false)
I1017 16:05:31.291930       1 plugin.go:140] Processing 1 Endpoints for Name: hello-service (MODIFIED)
I1017 16:05:31.291953       1 plugin.go:143]   Subset 0 : api.EndpointSubset{Addresses:[]api.EndpointAddress{api.EndpointAddress{IP:"10.1.0.2", TargetRef:(*api.ObjectReference)(0xc20860e070)}, api.EndpointAddress{IP:"10.1.0.3", TargetRef:(*api.ObjectReference)(0xc20860e230)}}, NotReadyAddresses:[]api.EndpointAddress(nil), Ports:[]api.EndpointPort{api.EndpointPort{Name:"", Port:8080, Protocol:"TCP"}}}
I1017 16:05:31.292022       1 plugin.go:152] Modifying endpoints for cake/hello-service
I1017 16:05:31.292061       1 router.go:236] Writing the router state
I1017 16:05:31.292269       1 router.go:241] Writing the router config
I1017 16:05:31.293491       1 router.go:246] Reloading the router
I1017 16:05:31.404642       1 reaper.go:24] Signal received: child exited
I1017 16:05:31.404713       1 reaper.go:32] Reaped process with pid 28
I1017 16:05:31.413302       1 router.go:321] Router reloaded:
 - Checking HAProxy /healthz on port 1936 ...
 - HAProxy port 1936 health check ok : 0 retry attempt(s).
I1017 16:05:31.413338       1 reaper.go:24] Signal received: child exited
[root@dhcp-41-180 ~]# curl --resolve test-route.apps.usersys.redhat.com:80:10.18.41.10 http://test-route.apps.usersys.redhat.com
Hello OpenShift!
[root@dhcp-41-180 ~]# curl --resolve hello-openshift.com:80:10.18.41.10  http://hello-openshift.com
Hello OpenShift!
[root@dhcp-41-180 ~]# 
####

Comment 10 Ben Bennett 2016-10-24 19:57:51 UTC

*** This bug has been marked as a duplicate of bug 1355711 ***

Comment 11 Alexander Koksharov 2016-11-15 10:27:16 UTC
I think there is a misunderstanding. Our problem is as follows: If a router pod gets restarted routes do not get picked up. Steps to reproduce the issue:

1) create a route within a project with NAMESPACE_LABELS
2) create a router shard with NAMESPACE_LABELS
3) restart the router shard
4) look at the configured maps in the router pod: oc exec router-11-y3q7q -- ls -l
     >> all map files are of size 2

Our issue does not involve newly created namespaces.

Comment 14 Ben Bennett 2016-11-15 15:26:33 UTC
Ram: Something is funny with it not updating the watched namespaces until 3 minutes after the router starts, yet it knows immediately that it is limited.  Is it possible that they are missing some permissions and that's causing trouble?  They clearly can't write route statuses, but because the service existed their 'oadm router' command failed... so it is possible that they aren't running what they expect.


I1017 16:02:30.257160       1 router.go:153] Router is only using routes in namespaces matching name=cake
I1017 16:02:30.257262       1 controller.go:47] Running router controller
I1017 16:02:30.258913       1 router.go:241] Writing the router config
I1017 16:02:30.259276       1 router.go:246] Reloading the router
I1017 16:02:30.276747       1 router.go:628] Updating skip commit to: %!s(bool=true)
I1017 16:02:30.276803       1 controller.go:69] Updating watched namespaces: map[]
I1017 16:02:30.326290       1 router.go:321] Router reloaded:
 - Checking HAProxy /healthz on port 1936 ...
 - HAProxy port 1936 health check ok : 0 retry attempt(s).
I1017 16:02:30.326336       1 router.go:225] Skipping router commit until last sync has been processed
I1017 16:02:30.326349       1 reaper.go:17] Launching reaper
I1017 16:02:30.329129       1 controller.go:69] Updating watched namespaces: map[]
I1017 16:02:30.329145       1 router.go:225] Skipping router commit until last sync has been processed
I1017 16:02:49.489496       1 router.go:628] Updating skip commit to: %!s(bool=false)
I1017 16:02:49.489517       1 controller.go:97] Processing Route: hello-route -> hello-service
I1017 16:02:49.489521       1 controller.go:98]            Alias: hello-openshift.com
I1017 16:02:49.489524       1 controller.go:99]            Event: ADDED
I1017 16:02:50.803688       1 controller.go:97] Processing Route: test-route -> hello-service
I1017 16:02:50.803721       1 controller.go:98]            Alias: test-route.apps.usersys.redhat.com
I1017 16:02:50.803726       1 controller.go:99]            Event: ADDED
I1017 16:05:20.333160       1 controller.go:69] Updating watched namespaces: map[cake:{}]

Comment 15 Ram Ranganathan 2016-11-16 01:43:23 UTC
@Ben status update failing due to perms / not updating the status is fine - it could well be an older version of the OpenShift API with a newer router image? Is that the case here - Alexander/Mark? Similar to another customer that 
And as per Alexander's last comment, its happening at startup - I can't reproduce it locally. I see no delay in the routes being picked up (well a few seconds 3/4 but that's expected behavior). 

@Alexander/Mark is it possible to get detailed logs with loglevel=4
Basically the router dc needs to be edited to have something like:

$ oc edit dc/router
   command:
     - /usr/bin/openshift-router             
     - --loglevel=4

$ #  note: command should be at the same level as 
$ #        image: openshift/origin-haproxy-router:...  tag.

I think Mark already posted some log snippets with loglevel=4 but having the entire
log all might help along with env list for the router, and the label info on one or more of the matching namespaces.

$ oc get namespace $insert-names-here -o yaml > /tmp/namespaces-$names.yaml
$ #  example oc get namespace foo bar baz -o yaml > /tmp/namespaces-$names.yaml

$ oc env dc/router --list
$ oc logs $(oc get pods  | grep router | awk '{print $1}')  > /tmp/routerpod.log

Thx

Comment 32 Ben Bennett 2017-01-27 19:31:55 UTC
This is the 3.5 work, setting it to modified since that code has landed.  The backports are being tracked under other bzs.

Comment 33 Troy Dawson 2017-01-31 20:29:02 UTC
This has been merged into ocp and is in OCP v3.5.0.12 or newer.

Comment 34 zhaozhanqi 2017-02-03 08:38:58 UTC
Tested this bug on 
openshift v3.5.0.14+20b49d0
kubernetes v1.5.2+43a9be4
etcd 3.1.0

with image registry.ops.openshift.com/openshift3/ose-haproxy-router   v3.5.0.14           887231f497c5

steps:

1. oc new-project u2p4
2. oc label namespace u2p4 team=red
3. oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/routing/edge/service_unsecure.json
4. oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/routing/caddy-docker.json
5. oc expose svc service-unsecure
6.  oadm router
7. oc env dc/router NAMESPACE_LABELS=team=red
8. Check the route of u2p4 
   curl service-unsecure-u2p4.0203-4h9.qe.rhcloud.com
   Hello-OpenShift-1 http-8080

Comment 44 errata-xmlrpc 2017-04-12 19:07:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0884


Note You need to log in before you can comment on or make changes to this bug.