My hunch is that this is because the ELB in front of the router is confusing the source address. @Phil, can you take a look at this to see if src honors the proxy protocol? If so, we need to update the docs to tell people it will only work if proxy protocol is in use.
@yandu, it works fine in my env with 3.9.121 image, curl from both master and node works. [root@ip-172-18-8-94 ~]# oc annotate route route1 --overwrite haproxy.router.openshift.io/ip_whitelist='172.18.8.94 172.18.3.148' route "route1" annotated [root@ip-172-18-8-94 ~]# curl --resolve hello-openshift.com:80:172.18.3.148 http://hello-openshift.com Hello OpenShift! [root@ip-172-18-8-94 ~]# oc project default Now using project "default" on server "https://ip-172-18-8-94.ec2.internal:8443". [root@ip-172-18-8-94 ~]# oc rsh router-1-cqgf1 sh-4.2$ grep -B6 white haproxy.config # Plain http backend backend be_http:p1:route1 mode http option redispatch option forwardfor balance leastconn acl whitelist src 172.18.8.94 172.18.3.148 tcp-request content reject if !whitelist sh-4.2$ exit exit Additional info: When oc env dc/router ROUTER_USE_PROXY_PROTOCOL=false, both curl work When oc env dc/router ROUTER_USE_PROXY_PROTOCOL=true, both curl fail
Does route1-d1.0623-8o4.qe.rhcloud.com point to the specific node running the router or is that a Google Load Balancer which points to the specific node running the router? If route1-d1.0623-8o4.qe.rhcloud.com points directly to the router then I'm not sure why weibin works and yandu fails. If route1-d1.0623-8o4.qe.rhcloud.com does go through a Google TCP Proxy Load Balancer than we can explain what happened. If it goes through the proxy then the Google Load Balancer needs to be configured with "proxy protocol". https://cloud.google.com/compute/docs/load-balancing/tcp-ssl/tcp-proxy#proxy-protocol And the router needs to run with ROUTER_USE_PROXY_PROTOCOL=true Both sides must always match! Either proxy protocol must be on in the LB AND the Router or must be off in both. If the do not match NOTHING will work. If there is no google load balancer involved can you give us login details for the reproducer? I wonder if some form of NAT could exist which might cause the issue you are describing. But I'm not certain how to prove that without trial and error.
I think @weibin may misunderstand the bug. I could also curl successfully from master/node. It is just not working when we curl from own local vm, most normal user may use their own vms to connect OCP and access the route since they don't have permission to login to master/node. You could refer to detail info in step5 and step6 above.
@yandu Ohhhh ok. So the problem is likely that the router is seeing the source as 127.0.01? Can you add 127.0.0.1 to the whitelist. If that works it would prove my hypothesis?
Docs for PROXY Protocol https://docs.openshift.com/container-platform/3.5/install_config/router/proxy_protocol.html
@yandu, updated my testing env as you mentioned above, now I can not curl from local VM when add my VM ip/10.18.41.193 in whitelist. In AWS router, tcpdump to captures the http packets when curl from local VM, and you will see the source IP in http packets is 66.187.233.206 (not 10.18.41.193), if you put 66.187.233.206 in whitelist, curl from local VM will work again. The test case failed because of including the wrong ip address in whitelist.