Bug 1275003
| Summary: | router reports 503 for seemingly properly identified route when endpoints accessible | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Erik M Jacobs <ejacobs> |
| Component: | Networking | Assignee: | Michail Kargakis <mkargaki> |
| Networking sub component: | router | QA Contact: | zhaozhanqi <zzhao> |
| Status: | CLOSED CURRENTRELEASE | Docs Contact: | |
| Severity: | urgent | ||
| Priority: | unspecified | CC: | aos-bugs, ejacobs, pweil, sdodson |
| Version: | 3.1.0 | ||
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2015-11-23 14:26:31 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
All endpoints are reachable, in case anyone was wondering (to be thorough): [root@ose3-master training]# curl 10.1.1.6:8080 Hello OpenShift! [root@ose3-master training]# curl 10.1.1.7:8080 Hello OpenShift! [root@ose3-master training]# curl 10.1.2.2:8080 Hello OpenShift! [root@ose3-master training]# oc exec router-2-2ty6y -it -- bash [root@ose3-master conf]# curl 10.1.1.6:8080 Hello OpenShift! [root@ose3-master conf]# curl 10.1.1.7:8080 Hello OpenShift! [root@ose3-master conf]# curl 10.1.2.2:8080 Hello OpenShift! inside the router:
[root@ose3-master /]# find -name '*.map'
./var/lib/haproxy/conf/os_http_be.map
./var/lib/haproxy/conf/os_sni_passthrough.map
./var/lib/haproxy/conf/os_reencrypt.map
./var/lib/haproxy/conf/os_edge_http_be.map
./var/lib/haproxy/conf/os_tcp_be.map
./usr/share/groff/1.22.2/font/devps/generate/dingbats.map
[root@ose3-master /]# cd /var/lib/haproxy/conf/
[root@ose3-master conf]# cat os_http_be.map
hello-service-demo.cloudapps.example.com demo_hello-service
----
Yes, everything above ---- is whitespace/blank
[root@ose3-master training]# oc logs router-1-vyia4
I1024 16:35:35.754370 1 router.go:122] Router is including routes in all namespaces
Logs don't indicate anything...
Should I not see something like:
backend be_http_tim-test-2_nodejs-example-http
mode http
option redispatch
balance leastconn
timeout check 5000ms
cookie OPENSHIFT_tim-test-2_nodejs-example-http_SERVERID insert indirect nocache httponly
server 10.1.3.6:8080 10.1.3.6:8080 check inter 5000ms cookie 10.1.3.6:8080
@Erik
QE cannot reproduce this issue..here are the steps
1. Create 3 pods with labels 'name: hello-nginx-docker"
# oc get pod -n zzhao
NAME READY STATUS RESTARTS AGE
hello-nginx-docker 1/1 Running 0 19m
hello-nginx-docker-2 1/1 Running 0 18m
hello-nginx-docker-3 1/1 Running 0 17m
2. Create service to mapping those three pods
[root@openshift-140 ~]# oc get endpoints -n zzhao
NAME ENDPOINTS AGE
hello-nginx 10.1.1.37:80,10.1.1.38:80,10.1.1.39:80 18m
3. oc expose svc hello-nginx
# oc get route
NAME HOST/PORT PATH SERVICE LABELS TLS TERMINATION
hello-nginx hello-nginx-zzhao.ose-appoxza.com.cn hello-nginx name=hello-nginx
4. curl the route
# curl hello-nginx-zzhao.ose-appoxza.com.cn
Hello World
I was wondering your haproxy.conf is wrong, here
<------snip---->
cookie OPENSHIFT_tim-test-2_nodejs-example-http_SERVERID insert indirect nocache httponly
server 10.1.3.6:8080 10.1.3.6:8080 check inter 5000ms cookie 10.1.3.6:8080
<-----snip---->
don't know why the pod ip is '10.1.3.6', normally this should be your three pods ip.
you can check my side:
######haproxy.conf######
backend be_http_zzhao_hello-nginx
mode http
option redispatch
option forwardfor
balance leastconn
timeout check 5000ms
http-request set-header X-Forwarded-Host %[req.hdr(host)]
http-request set-header X-Forwarded-Port %[dst_port]
http-request set-header X-Forwarded-Proto https if { ssl_fc }
cookie OPENSHIFT_zzhao_hello-nginx_SERVERID insert indirect nocache httponly
http-request set-header X-Forwarded-Proto http
http-request set-header Forwarded for=%[src],host=%[req.hdr(host)],proto=%[req.hdr(X-Forwarded-Proto)]
server 10.1.1.37:80 10.1.1.37:80 check inter 5000ms cookie 10.1.1.37:80
server 10.1.1.38:80 10.1.1.38:80 check inter 5000ms cookie 10.1.1.38:80
server 10.1.1.39:80 10.1.1.39:80 check inter 5000ms cookie 10.1.1.39:80
#######################################
Could you double check your service if can work, I doubt the service is not working..
thanks.
There is no file 'haproxy.conf' in my router:
[root@ose3-master /]# find -name '*.map*'
./var/lib/haproxy/conf/os_http_be.map
./var/lib/haproxy/conf/os_sni_passthrough.map
./var/lib/haproxy/conf/os_reencrypt.map
./var/lib/haproxy/conf/os_edge_http_be.map
./var/lib/haproxy/conf/os_tcp_be.map
./usr/share/groff/1.22.2/font/devps/generate/dingbats.map
[root@ose3-master /]# find -name 'haproxy.conf'
There is this file:
./var/lib/haproxy/conf/haproxy.config
It has the following:
##-------------- app level backends ----------------
backend be_http_demo_hello-service
mode http
option redispatch
option forwardfor
balance leastconn
timeout check 5000ms
http-request set-header X-Forwarded-Host %[req.hdr(host)]
http-request set-header X-Forwarded-Port %[dst_port]
http-request set-header X-Forwarded-Proto https if { ssl_fc }
cookie OPENSHIFT_demo_hello-service_SERVERID insert indirect nocache httponly
http-request set-header X-Forwarded-Proto http
http-request set-header Forwarded for=%[src],host=%[req.hdr(host)],proto=%[req.hdr(X-Forwarded-Proto)]
###
There are no IPs listed.
The service works:
[root@ose3-master training]# oc get service -n demo
NAME CLUSTER_IP EXTERNAL_IP PORT(S) SELECTOR AGE
hello-service 172.30.176.104 <none> 8888/TCP name=hello-openshift 3m
[root@ose3-master training]# curl 172.30.176.104:8888
Hello OpenShift!
[root@ose3-master training]# oc get endpoints hello-service -n demo
NAME ENDPOINTS AGE
hello-service 10.1.1.2:8080,10.1.2.2:8080,10.1.2.3:8080 4m
Note the router can reach the pods still:
[root@ose3-master training]# oc exec router-2-4rjez -it -- curl 10.1.2.3:8080
Hello OpenShift!
Something appears to be wrong with the router obtaining the endpoints?
that's weird, I never met this kind of issue before, can you try to delete the service and re-create it and better to change another port instead of '8888'.
Found /var/lib/containers/router/routes.json has a little different, the 'PreferPort' is nil in my site.
"ServiceAliasConfigs": {
"demo_hello-service": {
"Host": "hello-service-demo.cloudapps.example.com",
"Path": "",
"TLSTermination": "",
"Certificates": null,
"Status": "",
"PreferPort": "8888" #### here is nil in my site
I will try it again (I had to rebuild with 3.0.2) but I do not think the port is the issue. I am using the exact same JSON objects to define the pods and service with 3.0.2 as I am using with 3.1: https://github.com/thoraxe/training/blob/31-fixes/content/hello-service-pods.json https://github.com/thoraxe/training/blob/31-fixes/content/hello-service.json Followed by: oc expose service hello-service -l name=hello-openshift The thing that is strange to me is that the router's JSON file (the /var/lib/containers/router/routes.json) has the endpoints listed, but neither the map file nor the conf file have the endpoints. Again, the two JSON definitions (pods, service) + oc expose works in 3.0.2 on port 8888. Michalis, It looks like this is being caused by having a service definition that is using a port that is not the same as the endpoint ports. The service in question is exposing 8888, the endpoints are serving on 8080. When the oc expose command is run it is setting the target port to 8888 which results in no endpoints being chosen when handling the route in the router plugin. Related: https://github.com/openshift/origin/pull/5067 (In reply to Paul Weil from comment #8) > Michalis, > > It looks like this is being caused by having a service definition that is > using a port that is not the same as the endpoint ports. That's means in service the port must same as 'target port' in future? or the fixed code has not be merged to AEP 3.1 since I reproduced this issue on AEP 3.1 oc v3.0.2.903-29-g49953d6 kubernetes v1.2.0-alpha.1-1107-g4c8e6f4 (In reply to zhaozhanqi from comment #9) > (In reply to Paul Weil from comment #8) > > Michalis, > > > > It looks like this is being caused by having a service definition that is > > using a port that is not the same as the endpoint ports. > > That's means in service the port must same as 'target port' in future? or > the fixed code has not be merged to AEP 3.1 since I reproduced this issue on > AEP 3.1 > > oc v3.0.2.903-29-g49953d6 > kubernetes v1.2.0-alpha.1-1107-g4c8e6f4 No, Michalis has fixed it, it just has not merged yet. Commit pushed to master at https://github.com/openshift/origin https://github.com/openshift/origin/commit/af969529ab4af1776cd78dda61d142ee10d09cf9 Bug 1275003: expose: Set route port based on service target port Route should be using the port used from the endpoints of the service and not the port the service is exposing. this bug has been fixed
oc v3.0.2.905
kubernetes v1.2.0-alpha.1-1107-g4c8e6f4
# oc get endpoints
NAME ENDPOINTS AGE
hello-nginx 10.1.2.114:8080 2h
"Path": "",
"TLSTermination": "",
"Certificates": null,
"Status": "saved",
"PreferPort": "8080",
"InsecureEdgeTerminationPolicy": ""
}
This fix is available in OpenShift Enterprise 3.1. |
[joe@ose3-master ~]$ oc version oc v3.0.2.903 kubernetes v1.2.0-alpha.1-1107-g4c8e6f4 atomic-openshift-3.0.2.903-0.git.0.a4ff36b.el7aos.x86_64 atomic-openshift-clients-3.0.2.903-0.git.0.a4ff36b.el7aos.x86_64 atomic-openshift-master-3.0.2.903-0.git.0.a4ff36b.el7aos.x86_64 atomic-openshift-node-3.0.2.903-0.git.0.a4ff36b.el7aos.x86_64 atomic-openshift-sdn-ovs-3.0.2.903-0.git.0.a4ff36b.el7aos.x86_64 tuned-profiles-atomic-openshift-node-3.0.2.903-0.git.0.a4ff36b.el7aos.x86_64 master node can reach other node with pod: [root@ose3-master training]# oc get endpoints hello-service -n demo NAME ENDPOINTS AGE hello-service 10.1.1.6:8080,10.1.1.7:8080,10.1.2.2:8080 10m [root@ose3-master training]# curl 10.1.2.2:8080 Hello OpenShift! [root@ose3-master router configuration: [root@ose3-master training]# oc exec router-2-2ty6y -- cat /var/lib/containers/router/routes.json { "default/kubernetes": { "Name": "default/kubernetes", "EndpointTable": [ { "ID": "192.168.133.2:8443", "IP": "192.168.133.2", "Port": "8443", "TargetName": "192.168.133.2", "PortName": "https" } ], "ServiceAliasConfigs": {} }, "default/router": { "Name": "default/router", "EndpointTable": [ { "ID": "192.168.133.2:80", "IP": "192.168.133.2", "Port": "80", "TargetName": "router-2-2ty6y", "PortName": "80-tcp" } ], "ServiceAliasConfigs": {} }, "demo/hello-service": { "Name": "demo/hello-service", "EndpointTable": [ { "ID": "10.1.1.6:8080", "IP": "10.1.1.6", "Port": "8080", "TargetName": "hello-openshift-1", "PortName": "" }, { "ID": "10.1.1.7:8080", "IP": "10.1.1.7", "Port": "8080", "TargetName": "hello-openshift-3", "PortName": "" }, { "ID": "10.1.2.2:8080", "IP": "10.1.2.2", "Port": "8080", "TargetName": "hello-openshift-2", "PortName": "" } ], "ServiceAliasConfigs": { "demo_hello-service": { "Host": "hello-service-demo.cloudapps.example.com", "Path": "", "TLSTermination": "", "Certificates": null, "Status": "", "PreferPort": "8888" } } } } router reports 503: [root@ose3-master training]# curl hello-service-demo.cloudapps.example.com <html><body><h1>503 Service Unavailable</h1> No server is available to handle this request. </body></html> router reports 503 inside router: [root@ose3-master training]# oc exec router-2-2ty6y -it -- bash [root@ose3-master conf]# curl hello-service-demo.cloudapps.example.com <html><body><h1>503 Service Unavailable</h1> No server is available to handle this request. </body></html> router can reach pod on other node, so 503 isn't related to endpoint: [root@ose3-master conf]# curl 10.1.2.2:8080 Hello OpenShift! What's wrong here?