Description of problem: Project with 2 routes points to a single service with 2 named ports. One is for a health check and the other is for application traffic. When start / deploying and the health check is up, but the application is still starting, we see traffic on the application route being sent to the health check containers resulting in 404s. This pod has 2 containers, the application and a sidecar for add-on services which include the health check. Requests to project-lb.public.com target the app port which points to the application container. When the application container is still starting, they are seeing a number of 404 errors for traffic. From what we can see, when the application comes up, it stops sending traffic incorrectly to the health check container. Turning on debug logging for the health check container, we can see traffic for project-lb.public.com is incorrectly coming into the health check container and returning 404s. Health Check Route: apiVersion: v1 items: - apiVersion: route.openshift.io/v1 kind: Route metadata: creationTimestamp: 2019-03-29T19:31:23Z name: healthcheck namespace: project resourceVersion: "65987865" selfLink: /apis/route.openshift.io/v1/namespaces/project/routes/healthcheck uid: 3c9c3612-5259-11e9-a732-005056b97b10 spec: host: healthcheck.paas.com port: targetPort: hm to: kind: Service name: svc-act weight: 100 wildcardPolicy: None status: ingress: - conditions: - lastTransitionTime: 2019-03-29T19:31:23Z status: "True" type: Admitted routerName: router wildcardPolicy: None Application Route: - apiVersion: route.openshift.io/v1 kind: Route metadata: creationTimestamp: 2019-03-29T19:31:05Z name: application namespace: project resourceVersion: "65987845" selfLink: /apis/route.openshift.io/v1/namespaces/project/routes/application uid: 31a7d081-5259-11e9-a732-005056b97b10 spec: host: project-lb.public.com port: targetPort: app tls: termination: edge to: kind: Service name: svc-act weight: 100 wildcardPolicy: None status: ingress: - conditions: - lastTransitionTime: 2019-03-29T19:31:05Z status: "True" type: Admitted routerName: router wildcardPolicy: None Service: apiVersion: v1 items: - apiVersion: v1 kind: Service metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"name":"svc-act","namespace":"project"},"spec":{"ports":[{"name":"app","port":8080,"targetPort":8080},{"name":"hm","port":8081,"targetPort":9011}],"selector":{"name":"project-b"}}} creationTimestamp: 2019-03-29T19:31:09Z name: svc-act namespace: project resourceVersion: "80400351" selfLink: /api/v1/namespaces/project/services/svc-act uid: 3483c483-5259-11e9-a366-005056b97ad5 spec: clusterIP: 172.29.82.203 ports: - name: app port: 8080 protocol: TCP targetPort: 8080 - name: hm port: 8081 protocol: TCP targetPort: 9011 selector: name: project-b sessionAffinity: None type: ClusterIP status: loadBalancer: {} Health Check Log showing traffic for project-lb.public.com coming into port 9011 which is the health check port 'hm' 2019-07-16 14:50:27.450 DEBUG 8 --- [http-nio-9011-Acceptor-0] o.apache.tomcat.util.threads.LimitLatch : Counting up[http-nio-9011-Acceptor-0] latch=1 2019-07-16 14:50:27.450 DEBUG 8 --- [http-nio-9011-exec-5] o.a.tomcat.util.net.SocketWrapperBase : Socket: [org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper@4054f4a1:org.apache.tomcat.util.net.NioChannel@699d524f:java.nio.channels.SocketChannel[connected local=/10.1.1.1:9011 remote=/10.59.28.1:37190]], Read from buffer: [0] 2019-07-16 14:50:27.451 DEBUG 8 --- [http-nio-9011-exec-5] o.a.coyote.http11.Http11InputBuffer : Received [GET /url/info HTTP/1.1 User-Agent: curl/7.29.0 Accept: */* X-Forwarded-For: 10.1.1.1 Host: project-lb.public.com X-Forwarded-Host: project-lb.public.com X-Forwarded-Port: 443 X-Forwarded-Proto: https Forwarded: for=10.1.1.1;host=fproject-lb.public.com;proto=https ] 2019-07-16 14:50:27.451 DEBUG 8 --- [http-nio-9011-exec-5] o.a.c.authenticator.AuthenticatorBase : Security checking request GET /url/info 2019-07-16 14:50:27.451 DEBUG 8 --- [http-nio-9011-exec-5] org.apache.catalina.realm.RealmBase : No applicable constraints defined 2019-07-16 14:50:27.451 DEBUG 8 --- [http-nio-9011-exec-5] o.a.c.authenticator.AuthenticatorBase : Not subject to any constraint 2019-07-16 14:50:27.451 DEBUG 8 --- [http-nio-9011-exec-5] o.s.b.w.f.OrderedRequestContextFilter : Bound request context to thread: org.apache.catalina.connector.RequestFacade@12f8d3c1 2019-07-16 14:50:27.451 DEBUG 8 --- [http-nio-9011-exec-5] o.s.web.servlet.DispatcherServlet : DispatcherServlet with name 'dispatcherServlet' processing GET request for [/url/info] 2019-07-16 14:50:27.451 DEBUG 8 --- [http-nio-9011-exec-5] s.w.s.m.m.a.RequestMappingHandlerMapping : Looking up handler method for path /url/info 2019-07-16 14:50:27.451 DEBUG 8 --- [http-nio-9011-exec-5] s.w.s.m.m.a.RequestMappingHandlerMapping : Did not find handler method for [/url/info] 2019-07-16 14:50:27.451 DEBUG 8 --- [http-nio-9011-exec-5] o.s.w.s.handler.SimpleUrlHandlerMapping : Matching patterns for request [/url/info] are [/**] 2019-07-16 14:50:27.452 DEBUG 8 --- [http-nio-9011-exec-5] o.s.w.s.handler.SimpleUrlHandlerMapping : URI Template variables for request [/url/info] are {} 2019-07-16 14:50:27.452 DEBUG 8 --- [http-nio-9011-exec-5] o.s.w.s.handler.SimpleUrlHandlerMapping : Mapping [/url/info] to HandlerExecutionChain with handler [ResourceHttpRequestHandler [locations=[ServletContext resource [/], class path resource [META-INF/resources/], class path resource [resources/], class path resource [static/], class path resource [public/]], resolvers=[org.springframework.web.servlet.resource.PathResourceResolver@4b5a5ed1]]] and 1 interceptor 2019-07-16 14:50:27.452 DEBUG 8 --- [http-nio-9011-exec-5] o.s.web.servlet.DispatcherServlet : Last-Modified value for [/url/info] is: -1 2019-07-16 14:50:27.452 DEBUG 8 --- [http-nio-9011-exec-5] o.s.web.servlet.DispatcherServlet : Null ModelAndView returned to DispatcherServlet with name 'dispatcherServlet': assuming HandlerAdapter completed request handling 2019-07-16 14:50:27.452 DEBUG 8 --- [http-nio-9011-exec-5] o.s.web.servlet.DispatcherServlet : Successfully completed request 2019-07-16 14:50:27.452 DEBUG 8 --- [http-nio-9011-exec-5] o.s.b.w.f.OrderedRequestContextFilter : Cleared thread-bound request context: org.apache.catalina.connector.RequestFacade@12f8d3c1 2019-07-16 14:50:27.452 DEBUG 8 --- [http-nio-9011-exec-5] o.a.c.c.C.[Tomcat].[localhost] : Processing ErrorPage[errorCode=0, location=/error] 2019-07-16 14:50:27.452 DEBUG 8 --- [http-nio-9011-exec-5] o.s.web.servlet.DispatcherServlet : DispatcherServlet with name 'dispatcherServlet' processing GET request for [/error] 2019-07-16 14:50:27.453 DEBUG 8 --- [http-nio-9011-exec-5] s.w.s.m.m.a.RequestMappingHandlerMapping : Looking up handler method for path /error 2019-07-16 14:50:27.453 DEBUG 8 --- [http-nio-9011-exec-5] s.w.s.m.m.a.RequestMappingHandlerMapping : Returning handler method [public org.springframework.http.ResponseEntity<java.util.Map<java.lang.String, java.lang.Object>> org.springframework.boot.autoconfigure.web.BasicErrorController.error(javax.servlet.http.HttpServletRequest)] 2019-07-16 14:50:27.453 DEBUG 8 --- [http-nio-9011-exec-5] o.s.b.f.s.DefaultListableBeanFactory : Returning cached instance of singleton bean 'basicErrorController' 2019-07-16 14:50:27.453 DEBUG 8 --- [http-nio-9011-exec-5] o.s.web.servlet.DispatcherServlet : Last-Modified value for [/error] is: -1 2019-07-16 14:50:27.454 DEBUG 8 --- [http-nio-9011-exec-5] o.s.w.s.m.m.a.HttpEntityMethodProcessor : Written [{timestamp=Tue Jul 16 14:50:27 EDT 2019, status=404, error=Not Found, message=No message available, path=/url/info}] as ""application/json"" using [org.springframework.http.converter.json.MappingJackson2HttpMessageConverter@37e547da] 2019-07-16 14:50:27.454 DEBUG 8 --- [http-nio-9011-exec-5] o.s.web.servlet.DispatcherServlet : Null ModelAndView returned to DispatcherServlet with name 'dispatcherServlet': assuming HandlerAdapter completed request handling 2019-07-16 14:50:27.454 DEBUG 8 --- [http-nio-9011-exec-5] o.s.web.servlet.DispatcherServlet : Successfully completed request 2019-07-16 14:50:27.454 DEBUG 8 --- [http-nio-9011-exec-5] o.a.c.c.C.[.[.[/].[dispatcherServlet] : Disabling the response for further output ... ... Version-Release number of selected component (if applicable): 3.11 How reproducible: Always for this application Steps to Reproduce: 1. Customer environment 2. 3. Actual results: Traffic to wrong container Expected results: Traffic should not route to the health check container. Additional info:
Still working on reproducing this. In the meantime, if you are able to reproduce, could you also please provide the following after observing the problematic behavior? 1. Related resources: oc get -n project routes -o yaml oc get -n project services -o yaml oc get -n project endpoints -o yaml oc get -n project pods -o yaml 2. The haproxy.config file from a router pod's filesystem (substituting the router namespace and pod name): oc exec -n router_namespace router_pod_name cat haproxy.config Thanks!
In addition to the haproxy config, can you please also get the yaml for the router deployment itself?
verified with atomic-openshift-3.11.135-1.git.0.b309f70 and issue has been fixed. test step: 1. create a service with two ports # oc get svc -n hongli NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE mysvc ClusterIP 172.30.28.216 <none> 27017/TCP,27443/TCP 38m 2. create a route with the svc and one of the ports # oc get route -n hongli NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD mysvc mysvc-hongli.apps.0806-xqi.qe.rhcloud.com mysvc 8080 None 3. enable dynamic configuration # oc set env dc/router ROUTER_HAPROXY_CONFIG_MANAGER=true 4. check server status or curling the route. sh-4.2$ echo "show servers state be_http:hongli:mysvc" | socat - UNIX-CONNECT:/var/lib/haproxy/run/haproxy.sock 1 # be_id be_name srv_id srv_name srv_addr srv_op_state srv_admin_state srv_uweight srv_iweight srv_time_since_last_change srv_check_status srv_check_result srv_check_health srv_check_state srv_agent_state bk_f_forced_id srv_f_forced_id srv_fqdn srv_port srvrecord 41 be_http:hongli:mysvc 1 pod:caddy-rc-d6jxz:mysvc:10.129.0.11:8080 10.129.0.11 2 0 256 256 342 6 3 4 6 0 0 0 - 8080 - 41 be_http:hongli:mysvc 2 _dynamic-pod-1 172.4.0.4 0 5 1 1 342 1 0 0 14 0 0 0 - 8765 - 41 be_http:hongli:mysvc 3 _dynamic-pod-2 172.4.0.4 0 5 1 1 342 1 0 0 14 0 0 0 - 8765 - 41 be_http:hongli:mysvc 4 _dynamic-pod-3 172.4.0.4 0 5 1 1 342 1 0 0 14 0 0 0 - 8765 - 41 be_http:hongli:mysvc 5 _dynamic-pod-4 172.4.0.4 0 5 1 1 342 1 0 0 14 0 0 0 - 8765 - 41 be_http:hongli:mysvc 6 _dynamic-pod-5 10.129.0.13 2 4 256 1 342 6 3 4 6 0 0 0 - 8080 - Tested with old version, the server status may shows another ports like: 41 be_http:hongli:mysvc 5 _dynamic-pod-4 10.129.0.13 2 4 256 1 342 6 3 4 6 0 0 0 - 8443 - 41 be_http:hongli:mysvc 6 _dynamic-pod-5 10.129.0.13 2 4 256 1 342 6 3 4 6 0 0 0 - 8080 - And curling randomly return error like: # curl http://mysvc-hongli.apps.0806-xqi.qe.rhcloud.com Hello-OpenShift-1 http-8080 # curl http://mysvc-hongli.apps.0806-xqi.qe.rhcloud.com <html><body><h1>502 Bad Gateway</h1> The server returned an invalid or incomplete response. </body></html>
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2352