Bug 1752521

Summary: Route with 2 named ports sends traffic to wrong one with the correct one is down [4.1 backport]
Product: OpenShift Container Platform Reporter: Miciah Dashiel Butler Masters <mmasters>
Component: NetworkingAssignee: Miciah Dashiel Butler Masters <mmasters>
Networking sub component: router QA Contact: Hongan Li <hongli>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: medium CC: aos-bugs, bbennett, dmace, hcisneir, hongli, jlee, mrobson
Version: 4.1.z   
Target Milestone: ---   
Target Release: 4.1.z   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1730778
: 1756166 (view as bug list) Environment:
Last Closed: 2019-10-16 18:07:59 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1730778, 1756166    
Bug Blocks:    

Description Miciah Dashiel Butler Masters 2019-09-16 14:22:55 UTC
+++ This bug was initially created as a clone of Bug #1730778 +++

Description of problem:

Project with 2 routes points to a single service with 2 named ports.

One is for a health check and the other is for application traffic.

When start / deploying and the health check is up, but the application is still starting, we see traffic on the application route being sent to the health check containers resulting in 404s.

This pod has 2 containers, the application and a sidecar for add-on services which include the health check.

Requests to project-lb.public.com target the app port which points to the application container. When the application container is still starting, they are seeing a number of 404 errors for traffic.

From what we can see, when the application comes up, it stops sending traffic incorrectly to the health check container.

Turning on debug logging for the health check container, we can see traffic for project-lb.public.com is incorrectly coming into the health check container and returning 404s.

Health Check Route:

apiVersion: v1
items:
- apiVersion: route.openshift.io/v1
  kind: Route
  metadata:
    creationTimestamp: 2019-03-29T19:31:23Z
    name: healthcheck
    namespace: project
    resourceVersion: "65987865"
    selfLink: /apis/route.openshift.io/v1/namespaces/project/routes/healthcheck
    uid: 3c9c3612-5259-11e9-a732-005056b97b10
  spec:
    host: healthcheck.paas.com
    port:
      targetPort: hm
    to:
      kind: Service
      name: svc-act
      weight: 100
    wildcardPolicy: None
  status:
    ingress:
    - conditions:
      - lastTransitionTime: 2019-03-29T19:31:23Z
        status: "True"
        type: Admitted
      routerName: router
      wildcardPolicy: None

Application Route:


- apiVersion: route.openshift.io/v1
  kind: Route
  metadata:
    creationTimestamp: 2019-03-29T19:31:05Z
    name: application
    namespace: project
    resourceVersion: "65987845"
    selfLink: /apis/route.openshift.io/v1/namespaces/project/routes/application
    uid: 31a7d081-5259-11e9-a732-005056b97b10
  spec:
    host: project-lb.public.com
    port:
      targetPort: app
    tls:
      termination: edge
    to:
      kind: Service
      name: svc-act
      weight: 100
    wildcardPolicy: None
  status:
    ingress:
    - conditions:
      - lastTransitionTime: 2019-03-29T19:31:05Z
        status: "True"
        type: Admitted
      routerName: router
      wildcardPolicy: None


Service:

apiVersion: v1
items:
- apiVersion: v1
  kind: Service
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"name":"svc-act","namespace":"project"},"spec":{"ports":[{"name":"app","port":8080,"targetPort":8080},{"name":"hm","port":8081,"targetPort":9011}],"selector":{"name":"project-b"}}}
    creationTimestamp: 2019-03-29T19:31:09Z
    name: svc-act
    namespace: project
    resourceVersion: "80400351"
    selfLink: /api/v1/namespaces/project/services/svc-act
    uid: 3483c483-5259-11e9-a366-005056b97ad5
  spec:
    clusterIP: 172.29.82.203
    ports:
    - name: app
      port: 8080
      protocol: TCP
      targetPort: 8080
    - name: hm
      port: 8081
      protocol: TCP
      targetPort: 9011
    selector:
      name: project-b
    sessionAffinity: None
    type: ClusterIP
  status:
    loadBalancer: {}


Health Check Log showing traffic for project-lb.public.com coming into port 9011 which is the health check port 'hm'

2019-07-16 14:50:27.450 DEBUG 8 --- [http-nio-9011-Acceptor-0] o.apache.tomcat.util.threads.LimitLatch  : Counting up[http-nio-9011-Acceptor-0] latch=1
2019-07-16 14:50:27.450 DEBUG 8 --- [http-nio-9011-exec-5] o.a.tomcat.util.net.SocketWrapperBase    : Socket: [org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper@4054f4a1:org.apache.tomcat.util.net.NioChannel@699d524f:java.nio.channels.SocketChannel[connected local=/10.1.1.1:9011 remote=/10.59.28.1:37190]], Read from buffer: [0]
2019-07-16 14:50:27.451 DEBUG 8 --- [http-nio-9011-exec-5] o.a.coyote.http11.Http11InputBuffer      : Received [GET /url/info HTTP/1.1
User-Agent: curl/7.29.0
Accept: */*
X-Forwarded-For: 10.1.1.1
Host: project-lb.public.com
X-Forwarded-Host: project-lb.public.com
X-Forwarded-Port: 443
X-Forwarded-Proto: https
Forwarded: for=10.1.1.1;host=fproject-lb.public.com;proto=https

]
2019-07-16 14:50:27.451 DEBUG 8 --- [http-nio-9011-exec-5] o.a.c.authenticator.AuthenticatorBase    : Security checking request GET /url/info
2019-07-16 14:50:27.451 DEBUG 8 --- [http-nio-9011-exec-5] org.apache.catalina.realm.RealmBase      :   No applicable constraints defined
2019-07-16 14:50:27.451 DEBUG 8 --- [http-nio-9011-exec-5] o.a.c.authenticator.AuthenticatorBase    :  Not subject to any constraint
2019-07-16 14:50:27.451 DEBUG 8 --- [http-nio-9011-exec-5] o.s.b.w.f.OrderedRequestContextFilter    : Bound request context to thread: org.apache.catalina.connector.RequestFacade@12f8d3c1
2019-07-16 14:50:27.451 DEBUG 8 --- [http-nio-9011-exec-5] o.s.web.servlet.DispatcherServlet        : DispatcherServlet with name 'dispatcherServlet' processing GET request for [/url/info]
2019-07-16 14:50:27.451 DEBUG 8 --- [http-nio-9011-exec-5] s.w.s.m.m.a.RequestMappingHandlerMapping : Looking up handler method for path /url/info
2019-07-16 14:50:27.451 DEBUG 8 --- [http-nio-9011-exec-5] s.w.s.m.m.a.RequestMappingHandlerMapping : Did not find handler method for [/url/info]
2019-07-16 14:50:27.451 DEBUG 8 --- [http-nio-9011-exec-5] o.s.w.s.handler.SimpleUrlHandlerMapping  : Matching patterns for request [/url/info] are [/**]
2019-07-16 14:50:27.452 DEBUG 8 --- [http-nio-9011-exec-5] o.s.w.s.handler.SimpleUrlHandlerMapping  : URI Template variables for request [/url/info] are {}
2019-07-16 14:50:27.452 DEBUG 8 --- [http-nio-9011-exec-5] o.s.w.s.handler.SimpleUrlHandlerMapping  : Mapping [/url/info] to HandlerExecutionChain with handler [ResourceHttpRequestHandler [locations=[ServletContext resource [/], class path resource [META-INF/resources/], class path resource [resources/], class path resource [static/], class path resource [public/]], resolvers=[org.springframework.web.servlet.resource.PathResourceResolver@4b5a5ed1]]] and 1 interceptor
2019-07-16 14:50:27.452 DEBUG 8 --- [http-nio-9011-exec-5] o.s.web.servlet.DispatcherServlet        : Last-Modified value for [/url/info] is: -1
2019-07-16 14:50:27.452 DEBUG 8 --- [http-nio-9011-exec-5] o.s.web.servlet.DispatcherServlet        : Null ModelAndView returned to DispatcherServlet with name 'dispatcherServlet': assuming HandlerAdapter completed request handling
2019-07-16 14:50:27.452 DEBUG 8 --- [http-nio-9011-exec-5] o.s.web.servlet.DispatcherServlet        : Successfully completed request
2019-07-16 14:50:27.452 DEBUG 8 --- [http-nio-9011-exec-5] o.s.b.w.f.OrderedRequestContextFilter    : Cleared thread-bound request context: org.apache.catalina.connector.RequestFacade@12f8d3c1
2019-07-16 14:50:27.452 DEBUG 8 --- [http-nio-9011-exec-5] o.a.c.c.C.[Tomcat].[localhost]           : Processing ErrorPage[errorCode=0, location=/error]
2019-07-16 14:50:27.452 DEBUG 8 --- [http-nio-9011-exec-5] o.s.web.servlet.DispatcherServlet        : DispatcherServlet with name 'dispatcherServlet' processing GET request for [/error]
2019-07-16 14:50:27.453 DEBUG 8 --- [http-nio-9011-exec-5] s.w.s.m.m.a.RequestMappingHandlerMapping : Looking up handler method for path /error
2019-07-16 14:50:27.453 DEBUG 8 --- [http-nio-9011-exec-5] s.w.s.m.m.a.RequestMappingHandlerMapping : Returning handler method [public org.springframework.http.ResponseEntity<java.util.Map<java.lang.String, java.lang.Object>> org.springframework.boot.autoconfigure.web.BasicErrorController.error(javax.servlet.http.HttpServletRequest)]
2019-07-16 14:50:27.453 DEBUG 8 --- [http-nio-9011-exec-5] o.s.b.f.s.DefaultListableBeanFactory     : Returning cached instance of singleton bean 'basicErrorController'
2019-07-16 14:50:27.453 DEBUG 8 --- [http-nio-9011-exec-5] o.s.web.servlet.DispatcherServlet        : Last-Modified value for [/error] is: -1
2019-07-16 14:50:27.454 DEBUG 8 --- [http-nio-9011-exec-5] o.s.w.s.m.m.a.HttpEntityMethodProcessor  : Written [{timestamp=Tue Jul 16 14:50:27 EDT 2019, status=404, error=Not Found, message=No message available, path=/url/info}] as ""application/json"" using [org.springframework.http.converter.json.MappingJackson2HttpMessageConverter@37e547da]
2019-07-16 14:50:27.454 DEBUG 8 --- [http-nio-9011-exec-5] o.s.web.servlet.DispatcherServlet        : Null ModelAndView returned to DispatcherServlet with name 'dispatcherServlet': assuming HandlerAdapter completed request handling
2019-07-16 14:50:27.454 DEBUG 8 --- [http-nio-9011-exec-5] o.s.web.servlet.DispatcherServlet        : Successfully completed request
2019-07-16 14:50:27.454 DEBUG 8 --- [http-nio-9011-exec-5] o.a.c.c.C.[.[.[/].[dispatcherServlet]    :  Disabling the response for further output
...
...


Version-Release number of selected component (if applicable):
3.11

How reproducible:
Always for this application

Steps to Reproduce:
1. Customer environment
2.
3.

Actual results:
Traffic to wrong container


Expected results:
Traffic should not route to the health check container.

Additional info:

--- Additional comment from errata-xmlrpc on 2019-08-13 14:09:19 UTC ---

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2352

Comment 10 Hongan Li 2019-10-06 13:54:02 UTC
verified with 4.1.0-0.nightly-2019-10-04-210822 and issue has been fixed.


Note: firstly should disable CVO then disable ingress operator (only disabling ingress operator doesn't work since it will be restored by CVO)
1. oc scale deployment/cluster-version-operator --replicas=0 -n openshift-cluster-version
2. oc scale deployment/ingress-operator --replicas=0 -n openshift-ingress-operator
3. oc set env deployment/router-default ROUTER_HAPROXY_CONFIG_MANAGER=true -n openshift-ingress
4. create your project, pod, svc with two ports and route
5. rsh to router pod and check server status

sh-4.2$ echo "show servers state be_http:hongli:mysvc" | socat - UNIX-CONNECT:/var/lib/haproxy/run/haproxy.sock
1
# be_id be_name srv_id srv_name srv_addr srv_op_state srv_admin_state srv_uweight srv_iweight srv_time_since_last_change srv_check_status srv_check_result srv_check_health srv_check_state srv_agent_state bk_f_forced_id srv_f_forced_id srv_fqdn srv_port srvrecord
39 be_http:hongli:mysvc 1 pod:caddy-rc-wd7h4:mysvc:10.128.2.12:8080 10.128.2.12 2 0 256 256 93 6 3 4 6 0 0 0 - 8080 -
39 be_http:hongli:mysvc 2 pod:caddy-rc-d75qj:mysvc:10.131.0.17:8080 10.131.0.17 2 0 256 256 93 6 3 4 6 0 0 0 - 8080 -
39 be_http:hongli:mysvc 3 _dynamic-pod-1 172.4.0.4 0 5 1 1 93 1 0 0 14 0 0 0 - 8765 -
39 be_http:hongli:mysvc 4 _dynamic-pod-2 172.4.0.4 0 5 1 1 93 1 0 0 14 0 0 0 - 8765 -
39 be_http:hongli:mysvc 5 _dynamic-pod-3 172.4.0.4 0 5 1 1 93 1 0 0 14 0 0 0 - 8765 -
39 be_http:hongli:mysvc 6 _dynamic-pod-4 172.4.0.4 0 5 1 1 93 1 0 0 14 0 0 0 - 8765 -
39 be_http:hongli:mysvc 7 _dynamic-pod-5 172.4.0.4 0 5 1 1 93 1 0 0 14 0 0 0 - 8765 -

Comment 12 errata-xmlrpc 2019-10-16 18:07:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3004