Description of problem: There was a change made to prevent health checks when there was only one endpoint for a service. But that was too aggressive. When a route has multiple services then it would disable health checks incorrectly. We need to disable the checks when there is only one backend server for the whole route. Version-Release number of selected component (if applicable): 3.7.0 pre-release How reproducible: Every time. Steps to Reproduce: 1. Make a route that selects two services 2. Make each service have one endpoint Actual results: Health checks are disabled for both servers in the haproxy config. Expected results: Health checks are performed. Additional info:
PR https://github.com/openshift/origin/pull/17077
Commits pushed to master at https://github.com/openshift/origin https://github.com/openshift/origin/commit/9835dca2c14d387a5a15ee14883b48e1f9073604 Skip health checks when there is one server that backs the route Previously we had attempted to suppress the health checks, but it was looking at the endpoints of the service, not the servers that back the route. The problem is that if we just look at endpoints, then if a route has two backing services, each with one endpoint, we would incorrectly suppress the health checks. Fixes bug 1507664 (https://bugzilla.redhat.com/show_bug.cgi?id=1507664) https://github.com/openshift/origin/commit/1bae363268fec9a0df7ac62fc1823ccc3df1cf94 Merge pull request #17077 from knobunc/fix/health-supression Automatic merge from submit-queue. Fix the "supress health checks when only one backing service" logic This reverts commit 4833eb5d9f770fe8ab2b991f1c4d114cd09bf99c and then fixes the implementation. Previously we had attempted to suppress the health checks, but it was looking at the endpoints of the service, not the servers that back the route. The problem is that if we just look at endpoints, then if a route has two backing services, each with one endpoint, we would incorrectly suppress the health checks. Fixes bug 1507664 (https://bugzilla.redhat.com/show_bug.cgi?id=1507664)
verified in v3.7.0-0.189.0 but failed, no "check inter" if route has two services and service has one endpoint. a) two services, one have two endpoints and another has one endpoint: server pod:test-rc-dsx7g:service-unsecure:10.129.0.11:8080 10.129.0.11:8080 cookie 51b8a3813373ee7b1281fc933f9d1ec3 weight 64 check inter 5000ms server pod:test-rc-7vcs2:service-unsecure:10.130.0.11:8080 10.130.0.11:8080 cookie b59a676bd37c1865c9b8c40d9db15c61 weight 64 check inter 5000ms server pod:caddy-docker-2:service-unsecure-2:10.130.0.9:8080 10.130.0.9:8080 cookie fbfdb60f74200c055330efafa409a7f5 weight 256 b) two services, each service has one endpoint: server pod:test-rc-dsx7g:service-unsecure:10.129.0.11:8080 10.129.0.11:8080 cookie 51b8a3813373ee7b1281fc933f9d1ec3 weight 128 server pod:caddy-docker-2:service-unsecure-2:10.130.0.9:8080 10.130.0.9:8080 cookie fbfdb60f74200c055330efafa409a7f5 weight 256
@hongli, I got same results yesterday with no "check inter" for the services when I create one route with three servers, each service has only one endpoint.
New PR https://github.com/openshift/origin/pull/17155
Commits pushed to master at https://github.com/openshift/origin https://github.com/openshift/origin/commit/8cb7b957e092fd73d6d84feeb16c77d68ef69d28 Fix the "supress health checks when only one backing service" logic Previously we had attempted to suppress the health checks, but it was looking at the endpoints of the service, not the servers that back the route. The problem is that if we just look at endpoints, then if a route has two backing services, each with one endpoint, we would incorrectly suppress the health checks. This is the third attempt. I had missed a case in the last one, so now the logic is moved to be computed and recorded by the router controller, but used in the template. Fixes bug 1507664 (https://bugzilla.redhat.com/show_bug.cgi?id=1507664) https://github.com/openshift/origin/commit/fc8725b6d2f46261f12862dd6185be67352fd53a Merge pull request #17155 from knobunc/bug/bz1507664-suppress-health-checks Automatic merge from submit-queue. Bug/bz1507664 suppress health checks Fix the "supress health checks when only one backing service" logic This reverts commit 9835dca2c14d387a5a15ee14883b48e1f9073604 because it is wrong, then the second commit implements the correct change. Previously we had attempted to suppress the health checks, but it was looking at the endpoints of the service, not the servers that back the route. The problem is that if we just look at endpoints, then if a route has two backing services, each with one endpoint, we would incorrectly suppress the health checks. This is the third attempt. I had missed a case in the last one, so now the logic is moved to be computed and recorded by the router controller, but used in the template. Fixes bug 1507664 (https://bugzilla.redhat.com/show_bug.cgi?id=1507664)
verified in OCP v3.7.0-0.196.0 and issue has been fixed. a) two service and each service has one endpoint server pod:test-rc-qrxjt:service-unsecure:10.128.0.9:8080 10.128.0.9:8080 cookie 5708008fda5943a37452de2ab223d0bc weight 256 check inter 5000ms server pod:caddy-docker-2:service-unsecure-2:10.128.0.8:8080 10.128.0.8:8080 cookie 0b0853154930d7393cdbcb1c3f06306a weight 256 check inter 5000ms b) one service has two endpoints server pod:test-rc-qrxjt:service-unsecure:10.128.0.9:8080 10.128.0.9:8080 cookie 5708008fda5943a37452de2ab223d0bc weight 256 check inter 5000ms server pod:test-rc-9t69p:service-unsecure:10.129.0.6:8080 10.129.0.6:8080 cookie 991bec356d21c2a6ea189d9f69e0b244 weight 256 check inter 5000ms c) one service has one endpoint server pod:test-rc-qrxjt:service-unsecure:10.128.0.9:8080 10.128.0.9:8080 cookie 5708008fda5943a37452de2ab223d0bc weight 256
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:3188