Bug 1405440
Summary: | HAProxy is forcefully restarted due to not responding to /healthz probe when under high load | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Jiří Mencák <jmencak> |
Component: | Networking | Assignee: | Phil Cameron <pcameron> |
Networking sub component: | router | QA Contact: | zhaozhanqi <zzhao> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | high | ||
Priority: | high | CC: | aos-bugs, bbennett, bperkins, ccoleman, eparis, jeder, jkaur, jmencak, pcameron, ramr, tdawson |
Version: | 3.4.0 | ||
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | aos-scalability-34 | ||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
Cause: Max connections is too low see openshift-docs PR 3609
Consequence: pod restarts
Fix: increase default value
Result:
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2017-04-12 19:07:47 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Jiří Mencák
2016-12-16 14:07:51 UTC
2k is really low. Jiri, would you please file another BZ for raising maxconn according to your research please? --- I believe our health check config for haproxy allows for 30 seconds of failed health checks to the stats listener before restarting the pod. Lengthening that delay trades reactivity to truly-failed haproxy pods for potential resilience to this issue. # oc describe pod router-1-rrk87 -n default|grep ness Liveness: http-get http://localhost:1936/healthz delay=10s timeout=1s period=10s #success=1 #failure=3 Readiness: http-get http://localhost:1936/healthz delay=10s timeout=1s period=10s #success=1 #failure=3 If we're going to use this method as our health check for haproxy, then I think haproxy should be configured to always allow sessions to be created against the stats listener, even after maxconn is reached (we're looking for a config mix that allows this, might need code changes to haproxy). The current state means kube will restart haproxy at the worst possible moment, when an app is under high load (hopefully for a good reason). Linking associated BZ to raise the connection limit. https://bugzilla.redhat.com/show_bug.cgi?id=1406327 For the stats we need to use: https://cbonte.github.io/haproxy-dconv/configuration-1.5.html#3.1-stats%20maxconn (and perhaps also 'stats timeout') I'm not sure if we need to expose this as a tunable knob, or if we should just increase it. I believe that increasing 'stats maxconn' would not help in this case. Check out the function "void listener_accept(int fd)" in haproxy-1.5.18/src/listener.c and look for global.maxconn within that function. The listener simply stops accepting new connections including those targeting the stats listener (at :1936). How about changing the health-check to target HAProxy's stats UNIX domain socket in stream mode. Something like this: exec: command: - /bin/sh - -c - echo show info | socat - UNIX-CONNECT:/var/lib/haproxy/run/haproxy.sock | grep -q "^Name:\s*HAProxy" instead of httpGet: host: localhost path: /healthz port: 1936 scheme: HTTP Gave it a quick test and it seem to be working fine. No more unnecessary HAProxy restarts. If haproxy is not responding to /healthz it is broken/not working. The whole point of haproxy is to respond to requests. I feel like the only 'bug' here is that when haproxy is so taxed and is failing to do its one and only job the only thing our system can do is restart the container. Which likely doesn't help a lot. Eventually having kubernetes take other reactions might make sense. Vertical resize of haproxy? Send a page to the guy on duty? Something else? But to me it seems like 'fixing' the base image to allow more connections is a good thing to do. Changes to the health check feels like a bad idea. haproxy is failing, the health check should fail... (In reply to Eric Paris from comment #6) > If haproxy is not responding to /healthz it is broken/not working. The whole > point of haproxy is to respond to requests. At least IMO, what's wrong here is that there's no way to tell haproxy that kubernetes health-checks are "special" as described in c#1. Literally the worst thing our platform can do to haproxy is restart it while under legitimate load. > I feel like the only 'bug' here > is that when haproxy is so taxed and is failing to do its one and only job > the only thing our system can do is restart the container. Which likely > doesn't help a lot. > > Eventually having kubernetes take other reactions might make sense. Vertical > resize of haproxy? Send a page to the guy on duty? Something else? > > But to me it seems like 'fixing' the base image to allow more connections is > a good thing to do. Indeed. I think the increase in maxconns via https://bugzilla.redhat.com/show_bug.cgi?id=1406327 will significantly reduce the likelihood of hitting this in the real world, and is low risk. But we still need a real fix. > Changes to the health check feels like a bad idea. > haproxy is failing, the health check should fail... Subtle yet important difference: it's busy, not failed. Busy is (mostly) transient and failed is forever. Jiri, would you mind kicking off a thread on the haproxy mailing list please? Perhaps there's something we're missing? http://www.haproxy.org/#tact Rejecting connections to the main ports is "healthy" behavior. Rejecting connections to the main ports because of a broken configuration is "unhealthy" behavior. What configuration can we do that allows both of those to be distinguished? *** Bug 1406327 has been marked as a duplicate of this bug. *** There is a process maxconn, frontend maxcon and server maxcoo nn. The frontend and server can be set for each frontend and server. By default all are 2000. Unless otherwise specified the global maxconn applies everywhere. Proposal: Add oadm --router --max-connections= (default 20000) option. This will generate an environment variable ROUTER_MAX_CONNECTIONS The default when the env variable is not present is 20000. Does this satisfy the current needs? Seems like a good start. That will set the global maxconn? Hi, can someone point us to the github commit that changed the router default to 20,000 please? Jeremy, PR 12716 This is a fix for this bug. oadm router --max-connections=20000 Or edit env var ROUTER_MAX_CONNECTIONS If you don't have an env the haproxy-template defaults to 20000 PTAL at the proposed fix. Awesome, that looks good for covering both of these BZs. Question: Where is a user supposed to set ROUTER_MAX_CONNECTIONS? In their shell environment before running the command? Somewhere in /etc/sysconfig? Wasn't immediately clear to me. last question: is this configurable in openshift-ansible somehow? jeder The DC for the router is where the ROUTER_MAX_CONNECTIONS environment variable goes. See DOCS PR 3609 oadm router will create it. You can add it to DC for existing routers. If its not there, 20000 is used. IMO, this needs to be plumbed up through installer, as well. Any chance you could look into that, to round-out this new feature? What happens during upgrades? Is the default changed when a user goes from 3.4->3.5? jeder I don't know enough about ansible to really comment. It is not a default in any of the config files. When you create a router oadm router <name> --max-connections=12345 will create it. jeder As for upgrades, 3.5 router image uses "maxconn 20000" unless told otherwise (via ROUTER_MAX_CONNECTIONS environment variable). 3.4 and earlier use 2000. Perfect, thanks. So upgrades will automatically go from 2000->20,000. I filed an issue for ansible work to plumb ROUTER_MAX_CONNECTIONS through: https://github.com/openshift/openshift-ansible/issues/3233 jeder It appears that increasing the default causes the tests to fail on systems with limited resources. Moving to 20000 by default on upgrade will break otherwise working systems. I changed the PR to default to 2000 which is the current default. The value can be changed as described in openshift-docs PR 3609 Hmm, what resource limit are we hitting in those environments? Why not just also increase that one? What kind of environments are these, just the CI one for openshift? jeder #23 I don't know for sure. The jenkins tests are failing. Rajat ran into 'sysctl fs.nr_open' and/or 'sysctl fs.file-max' being too small. See PR 12716 also docs PR 3609. I don''t know how to find the setting in the current test environment. The tests work for me on lab machines and I have fs.nr = 1048576, fs.file is 13094283. There has got to be a way to actually debug that test failure. Related to the RFE https://bugzilla.redhat.com/show_bug.cgi?id=1418905 @jeder: Phil and I dug through the log file line-by-line and found that the environment passed to the router config has MAX_CONNS of 0. Further investigation showed that the cluster for that test was created with 'oc cluster up'. Digging into that, we found that the defaults are set in 'oc cluster up' and passed in to create the cluster. I propose that we treat 0 as 'use the default' since the data structure will have 0 when not otherwise initialized, and 0 is a nonsensical value. @jeder: Also we decided to change the default to 20000 since that was not the issue and it appears that configurations that we care about have high enough limits. Ack, thank you! Commit pushed to master at https://github.com/openshift/origin https://github.com/openshift/origin/commit/119f9b1583f88f5d49fe373850c878f82c8ceb51 Make haproxy maxconn configurable The haproxy maxconn (maximum connections) is by default 20000. This was previously 2000. This change makes that configurable through the oadm router --max-connections= option when creating a router. For existing routers the value can be set in the ROUTER_MAX_CONNECTIONS environment variable. If ROUTER_MAX_CONNECTIONS is missing, the deafult (20000) is used. openshift-docs PR 3609 bug 1405440 https://bugzilla.redhat.com/show_bug.cgi?id=1405440 This has been merged into ocp and is in OCP v3.5.0.18 or newer. Verified this bug on OCP v3.5.0.18 the 'maxconn' will be got from `--max-connections` or 'oc env dc router ROUTER_MAX_CONNECTIONS' Commit pushed to master at https://github.com/openshift/origin https://github.com/openshift/origin/commit/5b708a582d25b103f187207b7ac93db553192c67 Fix of BUG 1405440 Using TCPSocketAction as the liveness probe, which will not be affected by the connection limit set in HAProxy's config file. This is a TRUE fix for BUG 1405440. Commit pushed to master at https://github.com/openshift/origin https://github.com/openshift/origin/commit/0d8009fa6deb5383a74639af795e80aec6ef473f Revert "Fix of BUG 1405440" This reverts commit 5b708a582d25b103f187207b7ac93db553192c67. This BZ is in verified state but the fix has been reverted, so I think the BZ state needs to be updated (to ASSIGNED, perhaps?) https://github.com/openshift/origin/pull/13331#issue-213182468 pecameron commented 6 days ago • edited The fix to use a TCP connection check to check whether the HAProxy process is alive or not doesn't work without a iptables rule for port 1936. The original test using HTTPGet works because HTTPGet supports a Host field that can be set with "localhost" when host networking is used. The TCPSocketAction does not support a Host field. Rolling back the fix until a new fix is developed. bug 1430729 @Jeremy: I'm not convinced. The first fix (upping max connections) is still in. It's only the later change that has been reverted. Given that the new max connections change will probably resolve the problem, I am okay leaving this in VERIFIED. The other change should probably have gone in under a separate bug anyway to avoid this kind of confusion. Commit pushed to master at https://github.com/openshift/openshift-docs https://github.com/openshift/openshift-docs/commit/82ea1c37c5cce9aaeeee19d0826e36b312e6365f Make haproxy maxconn configurable The haproxy maxconn (maximum connections) is by default 2000. This was previously 2000. This change makes that configurable through the oadm router --max-connections= option when creating a router. For existing routers the value can be set in the ROUTER_MAX_CONNECTIONS environment variable. If ROUTER_MAX_CONNECTIONS is missing, the deafult (20000) is used. origin PR 12716 bug 1405440 https://bugzilla.redhat.com/show_bug.cgi?id=1405440 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:0884 |