Bug 1406327 - Raise the default global HAProxy limit of maxconn to 20000
Summary: Raise the default global HAProxy limit of maxconn to 20000
Keywords:
Status: CLOSED DUPLICATE of bug 1405440
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 3.4.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Phil Cameron
QA Contact: zhaozhanqi
URL:
Whiteboard: aos-scalability-34
: 1402892 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-12-20 09:49 UTC by Jiří Mencák
Modified: 2022-08-04 22:20 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-01-27 16:33:55 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Jiří Mencák 2016-12-20 09:49:58 UTC
Description of problem:
The default global HAProxy limit is 2000 connections.  It is the HAProxy's per-process maximum number of concurrent connections.  When this limit is reached, HAProxy stops listening for new connections including :1936/healthz probes.  This triggers a health-check failure and HAProxy is forcefully restarted.

Version-Release number of selected component (if applicable):
$ oc version
oc v3.4.0.34+87d9d8d
kubernetes v1.4.0+776c994
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://ip-172-31-35-3.us-west-2.compute.internal:8443
openshift v3.4.0.34+87d9d8d
kubernetes v1.4.0+776c994

$ oc rsh router-6-zoksy
sh-4.2$ haproxy -vv
HA-Proxy version 1.5.18 2016/05/10
Copyright 2000-2016 Willy Tarreau <willy>

Build options :
  TARGET  = linux2628
  CPU     = generic
  CC      = gcc
  CFLAGS  = -O2 -g -fno-strict-aliasing -DTCP_USER_TIMEOUT=18
  OPTIONS = USE_LINUX_TPROXY=1 USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1 USE_PCRE=1

Default settings :
  maxconn = 2000, bufsize = 16384, maxrewrite = 8192, maxpollevents = 200

Encrypted password support via crypt(3): yes
Built with zlib version : 1.2.7
Compression algorithms supported : identity, deflate, gzip
Built with OpenSSL version : OpenSSL 1.0.1e-fips 11 Feb 2013
Running on OpenSSL version : OpenSSL 1.0.1e-fips 11 Feb 2013
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports prefer-server-ciphers : yes
Built with PCRE version : 8.32 2012-11-30
PCRE library supports JIT : no (USE_PCRE_JIT not set)
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND

Available polling systems :
      epoll : pref=300,  test result OK
       poll : pref=200,  test result OK
     select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

How reproducible:
Always when trying to establish more than 2000 connections towards backend services through HAProxy.  

Steps to Reproduce:
1. Create more than 2000 concurrent through HAProxy.

Actual results:
HAProxy accepts only 2000 concurrent connections.

Expected results:
HAProxy able to accept at least 20000 concurrent connections.

Additional info:
It is possible to establish 2000+ connections with very low request rates to backend services (for example 1 request per second per) and still HAProxy will be unnecessarily restarted: http://results.ec2.breakage.org/results/ip-172-31-27-128/pbench-user-benchmark_haproxy-0600r-1ppr-01000rps-300s-nginx-routes-2400conn-wrk_2016-12-19_16:30:01/

Results after changing the maxconn limit to 20000:
http://results.ec2.breakage.org/results/ip-172-31-27-128/20000maxconn/pbench-user-benchmark_haproxy-0600r-1ppr-01000rps-300s-nginx-routes-2400conn-20000maxconn-wrk_2016-12-19_17:11:37/

Notice HAProxy's only about 8% CPU usage during this test: 
http://results.ec2.breakage.org/results/ip-172-31-27-128/20000maxconn/pbench-user-benchmark_haproxy-0600r-1ppr-01000rps-300s-nginx-routes-2400conn-20000maxconn-wrk_2016-12-19_17:11:37/1/reference-result/tools-default/ip-172-31-27-128/pidstat/cpu_usage.html

Also, please see: https://bugzilla.redhat.com/show_bug.cgi?id=1405440

Comment 1 Jiří Mencák 2017-01-04 12:45:24 UTC
May I ask why was the target release changed from 3.4.1 to 3.5.0?

Comment 3 Clayton Coleman 2017-01-18 17:41:53 UTC
Is 20k arbitrary or based on a real use case?  Why not 40?

Comment 5 Ben Bennett 2017-01-27 15:37:12 UTC
*** Bug 1402892 has been marked as a duplicate of this bug. ***

Comment 6 Ben Bennett 2017-01-27 16:33:55 UTC

*** This bug has been marked as a duplicate of bug 1405440 ***


Note You need to log in before you can comment on or make changes to this bug.