1406327 – Raise the default global HAProxy limit of maxconn to 20000

Bug 1406327 - Raise the default global HAProxy limit of maxconn to 20000

Summary: Raise the default global HAProxy limit of maxconn to 20000

Keywords:
Status:	CLOSED DUPLICATE of bug 1405440
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	3.4.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Phil Cameron
QA Contact:	zhaozhanqi
Docs Contact:
URL:
Whiteboard:	aos-scalability-34
Duplicates (1):	1402892 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-12-20 09:49 UTC by Jiří Mencák
Modified:	2022-08-04 22:20 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-01-27 16:33:55 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Jiří Mencák 2016-12-20 09:49:58 UTC

Description of problem:
The default global HAProxy limit is 2000 connections.  It is the HAProxy's per-process maximum number of concurrent connections.  When this limit is reached, HAProxy stops listening for new connections including :1936/healthz probes.  This triggers a health-check failure and HAProxy is forcefully restarted.

Version-Release number of selected component (if applicable):
$ oc version
oc v3.4.0.34+87d9d8d
kubernetes v1.4.0+776c994
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://ip-172-31-35-3.us-west-2.compute.internal:8443
openshift v3.4.0.34+87d9d8d
kubernetes v1.4.0+776c994

$ oc rsh router-6-zoksy
sh-4.2$ haproxy -vv
HA-Proxy version 1.5.18 2016/05/10
Copyright 2000-2016 Willy Tarreau <willy>

Build options :
  TARGET  = linux2628
  CPU     = generic
  CC      = gcc
  CFLAGS  = -O2 -g -fno-strict-aliasing -DTCP_USER_TIMEOUT=18
  OPTIONS = USE_LINUX_TPROXY=1 USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1 USE_PCRE=1

Default settings :
  maxconn = 2000, bufsize = 16384, maxrewrite = 8192, maxpollevents = 200

Encrypted password support via crypt(3): yes
Built with zlib version : 1.2.7
Compression algorithms supported : identity, deflate, gzip
Built with OpenSSL version : OpenSSL 1.0.1e-fips 11 Feb 2013
Running on OpenSSL version : OpenSSL 1.0.1e-fips 11 Feb 2013
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports prefer-server-ciphers : yes
Built with PCRE version : 8.32 2012-11-30
PCRE library supports JIT : no (USE_PCRE_JIT not set)
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND

Available polling systems :
      epoll : pref=300,  test result OK
       poll : pref=200,  test result OK
     select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

How reproducible:
Always when trying to establish more than 2000 connections towards backend services through HAProxy.  

Steps to Reproduce:
1. Create more than 2000 concurrent through HAProxy.

Actual results:
HAProxy accepts only 2000 concurrent connections.

Expected results:
HAProxy able to accept at least 20000 concurrent connections.

Additional info:
It is possible to establish 2000+ connections with very low request rates to backend services (for example 1 request per second per) and still HAProxy will be unnecessarily restarted: http://results.ec2.breakage.org/results/ip-172-31-27-128/pbench-user-benchmark_haproxy-0600r-1ppr-01000rps-300s-nginx-routes-2400conn-wrk_2016-12-19_16:30:01/

Results after changing the maxconn limit to 20000:
http://results.ec2.breakage.org/results/ip-172-31-27-128/20000maxconn/pbench-user-benchmark_haproxy-0600r-1ppr-01000rps-300s-nginx-routes-2400conn-20000maxconn-wrk_2016-12-19_17:11:37/

Notice HAProxy's only about 8% CPU usage during this test: 
http://results.ec2.breakage.org/results/ip-172-31-27-128/20000maxconn/pbench-user-benchmark_haproxy-0600r-1ppr-01000rps-300s-nginx-routes-2400conn-20000maxconn-wrk_2016-12-19_17:11:37/1/reference-result/tools-default/ip-172-31-27-128/pidstat/cpu_usage.html

Also, please see: https://bugzilla.redhat.com/show_bug.cgi?id=1405440

Comment 1 Jiří Mencák 2017-01-04 12:45:24 UTC

May I ask why was the target release changed from 3.4.1 to 3.5.0?

Comment 3 Clayton Coleman 2017-01-18 17:41:53 UTC

Is 20k arbitrary or based on a real use case?  Why not 40?

Comment 5 Ben Bennett 2017-01-27 15:37:12 UTC

*** Bug 1402892 has been marked as a duplicate of this bug. ***

Comment 6 Ben Bennett 2017-01-27 16:33:55 UTC


*** This bug has been marked as a duplicate of bug 1405440 ***

Note You need to log in before you can comment on or make changes to this bug.