Description of problem: Whenever there is any haproxy configuration change (e.g. new route added, new pods scaled up, etc.), some of the client connections will be "reset by peer". Version-Release number of selected component (if applicable): OSE 3.0.2.0 rcm-img-docker01.build.eng.bos.redhat.com:5001/openshift3/ose-haproxy-router:v3.0.2.0 How reproducible: Not easily Mostly in a load test (only a small fraction of client requests seem to be affected, also the test machine must not be too near, as presumably the connection reset occurs in some small window when the connection is opened. My test has about 175ms pings to the OSE router) Steps to Reproduce: 1. deploy any HTTP application (e.g. cakephp-example ) 2. create another route definition (e.g. oc get route -o json > route.json and edit the names and hostnames to not conflict with the cakephp route 3. run apache bench from a machine not too close to the OSE instances (my test machine has 175ms pings to the OSE router). ab -v 2 -r -n 20000 -c 64 http://cakephp-example-foo.cloudapps.example.com/ > ab.log 4. Randomly run 'oc create -f route.json && oc delete route' Actual results: ab will report ... apr_socket_recv: Connection reset by peer (104) ... immediately after any router configuration change Expected results: no connection resets on router changes Additional info:
Does cakephp respond to graceful deletion correctly? What docker image and app are you running when you experience this?
Looks like we are doing the right thing in our scripts and requesting a soft reload (with -sf $old_pid). What that does is make haproxy bind a second daemon to the same port, and when it is listening, it signals to the old daemon to finish the requests it's handling, but not listen for more. UNFORTUNATELY there's a problem with the SYN packets getting put in the wrong queue... so connections get reset. The issue is well described by: http://engineeringblog.yelp.com/2015/04/true-zero-downtime-haproxy-reloads.html But that fix is rather... intricate. There are others to add iptables rules to drop the SYNs while haproxy reloads. But the haproxy devs appear to be aware of the issue and are looking at fd passing to resolve this. So... do we want to implement a hack to handle this? Searching for 'haproxy soft reload "connection reset by peer"' gives good results.
To summarize what I have found so far: - OpenShift 2 seems to have exhibited the same behavior... there is no difference between the way that OS2 and OS3 cause the reload to happen - We don't have permission (when the router is run with host-network=true and the privileged scc it is running under has: allowHostNetwork: true; allowPrivilegedContainer: true) - That this behavior is known and documented in the haproxy management.txt file: http://www.haproxy.org/download/1.6/doc/management.txt I'm investigating what it would take to implement the iptables solution in a container. So far it looks ugly: - Need to set the SCC to have: allowedCapabilities: - NET_ADMIN - Need to edit the RC to have: spec: template: spec: containers: securityContext: capabilities: add: - NET_ADMIN - Then in the reload script for haproxy: iptables -I INPUT -p tcp -m multiport --dports $PORTS --syn -j DROP sleep 1 /usr/sbin/haproxy -f $config_file -p $pid_file -sf $old_pid iptables -D INPUT -p tcp -m multiport --dport $PORTS --syn -j DROP
It seems our EAP deployment is affected by this same issue. We are running OSE 3.0. To reproduce: for i in {1..5000}; do curl http://our.openshift.redhat.com/pnc-rest/rest/running-build-records/1193 && echo " ($(date)) \n" ; done When we add or delete dummy route, we get "Connection reset by peer". The error does not show up every time but it is reproducible in at least 10% of route changes.
PR is in progress at https://github.com/openshift/origin/pull/6472
Resolved by: https://github.com/openshift/origin/pull/6472 This fix prevents traffic to haproxy getting dropped if it connects while the reload is in progress. You need to change your router to have an environment variable set: oc set env dc/router -c router DROP_SYN_DURING_RESTART=true Once that has been set, and the router has restarted, any subsequent reload will have an iptables change in place to eat the SYN packets to make the hand-over not drop packets. The downside is that it will make the reloads seem to take longer. The kernel networking team has a bug open on the root cause.
this issue still can be reproduced Tested on devenv_rhel_3075 with router images openshift/origin-haproxy-router latest b5436007264f 44 hours ago steps: 1. Create hello-openshift pod/service/route 2. using ab to stress the URL 3. Create another route during the step 2 [root@ip-172-18-0-105 ~]# ab -v 2 -r -n 2000000 -c 64 http://hello-service-default.router.default.svc.cluster.local/ >htllo.log Completed 200000 requests Completed 400000 requests Completed 600000 requests Completed 800000 requests Completed 1000000 requests apr_socket_recv: Connection reset by peer (104) apr_socket_recv: Connection reset by peer (104) apr_socket_recv: Connection reset by peer (104) apr_socket_recv: Connection reset by peer (104) apr_socket_recv: Connection reset by peer (104) apr_socket_recv: Connection reset by peer (104) Completed 1200000 requests Completed 1400000 requests Completed 1600000 requests Completed 1800000 requests Completed 2000000 requests Finished 2000000 requests
BTW: forgot to mention in the comment 13, I had set 'oc set env dc/router -c router DROP_SYN_DURING_RESTART=true' in the testing
(In reply to zhaozhanqi from comment #15) > BTW: forgot to mention in the comment 13, I had set 'oc set env dc/router -c > router DROP_SYN_DURING_RESTART=true' in the testing Did you restart the router after setting that environment variable?
yes, Ben. when setting an env variable to dc/router. The router will be re-deploy automatically.
(In reply to Ben Bennett from comment #13) > The kernel networking team has a bug open on the root cause. For completeness: do you have a bz id?
Doc PR https://github.com/openshift/openshift-docs/pull/1987
Kernel bug - https://bugzilla.redhat.com/show_bug.cgi?id=1203000
Can you make sure that you followed all the steps in the doc PR to get it set up? It needs to run in the privileged SCC to be able to use iptables.
@Ben Bennett Just test this using privileged scc when creating router. this issue did not reproduced. BTW. since router is using hostnetwork scc as default. So I doubt some customer still can meet this issue when using hostnetwork scc router.
@Ben Bennett seems this issue still can be reproduced even if using privileged scc the weird things: cannot initialize iptables even if it's a root user. sh-4.2# id uid=0(root) gid=0(root) groups=0(root) sh-4.2# iptables-save iptables-save v1.4.21: Cannot initialize: Permission denied (you must be root)
@zhaozhanqi: You need to have CAP_NET_ADMIN... but privileged should give you that. If you are getting that error, then it is not set up correctly.
Same here... I followed the workaround suggestion but it didn't work... I'm still getting errors like 'Remote host closed connection during handshake' due to connection being dropped by router...
I'm working on this at the moment and something with the capabilities has changed since the version I tested with. I'm investigating alternatives for how we can make this work now.
Added https://github.com/openshift/origin/pull/10514 to support 'true' for DROP_SYN_DURING_RESTART (as the docs stated, but really only '1' was supported) Fixed the docs with https://github.com/openshift/openshift-docs/pull/2680 For any customers on 3.2, the correct steps are: $ oadm policy add-scc-to-user privileged -z router $ oc patch dc router -p '{"spec":{"template":{"spec":{"containers":[{"name":"router","securityContext":{"privileged":true}}],"securityContext":{"runAsUser": 0}}}}}' $ oc set env dc/router -c router DROP_SYN_DURING_RESTART=1
This has been merged into ose and is in OSE v3.3.0.23 or newer. If this is going to be backported to older versions, please let me know or clone this bugzilla for the older versions.
Checked this bug on those two version: 1)# openshift version openshift v3.3.0.23-dirty kubernetes v1.3.0+507d3a7 etcd 2.3.0+git with router imager v3.3.0.23 (id=3502a6052613) 2) # openshift version openshift v3.2.1.13-1-gc2a90e1 kubernetes v1.2.0-36-g4a3f9c5 etcd 2.2.5 brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/ose-haproxy-router v3.2.1.13 f8e807bd101b and in my testing the issue does not be reproduced, and also I use hostnetwork scc for router, seems we do not specified 'privileged' scc for v3.3.0.23 # ab -v 2 -r -n 2000000 -c 64 http://service-unsecure-zzhao.0822-3yz.qe.rhcloud.com/ > hello.log Completed 200000 requests Completed 400000 requests Completed 600000 requests Completed 800000 requests Completed 1000000 requests Completed 1200000 requests Completed 1400000 requests Completed 1600000 requests Completed 1800000 requests Completed 2000000 requests Finished 2000000 requests
Verified this bug according to comment 31
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1933
This is fixed in 3.9 by https://bugzilla.redhat.com/show_bug.cgi?id=1464657