Bug 1203000
Summary: | RFE: Seamless server restart, modify SO_REUSEPORT | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Jesper Brouer <jbrouer> |
Component: | kernel | Assignee: | Jesper Brouer <jbrouer> |
kernel sub component: | tcp | QA Contact: | xmu |
Status: | CLOSED WONTFIX | Docs Contact: | |
Severity: | medium | ||
Priority: | medium | CC: | aloughla, atragler, ccui, fwestpha, haliu, hsowa, jbrouer, jeder, jialiu, kzhang, mleitner, mpatel, network-qe, rkhan |
Version: | 7.3 | Keywords: | FutureFeature |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Enhancement | |
Doc Text: |
Cause:
Consequence:
Fix:
Result:
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2016-08-26 11:33:37 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1151756 | ||
Bug Blocks: |
Description
Jesper Brouer
2015-03-17 21:16:53 UTC
*** Bug 1030735 has been marked as a duplicate of this bug. *** The reproducer described in bug 1030735 is fairly complicated, and involves 2x haproxy, 2x nodejs and apache-bench (ab). While troubleshooting (bug 1030735) I developed two testing tools tcp_sink[1] and tcp_sink_client[2], which made it easier for me to reproduce. These tools are avail in the github repository: https://github.com/netoptimizer/network-testing/ tcp_sink [1] https://github.com/netoptimizer/network-testing/blob/master/src/tcp_sink.c tcp_sink_client [2] https://github.com/netoptimizer/network-testing/blob/master/src/tcp_sink_client.c Reproducer01: for issue-1 desc in comment #0 (In reply to Jesper Brouer from comment #0) > Issue-1) When removing a listen socket (e.g. app closing). > > Then the in-flight 3rd-ACK packets cannot lookup the corresponding > request_sock. The 3rd-ACK will actually find another listen_sock, and try > to lookup the request_sock. When failing it will send back a RST, > resulting in the connect() call failing with "Connection reset by peer" > (errno 104/ ECONNRESET). Requires three shells. Shell01: Start tcp_sink with high connection count limit :: ./tcp_sink --reuse --count 20000000 Shell02: Create a loop that will restart tcp_sink, and limit tcp_sink to accept 1000 connections. i=0; while (( i++ < 1000 )); do ./tcp_sink --reuse -c 1000; done Shell03: Start a tcp_sink_client doing many conn attempts ./tcp_sink_client -c 20000000 127.0.0.1 Notice the failure from tcp_sink_client: [...] count:8185 ERROR: Likely SO_REUSEPORT failed errno(104) - connect: Connection reset by peer In this run it took 8185 connections before provoking the race with the 3WHS, given the 1000 conn limit before restart, this means that the service managed to restarted 8 times without hitting the 3WHS race-condition. Reproducer02: for issue-2 desc in comment #0 (In reply to Jesper Brouer from comment #0) > Issue-2) When adding a listen socket to the pool (e.g. new app starting) > > Why adding an extra listen socket is problematic is harder to explain. > When several listen sockets exists, the selection among the listen sockets > (in __inet_lookup_listener()) is done via seeding a pseudo random number > generator (next_pseudo_random32()) with a hash including saddr+sport (for > the first matching socket), and then selecting the socket based on a modulo > like functionality (reciprocal_scale()) where the "modulo" is the "matches" > count. > > Thus, adding a listen socket can "shift" this selection, and result in the > 3rd-ACK packet getting matched against a wrong request_sock list in a > different listen_sock. Starting more and more LISTEN sockets should also cause the issue. Requires two shells. Shell01: Create a loop that will start-and-background tcp_sink, 100 times, but delay the start of each with 1 sec. i=0 while (( i++ < 100 )); do ./tcp_sink --reuse --write & echo $i; sleep 1; done killall tcp_sink Shell02: Start a tcp_sink_client doing many conn attempts ./tcp_sink_client -c 20000000 127.0.0.1 This also cause the issue, but it is harder to trigger (when many LISTEN sockets exists). The failure from tcp_sink_client looks like:: [...] count:209044 ERROR: Likely SO_REUSEPORT failed errno(104) - connect: Connection reset by peer For this run, as can be seen the connection count were much higher (209044), before hitting the 3WHS race, this was because I allowed it to start approx 20 TCP listen sockets, which reduced the probability of hitting a wrong listen socket. |