Red Hat Bugzilla – Bug 1200541
Reset socket ignored when socket state is LAST-ACK and connection state is SYN-SENT
Last modified: 2016-07-22 03:41:38 EDT
Description of problem: Under specific conditions when a reset is received for a socket in the last ACK state the reset is ignored. Version-Release number of selected component (if applicable): RHEL 6.6, 2.6.32-504.8.1.el6.x86_64 How reproducible: 100% Steps to Reproduce: 1. On the server create an iptables entry that redirects from port X to port Y iptables -t nat -A PREROUTING -p tcp --dport 80 -j REDIRECT --to-port 8080 2. Create a service application that will listen on port Y and after making a connection send a few bytes and then wait for user input, After getting user input i will send a few more bytes, close the socket and wait for more user input. start said application 3. On the client side create an iptables entry that will drop TCP segments containing a reset sent from the client to the server for the service port and then make a connection to the server/port iptables -A OUTPUT -p tcp --dport 80 --tcp-flags RST RST -j DROP; nc 172.16.1.200 80 -p 12481 4. Now kill the connection 5. service side of the connection moves into CLOSE-WAIT. 6. Provide user input into the server application so that it closes the socket. Since the client side is closed the client should send a reset but it is dropped by the client side netfilter rules 7. observe the server socket in LAST-ACK state also the nf_conntrack entry shows it in TIME-WAIT. ipv4 2 tcp 6 119 TIME_WAIT src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 [ASSURED] mark=0 secmark=0 use=2 State Recv-Q Send-Q Local Address:Port Peer Address:Port LAST-ACK 0 101 172.16.1.200:8080 172.16.1.11:12481 timer:(on,11sec,6) 8. Now immediately after the server retransmits a packet clear the iptables entries on the client and create a new connection with the same client port iptables -F; nc 172.16.1.200 80 -p 12481 9. Observer that the nf_conntrack entry goes to SYN_SENT while the socket entry is still in LAST-ACK ipv4 2 tcp 6 119 SYN_SENT src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 [UNREPLIED] src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 mark=0 secmark=0 use=2 State Recv-Q Send-Q Local Address:Port Peer Address:Port LAST-ACK 0 101 172.16.1.200:8080 172.16.1.11:12481 timer:(on,10sec,6) 10. At this point any future resets from the client are ignored until the LAST-ACK socket has timed out. If the SYN packet arrives while the nf_conntrack entry is still in TIME-WAIT the LAST-ACK socket is cleared immediately. Actual results: The LAST-ACK socket remains until it times out and no new connections can be established until them. Expected results: The reset should clear the LAST-ACK socket regardless of what state the nf_conntrack entry shows Additional info: Condition not duplicated: tcpdump shows that the server sent a packet and got a reset. Note that the connection tracting state is TIME-WAIT at the time of the reset. 12:06:34.285335 IP 172.16.1.200.http > 172.16.1.11.12481: Flags [P.], seq 101:121, ack 2, win 114, options [nop,nop,TS val 6058249 ecr 356278671], length 20 12:06:34.286148 IP 172.16.1.11.12481 > 172.16.1.200.http: Flags [R], seq 1012837968, win 0, length 0 Tue Mar 10 12:06:33 MST 2015 ipv4 2 tcp 6 114 TIME_WAIT src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 [ASSURED] mark=0 secmark=0 use=2 State Recv-Q Send-Q Local Address:Port Peer Address:Port LAST-ACK 0 101 172.16.1.200:8080 172.16.1.11:12481 timer:(on,984ms,5) Tue Mar 10 12:06:34 MST 2015 ipv4 2 tcp 6 9 CLOSE src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 [ASSURED] mark=0 secmark=0 use=2 State Recv-Q Send-Q Local Address:Port Peer Address:Port Condition duplicated: tcpdump shows that a SYN packet arrived before a packet from the server triggered a reset from the client. The connection tracking state is SYN-SENT and the reset appears to be ignored. 12:09:14.450348 IP 172.16.1.200.http > 172.16.1.11.12481: Flags [P.], seq 101:121, ack 2, win 114, options [nop,nop,TS val 6218414 ecr 356459936], length 20 12:09:15.144595 IP 172.16.1.11.12481 > 172.16.1.200.http: Flags [S], seq 4124955705, win 29200, options [mss 1460,sackOK,TS val 356476182 ecr 0,nop,wscale 7], length 0 12:09:16.146509 IP 172.16.1.11.12481 > 172.16.1.200.http: Flags [S], seq 4124955705, win 29200, options [mss 1460,sackOK,TS val 356477184 ecr 0,nop,wscale 7], length 0 12:09:17.666297 IP 172.16.1.200.http > 172.16.1.11.12481: Flags [P.], seq 101:121, ack 2, win 114, options [nop,nop,TS val 6221630 ecr 356459936], length 20 12:09:17.666956 IP 172.16.1.11.12481 > 172.16.1.200.http: Flags [R], seq 3817972861, win 0, length 0 12:09:18.150429 IP 172.16.1.11.12481 > 172.16.1.200.http: Flags [S], seq 4124955705, win 29200, options [mss 1460,sackOK,TS val 356479188 ecr 0,nop,wscale 7], length 0 12:09:22.162066 IP 172.16.1.11.12481 > 172.16.1.200.http: Flags [S], seq 4124955705, win 29200, options [mss 1460,sackOK,TS val 356483200 ecr 0,nop,wscale 7], length 0 12:09:24.098329 IP 172.16.1.200.http > 172.16.1.11.12481: Flags [P.], seq 101:121, ack 2, win 114, options [nop,nop,TS val 6228062 ecr 356459936], length 20 Tue Mar 10 12:09:14 MST 2015 ipv4 2 tcp 6 119 TIME_WAIT src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 [ASSURED] mark=0 secmark=0 use=2 State Recv-Q Send-Q Local Address:Port Peer Address:Port LAST-ACK 0 101 172.16.1.200:8080 172.16.1.11:12481 timer:(on,3.184ms,4) Tue Mar 10 12:09:15 MST 2015 ipv4 2 tcp 6 119 SYN_SENT src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 [UNREPLIED] src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 mark=0 secmark=0 use=2 State Recv-Q Send-Q Local Address:Port Peer Address:Port LAST-ACK 0 101 172.16.1.200:8080 172.16.1.11:12481 timer:(on,2.167ms,4) Tue Mar 10 12:09:16 MST 2015 ipv4 2 tcp 6 119 SYN_SENT src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 [UNREPLIED] src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 mark=0 secmark=0 use=2 State Recv-Q Send-Q Local Address:Port Peer Address:Port LAST-ACK 0 101 172.16.1.200:8080 172.16.1.11:12481 timer:(on,1.150ms,4) Tue Mar 10 12:09:17 MST 2015 ipv4 2 tcp 6 118 SYN_SENT src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 [UNREPLIED] src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 mark=0 secmark=0 use=2 State Recv-Q Send-Q Local Address:Port Peer Address:Port LAST-ACK 0 101 172.16.1.200:8080 172.16.1.11:12481 timer:(on,134ms,4) Tue Mar 10 12:09:18 MST 2015 ipv4 2 tcp 6 119 SYN_SENT src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 mark=0 secmark=0 use=2 State Recv-Q Send-Q Local Address:Port Peer Address:Port LAST-ACK 0 101 172.16.1.200:8080 172.16.1.11:12481 timer:(on,5.549ms,5) Tue Mar 10 12:09:19 MST 2015 ipv4 2 tcp 6 118 SYN_SENT src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 mark=0 secmark=0 use=2 State Recv-Q Send-Q Local Address:Port Peer Address:Port LAST-ACK 0 101 172.16.1.200:8080 172.16.1.11:12481 timer:(on,4.533ms,5) Tue Mar 10 12:09:20 MST 2015 ipv4 2 tcp 6 117 SYN_SENT src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 mark=0 secmark=0 use=2 State Recv-Q Send-Q Local Address:Port Peer Address:Port LAST-ACK 0 101 172.16.1.200:8080 172.16.1.11:12481 timer:(on,3.516ms,5) Tue Mar 10 12:09:21 MST 2015 ipv4 2 tcp 6 116 SYN_SENT src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 mark=0 secmark=0 use=2 State Recv-Q Send-Q Local Address:Port Peer Address:Port LAST-ACK 0 101 172.16.1.200:8080 172.16.1.11:12481 timer:(on,2.500ms,5) Tue Mar 10 12:09:22 MST 2015 ipv4 2 tcp 6 119 SYN_SENT src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 mark=0 secmark=0 use=2 State Recv-Q Send-Q Local Address:Port Peer Address:Port LAST-ACK 0 101 172.16.1.200:8080 172.16.1.11:12481 timer:(on,1.484ms,5) Tue Mar 10 12:09:23 MST 2015 ipv4 2 tcp 6 118 SYN_SENT src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 mark=0 secmark=0 use=2 State Recv-Q Send-Q Local Address:Port Peer Address:Port LAST-ACK 0 101 172.16.1.200:8080 172.16.1.11:12481 timer:(on,468ms,5) Tue Mar 10 12:09:24 MST 2015 ipv4 2 tcp 6 117 SYN_SENT src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 mark=0 secmark=0 use=2 State Recv-Q Send-Q Local Address:Port Peer Address:Port LAST-ACK 0 101 172.16.1.200:8080 172.16.1.11:12481 timer:(on,12sec,6) Tue Mar 10 12:09:25 MST 2015 ipv4 2 tcp 6 116 SYN_SENT src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 mark=0 secmark=0 use=2 State Recv-Q Send-Q Local Address:Port Peer Address:Port LAST-ACK 0 101 172.16.1.200:8080 172.16.1.11:12481 timer:(on,11sec,6)
Note that setting net.netfilter.nf_conntrack_tcp_be_liberal to 1 prevents the problem. I am not sure if this means that this is not a bug or we just have an effective work around.
Can we provide guidance to customer on what the impact might be of using the nf_conntrack_tcp_be_liberal parameter? Is it safe to use? Could it cause any adverse affects?
Well I assume that some part of netfilter on the server is dropping the reset so that the TCP stack never sees it. I can see in the packet trace that the reset segment arrived at the server so I know that the client sent it.
(In reply to Jesper Brouer from comment #22) > I think we have misguided ourself a bit in this BZ, when trying to narrow > down where the packets gets dropped (comment #12). > > In comment #13 we *wrongly* though that we had enabled logging of invalid > conntracks entries, and we didn't see any. > > Via: sysctl -w net.netfilter.nf_conntrack_log_invalid=6 > > The issues here is, we didn't enable this logging correctly. > To enable logging the following step is also neeeded: > > modprobe ipt_LOG > echo "ipt_LOG" > /proc/sys/net/netfilter/nf_log/2 For reproducing on upstream kernels, notice that enabling this logging have change. I had to read the code to figure this out (this "nf_log" system is quite broken IMHO). Use this on upstream kernels: modprobe nf_log_ipv4 echo "nf_log_ipv4" > /proc/sys/net/netfilter/nf_log/2
Has this already been merged into the 6.7 kernel? Customer is participating in the 6.7 High Touch Beta and could possibly test it for us. I do not see the "fixed in" field populated. Does it depend on bz1212801? Also, could we get the 6.6.Z bz clone started? I was under the impression it was already in progress. Else I could formally request it via the GSS process.
Patch(es) available on kernel-2.6.32-564.el6
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-1272.html