Bug 1200541 - Reset socket ignored when socket state is LAST-ACK and connection state is SYN-SENT
Summary: Reset socket ignored when socket state is LAST-ACK and connection state is SY...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.6
Hardware: Unspecified
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Jesper Brouer
QA Contact: xmu
URL:
Whiteboard:
Depends On:
Blocks: 1075802 1172231 1212801 1227467 1227468
TreeView+ depends on / blocked
 
Reported: 2015-03-10 19:19 UTC by noah davids
Modified: 2019-07-11 08:45 UTC (History)
13 users (show)

Fixed In Version: kernel-2.6.32-564.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1227467 (view as bug list)
Environment:
Last Closed: 2015-07-22 08:44:22 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:1272 0 normal SHIPPED_LIVE Moderate: kernel security, bug fix, and enhancement update 2015-07-22 11:56:25 UTC

Description noah davids 2015-03-10 19:19:08 UTC
Description of problem:
Under specific conditions when a reset is received for a socket in the last ACK state the reset is ignored.


Version-Release number of selected component (if applicable):
RHEL 6.6, 2.6.32-504.8.1.el6.x86_64


How reproducible:
100%

Steps to Reproduce:
1. On the server create an iptables entry that redirects from port X to port Y
    iptables -t nat -A PREROUTING -p tcp --dport 80 -j REDIRECT --to-port 8080

2. Create a service application that will listen on port Y and after making a connection send a few bytes and then wait for user input, After getting user input i will send a few more bytes, close the socket and wait for more user input. start said application

3. On the client side create an iptables entry that will drop TCP segments containing a reset sent from the client to the server for the service port and then make a connection to the server/port
iptables -A OUTPUT -p tcp --dport 80 --tcp-flags RST RST -j DROP; nc 172.16.1.200 80 -p 12481

4. Now kill the connection

5. service side of the connection moves into CLOSE-WAIT. 

6. Provide user input into the server application so that it closes the socket. Since the client side is closed the client should send a reset but it is dropped by the client side netfilter rules

7. observe the server socket in LAST-ACK state also the nf_conntrack entry shows it in TIME-WAIT.

ipv4     2 tcp      6 119 TIME_WAIT src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 [ASSURED] mark=0 secmark=0 use=2
State      Recv-Q Send-Q        Local Address:Port          Peer Address:Port 
LAST-ACK   0      101            172.16.1.200:8080           172.16.1.11:12481  timer:(on,11sec,6)

8. Now immediately after the server retransmits a packet clear the iptables entries on the client and create a new connection with the same client port
 iptables -F;  nc 172.16.1.200 80 -p 12481

9. Observer that the nf_conntrack entry goes to SYN_SENT while the socket entry is still in LAST-ACK

ipv4     2 tcp      6 119 SYN_SENT src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 [UNREPLIED] src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 mark=0 secmark=0 use=2
State      Recv-Q Send-Q        Local Address:Port          Peer Address:Port 
LAST-ACK   0      101            172.16.1.200:8080           172.16.1.11:12481  timer:(on,10sec,6)

10. At this point any future resets from the client are ignored until the LAST-ACK socket has timed out.

If the SYN packet arrives while the nf_conntrack entry is still in TIME-WAIT the LAST-ACK socket is cleared immediately.


Actual results:
The LAST-ACK socket remains until it times out and no new connections can be established until them.

Expected results:
The reset should clear the LAST-ACK socket regardless of what state the nf_conntrack entry shows


Additional info:

Condition not duplicated: tcpdump shows that the server sent a packet and got a reset. Note that the connection tracting state is TIME-WAIT at the time of the reset.

12:06:34.285335 IP 172.16.1.200.http > 172.16.1.11.12481: Flags [P.], seq 101:121, ack 2, win 114, options [nop,nop,TS val 6058249 ecr 356278671], length 20
12:06:34.286148 IP 172.16.1.11.12481 > 172.16.1.200.http: Flags [R], seq 1012837968, win 0, length 0


Tue Mar 10 12:06:33 MST 2015
ipv4     2 tcp      6 114 TIME_WAIT src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 [ASSURED] mark=0 secmark=0 use=2
State      Recv-Q Send-Q        Local Address:Port          Peer Address:Port 
LAST-ACK   0      101            172.16.1.200:8080           172.16.1.11:12481  timer:(on,984ms,5)
Tue Mar 10 12:06:34 MST 2015
ipv4     2 tcp      6 9 CLOSE src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 [ASSURED] mark=0 secmark=0 use=2
State      Recv-Q Send-Q        Local Address:Port          Peer Address:Port 


Condition duplicated: tcpdump shows that a SYN packet arrived before a packet from the server triggered a reset from the client. The connection tracking state is SYN-SENT and the reset appears to be ignored.

12:09:14.450348 IP 172.16.1.200.http > 172.16.1.11.12481: Flags [P.], seq 101:121, ack 2, win 114, options [nop,nop,TS val 6218414 ecr 356459936], length 20
12:09:15.144595 IP 172.16.1.11.12481 > 172.16.1.200.http: Flags [S], seq 4124955705, win 29200, options [mss 1460,sackOK,TS val 356476182 ecr 0,nop,wscale 7], length 0
12:09:16.146509 IP 172.16.1.11.12481 > 172.16.1.200.http: Flags [S], seq 4124955705, win 29200, options [mss 1460,sackOK,TS val 356477184 ecr 0,nop,wscale 7], length 0
12:09:17.666297 IP 172.16.1.200.http > 172.16.1.11.12481: Flags [P.], seq 101:121, ack 2, win 114, options [nop,nop,TS val 6221630 ecr 356459936], length 20
12:09:17.666956 IP 172.16.1.11.12481 > 172.16.1.200.http: Flags [R], seq 3817972861, win 0, length 0
12:09:18.150429 IP 172.16.1.11.12481 > 172.16.1.200.http: Flags [S], seq 4124955705, win 29200, options [mss 1460,sackOK,TS val 356479188 ecr 0,nop,wscale 7], length 0
12:09:22.162066 IP 172.16.1.11.12481 > 172.16.1.200.http: Flags [S], seq 4124955705, win 29200, options [mss 1460,sackOK,TS val 356483200 ecr 0,nop,wscale 7], length 0
12:09:24.098329 IP 172.16.1.200.http > 172.16.1.11.12481: Flags [P.], seq 101:121, ack 2, win 114, options [nop,nop,TS val 6228062 ecr 356459936], length 20

Tue Mar 10 12:09:14 MST 2015
ipv4     2 tcp      6 119 TIME_WAIT src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 [ASSURED] mark=0 secmark=0 use=2
State      Recv-Q Send-Q        Local Address:Port          Peer Address:Port 
LAST-ACK   0      101            172.16.1.200:8080           172.16.1.11:12481  timer:(on,3.184ms,4)
Tue Mar 10 12:09:15 MST 2015
ipv4     2 tcp      6 119 SYN_SENT src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 [UNREPLIED] src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 mark=0 secmark=0 use=2
State      Recv-Q Send-Q        Local Address:Port          Peer Address:Port 
LAST-ACK   0      101            172.16.1.200:8080           172.16.1.11:12481  timer:(on,2.167ms,4)
Tue Mar 10 12:09:16 MST 2015
ipv4     2 tcp      6 119 SYN_SENT src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 [UNREPLIED] src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 mark=0 secmark=0 use=2
State      Recv-Q Send-Q        Local Address:Port          Peer Address:Port 
LAST-ACK   0      101            172.16.1.200:8080           172.16.1.11:12481  timer:(on,1.150ms,4)
Tue Mar 10 12:09:17 MST 2015
ipv4     2 tcp      6 118 SYN_SENT src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 [UNREPLIED] src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 mark=0 secmark=0 use=2
State      Recv-Q Send-Q        Local Address:Port          Peer Address:Port 
LAST-ACK   0      101            172.16.1.200:8080           172.16.1.11:12481  timer:(on,134ms,4)
Tue Mar 10 12:09:18 MST 2015
ipv4     2 tcp      6 119 SYN_SENT src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 mark=0 secmark=0 use=2
State      Recv-Q Send-Q        Local Address:Port          Peer Address:Port 
LAST-ACK   0      101            172.16.1.200:8080           172.16.1.11:12481  timer:(on,5.549ms,5)
Tue Mar 10 12:09:19 MST 2015
ipv4     2 tcp      6 118 SYN_SENT src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 mark=0 secmark=0 use=2
State      Recv-Q Send-Q        Local Address:Port          Peer Address:Port 
LAST-ACK   0      101            172.16.1.200:8080           172.16.1.11:12481  timer:(on,4.533ms,5)
Tue Mar 10 12:09:20 MST 2015
ipv4     2 tcp      6 117 SYN_SENT src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 mark=0 secmark=0 use=2
State      Recv-Q Send-Q        Local Address:Port          Peer Address:Port 
LAST-ACK   0      101            172.16.1.200:8080           172.16.1.11:12481  timer:(on,3.516ms,5)
Tue Mar 10 12:09:21 MST 2015
ipv4     2 tcp      6 116 SYN_SENT src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 mark=0 secmark=0 use=2
State      Recv-Q Send-Q        Local Address:Port          Peer Address:Port 
LAST-ACK   0      101            172.16.1.200:8080           172.16.1.11:12481  timer:(on,2.500ms,5)
Tue Mar 10 12:09:22 MST 2015
ipv4     2 tcp      6 119 SYN_SENT src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 mark=0 secmark=0 use=2
State      Recv-Q Send-Q        Local Address:Port          Peer Address:Port 
LAST-ACK   0      101            172.16.1.200:8080           172.16.1.11:12481  timer:(on,1.484ms,5)
Tue Mar 10 12:09:23 MST 2015
ipv4     2 tcp      6 118 SYN_SENT src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 mark=0 secmark=0 use=2
State      Recv-Q Send-Q        Local Address:Port          Peer Address:Port 
LAST-ACK   0      101            172.16.1.200:8080           172.16.1.11:12481  timer:(on,468ms,5)
Tue Mar 10 12:09:24 MST 2015
ipv4     2 tcp      6 117 SYN_SENT src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 mark=0 secmark=0 use=2
State      Recv-Q Send-Q        Local Address:Port          Peer Address:Port 
LAST-ACK   0      101            172.16.1.200:8080           172.16.1.11:12481  timer:(on,12sec,6)
Tue Mar 10 12:09:25 MST 2015
ipv4     2 tcp      6 116 SYN_SENT src=172.16.1.11 dst=172.16.1.200 sport=12481 dport=80 src=172.16.1.200 dst=172.16.1.11 sport=8080 dport=12481 mark=0 secmark=0 use=2
State      Recv-Q Send-Q        Local Address:Port          Peer Address:Port 
LAST-ACK   0      101            172.16.1.200:8080           172.16.1.11:12481  timer:(on,11sec,6)

Comment 4 noah davids 2015-03-11 14:47:13 UTC
Note that setting net.netfilter.nf_conntrack_tcp_be_liberal to 1 prevents the problem. I am not sure if this means that this is not a bug or we just have an effective work around.

Comment 9 Terry Bowling 2015-03-19 19:38:57 UTC
Can we provide guidance to customer on what the impact might be of using the nf_conntrack_tcp_be_liberal parameter?  

Is it safe to use?  Could it cause any adverse affects?

Comment 11 noah davids 2015-03-19 23:29:38 UTC
Well I assume that some part of netfilter on the server is dropping the reset so that the TCP stack never sees it. I can see in the packet trace that the reset segment arrived at the server so I know that the client sent it.

Comment 28 Jesper Brouer 2015-05-07 13:12:43 UTC
(In reply to Jesper Brouer from comment #22)
> I think we have misguided ourself a bit in this BZ, when trying to narrow
> down where the packets gets dropped (comment #12).
> 
> In comment #13 we *wrongly* though that we had enabled logging of invalid
> conntracks entries, and we didn't see any.
> 
>  Via:  sysctl -w net.netfilter.nf_conntrack_log_invalid=6
> 
> The issues here is, we didn't enable this logging correctly.
> To enable logging the following step is also neeeded:
> 
>  modprobe ipt_LOG
>  echo "ipt_LOG" > /proc/sys/net/netfilter/nf_log/2

For reproducing on upstream kernels, notice that enabling this logging have change.  I had to read the code to figure this out (this "nf_log" system is quite broken IMHO).

Use this on upstream kernels:

 modprobe nf_log_ipv4
 echo "nf_log_ipv4" > /proc/sys/net/netfilter/nf_log/2

Comment 33 Terry Bowling 2015-05-26 17:22:16 UTC
Has this already been merged into the 6.7 kernel?  Customer is participating in the 6.7 High Touch Beta and could possibly test it for us.  I do not see the "fixed in" field populated.  Does it depend on bz1212801?

Also, could we get the 6.6.Z bz clone started?  I was under the impression it was already in progress.  Else I could formally request it via the GSS process.

Comment 36 Kurt Stutsman 2015-06-02 17:24:47 UTC
Patch(es) available on kernel-2.6.32-564.el6

Comment 42 errata-xmlrpc 2015-07-22 08:44:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1272.html


Note You need to log in before you can comment on or make changes to this bug.