Bug 220482

Summary: LSPP: CIPSO-to-unlabeled TCP connections block
Product: Red Hat Enterprise Linux 5 Reporter: Klaus Weidner <kweidner>
Component: kernelAssignee: Eric Paris <eparis>
Status: CLOSED CURRENTRELEASE QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 5.0CC: eparis, iboverma, krisw, linda.knippers, paul.moore, sgrubb
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RC Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-02-14 16:38:18 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 224041, 227613    
Attachments:
Description Flags
Wireshark/tcpdump binary packet dump
none
Patch to David Miller's net-2.6 git tree none

Description Klaus Weidner 2006-12-21 17:12:01 UTC
Description of problem:

When CIPSO/netlabel is active on a system, TCP transmissions to a non-CIPSO
systems get stuck when sending large amounts of data.

Version-Release number of selected component (if applicable):

kernel-2.6.18-1.2840.2.1.el5.lspp.57
selinux-policy-mls-2.4.6-15.el5 (but this happens in permissive mode also)
netlabel_tools-0.17-9.el5

How reproducible: always


Steps to Reproduce:
1. Use RHEL5 beta2 snapshot3 plus lspp.57 kernel, MLS policy. System can be in
permissive mode.

2. Activate CIPSO in pass-through mode on RHEL system:

 netlabelctl cipsov4 add pass doi:1 tags:1
 netlabelctl map del default
 netlabelctl map add default protocol:cipsov4,1

3. Connect to RHEL system from external non-CIPSO-aware system using TCP, and
transfer large amount of data from RHEL to outside system. For example, activate
chargen stream on RHEL system, and run "telnet rhel5 chargen" from outside system.

4. TCP connection stalls after transmitting a couple of packets, apparently as
soon as it tries to send full-size packets
  
Actual results: Connection gets stuck

Expected results: Data transferred normally

Additional info:

Reported to Paul Moore from HP (Cc'd) who is looking into the issue.

It is apparently MTU related, the effect is very similar to web sites with
broken PMTU:

  http://www.netheaven.com/pmtu.html

except that I'm not seeing any ICMP packets on the line, and there's no
router in the path that would need to fragment.

Despite that, either one of the following avoids the problem:

- setting the MTU to 1488 on the non-CIPSO system to make room for the
  12-byte CIPSO option in the header. Yes, it's nonintuitive to do it on
  that system, but apparently the server picks up the packet size from
  the peer. An MTU of 1489 makes the ports stuck again.

- doing "echo 1 > /proc/sys/net/ipv4/ip_no_pmtu_disc" on the CIPSO system
  (need to restart sshd to pick up the new default), which turns off PMTU
  (the packets don't have the "[DF]" flag).

My guess would be that some part of the TCP stack is expecting that it
has room for a full-size packet payload, but fails to take into account
that it needs to leave room for the CIPSO header.

Comment 1 Klaus Weidner 2006-12-21 17:16:23 UTC
Created attachment 144201 [details]
Wireshark/tcpdump binary packet dump

Attached is a binary wireshark dump of a stuck connection (viewable with
wireshark, or "tcpdump -r FILE"), here are some comments on it.

This was using MTU 1489 on the non-CIPSO host and 1500 on the CIPSO host. The
behavior is the same when using MTU 1500 on both.

No.	Time	    Source		  Destination		Protocol Info
      1 0.000000    172.16.204.1	  172.16.204.55 	TCP	 60773
> chargen [SYN] Seq=0 Len=0 MSS=1449 TSV=39780324 TSER=0 WS=7
		# The non-CIPSO system advertised a MSS of 1449, which
		# leaves room for 40 bytes of header - it doesn't know
		# about CIPSO, so it doesn't leave room for CIPSO options
		# in its MSS calculation.

No.	Time	    Source		  Destination		Protocol Info
      2 0.001761    172.16.204.55	  172.16.204.1		TCP	
chargen > 60773 [SYN, ACK] Seq=0 Ack=1 Win=46336 Len=0 MSS=1460 TSV=7122819
TSER=39780324 WS=3

		# MSS 1460 ?! The server only leaves room for a standard
		# IP+TCP header, but doesn't figure in the space needed
		# for the CIPSO option. I thought the icsk_sync_mss()
		# call was supposed to adjust for that?

No.	Time	    Source		  Destination		Protocol Info
      3 0.001816    172.16.204.1	  172.16.204.55 	TCP	 60773
> chargen [ACK] Seq=1 Ack=1 Win=5888 Len=0 TSV=39780324 TSER=7122819

		# normal ack.

No.	Time	    Source		  Destination		Protocol Info
      4 0.013220    172.16.204.55	  172.16.204.1		TCP	
chargen > 60773 [PSH, ACK] Seq=1 Ack=1 Win=5792 Len=74 TSV=7122863
TSER=39780324

		# First data packet, everything normal so far. Note that
		# it's a packet with only a single line payload. Normal
		# small packets follow, a couple deleted, until...

No.	Time	    Source		  Destination		Protocol Info
     14 0.013878    172.16.204.55	  172.16.204.1		TCP	
chargen > 60773 [PSH, ACK] Seq=371 Ack=1 Win=5792 Len=74 TSV=7122866
TSER=39780327

No.	Time	    Source		  Destination		Protocol Info
     15 0.023093    172.16.204.1	  172.16.204.55 	TCP	 60773
> chargen [ACK] Seq=1 Ack=445 Win=5888 Len=0 TSV=39780330 TSER=7122866

		# last small packet acknowledged. Now the chargen output
		# gets stuck, 10 seconds pass. I close the "telnet"
		# session connected to chargen, and my client sends a FIN
		# to the server:

No.	Time	    Source		  Destination		Protocol Info
     16 11.571509   172.16.204.1	  172.16.204.55 	TCP	 60773
> chargen [FIN, ACK] Seq=1 Ack=445 Win=5888 Len=0 TSV=39783217 TSER=7122866

No.	Time	    Source		  Destination		Protocol Info
     17 11.774140   172.16.204.1	  172.16.204.55 	TCP	 60773
> chargen [FIN, ACK] Seq=1 Ack=445 Win=5888 Len=0 TSV=39783268 TSER=7122866

		# ... twice, not sure if that's normal. Then, the server
		# suddenly decides to send a big packet, this is after
		# the 10 second delay and FIN:

No.	Time	    Source		  Destination		Protocol Info
     18 11.943452   172.16.204.55	  172.16.204.1		TCP	
chargen > 60773 [PSH, ACK] Seq=445 Ack=2 Win=5792 Len=1425 TSV=7134305
TSER=39783268

Frame 18 (1503 bytes on wire, 1503 bytes captured)
Ethernet II, Src: Vmware_c0:ab:70 (00:0c:29:c0:ab:70), Dst: Vmware_c0:00:01
(00:50:56:c0:00:01)
Internet Protocol, Src: 172.16.204.55 (172.16.204.55), Dst: 172.16.204.1
(172.16.204.1)
Transmission Control Protocol, Src Port: chargen (19), Dst Port: 60773 (60773),
Seq: 445, Ack: 2, Len: 1425
    Source port: chargen (19)
    Destination port: 60773 (60773)
    Sequence number: 445    (relative sequence number)
    [Next sequence number: 1870    (relative sequence number)]
    Acknowledgement number: 2	 (relative ack number)
    Header length: 32 bytes
    Flags: 0x18 (PSH, ACK)
    Window size: 5792 (scaled)
    Checksum: 0x240e [correct]
    Options: (12 bytes)
	NOP
	NOP
	Timestamps: TSval 7134305, TSecr 39783268
    [SEQ/ACK analysis]
	[This is an ACK to the segment in frame: 16]
	[The RTT to ACK the segment was: 0.371943000 seconds]
Data (1425 bytes)

0000  27 28 29 2a 2b 2c 2d 2e 2f 30 31 32 33 34 35 36	'()*+,-./0123456
[...]
0580  3c 3d 3e 3f 40 41 42 43 44 45 46 47 48 49 4a 4b	<=>?@ABCDEFGHIJK
0590  4c						L

	# This packet is 1425 bytes data + 32 byte IP hdr + 32 byte TCP
	# hdr = 1489 bytes, so it matches the client MTU. According to
	# the timestamps, it acked the packet in 0.37 seconds, which
	# doesn't match the wall clock. Seems like the packet was stuck
	# in the kernel?
	#
	# The client ignores this packet and sends a RST, since it
	# considers the connection to be closed already?

No.	Time	    Source		  Destination		Protocol Info
     19 11.943527   172.16.204.1	  172.16.204.55 	TCP	 60773
> chargen [RST] Seq=2 Len=0

Comment 3 Paul Moore 2006-12-21 20:24:34 UTC
I can confirm that this is MTU related as the sending machine issues a series of
ICMP errors (over loopback to itself) signaling that the destination is
unreachable and fragmentation is needed.  This occurs when trying to send a 1512
byte packet where the CIPSO option is 12 bytes long.

I am continuing to look into this problem, I will update this BZ when I know more.

Comment 4 Paul Moore 2007-01-03 16:10:21 UTC
Another quick update since it has been awhile.

It appears that the problem is due to cipso_v4_socket_setattr() failing to 
update the TCP MSS size when adding the CIPSO option to the socket.  There is 
code in the function to perform the MSS update, however, for reasons I have 
yet to identify this code is not being executed.  I am in the process of 
tracking this down now.

Comment 5 Paul Moore 2007-01-03 17:49:53 UTC
I've identified the problem: the network stack incorrectly sets the 
inet_sock->is_icsk flag when creating new sockets.  I am in the process of 
writing/testing a patch; I'll post it here soon so others can verify the fix.

Comment 6 Paul Moore 2007-01-03 18:28:34 UTC
Created attachment 144724 [details]
Patch to David Miller's net-2.6 git tree

This patch should solve the problem.  The patch is based off David Miller's
current net-2.6 git tree, however, the patch is very simple and should apply to
other kernel versions with little fuss.  Please note that I have only done some
simple testing with this patch, but it does correct the problem (there seems to
be some urgency to get this patched *soon*).

Klaus, would you be able to verify this patch using your test setup?

Comment 7 Eric Paris 2007-01-04 18:09:35 UTC
can you please test
http://people.redhat.com/sgrubb/files/lspp/kernel-2.6.18-1.2960.6.2.el5.lspp.60.i686.rpm
 

and report the results as soon as possible?

Comment 8 Klaus Weidner 2007-01-04 19:25:51 UTC
I've tested the lspp.60 kernel, it fixes the issue. I didn't get any blocked TCP
connections, and I didn't notice any regressions. Thank you!

Comment 9 RHEL Program Management 2007-01-04 20:45:48 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 10 Paul Moore 2007-01-04 23:59:56 UTC
This patch was just submitted upstream:

 * http://marc.10east.com/?l=linux-netdev&m=116795502115123&w=2

Comment 11 Jay Turner 2007-01-05 20:55:12 UTC
QE ack for RHEL5.

Comment 12 Jay Turner 2007-01-10 15:50:27 UTC
Built into 2.6.18-1.3002.el5.

Comment 13 RHEL Program Management 2007-02-08 01:57:32 UTC
A package has been built which should help the problem described in 
this bug report. This report is therefore being closed with a resolution 
of CURRENTRELEASE. You may reopen this bug report if the solution does 
not work for you.


Comment 14 Issue Tracker 2007-02-08 22:54:47 UTC
----- Additional Comments From loulwa=40us.ibm.com  2007-02-08 14:44 EDT
=
-------
I am on a ppc system with latest install (01/26) with the lspp.64 kernel
an=
d=20
latest policy .32 version.
=20
I don't think it is fixed from what I can see .. Here is what I did
=20
1- activated the chargen on the rhel system (system A) by editing=20
the /etc/xinetd.d/chargen-stream and enabling it. Then I restarted xinetd
2 - from non cipso configured system (System B) I tried the following
=22te=
lnet=20
<system> chargen=22
=20
The connection (on B) just sits there, I don't see any data transmitted
at=
=20
all. Contrary to what the bug says that some data will be transferred at
fi=
rst=20
then stops.
My understanding is that the two systems won't be able to talk to each
othe=
r=20
in the first place since one is cipso aware (A) and the other is not (B)
=20
I was getting =22ICMP parameter problem=22 packets sent back from system B
=
to A=20
when I was looking at the tcpdump output.
(Note: there is no router between my systems that can be blamed for
droppin=
g=20
packets)
=20
Here are scenarios I tried:
- Both A and B are cipso configured -> Data transmitted
- Both A and B are not cipso configured -> Data transmitted
- A is cipso configured, and B is not -> No data transmitted
=20
=20 

Internal Status set to 'Waiting on Support'
Status set to: Waiting on Tech

This event sent from IssueTracker by araghavan 
 issue 109885

Comment 17 Eric Paris 2007-02-09 16:51:06 UTC
Help me understand the setup.  Do you have system A configured to talk to System
B using CIPSO?  If so that's a configuration problem and is not a bug.  This bug
is when System A is configured to use CIPSO when talking to System C but then
also has problems talking to System B.

A set by step reproducer would be very helpful.

Comment 18 Linda Knippers 2007-02-09 17:34:28 UTC
Eric, isn't this one fixed?  Did you mean to update a different bz?

Comment 19 Eric Paris 2007-02-09 17:41:07 UTC
Right BZ, private comment from another partner claiming it isn't fixed.  Trying
to get to the bottom of it.  I'll talk to them to see if we can make their
comment public.

Comment 22 Eric Paris 2007-02-09 20:22:44 UTC
Made comment #14 public.  Still waiting a responce from IBM

Comment 23 Linda Knippers 2007-02-09 20:29:17 UTC
If this is a problem, it seems unrelated to the problem that
was originally reported, which was between systems configured
with CIPSO.  Seems like it would be better to open a new bugzilla.  

Comment 24 Linda Knippers 2007-02-09 20:30:11 UTC
Oops, I misread the original bug.

Comment 26 Loulwa Salem 2007-02-09 21:51:11 UTC
(This has not been mirrored yet so it might show up again, I just wanted to 
post it so it gets to you in time).

Here is what I tried exactly .. As per Klaus's original description I am only 
trying this on two systems (A and B) where they are the latest RHEL install + 
lspp.64 and latest policy

Steps are (System : step)
  A: netlabelctl cipsov4 add pass doi:1 tags:1
  A: netlabelctl map del default
  A: netlabelctl map add default protocol:cipsov4,1
  A: activate chargen by enabling it in /etc/xinetd.d/chargen-stream
  A: restart xinetd

  B: telnet <system A> chargen

At this point, System B just sits there, I don't see any data being shown on 
the screen.

If I enable cipso on system B and then try the telnet (i.e both A and B are 
setup with cipso), I do see the chargen output coming through.

Comment 27 Eric Paris 2007-02-10 04:23:52 UTC
My apologies for sending this update via email but I don't have access to a web
browser to directly update the BZ.  If someone could enter my response into the
BZ entry I would appreciate it.

The short version of the answer is that the behavior Loulwa is describing is
most likely correct.  Assuming that both system A and B are running RHEL5/LSPP
kernels then system A (the server sending chargen packets) will send packets
with a CIPSO option attached which will be rejected by system B because it does
not have a matching CIPSO DOI configured.  You can verify this by watching for
ICMP error messages sent by system B which have an offset greater than 20 (most
likely 22 but I don't have the CIPSO spec here so I can't be sure).  If you
wanted to do the opposite direction, have system B act as the chargen server, I
think you will find that this works correctly.

Please verify the ICMP error messages using tcpdump/Wireshark/Ethereal and let
me know if this is not the case.

Thanks.

. paul moore
. linux security @ hp

Comment 28 Loulwa Salem 2007-02-12 01:00:46 UTC
Thank for the clarification Paul, that's what I thought was the right behavior 
based on the earlier comment I made (comment #14).
You are right I was seeing ICMP error records through tcpdump.

I tried this in the other direction as you suggested. It still didn't work .. 
and here is exactly what I did and saw ..

Steps are (System : step)
  A: netlabelctl cipsov4 add pass doi:1 tags:1
  A: netlabelctl map del default
  A: netlabelctl map add default protocol:cipsov4,1
  
  B: activate chargen by enabling it in /etc/xinetd.d/chargen-stream
  B: restart xinetd

  A: telnet <system B> chargen

On system A (where I was starting the telnet) I was getting the following ..
# telnet <system B> chargen
Trying 9.3.192.178...
telnet: connect to address 9.3.192.178: Protocol error
telnet: Unable to connect to remote host: Protocol error

Trying a tcpdump on system A to see what is going on .. here is the output ..
(I replaced the system names with system A and B to make it easier)

tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes

14:14:36.238678 IP (tos 0x10, ttl  64, id 60563, offset 0, flags [DF], proto: 
TCP (6), length: 72, options ( unknown (134) len 10EOL (0) len 1 )) <system A> 
> <system B> : S, cksum 0xb507 (correct), 337286932:337286932(0) win 5840 <mss 
1460,sackOK,timestamp 2456455 0,nop,wscale 7>

14:14:36.238755 IP (tos 0xd0, ttl  64, id 19859, offset 0, flags [none], 
proto: ICMP (1), length: 112, options ( unknown (134) len 10EOL (0) len 1 )) 
<system B> > <system A>: ICMP parameter problem - octet 22, length 80
        IP (tos 0x10, ttl  64, id 60563, offset 0, flags [DF], proto: TCP (6), 
length: 72, options ( unknown (134) len 10EOL (0) len 1 )) <system A> > 
<system B>.chargen:  tcp 40 [bad hdr length 0 - too short, < 20]

I think system B is still not understanding the packet coming from system A. 
since A tries to connect through telnet to B and is rejected, does that make 
sense .. From trying a few cases it seems to me the connection is only 
established when both A and B are configured the same way (cipso or no cipso).

Will look into this and update this bug further tomorrow

Comment 29 Linda Knippers 2007-02-12 15:04:28 UTC
Updated from Paul:

I apologize, what you are seeing is still the correct behavior.  If you use an
older, non-NetLabel kernel on system B then it should work as described (this is
what I used to quickly recreate the original bug).  The reason B -> A does not
work is the same reason that A -> B does not work (see my previous comment).

Based on your comments I don't see any problems with the current fix.

. paul moore
. linux security @ hp

Comment 30 Klaus Weidner 2007-02-12 21:45:08 UTC
The system seems to be working as designed - the CIPSO system attaches the IP
options as expected, and it would be up to the receiving system to handle or
ignore them.

How about using iptables to strip off the CIPSO options to destinations that
can't handle them? This isn't really relevant to LSPP operating mode, but could
be useful as a workaround for specific configurations.

Something like:

   iptables -t mangle -A PREROUTING -j IPV4OPTSSTRIP

Comment 31 Linda Knippers 2007-02-13 15:13:48 UTC
Update from Paul:

Unfortunately, stripping the CIPSO option (or all options for that matter) from
an already crafted IPv4 packet would incur a performance penalty on the sending
side whereas right now there is almost zero NetLabel per-packet overhead; the
regular IPv4 stack does the heavy lifting already.  I have several ideas on how
to select NetLabel protocols based on IP addresses but I feel such a discussion
is outside the scope of this bug and most likely RHEL5 as well.  The current
solution of selecting NetLabel protocols based on LSM/SELinux domain may be a
departure from legacy labeled networking solutions but I feel it is in keeping
with the principals of Domain Type Enforcement. 

Update from Linda:

I think we're all in agreement that the system is currently behaving
as intended.  Other suggestions should be discussed on the mailing list.

Comment 32 Eric Paris 2007-02-13 16:59:39 UTC
So all are in agreement to reclose this bug?

Comment 33 Eric Paris 2007-02-14 16:38:18 UTC
Re-closing