Bug 499887

Summary:	IPSEC dosen't work with a big SPD/SAD
Product:	Red Hat Enterprise Linux 5	Reporter:	Marc Milgram <mmilgram>
Component:	kernel	Assignee:	Neil Horman <nhorman>
Status:	CLOSED INSUFFICIENT_DATA	QA Contact:	Red Hat Kernel QE team <kernel-qe>
Severity:	medium	Docs Contact:
Priority:	low
Version:	5.1	CC:	dzickus, herbert.xu, tao, tgraf
Target Milestone:	rc
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2009-10-21 19:57:58 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	533192

Description Marc Milgram 2009-05-08 18:35:44 UTC

Description of problem:
Same description as https://trac.ipsec-tools.net/ticket/1
ipsec-tools suite just doesn't work when SPD and/or SAD becomes big (problems can start around 100 tunnels, which is not so "big" !).

The main problem behind all that is in the PFKey interface:
when userland (racoon / setkey) sends a SADB_DUMP or a SADB_X_SPDDUMP, it sends a single PFKey message, but the kernel will send one PFKey message by entry.

Those messages are sent through an UNIX socket, and the socket's buffer will quickly fill in.

userland tools have no chance to fill it out, as almost all kernels will process the whole PFKey request before giving back some CPU to the userland.



Version-Release number of selected component (if applicable):
kernel-2.6.18-53

How reproducible:
Very

Steps to Reproduce:
1. load many rules into the SPD (ie. 192)
2. Validate that all rules are loaded into racoon correctly
3. Validate links.
  
Actual results:
Not all rules are loaded (customer indicated that first 170 were loaded).

Expected results:
All rules loaded.

Additional info:
This may be fixed by the following two patches:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=4c563f7669c10a12354b72b518c2287ffc6ebfb3
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=83321d6b9872b94604e481a79dc2c8acbe4ece31

Comment 1 Neil Horman 2009-05-17 18:01:30 UTC

Looks like you might be right about the git commits.  Do you already have this setup to reproduce on a set of systems somewhere, or do I need to set it up myself?

Comment 2 Marc Milgram 2009-05-18 12:50:18 UTC

I don't have a setup to test this.

Comment 3 Neil Horman 2009-05-18 15:34:36 UTC

So, I don't have enough hosts to actually validate all the links, but I setup a few hundred SA and SPD entries (1 24 bit subnet worth of each) and started racoon in the foreground with debug maxed out.  I was able to observe racoon read all of the resultant entries produced by the SPDDUMP it issues at startup, so I'm not sure whats going on here.  I agree the above commits look like they might improve performance in reading those entries, but I'm hesitant to take them just because something might be a bit slow.  That said, I am using newer kernel that has some fixes for UNIX sockets in it (although I could have sworn setkey uses the PF_KEY address family rather than PF_UNIX, I'll need to check on that).  Anywho, can the customer try this with the latest kernel?

Comment 6 Neil Horman 2009-05-18 16:54:33 UTC

I can read the bugzilla, I'm saying I can't reproduce the problem.  I loaded 254 SPD rules and 254 associated SA rules on a system, and started racoon in the foreground with full debug.  Parsing through the output, I see that racoon reads all 254 entries from the X_SPDDUMP request. (te results are on amd-toonie2-01.rhts.bos.redhat.com:/root/resutls if you want to see, just grep sub: results).  Anywho, it obviously works for me on the system I'm using.  Looking at the patches above, I see how they might help, but the first looks like an abi breaker, so its out.  The second looks doable, but before we take it, I'd really like to see the problem occur consistently, and then cease to occur when we take the patch.  My guess is that there is a load aspect to the problem that isn't being considered here.  As an alternative, the second patch should apply pretty cleanly to the rhel5 kernel I think.  Can the customer try a test kernel out with the second of the above patches included to confirm the fix?

Comment 7 Marc Milgram 2009-05-18 17:04:13 UTC

The customer is willing to try a test kernel in order to confirm the fix.

Comment 8 Neil Horman 2009-05-18 20:41:52 UTC

Gah, the second patch is pretty non-descript, but it requires the first patch to work properly, which makes the whole thing an ABI breaker.  I'm going to try to hack something together to make this work, but I can't promise anything.

In the meantime, I expect that the customer can work around this issue (assuming the problem is what we assume it is), but setting  /proc/sys/net/core/rmem_default and rmem_max to very large numbers.  If racoon doesn't explicitly reset those values on any sockets that it opens, that should prevent blocks/drops on the pf_key protocol and avoid this issue.  If so, the raccoon startup script can be adjusted to ensure that it starts with a sufficiently large buffer space to make the problem avoidable.  Please relay that to the customer and let me know how that works out.