Bug 834397

Summary:	ikev2 crashes when firing up a few conns at once
Product:	Red Hat Enterprise Linux 6	Reporter:	Avesh Agarwal <avagarwa>
Component:	openswan	Assignee:	Paul Wouters <pwouters>
Status:	CLOSED ERRATA	QA Contact:	Jaroslav Aster <jaster>
Severity:	high	Docs Contact:
Priority:	high
Version:	6.4	CC:	avagarwa, eparis, jaster, pwouters, sforsber
Target Milestone:	rc
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2014-10-14 08:18:36 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Avesh Agarwal 2012-06-21 18:36:33 UTC

Description of problem:
ikev2 crashes when firing up a few conns at once.

Version-Release number of selected component (if applicable):
2.6.32-16

How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 2 RHEL Program Management 2012-09-07 05:38:00 UTC

This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unable to address this
request at this time.

Red Hat invites you to ask your support representative to
propose this request, if appropriate, in the next release of
Red Hat Enterprise Linux.

Comment 3 Paul Wouters 2014-06-21 17:05:02 UTC

This requires a backport of libreswan 6883fac8a9974afa5c2374202ff7dc05cfedf89d

Comment 5 Paul Wouters 2014-06-24 14:35:55 UTC

yes, the crasher fairly simple. configure a few ikev2 connections with ikev2=insist and auto=start and start openswan. Be sure to have more connections than CPU cores on the (virtual) machine.

do use different IPs/IDs for the two test machines, so it appears that it is setting up connections to multiple different machines (so there is no attempt to share the parent sa)

I'll make a libreswan test case and link it here

Comment 7 Jaroslav Aster 2014-07-16 13:11:04 UTC

Hi Paul,

could you provide me more information or reproducer? I am not able to reproduce this bug.

This configuration was on the machine which was supposed to crash.

version    2.0

config setup
       protostack=netkey
       plutodebug=all

conn test1
        auto=start
        authby=secret
        left=<machine0>
        right=<machine1>
        ikev2=insist

conn test2
        auto=start
        authby=secret
        left=<machine0>
        right=<machine2>
        ikev2=insist

conn test3
        auto=start
        authby=secret
        left=<machine0>
        right=<machine3>
        ikev2=insist

conn test4
        auto=start
        authby=secret
        left=<machine0>
        right=<machine5>
        ikev2=insist


On machine <machineX> was the part of this configuration related to machineX and value of auto option was add.

# ipsec auto --status|grep 'newest ISAKMP SA'
000 "test1":   newest ISAKMP SA: #1; newest IPsec SA: #2; 
000 "test2":   newest ISAKMP SA: #3; newest IPsec SA: #7; 
000 "test3":   newest ISAKMP SA: #4; newest IPsec SA: #8; 
000 "test4":   newest ISAKMP SA: #5; newest IPsec SA: #6; 

Everything was ok, no crash.

If would be possible to test it on a single machine (loopback) it would be great.

Comment 8 Paul Wouters 2014-07-16 14:38:24 UTC

try adding rekey=yes and then ipsec restart on machine0 so machine 1-5 all reconnect to it at the same time?

Comment 9 Jaroslav Aster 2014-07-17 16:19:07 UTC

Hi Paul,

it did not help and I have not been able to reproduce it. I tested it on s390x, ppc64, x86_64 and i386 and openswan-2.6.32-27 and -16.

Were you able to reproduce it somewhere?

Comment 10 Paul Wouters 2014-07-30 05:16:44 UTC

I can crash it using:

config setup
        nhelpers=1

conn multi
	left=machine0
        right=machine1
        authby=secret
	ikev2=insist
	ike=aes128-sha1;modp2048
	esp=aes128-sha1;modp2048
	pfs=yes
	leftsubnets={10.0.1.0/24,10.0.2.0/24,10.0.3.0/24,10.0.4.0/24}
	rightsubnets={11.0.1.0/24,11.0.2.0/24,11.0.3.0/24,11.0.4.0/24}
	auto=start

Use auto=add on machine0 and auto=start on machine1
Then on machine0, start openswan using:

export PLUTO_CRYPTO_HELPER_DELAY=10
service ipsec start

then on machine1 start openswan

The nhelpers=1 ensures there are not too many helpers available for work
The env variable fakes a busy system where the crypto takes long.

Note it is hard to see in the noise and restarts. I see this for openswan 2.6.32-27.1

ABORT at /root/rpmbuild/BUILD/openswan-2.6.32/programs/pluto/ikev2_parent.c:1555

Comment 14 errata-xmlrpc 2014-10-14 08:18:36 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1588.html