Bug 438826 - openswan IKEv2 hangs between intel and ppc64 machines
openswan IKEv2 hangs between intel and ppc64 machines
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: openswan (Show other bugs)
5.2
ppc64 All
high Severity urgent
: rc
: ---
Assigned To: Steve Grubb
Martin Jenner
:
Depends On:
Blocks: 253764
  Show dependency treegraph
 
Reported: 2008-03-25 10:16 EDT by IBM Bug Proxy
Modified: 2008-05-21 11:29 EDT (History)
9 users (show)

See Also:
Fixed In Version: RHBA-2008-0395
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-05-21 11:29:05 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
tcpdump taken from elm3b128 (270 bytes, application/octet-stream)
2008-03-25 10:16 EDT, IBM Bug Proxy
no flags Details
tcpdump taken from elm3a23 (270 bytes, application/octet-stream)
2008-03-25 10:16 EDT, IBM Bug Proxy
no flags Details
/var/log/secure from initiator (elm3b66) (45.62 KB, text/plain)
2008-03-27 03:32 EDT, IBM Bug Proxy
no flags Details
/var/log/secure from responder (elm3b128) (96.75 KB, text/plain)
2008-03-27 03:32 EDT, IBM Bug Proxy
no flags Details
/var/log/secure from initiator (eal5) (40.34 KB, text/plain)
2008-03-28 12:49 EDT, IBM Bug Proxy
no flags Details
/var/log/secure from responder (tim-hv4) (68.78 KB, text/plain)
2008-03-28 12:49 EDT, IBM Bug Proxy
no flags Details
stderr output from rpmbuild of openswan-2.6.09-1.el5.src.rpm on eal5 (21.81 KB, text/plain)
2008-03-28 14:41 EDT, IBM Bug Proxy
no flags Details
offset tester for gcc (871 bytes, text/plain)
2008-04-02 18:46 EDT, Paul Wouters
no flags Details
Initiator (eal5) logs (51.78 KB, text/plain)
2008-04-03 01:25 EDT, IBM Bug Proxy
no flags Details
Responder (tim-hv4) logs (43.25 KB, text/plain)
2008-04-03 01:25 EDT, IBM Bug Proxy
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
IBM Linux Technology Center 43449 None None None Never

  None (edit)
Description IBM Bug Proxy 2008-03-25 10:16:52 EDT
=Comment: #0=================================================
TYLER C. HICKS <tchicks@us.ibm.com> - 2008-03-24 20:37 EDT
---Problem Description---
openswan IKEv2 fails to negotiate SAs across intel (i386/x86_64) and ppc64
platforms.
 
Contact Information = Tyler Hicks <tyhicks@linux.vnet.ibm.com>
 
---Additional Hardware Info---
Requires an Intel and PPC machine.
 
---uname output---
Linux elm3a23 2.6.18-86.el5PAE #1 SMP Tue Mar 18 18:34:45 EDT 2008 i686 i686
i386 GNU/Linux
Linux elm3a69 2.6.18-86.el5 #1 SMP Tue Mar 18 18:19:59 EDT 2008 x86_64 x86_64
x86_64 GNU/Linux
Linux elm3b128 2.6.18-86.el5 #1 SMP Tue Mar 18 18:25:26 EDT 2008 ppc64 ppc64
ppc64 GNU/Linux
 
---Machine Type--
elm3a23: 8870-11X eserver xSeries 445
elm3a69: 4367-30Z IBM System x3200
elm3b128: 8842-P2C
 
---Debugger---
A debugger is not configured
 
---Steps to Reproduce---
The following configs and logs are from these machines:

elm3a23 - 9.47.66.23 - i386
elm3a69 - 9.47.66.69 - x86_64
elm3b128 - 9.47.67.128 - ppc64

openswan config file used by all machines:
-----------------------------------------------------------------
# /etc/ipsec.conf - Openswan IPsec configuration file
#
# Manual:     ipsec.conf.5
#
# Please place your own config files in /etc/ipsec.d/ ending in .conf

version 2.0     # conforms to second version of ipsec.conf specification

# basic configuration
config setup
        # Debug-logging controls:  "none" for (almost) none, "all" for lots.
        # klipsdebug=none
        plutodebug=all
        nat_traversal=yes

conn i386-x86_64
        left=9.47.66.23
        right=9.47.66.69
        ikev2=insist
        authby=secret
        auto=add

conn i386-ppc64
        left=9.47.66.23
        right=9.47.67.128
        ikev2=insist
        authby=secret
        auto=add

conn x86_64-ppc64
        left=9.47.66.69
        right=9.47.67.128
        ikev2=insist
        authby=secret
        auto=add
-----------------------------------------------------------------

openswan secrets file used by all machines:
-----------------------------------------------------------------
# /etc/ipsec.secrets

9.47.66.23 9.47.66.69 9.47.67.128 : PSK "psk"
-----------------------------------------------------------------

on all machines:
$> service ipsec start

on i386 machine:
$> ipsec auto --verbose --up i386-ppc64
002 "i386-ppc64" #1: initiating v2 parent SA
133 "i386-ppc64" #1: STATE_PARENT_I1: initiate
002 "i386-ppc64" #1: transition from state STATE_IKEv2_START to state
STATE_PARENT_I1
133 "i386-ppc64" #1: STATE_PARENT_I1: sent v2I1, expected v2R1

ipsec_auto hangs here and must be killed with ctrl-c.

contents of i386 (initiator) machine's /var/log/secure:
-----------------------------------------------------------------
Mar 24 16:22:07 elm3a23 ipsec__plutorun: Starting Pluto subsystem...
Mar 24 16:22:07 elm3a23 pluto[6455]: Starting Pluto (Openswan Version 2.6.07;
Vendor ID OEFk]{GSv\134hk) pid:6455
Mar 24 16:22:07 elm3a23 pluto[6455]: Setting NAT-Traversal port-4500 floating to off
Mar 24 16:22:07 elm3a23 pluto[6455]:    port floating activation criteria
nat_t=0/port_float=1
Mar 24 16:22:07 elm3a23 pluto[6455]:    including NAT-Traversal patch (Version
0.6c) [disabled]
Mar 24 16:22:07 elm3a23 pluto[6455]: using /dev/urandom as source of random entropy
Mar 24 16:22:07 elm3a23 pluto[6455]: ike_alg_register_enc(): Activating
OAKLEY_TWOFISH_CBC_SSH: Ok (ret=0)
Mar 24 16:22:07 elm3a23 pluto[6455]: ike_alg_register_enc(): Activating
OAKLEY_TWOFISH_CBC: Ok (ret=0)
Mar 24 16:22:07 elm3a23 pluto[6455]: ike_alg_register_enc(): Activating
OAKLEY_SERPENT_CBC: Ok (ret=0)
Mar 24 16:22:07 elm3a23 pluto[6455]: ike_alg_register_enc(): Activating
OAKLEY_AES_CBC: Ok (ret=0)
Mar 24 16:22:07 elm3a23 pluto[6455]: ike_alg_register_enc(): Activating
OAKLEY_BLOWFISH_CBC: Ok (ret=0)
Mar 24 16:22:07 elm3a23 pluto[6455]: ike_alg_register_hash(): Activating
OAKLEY_SHA2_512: Ok (ret=0)
Mar 24 16:22:07 elm3a23 pluto[6455]: ike_alg_register_hash(): Activating
OAKLEY_SHA2_256: Ok (ret=0)
Mar 24 16:22:07 elm3a23 pluto[6455]: starting up 7 cryptographic helpers
Mar 24 16:22:07 elm3a23 pluto[6463]: using /dev/urandom as source of random entropy
Mar 24 16:22:07 elm3a23 pluto[6455]: started helper pid=6463 (fd:7)
Mar 24 16:22:07 elm3a23 pluto[6455]: started helper pid=6464 (fd:8)
Mar 24 16:22:07 elm3a23 pluto[6464]: using /dev/urandom as source of random entropy
Mar 24 16:22:07 elm3a23 pluto[6465]: using /dev/urandom as source of random entropy
Mar 24 16:22:07 elm3a23 pluto[6455]: started helper pid=6465 (fd:9)
Mar 24 16:22:07 elm3a23 pluto[6466]: using /dev/urandom as source of random entropy
Mar 24 16:22:07 elm3a23 pluto[6455]: started helper pid=6466 (fd:10)
Mar 24 16:22:07 elm3a23 pluto[6467]: using /dev/urandom as source of random entropy
Mar 24 16:22:07 elm3a23 pluto[6455]: started helper pid=6467 (fd:11)
Mar 24 16:22:07 elm3a23 pluto[6468]: using /dev/urandom as source of random entropy
Mar 24 16:22:07 elm3a23 pluto[6455]: started helper pid=6468 (fd:12)
Mar 24 16:22:07 elm3a23 pluto[6469]: using /dev/urandom as source of random entropy
Mar 24 16:22:07 elm3a23 pluto[6455]: started helper pid=6469 (fd:13)
Mar 24 16:22:07 elm3a23 pluto[6455]: Using Linux 2.6 IPsec interface code on
2.6.18-86.el5PAE (experimental code)
Mar 24 16:22:07 elm3a23 pluto[6455]: Could not change to directory
'/etc/ipsec.d/cacerts': /
Mar 24 16:22:07 elm3a23 pluto[6455]: Could not change to directory
'/etc/ipsec.d/aacerts': /
Mar 24 16:22:07 elm3a23 pluto[6455]: Could not change to directory
'/etc/ipsec.d/ocspcerts': /
Mar 24 16:22:07 elm3a23 pluto[6455]: Could not change to directory
'/etc/ipsec.d/crls'
Mar 24 16:22:07 elm3a23 pluto[6455]: added connection description "i386-x86_64"
Mar 24 16:22:07 elm3a23 pluto[6455]: added connection description "i386-ppc64"
Mar 24 16:22:07 elm3a23 pluto[6455]: added connection description "x86_64-ppc64"
Mar 24 16:22:07 elm3a23 pluto[6455]: listening for IKE messages
Mar 24 16:22:07 elm3a23 pluto[6455]: adding interface eth0/eth0 9.47.66.23:500
Mar 24 16:22:07 elm3a23 pluto[6455]: adding interface lo/lo 127.0.0.1:500
Mar 24 16:22:07 elm3a23 pluto[6455]: adding interface lo/lo ::1:500
Mar 24 16:22:07 elm3a23 pluto[6455]: loading secrets from "/etc/ipsec.secrets"
Mar 24 16:22:24 elm3a23 pluto[6455]: "i386-ppc64" #1: initiating v2 parent SA
Mar 24 16:22:24 elm3a23 pluto[6455]: "i386-ppc64" #1: transition from state
STATE_IKEv2_START to state STATE_PARENT
_I1
Mar 24 16:22:24 elm3a23 pluto[6455]: "i386-ppc64" #1: STATE_PARENT_I1: sent
v2I1, expected v2R1
-----------------------------------------------------------------

contents of ppc64 (responder) machine's /var/log/secure:
-----------------------------------------------------------------
Mar 24 20:22:38 elm3b128 ipsec__plutorun: Starting Pluto subsystem...
Mar 24 20:22:38 elm3b128 pluto[11975]: Starting Pluto (Openswan Version 2.6.07;
Vendor ID OEFk]{GSv\134hk) pid:1197
5
Mar 24 20:22:38 elm3b128 pluto[11975]: Setting NAT-Traversal port-4500 floating
to off
Mar 24 20:22:38 elm3b128 pluto[11975]:    port floating activation criteria
nat_t=0/port_float=1
Mar 24 20:22:38 elm3b128 pluto[11975]:    including NAT-Traversal patch (Version
0.6c) [disabled]
Mar 24 20:22:38 elm3b128 pluto[11975]: using /dev/urandom as source of random
entropy
Mar 24 20:22:38 elm3b128 pluto[11975]: ike_alg_register_enc(): Activating
OAKLEY_TWOFISH_CBC_SSH: Ok (ret=0)
Mar 24 20:22:38 elm3b128 pluto[11975]: ike_alg_register_enc(): Activating
OAKLEY_TWOFISH_CBC: Ok (ret=0)
Mar 24 20:22:38 elm3b128 pluto[11975]: ike_alg_register_enc(): Activating
OAKLEY_SERPENT_CBC: Ok (ret=0)
Mar 24 20:22:38 elm3b128 pluto[11975]: ike_alg_register_enc(): Activating
OAKLEY_AES_CBC: Ok (ret=0)
Mar 24 20:22:38 elm3b128 pluto[11975]: ike_alg_register_enc(): Activating
OAKLEY_BLOWFISH_CBC: Ok (ret=0)
Mar 24 20:22:38 elm3b128 pluto[11975]: ike_alg_register_hash(): Activating
OAKLEY_SHA2_512: Ok (ret=0)
Mar 24 20:22:38 elm3b128 pluto[11975]: ike_alg_register_hash(): Activating
OAKLEY_SHA2_256: Ok (ret=0)
Mar 24 20:22:38 elm3b128 pluto[11975]: starting up 1 cryptographic helpers
Mar 24 20:22:38 elm3b128 pluto[11995]: using /dev/urandom as source of random
entropy
Mar 24 20:22:38 elm3b128 pluto[11975]: started helper pid=11995 (fd:7)
Mar 24 20:22:38 elm3b128 pluto[11975]: Using Linux 2.6 IPsec interface code on
2.6.18-86.el5 (experimental code)
Mar 24 20:22:38 elm3b128 pluto[11975]: Could not change to directory
'/etc/ipsec.d/cacerts': /
Mar 24 20:22:38 elm3b128 pluto[11975]: Could not change to directory
'/etc/ipsec.d/aacerts': /
Mar 24 20:22:38 elm3b128 pluto[11975]: Could not change to directory
'/etc/ipsec.d/ocspcerts': /
Mar 24 20:22:38 elm3b128 pluto[11975]: Could not change to directory
'/etc/ipsec.d/crls'
Mar 24 20:22:38 elm3b128 pluto[11975]: added connection description "i386-x86_64"
Mar 24 20:22:38 elm3b128 pluto[11975]: added connection description "i386-ppc64"
Mar 24 20:22:38 elm3b128 pluto[11975]: added connection description "x86_64-ppc64"
Mar 24 20:22:38 elm3b128 pluto[11975]: listening for IKE messages
Mar 24 20:22:38 elm3b128 pluto[11975]: adding interface eth0/eth0 9.47.67.128:500
Mar 24 20:22:38 elm3b128 pluto[11975]: adding interface lo/lo 127.0.0.1:500
Mar 24 20:22:38 elm3b128 pluto[11975]: adding interface lo/lo ::1:500
Mar 24 20:22:38 elm3b128 pluto[11975]: loading secrets from "/etc/ipsec.secrets"
Mar 24 20:22:51 elm3b128 pluto[11975]: | found connection: i386-ppc64
-----------------------------------------------------------------

It doesn't seem like the plutodebug=all line in /etc/ipsec.conf has any effect
on what is logged.

The same behavior is seen if the ppc64 machine is the initiator and the i386
machine is the responder.

The same behavior is seen if the i386 machine is replaced with the x86_64 machine.

The same behavior is seen if using kernel-2.6.18-85.el5 on all machines.
 
---Security Component Data---
Userspace tool common name: openswan

The userspace tool has the following bit modes: both

Userspace rpm: openswan-2.6.07-2.el5
=Comment: #3=================================================
TYLER C. HICKS <tchicks@us.ibm.com> - 2008-03-24 20:56 EDT

tcpdump taken from elm3a23

=Comment: #4=================================================
TYLER C. HICKS <tchicks@us.ibm.com> - 2008-03-24 20:57 EDT

tcpdump taken from elm3b128

=Comment: #5=================================================
TYLER C. HICKS <tchicks@us.ibm.com> - 2008-03-24 21:07 EDT
It should also be noted that the i386-x86_64 connection works as expected.

Also, by removing the i386-ppc64 connection's 'ikev2=insist' line in the
/etc/ipsec.conf file, the i386-ppc64 IKEv1 negotiation is successful.  This
seems to rule out any network or configuration errors.
Comment 1 IBM Bug Proxy 2008-03-25 10:16:54 EDT
Created attachment 299030 [details]
tcpdump taken from elm3b128
Comment 2 IBM Bug Proxy 2008-03-25 10:16:56 EDT
Created attachment 299031 [details]
tcpdump taken from elm3a23
Comment 3 IBM Bug Proxy 2008-03-26 11:33:00 EDT
------- Comment From emilyr@us.ibm.com 2008-03-26 11:24 EDT-------
Bumping priority to p1 because this might potentially impact the IPv6 certification.
Comment 5 Paul Wouters 2008-03-26 16:46:19 EDT
I am missing the logs from /var/log/secure. Please attach these.
Comment 6 IBM Bug Proxy 2008-03-27 03:32:47 EDT
------- Comment From tchicks@us.ibm.com 2008-03-27 03:25 EDT-------
Hi Paul - The /var/log/secure logs from both machines are actually inline of the
original comment.  I think there may be a problem with debug logging in
openswan-2.6.07-2.el5.  I built openswan-2.6.09-2.fc9.src.rpm on 2 machines and
see the same problems with IKEv2, but the debug logging seems to work as
expected. I am attaching those logs.  The i386 machine (elm3b66) was the initiator.
---uname output---
Linux elm3b66 2.6.18-85.el5PAE #1 SMP Tue Mar 11 19:05:36 EDT 2008 i686 i686
i386 GNU/Linux
Linux elm3b128 2.6.18-85.el5 #1 SMP Tue Mar 11 18:56:12 EDT 2008 ppc64 ppc64
ppc64 GNU/Linux
---ip addresses---
elm3b66 (9.47.67.66)
elm3b128 (9.47.67.128)

---ipsec.conf---
# /etc/ipsec.conf - Openswan IPsec configuration file
#
# Manual:     ipsec.conf.5
#
# Please place your own config files in /etc/ipsec.d/ ending in .conf

version 2.0     # conforms to second version of ipsec.conf specification

# basic configuration
config setup
# Debug-logging controls:  "none" for (almost) none, "all" for lots.
# klipsdebug=none
plutodebug="all"
# For Red Hat Enterprise Linux and Fedora, leave protostack=netkey
protostack=netkey
nat_traversal=yes

conn i386-ppc64
left=9.47.67.66
right=9.47.67.128
ikev2=insist
authby=secret
auto=add

---ipsec.secrets---
9.47.67.66 9.47.67.128 : PSK "psk"
Comment 7 IBM Bug Proxy 2008-03-27 03:32:49 EDT
Created attachment 299289 [details]
/var/log/secure from initiator (elm3b66)

The proposals continue until the ipsec service is stopped.
Comment 8 IBM Bug Proxy 2008-03-27 03:32:51 EDT
Created attachment 299290 [details]
/var/log/secure from responder (elm3b128)

The proposals continue until the ipsec service is stopped.
Comment 9 Paul Wouters 2008-03-27 09:57:44 EDT
Thanks. It looks like the initiator send a bad packet with no proper proposals.
I am a bit confused as to why this happened. Were there any warnings during
compilation? What OS is this compiled on?

I'm mostly confused that this also happens on generic intel hardware.
Comment 10 IBM Bug Proxy 2008-03-28 12:49:14 EDT
------- Comment From tchicks@us.ibm.com 2008-03-28 12:42 EDT-------
Hey Paul - Those previous tests were performed on snapshot 1.  I've now
installed snapshot 2 on two machines and see similar problems.  This is with
openswan-2.6.09-1.el5.rpm.  Here are the details:

---uname output---
Linux eal5.ltc.austin.ibm.com 2.6.18-86.el5 #1 SMP Tue Mar 18 18:19:47 EDT 2008
i686 i686 i386 GNU/Linux
Linux tim-hv4.ltc.austin.ibm.com 2.6.18-86.el5 #1 SMP Tue Mar 18 18:25:26 EDT
2008 ppc64 ppc64 ppc64 GNU/Linux

---ip addresses---
eal5.ltc.austin.ibm.com has address 9.3.190.198
tim-hv4.ltc.austin.ibm.com has address 9.3.192.210

---ipsec.conf---
# /etc/ipsec.conf - Openswan IPsec configuration file
#
# Manual:     ipsec.conf.5
#
# Please place your own config files in /etc/ipsec.d/ ending in .conf

version 2.0     # conforms to second version of ipsec.conf specification

# basic configuration
config setup
# Debug-logging controls:  "none" for (almost) none, "all" for lots.
# klipsdebug=none
plutodebug="all"
# For Red Hat Enterprise Linux and Fedora, leave protostack=netkey
protostack=netkey
nat_traversal=yes

conn i386-ppc64
left=9.3.190.198
right=9.3.192.210
ikev2=insist
authby=secret
auto=add

---ipsec.secrets---
9.3.190.198 9.3.192.210 : PSK "psk"

---recreate---
[root@eal5 ~]# ipsec auto --verbose --up i386-ppc64
002 "i386-ppc64" #1: initiating v2 parent SA
133 "i386-ppc64" #1: STATE_PARENT_I1: initiate
002 "i386-ppc64" #1: transition from state STATE_IKEv2_START to state
STATE_PARENT_I1
133 "i386-ppc64" #1: STATE_PARENT_I1: sent v2I1, expected v2R1
Comment 11 IBM Bug Proxy 2008-03-28 12:49:19 EDT
Created attachment 299492 [details]
/var/log/secure from initiator (eal5)
Comment 12 IBM Bug Proxy 2008-03-28 12:49:21 EDT
Created attachment 299493 [details]
/var/log/secure from responder (tim-hv4)
Comment 13 IBM Bug Proxy 2008-03-28 14:41:26 EDT
Created attachment 299510 [details]
stderr output from rpmbuild of openswan-2.6.09-1.el5.src.rpm on eal5

Requested compiler warnings from rpmbuild of openswan-2.6.09-1.el5.src.rpm on
snapshot 2
Comment 14 IBM Bug Proxy 2008-04-01 14:10:55 EDT
------- Comment From tchicks@us.ibm.com 2008-04-01 14:02 EDT-------
Hey Paul - I noticed that the status was still NEEDINFO, please let me know what
else is needed.  Thanks!
Comment 15 Brad Peters 2008-04-02 17:16:29 EDT
Paul,

Time is nearly up for 5.2.  Please let us know what else is needed ASAP

Thanks,

Brad
Comment 16 Paul Wouters 2008-04-02 17:42:00 EDT
we are working on it now. things would speed up if we could access to a ppc64
Comment 17 Paul Wouters 2008-04-02 18:45:56 EDT
Can you please compile the attached test program on the SAME machine you build
openswan on (eg same compiler, same hardware) and show us the output.

You should see the following if all is right (which we think might not be the
case on ppc64):

[paul@bofh openswan.ikev2]$ ./offtest 
0 1 isat_np
1 1 isat_critical
2 2 isat_length
4 1 isat_type
5 1 isat_res2
6 1 isat_transid
8 size
Comment 18 Paul Wouters 2008-04-02 18:46:59 EDT
Created attachment 300144 [details]
offset tester for gcc

This is to test gcc on ppc64
Comment 19 IBM Bug Proxy 2008-04-02 19:49:16 EDT
------- Comment From tchicks@us.ibm.com 2008-04-02 19:45 EDT-------
Hey Paul - Here's the results on tim-hv4:

[root@tim-hv4 openswan-2.6.09]# gcc offtest.c -o offtest
[root@tim-hv4 openswan-2.6.09]# ./offtest
0 1 isat_np
1 1 isat_critical
2 2 isat_length
4 1 isat_type
5 1 isat_res2
6 1 isat_transid
8 size
Comment 20 Paul Wouters 2008-04-02 20:12:40 EDT
hmm. so that's not the problem then.

Can you show a full build log with any potential flags, and warnings?

But I'm afraid someone will just have to go into gdb, and look what's going with
between receiving the data properly (as the log shows) and later referencing
nulls (as the log shows). eg set a breakpoint at
ikev2_match_transform_list_parent() and see if *itl is already corrupt, and then
move up the call chain.

Having access to some throwaway ppc64 would really help us....

Comment 21 Paul Wouters 2008-04-02 20:29:00 EDT
Please try this fix to include/packet.h:

diff --git a/include/packet.h b/include/packet.h
index 31083f2..4bea335 100644
--- a/include/packet.h
+++ b/include/packet.h
@@ -706,7 +706,7 @@ struct ikev2_trans
        u_int16_t isat_length;      /* Payload length */
        u_int8_t  isat_type;        /* transform type */
        u_int8_t  isat_res2;
-       u_int8_t  isat_transid;     /* ID */
+       u_int16_t  isat_transid;     /* ID */
 };
 extern struct_desc ikev2_trans_desc;
 
We did find a few other problems that warrant a new release. But before I can
release 2.6.10, I need to read through our automated testing report. Do you have
one more day before your freeze? eg can I release 2.6.10 tonight for you
(provided this indeed fixes your problem)
Comment 22 Paul Wouters 2008-04-02 22:35:39 EDT
In case things are frozen, please add this as a patch to the current source.
It's a crasher when the other end sends a NOTIFY as first packet in IKEv2.

diff --git a/programs/pluto/ikev2_parent.c b/programs/pluto/ikev2_parent.c
index 3243d92..8800826 100644
--- a/programs/pluto/ikev2_parent.c
+++ b/programs/pluto/ikev2_parent.c
@@ -830,6 +830,12 @@ stf_status ikev2parent_inR1outI2(struct msg_digest *md)
 
     DBG(DBG_CONTROLMORE
        , DBG_log("ikev2 parent inR1: calculating g^{xy} in order to send I2"));
+
+    if(st->st_gr.len == 0) {
+       /* Remote end did not send us Gr in R1 - likely a NOTIFY message */
+       openswan_log("No responder Gr found in R1 packet");
+       return PAYLOAD_MALFORMED;
+    }


The fix will also be in 2.6.10, which I am releasing tonight or tomorrow.
Comment 23 IBM Bug Proxy 2008-04-03 01:17:10 EDT
------- Comment From tchicks@us.ibm.com 2008-04-03 01:10 EDT-------
Hi Paul - I applied the first patch to both machines and it did get further, but
still failed:

Apr  2 23:53:43 tim-hv4 pluto[28225]: "i386-ppc" #1: ERROR: asynchronous network
error report on eth0 (sport=500) for message to 9.3.190.198 port 500, complaina
nt 9.3.190.198: Connection refused [errno 111, origin ICMP type 3 code 3 (not au
thenticated)]

I am attaching the full logs.
Comment 24 IBM Bug Proxy 2008-04-03 01:25:15 EDT
Created attachment 300178 [details]
Initiator (eal5) logs

Logs with first patch from Paul applied.
Comment 25 IBM Bug Proxy 2008-04-03 01:25:17 EDT
Created attachment 300179 [details]
Responder (tim-hv4) logs

Logs with first patch from Paul applied.
Comment 26 IBM Bug Proxy 2008-04-03 04:33:41 EDT
------- Comment From tchicks@us.ibm.com 2008-04-03 04:27 EDT-------
We have some good news!  Using the first patch you posted, SA's are successfully
negotiated if the ppc machine is the initiator.  I should have tested this
earlier, sorry!

Also, one thing that I should note is that the pluto process crashes on the i386
box whenever it acts as the initiator.  Here's the back trace:

Program received signal SIGABRT, Aborted.
0x00997402 in __kernel_vsyscall ()
(gdb) bt
#0  0x00997402 in __kernel_vsyscall ()
#1  0x00285c50 in raise () from /lib/libc.so.6
#2  0x00287561 in abort () from /lib/libc.so.6
#3  0x009c18dc in passert_fail (pred_str=0xa5ae38 "pbs != NULL",
file_str=0xa5aa10 "/usr/src/redhat/BUILD/openswan-2.6.09/programs/pluto/ipse
c_doi.c", line_no=222)
at /usr/src/debug/openswan-2.6.09/programs/pluto/log.c:623
#4  0x009cd630 in accept_KE (dest=0x9779b3c, val_name=0xa5d280 "Gr",
gr=0xa8d06c, pbs=0x0)
at /usr/src/debug/openswan-2.6.09/programs/pluto/ipsec_doi.c:222
#5  0x009dfaae in ikev2parent_inR1outI2 (md=0x977a140)
at /usr/src/debug/openswan-2.6.09/programs/pluto/ikev2_parent.c:830
#6  0x009dd51f in process_v2_packet (mdp=0xa966c0)
at /usr/src/debug/openswan-2.6.09/programs/pluto/ikev2.c:448
#7  0x009f2871 in process_packet (mdp=0xa966c0)
at /usr/src/debug/openswan-2.6.09/programs/pluto/demux.c:167
#8  0x009f2c67 in comm_handle (ifp=0x9778aa8)
at /usr/src/debug/openswan-2.6.09/programs/pluto/demux.c:212
#9  0x009c8aa8 in call_server ()
at /usr/src/debug/openswan-2.6.09/programs/pluto/server.c:760
#10 0x009c59ea in main (argc=1146766671, argv=0x7d544d7e)
at /usr/src/debug/openswan-2.6.09/programs/pluto/plutomain.c:828
Comment 27 Herbert Xu 2008-04-03 08:49:14 EDT
This means that the v2KE payload is missing.  We should have a check for that so
it doesn't crash.  However, the real problem is why does it think it's missing
as it should really be there.

Does this happen even when you're going from i386 to x86-64?
Comment 29 Paul Wouters 2008-04-03 10:38:08 EDT
the second patch i put in here is the fix for the i386 crasher reported. It
means the other end send a notify in the first packet of a problem, without
adding the KE.

Are the configurations still as posted in this bug item?

Comment 30 Paul Wouters 2008-04-03 11:31:18 EDT
We released 2.6.10 that incorporates these patches. Can you redo the tests with
that version?
Comment 31 IBM Bug Proxy 2008-04-03 11:58:18 EDT
------- Comment From tchicks@us.ibm.com 2008-04-03 11:49 EDT-------
With the 2nd patch applied to openswan-2.6.09-1.el5.src.rpm, I couldn't
negotiate SA's with any platform:

------------
Apr  3 10:26:53 tim-hv4 pluto[9671]: | processing payload: ISAKMP_NEXT_v2V (len=
16)
Apr  3 10:26:53 tim-hv4 pluto[9671]: | ikev2 parent inR1: calculating g^{xy} in
order to send I2
Apr  3 10:26:53 tim-hv4 pluto[9671]: packet from 9.3.190.198:500: No responder G
r found in R1 packet
Apr  3 10:26:53 tim-hv4 pluto[9671]: | complete v2 state transition with (null)
Apr  3 10:26:53 tim-hv4 pluto[9671]: | state transition function for STATE_PAREN
T_I1 failed: INVALID_FLAGS
------------

However, without the 2nd patch applied, I could get i386 <-> x86_64 working just
fine.

The configs are still the same as before.

I'm going to pull down 2.6.10 and give that a shot.
Comment 32 Paul Wouters 2008-04-03 12:17:45 EDT
2.6.10 will have the same problem.

Indee,d we should not be returning PALYOAD_MALFORMED. We should not be in the
code to process the KE at all. I'm currently following up the call chain to see
where it mistakenly continues, instead of either sending a notify or waiting on
a new packet.
Comment 33 Paul Wouters 2008-04-03 12:27:40 EDT
The initiator logs shows both charon and pluto. Are you sure you werent running
a mix of openswan and strongswan?

Anyway, the root cause seems to be:

Apr  2 23:53:34 tim-hv4 charon: 09[AUD] DH group MODP_1536_BIT inacceptable,
requesting MODP_1024_BIT 

So this is going to re-send an I1 packet with a different KE.

I am not sure why 1536 is rejected over 1024 though. That seems silly.
Comment 34 IBM Bug Proxy 2008-04-03 13:25:40 EDT
------- Comment From tchicks@us.ibm.com 2008-04-03 13:23 EDT-------
I can't figure out why those charon messages are in there.  I have had
strongswan installed on that machine before, but I can't find any leftover bits.
I'm going to kick off a fresh install of snapshot 3 on that machine.
Comment 35 IBM Bug Proxy 2008-04-03 13:57:46 EDT
------- Comment From tchicks@us.ibm.com 2008-04-03 13:48 EDT-------
I built the srpm with the first patch on yet another ppc machine (identical to
tim-hv4) that has never had strongswan installed on it.  The i386 machine was
able to act at the initiator and negotiate valid SA's!

There must have just been a few leftover charon pieces causing trouble on
tim-hv4.  I'm still installing a fresh copy of snapshot 3 on tim-hv4 to make
sure, but it looks like your first patch fixed this bug
Comment 36 IBM Bug Proxy 2008-04-03 18:17:26 EDT
------- Comment From tchicks@us.ibm.com 2008-04-03 18:12 EDT-------
I've ran through several negotiations and it looks like everything is working well:

----------------------------------------
[root@eal5 ~]# ipsec auto --verbose --up i386-ppc
002 "i386-ppc" #1: initiating v2 parent SA
133 "i386-ppc" #1: STATE_PARENT_I1: initiate
002 "i386-ppc" #1: transition from state STATE_IKEv2_START to state STATE_PARENT_I1
133 "i386-ppc" #1: STATE_PARENT_I1: sent v2I1, expected v2R1
002 "i386-ppc" #2: transition from state STATE_PARENT_I1 to state STATE_PARENT_I2
134 "i386-ppc" #2: STATE_PARENT_I2: sent v2I2, expected v2R2 {auth=IKEv2
cipher=aes_128 integ=sha1 prf=oakley_sha group=modp1536}
002 "i386-ppc" #2: transition from state STATE_PARENT_I2 to state STATE_PARENT_I3
002 "i386-ppc" #2: negotiated tunnel [9.3.190.198,9.3.190.198] ->
[9.3.192.210,9.3.192.210]
004 "i386-ppc" #2: STATE_PARENT_I3: PARENT SA established tunnel mode
{ESP=>0x328bd752 <0x5b5134f0 xfrm=AES_128-HMAC_SHA1 NATOA=none NATD=none DPD=none}
----------------------------------------
Comment 38 Paul Wouters 2008-04-08 13:55:17 EDT
The fix in 2.6.11 is slightly different (and better). You might want to re-run
the tests, and then close this bug report.
Comment 40 IBM Bug Proxy 2008-04-09 17:09:36 EDT
------- Comment From tchicks@us.ibm.com 2008-04-09 17:03 EDT-------
I ran the tests while using 2.6.11 and everything looks good.  I tested between
i386, x86_64, and ppc64.  No problems at all!
Comment 41 Steve Grubb 2008-04-09 17:11:38 EDT
openswan-2.6.11-1.el5 was built to resolve this problem.
Comment 44 IBM Bug Proxy 2008-04-30 16:10:40 EDT
------- Comment From tchicks@us.ibm.com 2008-04-30 16:01 EDT-------
I just realized that I had not verified this bug with an official rpm released
from RH.  I verified with openswan-2.6.12-2.el5 from snapshot #7.
Comment 46 errata-xmlrpc 2008-05-21 11:29:05 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0395.html

Note You need to log in before you can comment on or make changes to this bug.