Bug 1168407

Summary: pluto dumps core with an assertion failure during connection start for a PSK-using connection
Product: [Fedora] Fedora Reporter: Chris Siebenmann <cks-rhbugzilla>
Component: libreswanAssignee: Paul Wouters <pwouters>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 20CC: cks-rhbugzilla, pwouters
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-03-25 22:51:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
hoskey.secrets none

Description Chris Siebenmann 2014-11-26 20:25:11 UTC
Description of problem:
I am trying to set up tunnel-mode IPSec to cover a GRE tunnel with a preshared
secret key between two 64-bit Fedora 20 machines. When I do this, pluto on one
eside will dump core with an assertion failure.

Version-Release number of selected component (if applicable):
libreswan-3.12-1.fc20.x86_64

How reproducible:
Always.

Pluto logs from the crash:
Nov 26 14:53:47 hawkwind pluto[20875]: loading secrets from "/etc/ipsec.secrets"
Nov 26 14:53:47 hawkwind pluto[20875]: loading secrets from "/etc/ipsec.d/hawk-gre.secrets"
Nov 26 14:53:47 hawkwind pluto[20875]: loading secrets from "/etc/ipsec.d/hostkey.secrets"
Nov 26 14:53:47 hawkwind pluto[20875]: "/etc/ipsec.d/hostkey.secrets" line 14: CKAIDNSS keyword not found where expected in RSA key
Nov 26 14:53:47 hawkwind pluto[20875]: "hawkgre" #1: initiating v2 parent SA
Nov 26 14:53:47 hawkwind pluto[20875]: "hawkgre" #1: transition from state STATE_IKEv2_START to state STATE_PARENT_I1
Nov 26 14:53:47 hawkwind pluto[20875]: "hawkgre" #1: STATE_PARENT_I1: sent v2I1, expected v2R1
Nov 26 14:53:47 hawkwind pluto[20875]: | V2 microcode entry (initiate IKE_SA_INIT) has unspecified timeout_event
Nov 26 14:53:49 hawkwind pluto[20875]: "hawkgre" #1: ASSERTION FAILED at /builddir/build/BUILD/libreswan-3.12/programs/pluto/hmac.c:85: tkey2 != NULL
Nov 26 14:53:49 hawkwind pluto[20875]: "hawkgre" #1: ABORT at /builddir/build/BUILD/libreswan-3.12/programs/pluto/hmac.c:85
Nov 26 14:53:49 hawkwind pluto[20875]: "hawkgre" #1: ABORT at /builddir/build/BUILD/libreswan-3.12/programs/pluto/hmac.c:85

My /etc/ipsec.d/hawkgre.conf (with comments removed):

conn hawkgre
        left=128.100.3.58
        right=66.96.18.208
        leftprotoport=gre
        rightprotoport=gre
        auto=start
        authby=secret
        ikev2=insist

The IP address 128.100.3.58 is an interface alias; the primary IP address
of the interface is 128.100.3.51.

Based on libreswan source code from the master repo, this appears to be
PK11_Derive() failing for some reason (which in turn is part of NSS,
which may mean that this is actually an NSS bug).

Possibly relevant:
https://www.mail-archive.com/swan@lists.libreswan.org/msg00300.html

Comment 1 Paul Wouters 2014-11-27 16:05:05 UTC
Nov 26 14:53:47 hawkwind pluto[20875]: "/etc/ipsec.d/hostkey.secrets" line 14: CKAIDNSS keyword not found where expected in RSA key

It looks like you might have copied the RSA key from an openswan non-NSS compiled version into libreswan. libreswan is trying to lookup the private key in NSS using the CKAIDNSS, but did not find it.

For NSS migration of your keys/certs, please see: 


https://libreswan.org/wiki/Using_NSS_with_libreswan#Importing_third-party_certificates_into_NSS

Of course, libreswan should not crash on this, so we will look at that as a bug. But doing this migration might resolve your issue.

Comment 2 Chris Siebenmann 2014-11-27 16:15:43 UTC
Aha. It's not so much that I copied hostkey.secrets explicitly, it's
that I inherited it from previous versions of Fedora. This machine was
installed in 2006 and has been upgraded from Fedora version to Fedora
version ever since, so at some point OpenSWAN / libreswan / etc changed
but this old autogenerated-on-install hostkey didn't get updated.

Since it's an autogenerated host key that I don't use, I've just deleted
it. Since upgrading from old Fedora versions for so long is likely
quite rare, I suspect that this bug can be downgraded to a relatively
unimportant status.

(That this happened with a PSK but not when I switched to rsasig
authentication is probably a red herring.)

Comment 3 Paul Wouters 2014-11-27 16:26:17 UTC
can you share the contents of ipsec.secrets or its include with the problematic host key? Just to confirm we understand the problem?

Comment 4 Chris Siebenmann 2014-11-27 16:49:36 UTC
Created attachment 962198 [details]
hoskey.secrets

/etc/ipsec.secrets is just the standard 'include /etc/ipsec.d/*.secrets'.

I've attached the hostkey.secrets file itself; as an autogenerated file that
I never used, its contents are not sensitive.

Comment 5 Paul Wouters 2014-11-27 18:12:09 UTC
so you tried to use preshared key without a PSK entry and this old non-NSS RSA key entry? So PSK should have failed for not finding a proper PSK entry, and it somehow got stuck on the old RSA key entry?

btw if just running GRE, you can increase your effective MTU by a few bytes when using Transport Mode, type=transport

Comment 6 Chris Siebenmann 2014-11-27 18:19:02 UTC
I had my PSK in a separate /etc/ipsec.d/key.secrets file (now not needed
after I switched to RSA signatures). Both ends are F20 machines and had
the PSK, but only one end was old enough to have the hostkey.secrets
file and it was the only end that hit the assertion failure.

I use tunnel mode for GRE because of the weird effects that GRE copying
and using the underlying packet's TTL has in transport mode. For
instance, traceroute over the GRE tunnel doesn't really work all that
well; packets can vanish mysteriously in between the tunnel start point
and end point.

Comment 7 Paul Wouters 2015-03-20 04:21:38 UTC
I cannot seem to reproduce this. Using your secrets file, I get:

[root@west ~]# ipsec auto --up westnet-eastnet
002 "westnet-eastnet" #2: initiating v2 parent SA
133 "westnet-eastnet" #2: STATE_PARENT_I1: initiate
002 "westnet-eastnet" #2: transition from state STATE_IKEv2_START to state STATE_PARENT_I1
133 "westnet-eastnet" #2: STATE_PARENT_I1: sent v2I1, expected v2R1
003 "westnet-eastnet" #2: Failed to find our RSA key


which is what I would expect. I'm not sure how to try and reproduce this :(

Comment 8 Chris Siebenmann 2015-03-24 20:08:32 UTC
Given how obscure this seems to be (and also that this was reported against
Fedora 20), I suspect that it's okay to drop this bug as 'cannot reproduce,
probably weird, may be gone in the latest updates'.

Comment 9 Paul Wouters 2015-03-25 22:51:02 UTC
ok, thanks