Bug 164730 - x509 Certificate based IPSec VPN tunnels cause Kernel panic.
x509 Certificate based IPSec VPN tunnels cause Kernel panic.
Status: CLOSED WONTFIX
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
3.0
All Linux
medium Severity medium
: ---
: ---
Assigned To: David Miller
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-07-31 01:37 EDT by David Herselman
Modified: 2007-11-30 17:07 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-10-19 14:56:42 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
ifup-ipsec and ifdown-ipsec patch as from previous Bugzilla case. (1.73 KB, patch)
2005-07-31 01:37 EDT, David Herselman
no flags Details | Diff

  None (edit)
Description David Herselman 2005-07-31 01:37:55 EDT
Description of problem:
  Using 'ifcfg-ipsec0' to establish an interbranch IPSec VPN tunnel which
  eventually causes a kernel panic (can run for up to 3 days

  These VPN tunnels are established between remote servers not directly on the
  same physical network and required the following patches to 'ifup-ipsec'
  and 'ifdown-ipsec' as logged in a bug report in Bugzilla:

  [root@unix-01 root]# diff -uNr /mirror/linux//etc/sysconfig/network-
scripts/ifup-ipsec /etc/sysconfig/network-scripts/ifup-ipsec
--- /mirror/linux//etc/sysconfig/network-scripts/ifup-ipsec     2004-04-13 
18:21:13.000000000 +0200
+++ /etc/sysconfig/network-scripts/ifup-ipsec   2005-05-30 00:48:08.000000000 
+0200
@@ -137,7 +137,8 @@
       [ -z "$SRCNET" ] && SRCNET="$SRC/32"
       [ -z "$DSTNET" ] && DSTNET="$DST/32"

-      ip route add to $DSTNET via $DST
+      TUNSRC=`ip -o route get to $SRCNET | sed "s|.*src \([^ ]*\).*|\1|"`
+      ip route add to $DSTNET via $TUNSRC src $TUNSRC

       /sbin/setkey -c >/dev/null 2>&1 << EOF
 delete $SRC $DST ah $SPI_AH_OUT;
@@ -191,7 +192,8 @@
       [ -z "$SRCNET" ] && SRCNET="$SRC/32"
       [ -z "$DSTNET" ] && DSTNET="$DST/32"

-      ip route add to $DSTNET via $DST
+      TUNSRC=`ip -o route get to $SRCNET | sed "s|.*src \([^ ]*\).*|\1|"`
+      ip route add to $DSTNET via $TUNSRC src $TUNSRC

       /sbin/setkey -c >/dev/null 2>&1 << EOF
 spddelete $SRCNET $DSTNET any -P out;



[root@unix-01 root]# diff -uNr /mirror/linux//etc/sysconfig/network-
scripts/ifdown-ipsec /etc/sysconfig/network-scripts/ifdown-ipsec
--- /mirror/linux//etc/sysconfig/network-scripts/ifdown-ipsec   2004-04-13 
18:26:12.000000000 +0200
+++ /etc/sysconfig/network-scripts/ifdown-ipsec 2005-05-29 23:22:06.000000000 
+0200
@@ -62,7 +62,8 @@
       [ -z "$SRCNET" ] && SRCNET="$SRC/32"
       [ -z "$DSTNET" ] && DSTNET="$DST/32"

-      ip route del to $DSTNET via $DST
+      TUNSRC=`ip -o route get to $SRCNET | sed "s|.*src \([^ ]*\).*|\1|"`
+      ip route del to $DSTNET via $TUNSRC src $TUNSRC

       /sbin/setkey -c >/dev/null 2>&1 << EOF
        spddelete $SRCNET $DSTNET any -P out;



Systems are using x509 certificates to authenticate...

Configuration files:
[-------------- /etc/sysconfig/network-scripts/ifcfg-ipsec0 --------------]
TYPE=IPsec
  DEVICE=ipsec0
  ONBOOT=no
#  IKE_METHOD=PSK
#  IKE_PSK=654321
  IKE_METHOD=X509
  IKE_CERTFILE=VPNgateway
  IKE_PEER_CERTFILE=GT
  SRCGW=192.168.4.1
  DSTGW=192.168.0.1
  SRCNET=192.168.4.0/22
  DSTNET=192.168.0.0/22
  DST=`dig vpn.gt.co.za +tcp +short | tail -n 1`
[-------------- /etc/sysconfig/network-scripts/ifcfg-ipsec0 --------------]

[-------------- /etc/racoon/racoon.conf --------------]
# Racoon IKE daemon configuration file.
# See 'man racoon.conf' for a description of the format and entries.

path include "/etc/racoon";
path pre_shared_key "/etc/racoon/psk.txt";
path certificate "/etc/racoon/certs";

#log debug;

padding {
        randomize on;           # enable randomize length.
        maximum_length 20;      # maximum padding length.
        exclusive_tail on;      # extract last one octet.
        strict_check off;       # enable strict check.
}

remote anonymous {
        exchange_mode aggressive,main;
        passive on;
        doi ipsec_doi;
        generate_policy on;
        proposal_check obey;
        lifetime time 21600 sec;
#v x509 Certificates and 'rsasig' instead of 'pre_shares_key' v
        certificate_type x509 "VPNgateway.public" "VPNgateway.private";
        verify_cert on;
        my_identifier asn1dn;
        peers_identifier asn1dn;
#^ x509 Certificates and 'rsasig' instead of 'pre_shares_key' ^
        proposal {
                encryption_algorithm 3des;
                hash_algorithm sha1;
#               authentication_method pre_shared_key;
                authentication_method rsasig;
                dh_group modp1024;
        }
}

sainfo anonymous {
        pfs_group modp2048;
        lifetime time 21600 sec;
        encryption_algorithm 3des, blowfish 448, rijndael;
        authentication_algorithm hmac_sha1, hmac_md5;
        compression_algorithm deflate;
}
include "/etc/racoon/196.41.206.218.conf";
[-------------- /etc/racoon/racoon.conf --------------]

[-------------- /etc/racoon/196.41.206.218.conf --------------]
remote 196.41.206.218
{
        exchange_mode aggressive, main;
        my_identifier asn1dn;
        peers_identifier asn1dn;
        certificate_type x509 "VPNgateway.public" "VPNgateway.private";
        peers_certfile "GT.public";
        proposal {
                encryption_algorithm 3des;
                hash_algorithm sha1;
                authentication_method rsasig;
                dh_group 2;
        }
}
[-------------- /etc/racoon/196.41.206.218.conf --------------]



Log entries showing kernel panic:
Jul 27 17:24:32 unix-01 racoon: INFO: isakmp.c:1387:isakmp_open(): 10.0.0.1
[500] used as isakmp port (fd=8)
Jul 27 17:24:32 unix-01 racoon: INFO: isakmp.c:1387:isakmp_open(): 192.168.4.1
[500] used as isakmp port (fd=9)
Jul 27 17:24:32 unix-01 racoon: INFO: isakmp.c:1387:isakmp_open(): 127.0.0.1
[500] used as isakmp port (fd=10)
Jul 27 17:24:35 unix-01 kernel: KERNEL: assertion (x->km.state == 
XFRM_STATE_DEAD) failed at xfrm_state.c(193)
Jul 27 17:24:35 unix-01 kernel: KERNEL: assertion (x->km.state == 
XFRM_STATE_DEAD) failed at xfrm_state.c(193)
Jul 27 17:24:35 unix-01 kernel: ------------[ cut here ]------------
Jul 27 17:24:35 unix-01 kernel: kernel BUG at xfrm_state.c:54!
Jul 27 17:24:35 unix-01 kernel: invalid operand: 0000
Jul 27 17:24:35 unix-01 kernel: esp4 ah4 cls_u32 sch_sfq sch_cbq ipt_TOS 
ipt_limit ip_nat_irc ppp_synctty ppp_async ppp_generic slhc ipt_state ipt_owner 
ipt_REDIRECT ipt_REJECT ipt_LOG iptab


And another:
Jul 30 05:59:07 unix-01 kernel: KERNEL: assertion (x->km.state == 
XFRM_STATE_DEAD) failed at xfrm_state.c(193)
<nothing else logged>


Version-Release number of selected component (if applicable):
  Have had the same experience in 4 different companies on hardware ranging
  from Intel Pentium III, Intel Pentium 4, AMD Athlon XP and AMD Sempron
  architecures.

  Can confirm this problem also affected two previously kernels: 2.4.21-32.EL
  and 2.4.21-27.0.4.EL (older kernels weren't tested).


How reproducible:
  Always

Steps to Reproduce:
1. Establish an IPSec VPN between two remote hosts on the internet
2. Use 'standard' configuration as detailed above in the 'Description' field
3. Wait for the system to lock up...

  
Actual results:
  Kernel panic, system occassionally logs details to the disc in time...


Expected results:
  Should have remained up/connected...


Additional info:
  Am currently testing 2 systems using PSK keys instead of certificates, has
  only been up for a day so far though so can not rule out that this solves
  the problem...
Comment 1 David Herselman 2005-07-31 01:37:55 EDT
Created attachment 117319 [details]
ifup-ipsec and ifdown-ipsec patch as from previous Bugzilla case.
Comment 2 David Herselman 2005-08-01 05:41:24 EDT
Can confirm that a standard PSK VPN tunnel also produces the same kernel 
panic...

ie:
[-------------- /etc/sysconfig/network-scripts/ifcfg-ipsec0 --------------]
TYPE=IPsec
  DEVICE=ipsec0
  ONBOOT=no
  IKE_METHOD=PSK
  IKE_PSK=654321
  SRCGW=192.168.4.1
  DSTGW=192.168.0.1
  SRCNET=192.168.4.0/22
  DSTNET=192.168.0.0/22
  DST=196.41.206.218
[-------------- /etc/sysconfig/network-scripts/ifcfg-ipsec0 --------------]
Comment 4 David Herselman 2005-08-02 06:24:43 EDT
Been doing some further testing and this only appears to affect IPSec VPN 
tunnels established over ADSL PPPoE sessions. We perform dome bandwidth shaping 
on PPPoE connections by default as well so I've turned this off for the time 
being and will report back in a couple of days if this resolves the problem...

Section in /etc/ppp/ip-up.local what configures the basic traffic shaping on 
the ADSL connection:

PS: This is for a connection which can only send out at 260Kbps...

/sbin/tc qdisc del dev $1 root 2> /dev/null > /dev/null
/sbin/tc qdisc del dev $1 ingress 2> /dev/null > /dev/null
/sbin/tc qdisc add dev $1 root handle 1: cbq bandwidth 100mbit avpkt 1000 cell 8
/sbin/tc class add dev $1 parent 1: classid 1:1 cbq rate 260kbit weight 26kbit 
allot 1514 cell 8 prio 5 avpkt 1000 bounded isolated
/sbin/tc class add dev $1 parent 1:1 classid 1:10 cbq rate 260kbit weight 
26kbit allot 1514 cell 8 prio 1 avpkt 1000
/sbin/tc class add dev $1 parent 1:1 classid 1:20 cbq rate 234kbit weight 
23.4kbit allot 1514 cell 8 prio 2 avpkt 1000
/sbin/tc qdisc add dev $1 parent 1:10 handle 10: sfq perturb 10
/sbin/tc qdisc add dev $1 parent 1:20 handle 20: sfq perturb 10
/sbin/tc filter add dev $1 parent 1:0 protocol ip prio 10 u32 match ip tos 0x10 
0xff  flowid 1:10
/sbin/tc filter add dev $1 parent 1:0 protocol ip prio 11 u32 match ip protocol 
1 0xff flowid 1:10
/sbin/tc filter add dev $1 parent 1: protocol ip prio 12 u32 \
        match ip protocol 6 0xff \
        match u8 0x05 0x0f at 0 \
        match u16 0x0000 0xffc0 at 2 \
        match u8 0x10 0xff at 33 \
        flowid 1:10
/sbin/tc filter add dev $1 parent 1: protocol ip prio 13 u32 match ip dst 
0.0.0.0/0 flowid 1:20
Comment 5 David Herselman 2005-08-16 18:49:30 EDT
Alright, this has nothing to do with Bandwidth Management nor x509 certificates 
(also happens on PSK ipsec connections). I did however find the following 
article which deals with this exact issue though: 
http://lists.openswan.org/pipermail/users/2005-April/004540.html

The following information is from the 2.6.11.7 changelog:
<kaber at trash.net>
	[PATCH] : Do not hold state lock while checking size
	
	This patch from Herbert Xu fixes a deadlock with IPsec.
	When an ICMP frag. required is sent and the ICMP message
	needs the same SA as the packet that caused it the state
	will be locked twice.
	
	[IPSEC]: Do not hold state lock while checking size.
	
	This can elicit ICMP message output and thus result in a
	deadlock.
	
	Signed-off-by: Herbert Xu <herbert at gondor.apana.org.au>
	Signed-off-by: David S. Miller <davem at davemloft.net>
	Signed-off-by: Chris Wright <chrisw at osdl.org>
	Signed-off-by: Greg Kroah-Hartman <gregkh at suse.de>


diff -Nru a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
--- a/net/xfrm/xfrm_state.c     2005-04-07 11:58:58 -07:00
+++ b/net/xfrm/xfrm_state.c     2005-04-07 11:58:58 -07:00
@@ -609,7 +609,7 @@

         for (i = 0; i < XFRM_DST_HSIZE; i++) {
                 list_for_each_entry(x, xfrm_state_bydst+i, bydst) {
-                       if (x->km.seq == seq) {
+                       if (x->km.seq == seq && x->km.state == XFRM_STATE_ACQ) {
                                 xfrm_state_hold(x);
                                 return x;
                         }
Comment 6 David Herselman 2005-08-18 16:40:23 EDT
No longer receive 'kernel BUG at xfrm_state.c:54!' error messages but systems 
are still locking up. Systems appear to be unable to write information to the 
syslog file once they log up, the last information on this system was the 
following:

Aug 18 02:45:18 unix-01 racoon: INFO: pfkey.c:1394:pk_recvexpire(): IPsec-SA 
expired: AH/Tunnel 196.25.242.202->165.146.30.88 spi=200953471(0xbfa4e7f)
Aug 18 02:45:18 unix-01 racoon: INFO: pfkey.c:1394:pk_recvexpire(): IPsec-SA 
expired: ESP/Tunnel 196.25.242.202->165.146.30.88 spi=847614(0xceefe)
Aug 18 02:45:18 unix-01 racoon: INFO: pfkey.c:1394:pk_recvexpire(): IPsec-SA 
expired: AH/Tunnel 165.146.30.88->196.25.242.202 spi=18140664(0x114cdf8)
Aug 18 08:07:37 unix-01 syslogd 1.4.1: restart.
Aug 18 08:07:37 unix-01 syslog: syslogd startup succeeded
Aug 18 08:07:37 unix-01 kernel: klogd 1.4.1, log source = /proc/kmsg started.


Thought it may have been related to Bugzilla 118885 
(https://bugzilla.redhat.com/bugzilla/long_list.cgi?buglist=118885) but the 
patch appears to have been applied to the code...

Still hunting...
Comment 7 David Herselman 2005-08-22 20:09:57 EDT
This only appears to affect systems that connect using PPPoE (ADSL). Re-entered 
as 166531 (original information missleading):
  https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=166531
Comment 8 RHEL Product and Program Management 2007-10-19 14:56:42 EDT
This bug is filed against RHEL 3, which is in maintenance phase.
During the maintenance phase, only security errata and select mission
critical bug fixes will be released for enterprise products. Since
this bug does not meet that criteria, it is now being closed.
 
For more information of the RHEL errata support policy, please visit:
http://www.redhat.com/security/updates/errata/
 
If you feel this bug is indeed mission critical, please contact your
support representative. You may be asked to provide detailed
information on how this bug is affecting you.

Note You need to log in before you can comment on or make changes to this bug.