Bug 166531
Summary: | IPSec VPN Tunnels cause kernel panic when run over PPPoE (ADSL) | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 3 | Reporter: | David Herselman <bbs2web> |
Component: | kernel | Assignee: | David Miller <davem> |
Status: | CLOSED WONTFIX | QA Contact: | Brian Brock <bbrock> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 3.0 | CC: | petrides |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2007-10-19 18:55:38 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
David Herselman
2005-08-23 00:08:33 UTC
Have tried to implement a work around whereby I run a cron job every 3 hours to restart the tunnels and the systems are staying up longer now but had 2 crash with the following this morning: syslog: Aug 24 18:15:00 unix-01 modprobe: modprobe: Can't locate module ripemd160 Aug 24 18:15:00 unix-01 modprobe: modprobe: Can't locate module cast128 Aug 24 18:15:00 unix-01 modprobe: modprobe: Can't locate module lzs Aug 24 18:15:01 unix-01 modprobe: modprobe: Can't locate module lzjh Aug 24 18:15:01 unix-01 kernel: KERNEL: assertion (x->km.state == XFRM_STATE_DEAD) failed at xfrm_state.c(193) Aug 25 08:00:12 unix-01 syslogd 1.4.1: restart. Aug 25 08:00:12 unix-01 syslog: syslogd startup succeeded Aug 25 08:00:12 unix-01 kernel: klogd 1.4.1, log source = /proc/kmsg started. Screen: Kernel bug at xfrm_state.c:54! invalid operand : 0000 ide_cd cdrom esp4 ah4 cls_u32 sch_sfq sch_cbq ipt_TOS (did not finish writing all of these there where a couple more) CPU1 EIP: 0060 [<c028b15a>] Not tained EFLAGS:0010202 EIP is at xfrm_state_gc destroy [KERNEL] 0x1a (2.4.21-32.0.1 Elmp /i686) (Then there where a whole bunch of numbers) Kernel panic: Fatal exception 2nd system that crashed, also running IPSec network-to-network VPN over PPPoE: Aug 25 07:46:14 unix-01 pppd[3077]: LCP terminated by peer Aug 25 07:46:14 unix-01 pppoe[3078]: Session 4481 terminated -- received PADT from peer Aug 25 07:46:14 unix-01 pppoe[3078]: Sent PADT Aug 25 07:46:14 unix-01 pppd[3077]: Modem hangup Aug 25 07:46:14 unix-01 pppd[3077]: Connection terminated. Aug 25 07:46:14 unix-01 pppd[3077]: Connect time 1440.2 minutes. Aug 25 07:46:14 unix-01 pppd[3077]: Sent 112961249 bytes, received 345804065 bytes. Aug 25 07:46:14 unix-01 pppd[3077]: Exit. Aug 25 07:46:14 unix-01 adsl-connect: ADSL connection lost; attempting re- connection. Aug 25 07:46:14 unix-01 /etc/hotplug/net.agent: NET unregister event not supported Aug 25 07:46:18 unix-01 kernel: KERNEL: assertion (x->km.state == XFRM_STATE_DEAD) failed at xfrm_state.c(193) Aug 25 08:14:43 unix-01 syslogd 1.4.1: restart. Aug 25 08:14:43 unix-01 syslog: syslogd startup succeeded Aug 25 08:14:43 unix-01 kernel: klogd 1.4.1, log source = /proc/kmsg started. Sounds extremely relevant to the following kernel Bug posting: http://www.uwsg.indiana.edu/hypermail/linux/net/0307.3/0030.html Herbert's patch from the above posting has already been patched to the current system's kernel... Again, this only affects systems running IPSec tunnels over PPPoE connections, we switched one of the servers on to its backup route (fractional T1 (diginet)) and it hasn't locked up once. Is there any additional information I can supply to assist with resolving this problem? We've setup a RHEL4 test server running the same config so we'll see if this is specific to RHEL3 shortly... Item of concern is how many people are actually doing this (especially via dynamic IP PPPoE connections) due to: 1. The ifup-ipsec and ifdown-ipsec scripts being broken for net-to-net VPNs 2. Racoon missing an init script 3. Having to hack together a simple script to handle the changing IPs which updates the 'DST=' entry in /etc/sysconfig/network-scripts/ifcfg-ipsec? Could this possibly have something to do with IP addresses changing when the PPPoE connections re-establish? No feedback from anyone out there and I was under pressure to get this resolved... Got it working by installing kernel 2.6 from RHEL4.1 on the RHEL3 servers. Required packages: kernel-2.6.9-11.EL.i686.rpm lvm2-2.01.08-1.0.RHEL4.i386.rpm depend/device-mapper-1.01.01-1.RHEL4.i386.rpm depend/glibc-2.3.4-2.9.i686.rpm depend/glibc-common-2.3.4-2.9.i386.rpm depend/ipsec-tools-0.3.3-6.i386.rpm depend/l2tpd-0.69-12jdl.i386.rpm depend/libselinux-1.19.1-8.i386.rpm depend/mkinitrd-4.2.1.3-1.i386.rpm depend/module-init-tools-3.1-0.pre5.3.i386.rp depend/nscd-2.3.4-2.9.i386.rpm Installed like this: rpm -e piranha rpm -Uvh --nodeps lvm2-2.01.08-1.0.RHEL4.i386.rpm rpm -Uvh depend/*.rpm rpm -ivh kernel-2.6.9-11.EL.i686.rpm vi /etc/lilo.conf lilo Didn't get to test the following patch from Bugzilla #168458: http://sourceforge.net/mailarchive/forum.php?thread_id=3866075&forum_id=32000 This bug is filed against RHEL 3, which is in maintenance phase. During the maintenance phase, only security errata and select mission critical bug fixes will be released for enterprise products. Since this bug does not meet that criteria, it is now being closed. For more information of the RHEL errata support policy, please visit: http://www.redhat.com/security/updates/errata/ If you feel this bug is indeed mission critical, please contact your support representative. You may be asked to provide detailed information on how this bug is affecting you. |