Bug 228409 - LSPP: regular ipsec in upstream kernel crashes
Summary: LSPP: regular ipsec in upstream kernel crashes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Alexander Viro
QA Contact: Brian Brock
URL:
Whiteboard:
: 231690 (view as bug list)
Depends On:
Blocks: RHEL5LSPPCertTracker 233157 234654
TreeView+ depends on / blocked
 
Reported: 2007-02-12 23:09 UTC by Joy Latten
Modified: 2018-10-19 22:47 UTC (History)
6 users (show)

Fixed In Version: RHBA-2007-0959
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-11-07 19:40:06 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
IBM Linux Technology Center 32010 0 None None None Never
Red Hat Product Errata RHBA-2007:0959 0 normal SHIPPED_LIVE Updated kernel packages for Red Hat Enterprise Linux 5 Update 1 2007-11-08 00:47:37 UTC

Description Joy Latten 2007-02-12 23:09:17 UTC
Description of problem:
When running regular ipsec in upstream linux-2.6.20, after a few minutes of
sending streams of packets, the kernel crashes. 

Version-Release number of selected component (if applicable):
linux-2.6.20

How reproducible:
Happens every time.

Steps to Reproduce:
1.Configure regular ipsec on machines, A & B.
In /etc/racoon/racoon.conf

path include "/etc/racoon";
path pre_shared_key "/etc/racoon/psk.txt";
path certificate "/etc/racoon/certs";

remote anonymous
{
        exchange_mode main,aggressive;
        doi ipsec_doi;
        situation identity_only;

        my_identifier address;

        lifetime time 10 minutes;   # sec,min,hour
        initial_contact on;
        proposal_check obey;    # obey, strict or claim


        proposal {
                encryption_algorithm 3des;
                hash_algorithm sha1;
                authentication_method pre_shared_key ;
                dh_group 2 ;
        }
}

sainfo anonymous
{
        pfs_group 2;
        lifetime time 3 minutes ;
        encryption_algorithm 3des, blowfish 448, rijndael ;
        authentication_algorithm hmac_sha1, hmac_md5 ;
        compression_algorithm deflate ;
}

In /etc/racoon/psk.txt:
10.1.1.2                      flibbertigibbet
10.1.1.3                      flibbertigibbet

On machine A:
# echo "spdadd 10.1.1.2 10.1.1.3 any -P in ipsec esp/transport//require; spdadd
10.1.1.3 10.1.1.2 any -P out ipsec esp/transport//require;" | setkey -c

# racoon

On machine B:
# echo "spdadd 10.1.1.2 10.1.1.3 any -P out ipsec esp/transport//require; spdadd
10.1.1.3 10.1.1.2 any -P in ipsec esp/transport//require;" | setkey -c

# racoon

2. Now that ipsec is configured, do a ping to ensure connection is up.

3. Let ping run for awhile (about 3 minutes, time it takes for a new re-key),
and eventually system will crash.

64 bytes from 9.3.189.55: icmp_seq=175 ttl=63 time=0.978 ms
64 bytes from 9.3.189.55: icmp_seq=176 ttl=63 time=0.885 ms
64 bytes from 9.3.189.55: icmp_seq=177 ttl=63 time=0.793 ms
64 bytes from 9.3.189.55: icmp_seq=178 ttl=63 time=0.710 ms
64 bytes from 9.3.189.55: icmp_seq=179 ttl=63 time=0.724 ms
BUG: scheduling while atomic: swapper/0x10000200/0
Call Trace:
[C00000000FFFF860] [C00000000000F808] .show_stack+0x68/0x1b0 (unreliable)
[C00000000FFFF900] [C00000000035CB04] .schedule+0xac/0xd0c
[C00000000FFFFA10] [C000000000063070] .__cond_resched+0x24/0x50
[C00000000FFFFA90] [C00000000035D844] .cond_resched+0x48/0x60
[C00000000FFFFB10] [C0000000000DB254] .__kmalloc+0x6c/0x154
[C00000000FFFFBB0] [C0000000000A4890] .audit_log_task_context+0x88/0x128
[C00000000FFFFC50] [C000000000342A68] .xfrm_audit_log+0x148/0x36c
[C00000000FFFFDB0] [C0000000003491C8] .xfrm_timer_handler+0x22c/0x280
[C00000000FFFFE40] [C000000000077578] .run_timer_softirq+0x194/0x264
[C00000000FFFFEF0] [C0000000000716E8] .__do_softirq+0xa8/0x164
[C00000000FFFFF90] [C000000000027740] .call_do_softirq+0x14/0x24
[C000000000593910] [C00000000000C1E8] .do_softirq+0x68/0xac
[C0000000005939A0] [C0000000000717F8] .irq_exit+0x54/0x6c
[C000000000593A20] [C000000000024904] .timer_interrupt+0x478/0x4c4
[C000000000593B00] [C000000000003608] decrementer_common+0x108/0x180
--- Exception: 901 at .local_irq_restore+0x3c/0x40
    LR = .cpu_idle+0x114/0x1e0
[C000000000593DF0] [C000000000011CD4] .cpu_idle+0x108/0x1e0 (unreliable)
[C000000000593E70] [C000000000009200] .rest_init+0x44/0x5c
[C000000000593EF0] [C000000000430918] .start_kernel+0x354/0x370
[C000000000593F90] [C000000000008528] .start_here_common+0x54/0xac
Unable to handle kernel paging request for instruction fetch
Faulting instruction address: 0xc00000000ffff8a0
cpu 0x0: Vector: 400 (Instruction Access) at [c00000000ffff5f0]
    pc: c00000000ffff8a0
    lr: c00000000ffff8a0
    sp: c00000000ffff870
   msr: 8000000010009032
  current = 0xc0000000004a4840
  paca    = 0xc0000000004a5100
    pid   = 0, comm = swapper
enter ? for help
[link register   ] c00000000ffff8a0
[c00000000ffff870] c000000000010048 .__switch_to+0x12c/0x160 (unreliable)
[c00000000ffff900] ffffffffe0000000
[c00000000ffffa10] c0000000005533a0
[c00000000ffffac0] 0000000000128000
SP (1) is in userspace
0:mon> t
[link register   ] c00000000ffff8a0
[c00000000ffff870] c000000000010048 .__switch_to+0x12c/0x160 (unreliable)
[c00000000ffff900] ffffffffe0000000
[c00000000ffffa10] c0000000005533a0
[c00000000ffffac0] 0000000000128000
SP (1) is in userspace
0:mon>

Comment 1 Steve Grubb 2007-02-12 23:28:07 UTC
Al do you think audit_log_task_context() needs to take the task and memory pool
as passed parameters?

Comment 2 Alexander Viro 2007-02-13 12:24:39 UTC
Either that, or just make that allocation GFP_ATOMIC unconditionally.
BTW, the use of getprocattr() is an atrocity wrt allocations; we
end up calculating size, calling selinux_getsecurity(), calling
security_sid_to_context() that does allocation (atomic) and puts
the string there; then we free what we'd allocated, return size, do
allocation in audit_log_task_context(), get through exactly the same
work *again* (including recalculation of size and atomic allocation),
copy string from atomically allocated into what we'd allocated in
audit_log_task_context() and free atomically allocated.  Revolting.

What we need is an analog of getprocattr (and getsecurity) that would
_not_ take buffer+len as an argument but just return whatever
security_sid_to_context() allocated and filled.  Simple and sane...

Comment 3 Eric Paris 2007-03-20 19:48:13 UTC
Patch for this problem has been found to show no problems in the LSPP kernel.

Submitted internally on 3/20  Moving to POST

Comment 4 Eric Paris 2007-03-20 19:57:58 UTC
*** Bug 231690 has been marked as a duplicate of this bug. ***

Comment 5 RHEL Program Management 2007-03-20 20:02:07 UTC
This request was evaluated by Red Hat Kernel Team for inclusion in a Red
Hat Enterprise Linux maintenance release, and has moved to bugzilla 
status POST.

Comment 6 Don Zickus 2007-03-29 15:48:58 UTC
in 2.6.18-13.el5

Comment 8 IBM Bug Proxy 2007-08-01 17:35:36 UTC
changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jon.thomas.com




------- Additional Comments From jon.thomas.com (prefers email at jrthomas.com)  2007-08-01 13:29 EDT -------
Joy,

Have you tested 2.6.18-13.el5 or rhel5.1beta1 tosee if this is fixed? 

If so, please close this bug. 

Comment 9 Joy Latten 2007-08-01 19:04:20 UTC
yes, i recalled testing this ok in last lspp kernel.

Comment 10 IBM Bug Proxy 2007-08-01 19:26:23 UTC
changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ACCEPTED                    |CLOSED




------- Additional Comments From jon.thomas.com (prefers email at jrthomas.com)  2007-08-01 15:22 EDT -------
Closing based on the last comment 

Comment 12 errata-xmlrpc 2007-11-07 19:40:06 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0959.html



Note You need to log in before you can comment on or make changes to this bug.