Bug 810605

Summary: Segfault with freeradius-perl threading
Product: Red Hat Enterprise Linux 6 Reporter: Matt Ezell <matt+rh>
Component: freeradiusAssignee: John Dennis <jdennis>
Status: CLOSED ERRATA QA Contact: Patrik Kis <pkis>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.2CC: dpal, ksrot, pkis
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 846475 (view as bug list) Environment:
Last Closed: 2012-06-20 14:06:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 846475    

Description Matt Ezell 2012-04-07 04:36:43 UTC
Description of problem:

FreeRADIUS has a thread pool that will dynamically grow based on load.  If multiple threads using rlm_perl are spawned in quick succession, a segfault can occur due to parallel calls to rlm_perl_clone

Version-Release number of selected component (if applicable):

freeradius-perl-2.1.10-5.el6.x86_64

How reproducible:

Every time multiple threads are spawned in parallel

Steps to Reproduce:
1. Install freeradius and freeradius-perl
2. Setup test users and have FreeRADIUS use an example rlm_perl module
3. Hit the server with many parallel requests, such as with radclient
  
Actual results:

Server segfaults

Expected results:

Server should be able to handle the load just fine

Additional info:

A fix has been committed upstream (commit a97f8cc93a6f282777935e9b8a49c22c3fcc4647 at https://github.com/alandekok/freeradius-server/commit/a97f8cc93a6f282777935e9b8a49c22c3fcc4647#src/modules/rlm_perl/rlm_perl.c)

Comment 2 Patrik Kis 2012-04-10 11:29:43 UTC
Hi Matt,

I'm trying to reproduce this issue based on your description but no luck. Could you please describe the reproduce scenario more detailed?

I basically used the default config, just added "perl" in sites-enabled/default to authorize section and checked in debug logs that rlm_perl is used. Then with radtest send about 20 parallel requests. The result is: no segfault.

Comment 3 Karel Srot 2012-04-10 12:45:13 UTC
Hi Matt,
to be more specific, I have configured radiusd with rlm_perl:

$ radtest badusername passwd localhost 1 testing123Sending Access-Request of id 138 to 127.0.0.1 port 1812
	User-Name = "badusername"
	User-Password = "passwd"
	NAS-IP-Address = 127.0.0.1
	NAS-Port = 1
rad_recv: Access-Reject packet from host 127.0.0.1 port 1812, id=138, length=56
	Reply-Message = "Denied access by rlm_perl function"

$ radtest goodusername passwd localhost 1 testing123
Sending Access-Request of id 103 to 127.0.0.1 port 1812
	User-Name = "goodusername"
	User-Password = "passwd"
	NAS-IP-Address = 127.0.0.1
	NAS-Port = 1
rad_recv: Access-Accept packet from host 127.0.0.1 port 1812, id=103, length=31
	h323-credit-amount = "100"

I tried to reproduce the bug executing following loop 25x in parallel:

$ while true; do radtest goodusername passwd localhost 1 testing123 &> /dev/null ; done &

but radiusd didn't crash (I killed it after 15 minutes). 

Were you able to reproduce it? How many is "many parallel requests" and how fast it should crash?

Thank you in advance.

Comment 7 Matt Ezell 2012-04-10 16:09:13 UTC
Sorry for not being more specific with the reproducer.

I've been using "radtest" to send parallel requests. I setup a large input file with 10,000 entries like:
User-Name=goodusername, User-Password=passwd
(make sure to separate these with a blank line).

I've seen failures with parallelism set to 8, but bumping this number higher may make the failures more likely.

radclient -p 8 -f path_to_radclient_input_file -s -q localhost auth testing123

Adding additional computation to the part of the code where it clones the threads would help reproducibility also.  Try adding the following code to the bottom of example.pl:

sub CLONE {
        &radiusd::radlog(3, "Called clone in rlm_perl");
}

The CLONE subroutine is there to setup thread-local "things" like database connections.  The rlm_perl code I was using when I originally encountered this bug was setting up database connections.

Comment 8 John Dennis 2012-04-11 13:55:25 UTC
new builds provided.

Comment 11 Patrik Kis 2012-05-03 15:38:04 UTC
(In reply to comment #7)

Hi Matt,

I tried it again with your updates.

I'm using /etc/raddb/test.pl:
# diff /etc/raddb/test.pl /etc/raddb/example.pl 
90,94d76
< 	} elsif ($RAD_REQUEST{'User-Name'} =~ /^gooduser/i) {
<         if ($RAD_REQUEST{'User-Password'} eq "password1075") {
<             $RAD_REPLY{'Reply-Message'} = "Allow access by rlm_perl function";
<             return RLM_MODULE_OK;
<         }
197,200d178
< sub CLONE {
<         &radiusd::radlog(3, "Called clone in rlm_perl");
< }
< 

# cat aaa 
User-Name = "gooduser"
User-Password = "password1075"
NAS-IP-Address = 192.168.100.62
NAS-Port = 0

User-Name = "baduser"
User-Password = "password1075"
NAS-IP-Address = 192.168.100.62
NAS-Port = 0

# for i in `seq 1 100000`; do cat aaa >> perftest-200000.data ; done

But even running:
# radclient -p 100 -qs -f perftest-200000.data localhost auth testing123

do not cause crash.

Any other ideas?

Comment 12 Patrik Kis 2012-05-11 09:25:01 UTC
Since the issue cannot be reproduced, it is verified sanity only.

Comment 13 Patrik Kis 2012-05-11 10:02:54 UTC
We were not able to reproduce the crash using the given reproducer. We have verified that the patch is applied and freeradius doesn't crash with the reported scenario. The package has also passed all sanity&regression testing.

Comment 15 errata-xmlrpc 2012-06-20 14:06:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0881.html