Bug 140563 - Perl not properly supporting NPTL
Summary: Perl not properly supporting NPTL
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: perl
Version: 3.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Chip Turner
QA Contact: David Lawrence
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2004-11-23 17:15 UTC by Dave Maley
Modified: 2007-11-30 22:07 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2004-12-10 17:26:23 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Dave Maley 2004-11-23 17:15:49 UTC
Description of problem:
From IT #53712:
-----------------

Users here have noticed that Perl appears to correct for the old Linux
threads model by caching the PID and PPID at the start of a perl
process.  It appears this is done as an attempt to make Perl programs
more portable - by making Linux seem more Posix compliant.

A simple recreate to demonstrate this is as follows:
#!/usr/bin/perl
print "this is the parent $$\n";
unless (fork) { #this is the child
   print "this is the child $$\n";
   unless (fork) { # this is the grandchild
       print "this is the grandchild $$\n";

       ##################################
       # THE FOLLOWING LINE DOESN'T WORK
       ##################################
       #sleep 1 until getppid == 1;

       ##################################################################
       # SO AN ALTERNATIVE TEST WHICH SHOWS THAT getppid NEVER RETURNS 1
       ##################################################################
       while (($ppid=getppid) != 1) {
           print "getppid = $ppid\n";
           print "$$: $ppid is alive?", (kill(0, $$ppid) ? "yes" :
"no"), "\n";
           sleep 1;
           }

       print "this is the grandchild going bye bye\n";
       exit(0);
       }
   print "this is the child going bye bye\n";
   exit 0;
   }
print "this is the parent waiting\n";
wait;
print "this is the parent going bye bye\n";
exit (0);

There is an old Perl module Linux::PPID which works around this. 
However, it would seem that on Linux systems with NPTL, the correct
default behavior of Perl would be to do what it does on other OSes
with Posix (or nearly Posix) threads.

I downloaded a copy of the Perl 5.8.5 source and I think the offending
portion is in the hints/linux.sh file:
cat > UU/usethreads.cbu <<'EOCBU'
case "$usethreads" in
$define|true|[yY]*)
       ccflags="-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS $ccflags"
       set `echo X "$libswanted "| sed -e 's/ c / pthread c /'`
       shift
       libswanted="$*"

       # Somehow at least in Debian 2.2 these manage to escape
       # the #define forest of <features.h> and <time.h> so that
       # the hasproto macro of Configure doesn't see these protos,
       # even with the -D_GNU_SOURCE.

       d_asctime_r_proto="$define"
       d_crypt_r_proto="$define"
       d_ctime_r_proto="$define"
       d_gmtime_r_proto="$define"
       d_localtime_r_proto="$define"
       d_random_r_proto="$define"

       ;;
esac
EOCBU

It appears that the THREADS_HAVE_PIDS definition is what Perl uses to
set up it's caching of PIDs / PPIDs.

-Ryan

-----------------------------------------
Event posted 11-11-2004 03:59pm by gavin
-----------------------------------------
Yes, "grandchild" process should show that it's parent process is "1"
once the "child" process ends.  Looks like some problem in perl.

But, "fork" has very little to do with threads and so I doubt that
this problem has anything to do with NPTL or LinuxThreads, but I've
been wrong before.

Also, there is a bug in the testcase that causes it to give very
confusing results.  The reference to "$ppid" in the call to kill
should only have one "$", not two.


-----------------------------------------
Event posted 11-18-2004 03:26pm by dmaley
-----------------------------------------
Discussed this issue during the weekly call today and requested that
Ryan further explain why they believe this is related to moving to
NPTL in RHEL3.  This was in response to the post above by Gavin where
he mentioned that he didn't feel this was a threading issue because
fork has very little to do with threads.  Ryan, please feel free to
correct any of this if I explain things incorrectly.

Apparently Perl implemented a way to provide applications with a more
POSIX like interface under LinuxThreads, to assist in allowing Perl
apps to be portable to and from Linux (LinuxThreads).  The way they
accomplished this was to cache the PID and PPID at the start of a perl
process.  However with NPTL threads are no longer pids and so this
implementation doesn't work correctly.  There's a perl module
(Linux::PPID) which apparently prevents this non-standard behavior and
reverts Perl back to normal POSIX behavior.  However now that we
include a POSIX compiant threading model, and considering that this
isn't RH specific (ie. NPTL is upstream in 2.6), upstream Perl should
be updated to remove the "workaround" for LinuxThreads.  And obviously
RH should incorporate this into the Perl we ship in RHEL.

LLNL is currently hitting this issue in RHEL3, and it is believed this
problem will also exist in RHEL4. 


-----------------------------------------
Event posted 11-18-2004 03:43pm by braby1
-----------------------------------------
I don't see anything wrong with Dave's explanation, but thought I'd
paraphrase it so that hopefully someone ready both Dave's entry and
mine would be able to get a good solid understanding of this.

Without NPTL, Linux gave threads new PIDs.  This meant that someone
writing a threaded app in Perl would either have to know about this
different Linux behavior or that Perl would have to hide that Linux
did not follow POSIX threads.  To allow for Perl portability, it
appears that the Perl maintainers attempted to hide the Linux threads
behavior by caching the PID and PPID of Perl processes and passing
those onto any Perl threads.  Unfortunately, this work around was not
perfect, and breaks examples like the one posted above.  For these
cases someone developed the Linux::PPID module that would expose the
"true" Linux PPIDs.

Now that RHEL3 and the 2.6 kernel both have NPTL, it would seem that
this behavior of caching PIDs and PPIDs to hide that the threads model
is non-posix should be disabled on systems with NPTL.

Looking through the make files, I think this can be done by simply not
defining THREADS_HAVE_PIDS.  Ideally, a somewhat intelligent method
for determining this at build time could be found and put into the
Perl build system. 


Version-Release number of selected component (if applicable):
perl-5.8.0-88.9


How reproducible:
Every time


Steps to Reproduce:
1. run script provided in above description
2.
3.
  
Actual results:
"grandchild" process doesn't show that it's parent process is "1" once
the "child" process ends

Expected results:
"grandchild" process should show that it's parent process is "1" once
the "child" process ends

Additional info:

Comment 1 Chip Turner 2004-12-01 19:25:53 UTC
this is a good analysis of the problem.  I am hesitant to change it in
RHEL3, but I will adjust the RHEL4 beta perl to properly undefine the
THREADS_HAVE_PIDS setting, which should end the caching of ppids.  it
also may appear in future U releases of RHEL3.

Comment 5 Ben Woodard 2004-12-07 16:31:20 UTC
What did it break? Knowing what to look for might allow us to catch
problems faster.

Comment 8 Dave Maley 2004-12-10 17:26:23 UTC
Thanks much for the additional info here.  Chip had also added this to
the IT, so I wanted to add it here for anybody else who may be interested:

"Sure.  Basically the variable used to cache the ppid, PL_ppid, is no
longer present in libperl.so.  But anything that links against
libperl.so and expects that symbol to be present will fail to start
with an undefined symbol error.  We saw this happen with mod_perl, for
instance, the night after I made this change, so it would definitely
have an impact on any software that links against perl."


Note You need to log in before you can comment on or make changes to this bug.