Bug 151127

Summary: (perl ia64 segfault) bind-libs bad parameters to strcpy()
Product: [Fedora] Fedora Reporter: Warren Togami <wtogami>
Component: bindAssignee: Jason Vas Dias <jvdias>
Status: CLOSED RAWHIDE QA Contact: David Lawrence <dkl>
Severity: high Docs Contact:
Priority: high    
Version: rawhideCC: cturner, davej, ezannoni, jakub
Target Milestone: ---   
Target Release: ---   
Hardware: ia64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-04-27 09:41:57 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 136450    

Description Warren Togami 2005-03-15 05:25:38 UTC
The below failure happens only on ia64 when attempting to build perl-Net-DNS in
beehive.  It crashes reproducibly in the same place.

http://people.redhat.com/wtogami/temp/perl-Net-DNS-540395-ia64-natasha.build.redhat.com.log
Executing(%build): /bin/sh -e /usr/src/build/540398-ia64/install-tmp/rpm-tmp.17788
+ umask 022
+ cd /usr/src/build/540398-ia64/BUILD
+ cd Net-DNS-0.48
+ LANG=C
+ export LANG
+ unset DISPLAY
+ CFLAGS='-O2 -Wall -g -pipe -Wp,-D_FORTIFY_SOURCE=2'
+ perl Makefile.PL PREFIX=/usr/src/build/540398-ia64/install/usr INSTALLDIRS=vendor
Testing if you have a C compiler and the needed header files....
cc -O2 -Wall -g -pipe -Wp,-D_FORTIFY_SOURCE=2   -c -o compile.o compile.c
/usr/src/build/540398-ia64/install-tmp/rpm-tmp.17788: line 27:   462
Segmentation fault      CFLAGS="$RPM_OPT_FLAGS" perl Makefile.PL
PREFIX=$RPM_BUILD_ROOT/usr INSTALLDIRS=vendor </dev/null
error: Bad exit status from /usr/src/build/540398-ia64/install-tmp/rpm-tmp.17788
(%build)

Comment 1 Chip Turner 2005-03-16 04:05:10 UTC
trying to diagnose this locked the buildbox.  to reproduce lockup:

install src rpm, untar tarball.  cd into directory.  run "gdb /usr/bin/perl" then "r 
Makefile.PL" and enjoy the frozen screen

Comment 2 Warren Togami 2005-03-16 04:13:16 UTC
Not sure who to assign this to.

Comment 3 Warren Togami 2005-03-16 07:34:39 UTC
<warren> jakub, chip said the kernel crash during gdb happened at the same place
that it segfaults without gdb.
<jakub> warren: try LD_PRELOAD=libSegFault.so, ulimit -c unlimited and gdb on
the core it generates or something

Comment 5 Warren Togami 2005-03-16 22:40:21 UTC
Reproduced perl segfault on boris and natasha (RHEL3 ia64 kernel) and bullwinkle
(2.6.9-1.906_EL).  bullwinkle has some other malfunction, so not yet been able
to see if gdb perl can crash it too.

Comment 6 Warren Togami 2005-03-16 22:51:37 UTC
It doesn't crash bullwinkle.  gdb manages a backtrace, but Sopwith said don't
install debuginfo in there and reassign to Chip for him to deal with it.  Filing
a separate bug for the RHEL3 kernel crash.

Testing if you have a C compiler and the needed header files....
Detaching after fork from child process 32311.
cc -O2 -Wall -g -pipe -Wp,-D_FORTIFY_SOURCE=2   -c -o compile.o compile.c
You have a working compiler.

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 2305843009222377472 (LWP 32308)]
0x20000000006b84f0 in strcpy () from /lib/tls/libc.so.6.1
(gdb) bt
#0  0x20000000006b84f0 in strcpy () from /lib/tls/libc.so.6.1
#1  0x200000000038b140 in endprotoent_r () from /usr/lib/libbind.so.3
#2  0x200000000038b420 in getprotobyname_r () from /usr/lib/libbind.so.3
#3  0x200000000026d560 in Perl_pp_gprotoent () from
/usr/lib/perl5/5.8.6/ia64-linux-thread-multi/CORE/libperl.so
#4  0x200000000026d810 in Perl_pp_gpbyname () from
/usr/lib/perl5/5.8.6/ia64-linux-thread-multi/CORE/libperl.so
#5  0x200000000015f200 in Perl_runops_debug () from
/usr/lib/perl5/5.8.6/ia64-linux-thread-multi/CORE/libperl.so
#6  0x20000000000ad730 in perl_run () from
/usr/lib/perl5/5.8.6/ia64-linux-thread-multi/CORE/libperl.so
#7  0x4000000000002370 in main ()


Comment 7 Warren Togami 2005-03-16 23:00:49 UTC
Not sure how useful this backtrace will be...

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 2305843009222377472 (LWP 6408)]
0x20000000006b84f0 in ?? () from /lib/tls/libc.so.6.1
(gdb) bt
#0  0x20000000006b84f0 in ?? () from /lib/tls/libc.so.6.1
#1  0x200000000038b140 in endprotoent_r () from /usr/lib/libbind.so.3
#2  0x200000000038b420 in getprotobyname_r () from /usr/lib/libbind.so.3
#3  0x200000000026d560 in Perl_pp_gprotoent (my_perl=0x6000000000008010) at
pp_sys.c:4900
#4  0x200000000026d810 in Perl_pp_gpbyname (my_perl=0x6000000000008010) at
pp_sys.c:4867
#5  0x200000000015f200 in Perl_runops_debug (my_perl=0x6000000000008010) at
dump.c:1449
#6  0x20000000000ad730 in perl_run (my_perl=Cannot access memory at address 0x0
) at perl.c:1935
#7  0x4000000000002370 in main (argc=Variable "argc" is not available.
) at perlmain.c:98

-bash-3.00$ rpm -qa |grep debuginfo
glibc-debuginfo-2.3.4-14
perl-debuginfo-5.8.6-4


Comment 8 Warren Togami 2005-03-17 03:27:14 UTC
bind-libs is passing bad parameters to strcpy().  Reassigning.  Setting High
priority because it is blocking other package builds.

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 2305843009222377472 (LWP 12304)]
0x20000000006b84f0 in ?? () from /lib/tls/libc.so.6.1
(gdb) bt
#0  0x20000000006b84f0 in ?? () from /lib/tls/libc.so.6.1
#1  0x200000000038b140 in copy_protoent (pe=0x6000000000606910,
pptr=0x20000000008a41b0, buf=Variable "buf" is not available.)
    at /usr/include/bits/string3.h:123
#2  0x200000000038b420 in getprotobyname_r (name=0x6000000000515690 "tcp",
pptr=0x20000000008a41b0,
    buf=0x6000000000010c60 "", buflen=4096, answerp=0x20000000008a41d8) at
getprotoent_r.c:45
#3  0x200000000026d560 in Perl_pp_gprotoent (my_perl=0x6000000000008010) at
pp_sys.c:4900
#4  0x200000000026d810 in Perl_pp_gpbyname (my_perl=0x6000000000008010) at
pp_sys.c:4867
#5  0x200000000015f200 in Perl_runops_debug (my_perl=0x6000000000008010) at
dump.c:1449
#6  0x20000000000ad730 in perl_run (my_perl=Cannot access memory at address 0x0
) at perl.c:1935
#7  0x4000000000002370 in main (argc=Variable "argc" is not available.
) at perlmain.c:98


Comment 9 Jason Vas Dias 2005-03-17 15:57:46 UTC
The root problem here is that perl is linked with libbind instead of
libresolv :
      # ldd /usr/bin/perl
        libperl.so =>
/usr/lib/perl5/5.8.6/ia64-linux-thread-multi/CORE/libperl.so
(0x2000000000050000)
        libbind.so.3 => /usr/lib/libbind.so.3 (0x2000000000368000)
        libnsl.so.1 => /lib/libnsl.so.1 (0x200000000041c000)
        libdl.so.2 => /lib/libdl.so.2 (0x2000000000458000)
        libm.so.6.1 => /lib/tls/libm.so.6.1 (0x2000000000470000)
        libcrypt.so.1 => /lib/libcrypt.so.1 (0x2000000000540000)
        libutil.so.1 => /lib/libutil.so.1 (0x2000000000588000)
        libpthread.so.0 => /lib/tls/libpthread.so.0     
        (0x200000000059c000)
        libc.so.6.1 => /lib/tls/libc.so.6.1 (0x20000000005d4000)
        /lib/ld-linux-ia64.so.2 (0x2000000000000000)

And it is probably NOT picking up the includes from 
/usr/include/bind
(ie. it is not getting the includes:
     /usr/include/bind/{netdb.h,resolv.h,arpa/nameser.h}
 necessary to use libbind properly,
 but rather:
     /usr/include/{netdb.h,resolv.h,arpa/nameser.h}
 which are incompatible with libbind usage.
).

I don't think it is a good idea to make perl use libbind .
libbind is NOT configured by /etc/nsswitch.conf, but by
/etc/irs.conf.

If perl must use libbind, then it should be modified to pick
up the proper include files from /usr/include/bind .

I think rebuilding perl to use libresolv instead of libbind should
fix this problem.

However, I'm investigating if this is really a libbind issue or if
it is because the wrong include files were used - if it is a libbind
issue I'll fix it.



Comment 10 Jason Vas Dias 2005-03-17 17:05:32 UTC
Changing line 1269 in perl-5.8.6/Configure should do the trick:

Line 1269:
libswanted="sfio socket bind inet nsl nm ndbm gdbm dbm db malloc dl
dld ld sun"
Should be:
libswanted="sfio socket resolv inet nsl nm ndbm gdbm dbm db malloc dl
dld ld sun"

Otherwise you'll need to change all the includes that include 
netdb.h, resolv.h, arpa/nameser.h to 
bind/netdb.h, bind/resolv.h, bind/arpa/nameser.h

This issue is actually noted in PERL's INSTALL file:

./INSTALL: 1841:installed may run into troubles because BIND installs
its own netdb.h

Here's the patch to Configure that should fix this problem:
--- perl-5.8.6/Configure.libresolv      2005-03-17 11:54:54.000000000
-0500
+++ perl-5.8.6/Configure        2005-03-17 11:55:26.000000000 -0500
@@ -1266,7 +1266,7 @@

 : List of libraries we want.
 : If anyone needs extra -lxxx, put those in a hint file.
-libswanted="sfio socket bind inet nsl nm ndbm gdbm dbm db malloc dl
dld ld sun"
+libswanted="sfio socket resolv inet nsl nm ndbm gdbm dbm db malloc dl
dld ld sun"
 libswanted="$libswanted m crypt sec util c cposix posix ucb bsd BSD"
 : We probably want to search /usr/shlib before most other libraries.
 : This is only used by the lib/ExtUtils/MakeMaker.pm routine extliblist.

I'm testing this now and will let you know how it works.




Comment 11 Jason Vas Dias 2005-03-17 17:59:28 UTC
I've now built both perl and perl-Net-DNS with the above patch on ia64
(natasha) and they work fine - perl passes all tests, perl-Net-DNS
builds and installs, and I can run the BIND test scripts (a heavy 
Net::DNS user). 

Shall I go ahead and build perl and perl-Net-DNS with the patch ?

Is anyone getting mails from this bug report ?
 

Comment 12 Jason Vas Dias 2005-03-17 18:14:50 UTC
OK, since this it so obviously wrong for perl to be using libbind with
the wrong include path, I'm going ahead and building perl-5.8.6-5 in
FC4 with the fix. When that is done, I'll submit perl-Net-DNS with the
BuildRequires: perl >= 3:5.8.6-5 so the new perl will be used.




Comment 13 Warren Togami 2005-03-17 18:28:17 UTC
Please don't add that BuildRequires.  I'll handle perl-Net-DNS.  Thanks for your
fix!


Comment 16 Jason Vas Dias 2005-03-17 23:56:12 UTC
This bug is now fixed with perl-5.8.6-5 and perl-Net-DNS-0.48-3