Bug 806543

Summary: perl dumps core in Socket.so
Product: [Fedora] Fedora Reporter: Tom Lane <tgl>
Component: perlAssignee: Petr Pisar <ppisar>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: cweyl, hhorak, iarnell, kasal, lkundrak, mmaslano, ppisar, psabata, rc040203, tcallawa
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: perl-Socket-2.001-1.fc17 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-04-11 03:59:51 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Proposed fix none

Description Tom Lane 2012-03-24 15:32:42 UTC
Description of problem:
mysql fails to build in either rawhide or f17, because perl dumps core while attempting to execute the regression test driver script.  The identical script behaves fine in f16.  Also, I observed no problems the last time I tried to build mysql, about a month ago, suggesting that the problem is very new.

The stack trace looks like Socket.so might be to blame.

Version-Release number of selected component (if applicable):
perl-5.14.2-211.fc17.x86_64
perl-Socket-2.000-2.fc17.x86_64

How reproducible:
Seems to be 100% if you just try to build current mysql from SRPM.  Curiously, it didn't fail when invoking the command by hand inside a mock buildroot.  Both x86 and x86_64 fail.

Steps to Reproduce:
1.  check out current mysql from git
2.  fedpkg srpm
3.  try to build the srpm in a mock buildroot for f17 or rawhide
  
Actual results:
Crash.  gdb shows this backtrace:

Program terminated with signal 11, Segmentation fault.
#0  0x00007fb78c29e22e in __memset_sse2 () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install perl-5.14.2-211.fc17.x86_64
(gdb) bt
#0  0x00007fb78c29e22e in __memset_sse2 () from /lib64/libc.so.6
#1  0x00007fb78a965a10 in XS_Socket_unpack_sockaddr_un ()
   from /usr/lib64/perl5/vendor_perl/auto/Socket/Socket.so
#2  0x00007fb78d60776a in Perl_pp_entersub ()
   from /usr/lib64/perl5/CORE/libperl.so
#3  0x00007fb78d5feb36 in Perl_runops_standard ()
   from /usr/lib64/perl5/CORE/libperl.so
#4  0x00007fb78d5a0e6b in perl_run () from /usr/lib64/perl5/CORE/libperl.so
#5  0x0000000000400d39 in ?? ()
#6  0x00007fb78c237735 in __libc_start_main () from /lib64/libc.so.6


Expected results:
should run the script successfully.

Additional info:
This is blocking me from updating mysql, so a fix would be appreciated ...

Comment 1 Tom Lane 2012-03-24 15:52:24 UTC
After some further experimentation, it seems that you can get it to fail with hand invocation, it's just somewhat less likely than when being driven by the RPM script.  Once a failure has occurred, cd to
rpmpath.../mysql-5.5.22/mysql-test and do
perl ./mysql-test-run.pl

If it's going to fail, the failure will happen almost immediately:

<mock-chroot>[mockbuild@rh3 mysql-test]$ perl ./mysql-test-run.pl 
Logging: ./mysql-test-run.pl  
120324 11:46:54 [Note] Plugin 'FEDERATED' is disabled.
MySQL Version 5.5.22
Segmentation fault (core dumped)

If it gets further than that, just hit control-C and try again.

Also, in case of interest, here is a link to a koji build showing the failure:
http://koji.fedoraproject.org/koji/taskinfo?taskID=3928370

Comment 2 Petr Pisar 2012-03-26 09:31:00 UTC
We upgraded Socket module because of a buffer overflow in older versions. Curiously, I put the same version (perl-Socket-2.000-1.fc16) into F16 too, in updates-testing at time of your testing. So it should fail too.

The backtrace reminds me memcpy() on overlaping areas which segfaults with SSE2-optimizied glibc.

Comment 3 Tom Lane 2012-03-26 14:08:21 UTC
(In reply to comment #2)
> Curiously, I put the same version (perl-Socket-2.000-1.fc16) into F16 too, in
> updates-testing at time of your testing. So it should fail too.

No, because I don't have that machine subscribed to updates-testing; it's still using
perl-Socket-1.94-197.fc16.x86_64

Comment 4 Tom Lane 2012-03-26 15:58:56 UTC
I installed debuginfo packages and now see this:

Program terminated with signal 11, Segmentation fault.
#0  __memset_sse2 () at ../sysdeps/x86_64/memset.S:464
464     L(P6Q3): mov    %rdx,-0x1e(%rdi)
(gdb) bt
#0  __memset_sse2 () at ../sysdeps/x86_64/memset.S:464
#1  0x00007effd471ca10 in memset (__len=30, __ch=0, __dest=0x7fff24a3d370)
    at /usr/include/bits/string3.h:85
#2  XS_Socket_unpack_sockaddr_un (my_perl=0x604010, cv=<optimized out>) at Socket.xs:715
#3  0x00007effd73c076a in Perl_pp_entersub (my_perl=0x604010) at pp_hot.c:3046
#4  0x00007effd73b7b36 in Perl_runops_standard (my_perl=0x604010) at run.c:41
#5  0x00007effd7359e6b in S_run_body (oldscope=1, my_perl=0x604010) at perl.c:2350
#6  perl_run (my_perl=0x604010) at perl.c:2268
#7  0x0000000000400d39 in main (argc=2, argv=0x7fff24a3b4a8, env=0x7fff24a3b4c0)
    at perlmain.c:120
(gdb) f 1
#1  0x00007effd471ca10 in memset (__len=30, __ch=0, __dest=x/2)
    at /usr/include/bits/string3.h:85
warning: Source file is more recent than executable.
85        return __builtin___memset_chk (__dest, __ch, __len, __bos0 (__dest));
(gdb) f 2
#2  XS_Socket_unpack_sockaddr_un (my_perl=0x604010, cv=<optimized out>) at Socket.xs:715
715               Zero(&addr+sockaddrlen, sizeof(addr)-sockaddrlen, char);

I have not looked at the code surrounding this, but gdb says that addr is of type struct sockaddr_un, which means that  "&addr + sockaddrlen" is going to add sockaddrlen times sizeof(struct sockaddr_un) to the address of addr.  Surely that should be "((char *) &addr) + sockaddrlen"?

Comment 5 Petr Pisar 2012-03-26 16:23:22 UTC
Thanks for investigation. This line was added by the perl-Socket upgrade to initialize the unused memory. Your explanation seems correct. I will need to check what the Zero() does with last argument (char).

I got failure in fedpkg local and the ./mysql-test-run.pl failed right now because of no space left on device. How much disk space the test requires?

Comment 6 Tom Lane 2012-03-26 16:51:32 UTC
(In reply to comment #5)
> I got failure in fedpkg local and the ./mysql-test-run.pl failed right now
> because of no space left on device. How much disk space the test requires?

On the machine I was doing this morning's test on, the mysql build tree is occupying about 1.4G at the point where I stopped it.  I'm not sure how much more it might need to run to completion.

I think you might be able to provoke the error without so much disk space if you just install the current mysql-test RPM (with its dependencies, particularly mysql-server) and do

cd /usr/share/mysql-test
sudo -u mysql perl ./mysql-test-run

Again note that it might not fail the first time; in my manual tests it only seems to fail maybe one time in three or so.  I wonder whether the Koji environment uses different address-space-randomization rules...

Comment 7 Tom Lane 2012-03-26 16:59:11 UTC
(In reply to comment #6)
> I think you might be able to provoke the error without so much disk space if
> you just install the current mysql-test RPM (with its dependencies,
> particularly mysql-server) and do
> 
> cd /usr/share/mysql-test
> sudo -u mysql perl ./mysql-test-run

I confirmed this way will provoke the failure with current F17 RPMs.  The crash case looks like:

[tgl@rhlap mysql-test]$ sudo -u mysql perl ./mysql-test-run
Logging: ./mysql-test-run  
120326 12:57:11 [Note] Plugin 'FEDERATED' is disabled.
MySQL Version 5.5.21
[tgl@rhlap mysql-test]$ echo $?
139

If it gets further than that, just control-C and try again.

Comment 8 Petr Pisar 2012-03-27 07:23:12 UTC
Thanks. This way I can reproduce it. I verified your diagnosis by examining core dump and filed bug report to the upstream (https://rt.cpan.org/Public/Bug/Display.html?id=76067).

Comment 9 Petr Pisar 2012-03-27 08:06:59 UTC
Created attachment 572965 [details]
Proposed fix

Comment 10 Petr Pisar 2012-03-27 08:15:40 UTC
I think perl-Socket-2.000-3.fc18 fixes this issue in F18. I cannot reproduce it with this build anymore.

Comment 11 Fedora Update System 2012-03-27 08:18:20 UTC
perl-Socket-2.000-3.fc17 has been submitted as an update for Fedora 17.
https://admin.fedoraproject.org/updates/perl-Socket-2.000-3.fc17

Comment 12 Fedora Update System 2012-03-27 08:20:34 UTC
perl-Socket-2.000-2.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/perl-Socket-2.000-2.fc16

Comment 13 Petr Pisar 2012-03-27 10:28:04 UTC
*** Bug 806922 has been marked as a duplicate of this bug. ***

Comment 14 Fedora Update System 2012-03-28 05:54:24 UTC
Package perl-Socket-2.000-3.fc17:
* should fix your issue,
* was pushed to the Fedora 17 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing perl-Socket-2.000-3.fc17'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-4760/perl-Socket-2.000-3.fc17
then log in and leave karma (feedback).

Comment 15 Fedora Update System 2012-04-11 03:59:51 UTC
perl-Socket-2.001-1.fc16 has been pushed to the Fedora 16 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 16 Fedora Update System 2012-04-11 16:49:23 UTC
perl-Socket-2.001-1.fc16 has been pushed to the Fedora 16 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 17 Fedora Update System 2012-04-12 02:59:52 UTC
perl-Socket-2.001-1.fc17 has been pushed to the Fedora 17 stable repository.  If problems still persist, please make note of it in this bug report.