Bug 1269168

Summary: perl segfaults in Perl_mg_get()
Product: Red Hat Enterprise Linux 7 Reporter: Jindrich Novy <jindrich.novy>
Component: perlAssignee: perl-maint-list
Status: CLOSED INSUFFICIENT_DATA QA Contact: BaseOS QE - Apps <qe-baseos-apps>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.2CC: mkyral, ppisar, sohnythomas
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
URL: https://rt.perl.org/Public/Bug/Display.html?id=107480
Whiteboard:
Fixed In Version: Doc Type: Release Note
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-02-26 11:25:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1298243, 1473612    
Attachments:
Description Flags
Attaching core file in case you want to have a look. none

Description Jindrich Novy 2015-10-06 13:46:46 UTC
Description of problem:
Our internal OS installation program written in perl segfaults in RHEL7. This doesn't happen for RHEL5 and RHEL6. It seems to be related to 5.12+ version of perl which likely introduced this bug.


Version-Release number of selected component (if applicable):
perl-5.16.3-286.el7.x86_64

How reproducible:
Always

Steps to Reproduce:
1. run our internal OS installer on RHEL7
2.
3.

Actual results:
Segfault.

Expected results:
No segfault.

Additional info:

Output of gdb:
# gdb --args /usr/bin/perl -w -t /usr/sbin/ncm-ncd --configure spma
<snip>
Insecure dependency in require while running with -t switch at /usr/lib64/perl5/File/Glob.pm line 7.
Insecure dependency in require while running with -t switch at /usr/lib64/perl5/File/Glob.pm line 7.

Program received signal SIGSEGV, Segmentation fault.
0x00002aaaaaf6e04f in Perl_mg_get (my_perl=0x603010, sv=0x16c35e0) at mg.c:198
198		if (!(mg->mg_flags & MGf_GSKIP) && vtbl && vtbl->svt_get) {
Missing separate debuginfos, use: debuginfo-install gdbm-1.10-8.el7.x86_64 glibc-2.17-78.el7.x86_64 libdb-5.3.21-17.el7_0.1.x86_64 nss-softokn-freebl-3.16.2.3-1.el7_0.x86_64 perl-DB_File-1.830-3.el7.x86_64 perl-Encode-2.51-7.el7.x86_64 perl-JSON-XS-3.01-1.rhel7.x86_64 perl-PathTools-3.40-5.el7.x86_64 perl-Scalar-List-Utils-1.27-248.el7.x86_64 perl-Socket-2.010-3.el7.x86_64 perl-Storable-2.45-3.el7.x86_64 perl-Sys-Syslog-0.33-3.el7.x86_64 perl-Template-Toolkit-2.24-5.el7.x86_64
(gdb) l
193	    newmg = cur = head = mg = SvMAGIC(sv);
194	    while (mg) {
195		const MGVTBL * const vtbl = mg->mg_virtual;
196		MAGIC * const nextmg = mg->mg_moremagic;	/* it may delete itself */
197	
198		if (!(mg->mg_flags & MGf_GSKIP) && vtbl && vtbl->svt_get) {
199		    vtbl->svt_get(aTHX_ sv, mg);
200	
201		    /* guard against magic having been deleted - eg FETCH calling
202		     * untie */
(gdb) p mg
$1 = (MAGIC *) 0x16c3598
(gdb) p *(MAGIC*)mg
$2 = {mg_moremagic = 0x16daee0, mg_virtual = 0x2200000c00000003, mg_private = 54112, mg_type = -116 '\214', mg_flags = 1 '\001', mg_len = 0, mg_obj = 0x1888ff0, mg_ptr = 0x800900000001 <Address 0x800900000001 out of bounds>}

Comment 2 Jindrich Novy 2015-10-06 13:49:23 UTC
I'm happy to troubleshoot as you obviously don't have access to our internal code. Feel free to let me know if I can help. It's easily reproducible every time within our environment.

Comment 3 Petr Pisar 2015-10-06 15:23:12 UTC
Please contact Red Hat support to help you with debugging the code. I cannot do much without a reproducer.

Comment 4 Jindrich Novy 2015-10-06 15:36:18 UTC
I've already filed a support case:

https://access.redhat.com/support/cases/#/case/01518741

Just making sure Perl devels know about this bug so that I can help investigating/testing.

Comment 5 Petr Pisar 2015-10-07 06:45:05 UTC
The bug looks like a corruptions of a variable's internal metadata representation when cloning it into a new variable. Very probably the original variable was clobbered somewhere else before reaching this code. Therefore it's impossible to debug it without a reproducer. Moreover it involves some hairy interpreter code that changes each perl version that even the authors do not understand as you could read in the upstream bug report. So I don't believe in fixing this issue any soon.

I can give you only a generic piece of advice: Minimize the reproducer, especially avoid non-core XS modules (the modules thas has a DSO counterpart and are not listed in `corelist -v v5.16.3' output). This allows you to test the reproducer against different perl builds.

Then download upstream perl sources <git://perl5.git.perl.org/perl.git>, checkout v5.16.3 tag, build it, you can use this long command that works with old perls on old RHELs:

$ sh Configure -des -Dusedevel \
  -Dusethreads -Duseithreads -Dldflags="-lm -lpthread" \
  -Accflags="-I/usr/src/kernels/$(uname -r)/arch/x86/include" && \
perl -ni -we 'print unless /<(?:built-in|command)/' makefile x2p/makefile && \
( make -j5 test_prep || make -j5 )

and try the reproducer again executing like `./perl -Ilib ...'. If you can reproduce it with this upstream version, then try v5.16.0 tag which is on the `blead' git branch that contains all the major perl version (v5.12.0 etc.) in linear history (this is important for fluent git bisect), find a first working major version (e.g. the v5.12.0 is mentioed in the upstream report) and then bisect the history to find first commit leading to the bug.

Don't forget to prune the git tree (git clean -dxf; git checkout -f) before building each new commit as the build script does not count dependencies properly.

Good luck.

Comment 6 Jindrich Novy 2015-10-07 10:34:04 UTC
Thanks.

The segfault seems to be related to tainting warnings machinery in Perl. Without -t option the code runs as expected without segfaults.

Comment 7 Jindrich Novy 2015-10-27 11:03:10 UTC
Created attachment 1086819 [details]
Attaching core file in case you want to have a look.

Comment 8 Petr Pisar 2015-10-27 11:47:56 UTC
What perl did you run when generating the core dump? my gdb states /usr/lib/debug/.build-id/d7/34b2d4aaf3ad9581268cac0f1e498a914822ff, but the perl-debuginfo-5.16.3-286.el7.x86_64 delivers /usr/lib/debug/.build-id/f4/4bbae67a26208e2c4c6b3fd29af70c068f2ccc.1.debug. The hashes do not match.

Comment 9 Jindrich Novy 2015-10-27 12:25:13 UTC
The version is perl-5.16.3-285.el7.x86_64 (latest RHEL7).

Comment 10 Petr Pisar 2015-10-27 14:31:13 UTC
It crashed after nested module loading at /usr/share/perl5/feature.pm:10 (defining "our %feature"). I cannot get the Perl call stack trace from a core dump. Putting "Carp::cluck()" before the line could help you to figure out how perl got there.

Also I noticed a /lib64/bash_ld_preload.so library in use that is unknown to me.

Comment 12 sohny thomas 2016-04-05 11:58:06 UTC
I am getting a crash at the following when running an internal perl installation script

(/opt/ulticom/snr/swinstall/nsi/bin/../lib/IO/Pty.pm:20):
20:       my ($class) = $_[0] || "IO::Pty";
  DB<3> n
IO::Pty::new(/opt/ulticom/snr/swinstall/nsi/bin/../lib/IO/Pty.pm:21):
21:       $class = ref($class) if ref($class);
  DB<3>
IO::Pty::new(/opt/ulticom/snr/swinstall/nsi/bin/../lib/IO/Pty.pm:22):
22:       @_ <= 1 or croak 'usage: new $class';
  DB<3>
IO::Pty::new(/opt/ulticom/snr/swinstall/nsi/bin/../lib/IO/Pty.pm:24):
24:       my ($ptyfd, $ttyfd, $ttyname) = pty_allocate();
  DB<3> s
IO::Pty::pty_allocate(/usr/share/perl5/vendor_perl/Carp.pm:79):
79:         my $cgc = _cgc();
  DB<3> nn

  DB<4> n
IO::Pty::pty_allocate(/usr/share/perl5/vendor_perl/Carp.pm:80):
80:         my $call_pack = $cgc ? $cgc->() : caller();
  DB<4>
IO::Pty::pty_allocate(/usr/share/perl5/vendor_perl/Carp.pm:81):
81:         if ( $Internal{$call_pack} or $CarpInternal{$call_pack} ) {
  DB<4>
IO::Pty::pty_allocate(/usr/share/perl5/vendor_perl/Carp.pm:85):
85:             local $CarpLevel = $CarpLevel + 1;
  DB<4>
IO::Pty::pty_allocate(/usr/share/perl5/vendor_perl/Carp.pm:86):
86:             return longmess_heavy(@_);
  DB<4>
Signal SEGV at /opt/ulticom/snr/swinstall/nsi/bin/../lib/Expect.pm line 81.
        Expect::new('Expect') called at /opt/ulticom/snr/swinstall/nsi/bin/../lib/Ulcm/SHELL/Expect.pm line 76
        Ulcm::SHELL::Expect::run_shell('Ulcm::SHELL::Expect=HASH(0x3b051c8)') called at /opt/ulticom/snr/swinstall/nsi/bin/../lib/Ulcm/SHELL/Expect.pm line 101
        Ulcm::SHELL::Expect::start('Ulcm::SHELL::Expect=HASH(0x3b051c8)') called at /opt/ulticom/snr/swinstall/nsi/bin/../lib/Ulcm/Utils.pm line 619
        eval {...} called at /opt/ulticom/snr/swinstall/nsi/bin/../lib/Ulcm/Utils.pm line 616
        Ulcm::Utils::utl_cmd('/bin/sync;/bin/sync') called at /opt/ulticom/snr/swinstall/nsi/bin/../lib/Ulcm/Utils.pm line 1644
        Ulcm::Utils::utl_write_xml('/opt/ulticom/snr/Logs/swnsiupg.xml', 'HASH(0x3ae4b00)') called at /opt/ulticom/snr/swinstall/nsi/bin/../lib/Ulcm/SwUpgrade.pm line 147
        Ulcm::SwUpgrade::write_state('Ulcm::PkgUpg=HASH(0x3abdce0)') called at /opt/ulticom/snr/swinstall/nsi/bin/../lib/Ulcm/PkgUpg.pm line 659
        Ulcm::PkgUpg::upgrade_first_group('Ulcm::PkgUpg=HASH(0x3abdce0)') called at /opt/ulticom/snr/swinstall/nsi/bin/swnsiupg line 258
        eval {...} called at /opt/ulticom/snr/swinstall/nsi/bin/swnsiupg line 149
Aborted (core dumped)

Comment 13 Petr Pisar 2016-04-05 13:12:30 UTC
(In reply to sohny thomas from comment #12)
> I am getting a crash at the following when running an internal perl
> installation script
> 
It would be great if you could provide a reproducer. Please ask the /opt/ulticom software vendor for a help.

The trace says the Expect cannot create a PTY using IO::Pty::pty_allocate(), thus raises an exception using Carp::longmess(). This call involves long jumps with stack unwinding. And something goes wrong and the program crashes.

Without any details I even cannot conclude if the crash is similar to the original one reported by Jindrich Novy.

Comment 14 sohny thomas 2016-04-05 14:57:26 UTC
(In reply to Petr Pisar from comment #13)
> (In reply to sohny thomas from comment #12)
> > I am getting a crash at the following when running an internal perl
> > installation script
> > 
> It would be great if you could provide a reproducer. Please ask the
> /opt/ulticom software vendor for a help.
Oh , I am part of ulticom , I can reproduce this on redhat 7 but on redhat 6 it just works flawlessly. 
I have this perl install file which makes use of the perl-IO-Tty package , from the top down view , i just see that only the perl version has changed from 5.10 to 5.16 .

Let me know if I missed something or I need to provide anything more.

Comment 15 sohny thomas 2016-04-05 15:00:28 UTC
I will try to create a repoducer perl script

Comment 17 Petr Pisar 2016-12-12 11:12:10 UTC
Maybe a related bug <https://rt.perl.org/Public/Bug/Display.html?id=130320>.

Comment 19 Petr Pisar 2019-02-26 11:25:57 UTC
No reproducer has been provided.