Bug 1247268 - ksh-20120801-28.el6.x86_64 introduced by RHEL6u7 segfaults
ksh-20120801-28.el6.x86_64 introduced by RHEL6u7 segfaults
Status: CLOSED DUPLICATE of bug 1247383
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: ksh (Show other bugs)
Unspecified Unspecified
urgent Severity high
: rc
: ---
Assigned To: Michal Hlavinka
BaseOS QE - Apps
: Regression
Depends On:
Blocks: 1172231
  Show dependency treegraph
Reported: 2015-07-27 12:46 EDT by Jindrich Novy
Modified: 2015-08-25 04:49 EDT (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2015-08-25 04:49:54 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
ksh-20120801-trapcom.patch (1.78 KB, patch)
2015-07-31 23:08 EDT, Paulo Andrade
no flags Details | Diff
ksh-20120801-std_malloc.patch (473 bytes, patch)
2015-08-06 15:26 EDT, Paulo Andrade
no flags Details | Diff

  None (edit)
Description Jindrich Novy 2015-07-27 12:46:22 EDT
Description of problem:
Korn shell segfaults.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. Run sufficiently complex ksh script.

Actual results:
kernel: script..22217[22217] general protection ip:4dd304 sp:7fff1e979ee0 error:0 in ksh93[400000+15d000]

Expected results:
No segfault.

Additional info:
The offset of the segfault ip:4dd304 should give enough hint which patch on top of ksh-20120801-21.el6_6.3.x86_64 is causing the segfault, ksh-20120801-21.el6_6.3.x86_64 works flawlessly.
Comment 1 Jindrich Novy 2015-07-27 12:51:50 EDT
We use korn shell quite extensively so this regression is particularly annoying. Currently we version-locked ksh version to ksh-20120801-21.el6_6.3.x86_64 to avoid crashes.

I'm happy to test if you provide srpm.
Comment 4 Michal Hlavinka 2015-07-28 06:58:46 EDT
(In reply to Jindrich Novy from comment #0)
> Steps to Reproduce:
> 1. Run sufficiently complex ksh script.

We need reproducer for testing and regression tests to prevent it from happening in next releases. Above description is insufficient. Please provide reproducer we can use. Thanks
Comment 6 Jindrich Novy 2015-07-30 10:47:33 EDT
Ok, the segfault is caused by Patch60 - "trapcom" patch.

Please revert.

It is obvious from the second stack frame:

(gdb) f 2
#2  0x0000000000453caf in sh_subshell (shp=0x76e320, t=0x2b2fe8f150b0, flags=1, comsub=3) at /usr/src/debug/ksh-20120801/src/cmd/ksh93/sh/subshell.c:740
740                                             free(shp->st.trapcom[isig]);
(gdb) l
735                     shp->st.otrap = 0;
736                     if(nsig)
737                     {
738                             for (isig = 0; isig < nsig; ++isig)
739                                     if (shp->st.trapcom[isig] && shp->st.trapcom[isig]!=Empty)
740                                             free(shp->st.trapcom[isig]);
741                             memcpy((char*)&shp->st.trapcom[0],savsig,nsig*sizeof(char*));
742                             free((void*)savsig);
743                     }
744                     shp->options = sp->options;
Comment 7 Paulo Andrade 2015-07-31 23:08:08 EDT
Created attachment 1058208 [details]

User reported this patch corrects the problem.
The change to the original ksh-20120801-trapcom.patch
patch is to not strdup (cosmetic) neither free (crash)
thespecial Empty constant.
Comment 9 Eric Weaver 2015-08-03 05:54:12 EDT
Hi, we are also incurring this same problem with production batch jobs while running ksh93.  Any idea when the RPM will come out?  And, we've never tried an SRPM from RedHat, but if the RPM is not coming out today, then we'd like to try the SRM; how does one get the SRPM?
Comment 11 Kamil Dudka 2015-08-06 10:36:13 EDT
Jindro, could you please confirm that replacing ksh-20120801-trapcom.patch by attachment #1058208 [details] prevents ksh from crashing in your environment?
Comment 12 Jindrich Novy 2015-08-06 14:27:29 EDT
Still segfaults with the patch applied. Within the internal allocator:

Program terminated with signal 11, Segmentation fault.
#0  bestsearch (vd=0x76cb00, size=0, wanted=<value optimized out>) at /usr/src/debug/ksh-20120801/src/lib/libast/vmalloc/vmbest.c:292
292                             {       if(size <= (s = SIZE(t)) )
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.166.el6_7.1.x86_64
(gdb) bt
#0  bestsearch (vd=0x76cb00, size=0, wanted=<value optimized out>) at /usr/src/debug/ksh-20120801/src/lib/libast/vmalloc/vmbest.c:292
#1  0x00000000004d98f8 in bestreclaim (vd=0x76cb00, wanted=0x0, c=6) at /usr/src/debug/ksh-20120801/src/lib/libast/vmalloc/vmbest.c:422
#2  0x00000000004da33d in bestalloc (vm=0x76cb80, size=8192, local=1) at /usr/src/debug/ksh-20120801/src/lib/libast/vmalloc/vmbest.c:661
#3  0x00000000004de9b7 in _ast_malloc (size=8192) at /usr/src/debug/ksh-20120801/src/lib/libast/vmalloc/malloc.c:521
#4  0x00002b997ee6065a in ?? ()
#5  0x00002b997e5d3440 in ?? ()
#6  0x000000000076a2c0 in ?? ()
#7  0x00002b997e56ebe8 in ?? ()
#8  0x000000000076a2c0 in ?? ()
#9  0x00002b997e5bc920 in ?? ()
#10 0x0000000000000000 in ?? ()
Comment 13 Jindrich Novy 2015-08-06 14:37:04 EDT
In /var/log/messages:

segfault at 39 ip 00000000004d95e9 sp 00007fffc26d6f40 error 4 in ksh93[400000+15d000]

valgrind says:

==12776== Syscall param mount(type) points to unaddressable byte(s)
==12776==    at 0x55A313A: mount (in /lib64/libc-2.12.so)
==12776==    by 0x480CF7: fs3d (fs3d.c:57)
==12776==    by 0x420E0A: sh_init (init.c:1303)
==12776==    by 0x407BA1: sh_main (main.c:141)
==12776==    by 0x54D8D5C: (below main) (in /lib64/libc-2.12.so)
==12776==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
Comment 14 Jindrich Novy 2015-08-06 14:48:39 EDT
Within frame #0:

(gdb) p t
$1 = (Block_t *) 0x31

This really isn't dereferencable.
Comment 15 Paulo Andrade 2015-08-06 15:24:15 EDT
Hi Jindrich,

Please check a backtrace with ksh-debuginfo installed.
This would allow us to know where in ksh the problem
Comment 16 Paulo Andrade 2015-08-06 15:26:06 EDT
Created attachment 1060061 [details]

To have a meaningful valgrind check it is also
required to applythe attached patch, and this
change to ksh.spec:

-export CCFLAGS="$RPM_OPT_FLAGS -fno-strict-aliasing $XTRAFLAGS -DSHOPT_AUDIT"
+export CCFLAGS="-O0 -g3 -fno-strict-aliasing $XTRAFLAGS -DSHOPT_AUDIT -D_AST_std_malloc=1"
Comment 17 Jindrich Novy 2015-08-07 06:13:12 EDT
Hi Paulo,

both mock and local build fails with:

+ cc -O0 -g3 -fno-strict-aliasing -Wno-unknown-pragmas -Wno-missing-braces -Wno-unused-result -Wno-return-type -Wno-int-to-pointer-cast -Wno-parentheses -Wno-unused -Wno-unused-but-set-variable -Wno-cpp -DSHOPT_AUDIT -D_AST_std_malloc=1 -L. -L/builddir/build/BUILD/ksh-20120801/arch/linux.i386-64/lib -o suid_exec suid_exec.o -last -last
/usr/bin/ld: cannot find -last
collect2: ld returned 1 exit status
mamake [cmd/ksh93]: *** exit code 1 making suid_exec

Note that builddir/build/BUILD/ksh-20120801/arch/linux.i386-64/lib contains only static libraries and no libast.a. Maybe something else needs to be tweaked so that libast.a is built?
Comment 18 Jindrich Novy 2015-08-07 07:19:44 EDT
Ok, ksh doesn't segfault with the patch in comment #16 and -D_AST_std_malloc=1.
Comment 19 Paulo Andrade 2015-08-07 08:53:06 EDT
Hi Jindrich,

If this version does not trigger the problem, one built without
it should not fail as well. Unless:

o There is a toolchain bug, as the suggested patch did build with
  -O0 for easier debug
o There is a bug in the ksh malloc. I can only think of possible
  issues if using the (not even documented anymore) alarm interface
Comment 21 Martin Andersen 2015-08-12 08:53:20 EDT
This issue hit us pretty severely this weekend. Any estimate on when the new package with the proposed patch will make it to the official repos?
Comment 22 Kamil Dudka 2015-08-25 04:27:27 EDT
Is this a duplicate of bug #1247383?

Note You need to log in before you can comment on or make changes to this bug.