Bug 1247268
Summary: | ksh-20120801-28.el6.x86_64 introduced by RHEL6u7 segfaults | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Jindrich Novy <jindrich.novy> | ||||||
Component: | ksh | Assignee: | Michal Hlavinka <mhlavink> | ||||||
Status: | CLOSED DUPLICATE | QA Contact: | BaseOS QE - Apps <qe-baseos-apps> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | urgent | ||||||||
Version: | 6.7 | CC: | cww, fkrska, jindrich.novy, kdudka, martin.x.andersen, pandrade, salmy, zpytela | ||||||
Target Milestone: | rc | Keywords: | Regression | ||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2015-08-25 08:49:54 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1172231 | ||||||||
Attachments: |
|
Description
Jindrich Novy
2015-07-27 16:46:22 UTC
We use korn shell quite extensively so this regression is particularly annoying. Currently we version-locked ksh version to ksh-20120801-21.el6_6.3.x86_64 to avoid crashes. I'm happy to test if you provide srpm. (In reply to Jindrich Novy from comment #0) > Steps to Reproduce: > 1. Run sufficiently complex ksh script. We need reproducer for testing and regression tests to prevent it from happening in next releases. Above description is insufficient. Please provide reproducer we can use. Thanks Ok, the segfault is caused by Patch60 - "trapcom" patch. Please revert. It is obvious from the second stack frame: (gdb) f 2 #2 0x0000000000453caf in sh_subshell (shp=0x76e320, t=0x2b2fe8f150b0, flags=1, comsub=3) at /usr/src/debug/ksh-20120801/src/cmd/ksh93/sh/subshell.c:740 740 free(shp->st.trapcom[isig]); (gdb) l 735 shp->st.otrap = 0; 736 if(nsig) 737 { 738 for (isig = 0; isig < nsig; ++isig) 739 if (shp->st.trapcom[isig] && shp->st.trapcom[isig]!=Empty) 740 free(shp->st.trapcom[isig]); 741 memcpy((char*)&shp->st.trapcom[0],savsig,nsig*sizeof(char*)); 742 free((void*)savsig); 743 } 744 shp->options = sp->options; Created attachment 1058208 [details]
ksh-20120801-trapcom.patch
User reported this patch corrects the problem.
The change to the original ksh-20120801-trapcom.patch
patch is to not strdup (cosmetic) neither free (crash)
thespecial Empty constant.
Hi, we are also incurring this same problem with production batch jobs while running ksh93. Any idea when the RPM will come out? And, we've never tried an SRPM from RedHat, but if the RPM is not coming out today, then we'd like to try the SRM; how does one get the SRPM? Jindro, could you please confirm that replacing ksh-20120801-trapcom.patch by attachment #1058208 [details] prevents ksh from crashing in your environment?
Still segfaults with the patch applied. Within the internal allocator: Program terminated with signal 11, Segmentation fault. #0 bestsearch (vd=0x76cb00, size=0, wanted=<value optimized out>) at /usr/src/debug/ksh-20120801/src/lib/libast/vmalloc/vmbest.c:292 292 { if(size <= (s = SIZE(t)) ) Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.166.el6_7.1.x86_64 (gdb) bt #0 bestsearch (vd=0x76cb00, size=0, wanted=<value optimized out>) at /usr/src/debug/ksh-20120801/src/lib/libast/vmalloc/vmbest.c:292 #1 0x00000000004d98f8 in bestreclaim (vd=0x76cb00, wanted=0x0, c=6) at /usr/src/debug/ksh-20120801/src/lib/libast/vmalloc/vmbest.c:422 #2 0x00000000004da33d in bestalloc (vm=0x76cb80, size=8192, local=1) at /usr/src/debug/ksh-20120801/src/lib/libast/vmalloc/vmbest.c:661 #3 0x00000000004de9b7 in _ast_malloc (size=8192) at /usr/src/debug/ksh-20120801/src/lib/libast/vmalloc/malloc.c:521 #4 0x00002b997ee6065a in ?? () #5 0x00002b997e5d3440 in ?? () #6 0x000000000076a2c0 in ?? () #7 0x00002b997e56ebe8 in ?? () #8 0x000000000076a2c0 in ?? () #9 0x00002b997e5bc920 in ?? () #10 0x0000000000000000 in ?? () In /var/log/messages: segfault at 39 ip 00000000004d95e9 sp 00007fffc26d6f40 error 4 in ksh93[400000+15d000] valgrind says: ==12776== Syscall param mount(type) points to unaddressable byte(s) ==12776== at 0x55A313A: mount (in /lib64/libc-2.12.so) ==12776== by 0x480CF7: fs3d (fs3d.c:57) ==12776== by 0x420E0A: sh_init (init.c:1303) ==12776== by 0x407BA1: sh_main (main.c:141) ==12776== by 0x54D8D5C: (below main) (in /lib64/libc-2.12.so) ==12776== Address 0x0 is not stack'd, malloc'd or (recently) free'd Within frame #0: (gdb) p t $1 = (Block_t *) 0x31 This really isn't dereferencable. Hi Jindrich, Please check a backtrace with ksh-debuginfo installed. This would allow us to know where in ksh the problem happened. Created attachment 1060061 [details]
ksh-20120801-std_malloc.patch
To have a meaningful valgrind check it is also
required to applythe attached patch, and this
change to ksh.spec:
-export CCFLAGS="$RPM_OPT_FLAGS -fno-strict-aliasing $XTRAFLAGS -DSHOPT_AUDIT"
+export CCFLAGS="-O0 -g3 -fno-strict-aliasing $XTRAFLAGS -DSHOPT_AUDIT -D_AST_std_malloc=1"
Hi Paulo, both mock and local build fails with: + cc -O0 -g3 -fno-strict-aliasing -Wno-unknown-pragmas -Wno-missing-braces -Wno-unused-result -Wno-return-type -Wno-int-to-pointer-cast -Wno-parentheses -Wno-unused -Wno-unused-but-set-variable -Wno-cpp -DSHOPT_AUDIT -D_AST_std_malloc=1 -L. -L/builddir/build/BUILD/ksh-20120801/arch/linux.i386-64/lib -o suid_exec suid_exec.o -last -last /usr/bin/ld: cannot find -last collect2: ld returned 1 exit status mamake [cmd/ksh93]: *** exit code 1 making suid_exec Note that builddir/build/BUILD/ksh-20120801/arch/linux.i386-64/lib contains only static libraries and no libast.a. Maybe something else needs to be tweaked so that libast.a is built? Ok, ksh doesn't segfault with the patch in comment #16 and -D_AST_std_malloc=1. Hi Jindrich, If this version does not trigger the problem, one built without it should not fail as well. Unless: o There is a toolchain bug, as the suggested patch did build with -O0 for easier debug o There is a bug in the ksh malloc. I can only think of possible issues if using the (not even documented anymore) alarm interface This issue hit us pretty severely this weekend. Any estimate on when the new package with the proposed patch will make it to the official repos? Is this a duplicate of bug #1247383? |