Bug 503062 - ksh trashes its internal memory, can dump core
ksh trashes its internal memory, can dump core
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: ksh (Show other bugs)
5.4
x86_64 Linux
low Severity high
: rc
: ---
Assigned To: Michal Hlavinka
BaseOS QE
: Reopened
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-05-28 11:50 EDT by Jay Schuster
Modified: 2010-04-08 03:14 EDT (History)
1 user (show)

See Also:
Fixed In Version: ksh-20100202-1.el5
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-04-08 03:14:32 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
A list of the installed RPMs. (26.01 KB, text/plain)
2010-02-03 16:18 EST, Jay Schuster
no flags Details
Gzipped core dump. (34.67 KB, application/octet-stream)
2010-02-04 12:35 EST, Jay Schuster
no flags Details

  None (edit)
Description Jay Schuster 2009-05-28 11:50:28 EDT
Description of problem:

With the following code fragment, ksh trashes its internal memory as evidenced by a numeric variable being displayed as memory garbage.  An earlier version of ksh would actually dump core, but the latest one does not with this test case.

This bug does not happen with pdksh.

Version-Release number of selected component (if applicable):

ksh-20080202-2.el5

How reproducible:

Very reproducible.  Here's the test case:

Create this script as ./a:
--------------------------
#!/bin/ksh

trap '' 1 2
(
    trap 1 2
    b
)

RET=$?
echo "RET is '$RET'"
trap 'exit 2' 1 2
echo "RET is '$RET'"


trap '' 1 2
(
    trap 1 2
    b
)

RET=$?
echo "RET is '$RET'"
trap 'exit 2' 1 2
echo "RET is '$RET'"
--------------------------

and create this script as ./b:
--------------------------
#!/bin/ksh
exit 1
--------------------------

chmod +x ./a ./b
Make sure . is on your PATH and run: ./a

Yes, the code needs to be repeated twice.  An earlier release of the ksh RPM would trigger the bug after only the first instance.  This latest RPM (the 20080202 one) needs it done twice.

The last RET statement will return garbage characters.

Replacing the execution of "b" with "sleep 1; true" doesn't triggers the bug.  


Steps to Reproduce:
1. Create ./a and ./b as given above.
2. Run: ./a
3. See garbage output in last line of output.
  
Actual results:

--------------------------
RET is '1'
RET is '1'
RET is '1'
RET is 'äí'
--------------------------

Expected results:

--------------------------
RET is '1'
RET is '1'
RET is '1'
RET is '1'
--------------------------

Additional info:

The last line varies depending on whether it is run as ./a, a, ksh ./a, etc.  It really looks like memory is being overwritten somehow.
Comment 1 Michal Hlavinka 2009-06-01 04:56:51 EDT
Hi, I've tried to reproduce this but without any success.

What I did:
1)crated ./a and ./b with content from description
2)chmod +x ./a ./b
3)export PATH=$PATH:.
4)tried a lot of times:
a) ./a
b) ksh ./a
c) a
d) ksh a

but everything worked as expected.

please check if your ksh is not broken with:
rpm -V ksh

What platform do you use? I've used x86_64
Comment 2 Jay Schuster 2009-06-01 23:56:57 EDT
The problem is with x86_64.  rpm -V ksh shows no problems.  I'll work on a better test case.  It is system-dependent to some degree, as it is not happening on all our RHEL5 machines, but the ones it is happening on, the test case fails reliably.  I imagine it is some weird combination of the memory state (like pwd, etc) from where it is run.  An earlier ksh RPM would not only fail with just the first iteration of the trap's -- with some changes, I could get the shell to dump core!  I couldn't believe my eyes.

I'll report back with a more reliable test case.
Comment 3 Michal Hlavinka 2009-08-05 04:08:59 EDT
(In reply to comment #2)
> I'll report back with a more reliable test case.  

ping, any update on this?
Comment 4 Michal Hlavinka 2009-09-15 06:38:51 EDT
This bug is in needinfo state for more than one month, closing as insufficient_data. If you suffer from this bug and can provide reproducer, feel free to reopen.
Comment 5 Jay Schuster 2010-02-03 16:16:47 EST
I'd like to reopen this bug.  I'm sorry for the long delay -- I lost access to a RHEL5 server for a while.  The bug has not gone away.

Interestingly, changing the echo's to /bin/echo's, or to printf's, makes the bug go away.  But I can reliably core dump ksh with some variation of this script.

This is on a freshly installed RHEL 5 system.  I'll attach a list of the installed RPMs.

[root@hamm ~]# rpm -q ksh
ksh-20080202-14.el5_4.2
[root@hamm ~]# rpm -V ksh
[root@hamm ~]# ./a
Segmentation fault
[root@hamm ~]# cat a
#!/bin/ksh

trap '' 1 2
(
    trap 1 2
    b
)

RET=$?
echo "RET is '$RET'"
trap 'exit 2' 1 2
echo "RET is '$RET'"
[root@hamm ~]# cat b
#!/bin/ksh
exit 1
[root@hamm ~]# echo $PATH
/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:
[root@hamm ~]# pwd
/root
[root@hamm ~]# ./a
Segmentation fault
[root@hamm ~]# cat /proc/version
Linux version 2.6.18-164.11.1.el5 (mockbuild@ls20-bc2-13.build.redhat.com) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-46)) #1 SMP Wed Jan 6 13:26:04 EST 2010
[root@hamm ~]# ls -l /bin/ksh  
lrwxrwxrwx 1 root root 21 Jan 26 13:33 /bin/ksh -> /etc/alternatives/ksh
[root@hamm ~]# ls -l /etc/alternatives/ksh
lrwxrwxrwx 1 root root 10 Feb  3 15:59 /etc/alternatives/ksh -> /bin/ksh93
[root@hamm ~]# ls -l /bin/ksh93
-rwxr-xr-x 1 root root 1165392 Dec  7 05:42 /bin/ksh93
Comment 6 Jay Schuster 2010-02-03 16:18:30 EST
Created attachment 388651 [details]
A list of the installed RPMs.

This is the list of RPMs installed on my system.  It's a fresh, current, install.
Comment 7 Michal Hlavinka 2010-02-04 08:10:02 EST
that's definitely odd, I've tried a lot of times but still unable to reproduce your problem. Could you attach backtrace or core file?
Comment 8 Jay Schuster 2010-02-04 12:33:58 EST
[root@hamm ~]# gdb /bin/ksh93
GNU gdb Fedora (6.8-37.el5)
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...
(no debugging symbols found)
(gdb) run ./a
Starting program: /bin/ksh93 ./a
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
Detaching after fork from child process 14553.

Program received signal SIGSEGV, Segmentation fault.
0x00000038c5a79142 in strcmp () from /lib64/libc.so.6
(gdb) bt
#0  0x00000038c5a79142 in strcmp () from /lib64/libc.so.6
#1  0x000000000048d855 in ?? ()
#2  0x000000000044dd96 in ?? ()
#3  0x0000000000423306 in ?? ()
#4  0x00000000004241e7 in ?? ()
#5  0x000000000041f916 in hypotl ()
#6  0x000000000041e565 in hypotl ()
#7  0x000000000042129a in hypotl ()
#8  0x0000000000450564 in ?? ()
#9  0x0000000000435880 in ?? ()
#10 0x0000000000407723 in hypotl ()
#11 0x0000000000406c2c in hypotl ()
#12 0x00000038c5a1d994 in __libc_start_main () from /lib64/libc.so.6
#13 0x0000000000405f39 in hypotl ()
#14 0x00007fffe99f7ba8 in ?? ()
#15 0x0000000000000000 in ?? ()
(gdb) 

[root@hamm ~]# ulimit -c unlimited
[root@hamm ~]# /bin/ksh93 ./a      
Segmentation fault (core dumped)
[root@hamm ~]# ls core*
core.14633
[root@hamm ~]# gzip core.14633 

I'll attach the gzip'd core file.
Comment 9 Jay Schuster 2010-02-04 12:35:10 EST
Created attachment 388848 [details]
Gzipped core dump.
Comment 10 Michal Hlavinka 2010-04-06 11:43:23 EDT
I've looked at the core dump and found it fails on

strcmp("KEY","") which is really odd, because this should work fine and there is no other thread.

Anyway, we've updated ksh a lot for RHEL5.5 Could you please test whether updated ksh (together with updated glibc) fixes problem for you? Thanks
Comment 11 Jay Schuster 2010-04-07 16:22:58 EDT
The latest ksh (ksh-20100202-1.el5) does not have this problem!  Thank you!

Note You need to log in before you can comment on or make changes to this bug.