Bug 1414808 - ksh does not fork a subshell, breaking the wait builtin [el6]
Summary: ksh does not fork a subshell, breaking the wait builtin [el6]
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: ksh
Version: 6.8
Hardware: All
OS: Linux
unspecified
medium
Target Milestone: rc
: 6.9
Assignee: Siteshwar Vashisht
QA Contact: BaseOS QE - Apps
URL:
Whiteboard:
Depends On: 1462347
Blocks: 1461138
TreeView+ depends on / blocked
 
Reported: 2017-01-19 14:06 UTC by Paulo Andrade
Modified: 2020-12-14 08:01 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1462347 1497679 (view as bug list)
Environment:
Last Closed: 2017-11-06 13:23:49 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
SubShell.tar (10.00 KB, application/x-tar)
2017-01-19 14:06 UTC, Paulo Andrade
no flags Details
ksh-20120801-subshell-jobwait.patch (383 bytes, patch)
2017-06-16 17:59 UTC, Siteshwar Vashisht
kdudka: review+
Details | Diff

Description Paulo Andrade 2017-01-19 14:06:40 UTC
Created attachment 1242505 [details]
SubShell.tar

The "unmodified" attached reproducer is from the user,
and should be extracted and tested in /root as root.

  I suggest changing all sleep arguments to its value
divided by 10 for faster testing, and/or changing
paths to not use /root and run as another user.

  Very simple patch to correct the issue is:

"""
--- ksh-20120801/src/cmd/ksh93/sh/xec.c~	2017-01-19 11:49:13.566724372 -0200
+++ ksh-20120801/src/cmd/ksh93/sh/xec.c	2017-01-19 11:49:15.664663973 -0200
@@ -1643,7 +1643,7 @@
 
 				if((type&(FAMP|TFORK))==(FAMP|TFORK))
 				{
-					if(shp->comsub && !(shp->fdstatus[1]&IONOSEEK))
+					//if(shp->comsub && !(shp->fdstatus[1]&IONOSEEK))
 					{
 						if (!unpipe)
 							unpipe = iousepipe(shp);
"""

  From my understanding, the problem happens because,
since it apparently attempts to optimize out the fork,
it ends up wait'ing also for the watchdog_timer.ksh
process, causing a deadlock.

  Possible ways to make it work would be:
o change the subshell in subshell_test.ksh:TestSubShell()
  with a call to a script
o some construct like an eval to a string
o keep a list of the pids of the subprocesses to wait and
  wait only for them
*but* that would be just to workaround ksh not forking
for the subshell.

  I suggest comparing the code to the ksh-20100621 version.

  I am afraid there is no much information about the reason
of commenting the check in the suggested patch (checking
github ast diff it is just in the middle of a 256K+ line
single commit); if it is just an optimization, it is in error
because it breaks use of the wait builtin, that would wait all
processes in the subshell inside ( ).

  On rhel5.11, or with the patch, it should output something
like:
$ ./subshell_test.ksh
ksh Version AJM 93t+ 2010-06-21
Wed Jan 11 12:38:11 GMT 2017 Start ---------------------- subshell_test.ksh ----------------------
Wed Jan 11 12:38:11 GMT 2017 Start ---------------------- subshellA1.ksh ----------------------
Wed Jan 11 12:38:11 GMT 2017 Start ---------------------- subshellA2.ksh ----------------------
Wed Jan 11 12:38:11 GMT 2017 Start ---------------------- subshellA3.ksh ----------------------
Wed Jan 11 12:38:11 GMT 2017 Start ---------------------- watchdog_timer.ksh ----------------------
Wed Jan 11 12:39:51 GMT 2017 End ---------------------- subshellA3.ksh ----------------------
Wed Jan 11 12:41:31 GMT 2017 End ---------------------- subshellA2.ksh ----------------------
Wed Jan 11 12:43:11 GMT 2017 End ---------------------- subshellA1.ksh ----------------------
Wed Jan 11 12:43:11 GMT 2017 Start ---------------------- extra_task.ksh ----------------------
Wed Jan 11 12:44:51 GMT 2017 End ---------------------- extra_task.ksh ----------------------
Wed Jan 11 12:44:51 GMT 2017 End ---------------------- subshell_test.ksh ----------------------
<< SHELL_PROMPT >>
$ Wed Jan 11 12:44:51 GMT 2017 End ---------------------- watchdog_timer.ksh ----------------------

and not:

$ ./subshell_test.ksh
ksh Version AJM 93u+ 2012-08-01
Wed Jan 11 12:39:33 GMT 2017 Start ---------------------- subshell_test.ksh ----------------------
Wed Jan 11 12:39:33 GMT 2017 Start ---------------------- subshellA1.ksh ----------------------
Wed Jan 11 12:39:33 GMT 2017 Start ---------------------- subshellA2.ksh ----------------------
Wed Jan 11 12:39:33 GMT 2017 Start ---------------------- subshellA3.ksh ----------------------
Wed Jan 11 12:39:33 GMT 2017 Start ---------------------- watchdog_timer.ksh ----------------------
Wed Jan 11 12:41:13 GMT 2017 End ---------------------- subshellA3.ksh ----------------------
Wed Jan 11 12:42:53 GMT 2017 End ---------------------- subshellA2.ksh ----------------------
Wed Jan 11 12:44:33 GMT 2017 End ---------------------- subshellA1.ksh ----------------------
Wed Jan 11 12:46:13 GMT 2017 End ---------------------- watchdog_timer.ksh ----------------------
subshell_test.ksh has overrun its time of 6 minutes
Started by:
UID        PID  PPID  C STIME TTY          TIME CMD
<<user>>  17090  4104  0 12:39 pts/8    00:00:00 /bin/ksh ./subshell_test.ksh
Wed Jan 11 12:46:13 GMT 2017 Start ---------------------- extra_task.ksh ----------------------
Wed Jan 11 12:47:53 GMT 2017 End ---------------------- extra_task.ksh ----------------------
Wed Jan 11 12:47:53 GMT 2017 End ---------------------- subshell_test.ksh ----------------------

  The problem happens in rhel6, rhel7 and fedora, as it is specific
to ksh-20120801.

Comment 2 Kamil Dudka 2017-05-10 15:00:09 UTC
As Siteshwar pointed out, the following patch could possibly fix this bug:

    attachment #1276404 [details]

Comment 3 Siteshwar Vashisht 2017-06-14 17:30:58 UTC
This is the output I receive with patch from comment 2 :

./subshell_test.ksh 
ksh Version AJM 93u+ 2012-08-01
Wed Jun 14 09:03:43 EDT 2017 Start ---------------------- subshell_test.ksh ----------------------
Wed Jun 14 09:03:43 EDT 2017 Start ---------------------- subshellA3.ksh ----------------------
Wed Jun 14 09:03:43 EDT 2017 Start ---------------------- subshellA2.ksh ----------------------
Wed Jun 14 09:03:43 EDT 2017 Start ---------------------- subshellA1.ksh ----------------------
Wed Jun 14 09:03:43 EDT 2017 Start ---------------------- watchdog_timer.ksh ----------------------
Wed Jun 14 09:05:23 EDT 2017 End ---------------------- subshellA3.ksh ----------------------
Wed Jun 14 09:07:03 EDT 2017 End ---------------------- subshellA2.ksh ----------------------
Wed Jun 14 09:08:43 EDT 2017 End ---------------------- subshellA1.ksh ----------------------
Wed Jun 14 09:10:23 EDT 2017 End ---------------------- watchdog_timer.ksh ----------------------
subshell_test.ksh has overrun its time of 6 minutes
Started by:
UID        PID  PPID  C STIME TTY          TIME CMD
root      3387  3351  0 09:03 pts/0    00:00:00 /bin/ksh ./subshell_test.ksh
Wed Jun 14 09:10:23 EDT 2017 Start ---------------------- extra_task.ksh ----------------------
Wed Jun 14 09:12:03 EDT 2017 End ---------------------- extra_task.ksh ----------------------
Wed Jun 14 09:12:03 EDT 2017 End ---------------------- subshell_test.ksh ----------------------


So patch from comment 2 does not bring the required behavior.

Comment 4 Paulo Andrade 2017-06-14 18:12:05 UTC
I just tested it again, rebuilding ksh with the patch, and installing
it to make sure the proper ksh is run works for me:

[root@localhost SubShell]# ./subshell_test.ksh 
ksh Version AJM 93u+ 2012-08-01
qua jun 14 15:08:28 -03 2017 Start ---------------------- subshell_test.ksh ----------------------
qua jun 14 15:08:29 -03 2017 Start ---------------------- subshellA2.ksh ----------------------
qua jun 14 15:08:29 -03 2017 Start ---------------------- subshellA1.ksh ----------------------
qua jun 14 15:08:29 -03 2017 Start ---------------------- subshellA3.ksh ----------------------
qua jun 14 15:08:29 -03 2017 Start ---------------------- watchdog_timer.ksh ----------------------
qua jun 14 15:08:39 -03 2017 End ---------------------- subshellA3.ksh ----------------------
qua jun 14 15:08:49 -03 2017 End ---------------------- subshellA2.ksh ----------------------
qua jun 14 15:08:59 -03 2017 End ---------------------- subshellA1.ksh ----------------------
qua jun 14 15:08:59 -03 2017 Start ---------------------- extra_task.ksh ----------------------
qua jun 14 15:09:09 -03 2017 End ---------------------- extra_task.ksh ----------------------

without the suggested patch:
[root@localhost SubShell]# ./subshell_test.ksh 
ksh Version AJM 93u+ 2012-08-01
qua jun 14 14:54:10 -03 2017 Start ---------------------- subshell_test.ksh ----------------------
qua jun 14 14:54:10 -03 2017 Start ---------------------- subshellA1.ksh ----------------------
qua jun 14 14:54:10 -03 2017 Start ---------------------- subshellA2.ksh ----------------------
qua jun 14 14:54:10 -03 2017 Start ---------------------- subshellA3.ksh ----------------------
qua jun 14 14:54:10 -03 2017 Start ---------------------- watchdog_timer.ksh ----------------------
qua jun 14 14:54:20 -03 2017 End ---------------------- subshellA3.ksh ----------------------
qua jun 14 14:54:30 -03 2017 End ---------------------- subshellA2.ksh ----------------------
qua jun 14 14:54:40 -03 2017 End ---------------------- subshellA1.ksh ----------------------
qua jun 14 15:00:50 -03 2017 End ---------------------- watchdog_timer.ksh ----------------------
subshell_test.ksh has overrun its time of 6 minutes
Started by:
UID        PID  PPID  C STIME TTY          TIME CMD
root     29280 17697  0 14:54 pts/7    00:00:00 /bin/ksh ./subshell_test.ksh
qua jun 14 15:00:50 -03 2017 Start ---------------------- extra_task.ksh ----------------------
qua jun 14 15:01:00 -03 2017 End ---------------------- extra_task.ksh ----------------------
qua jun 14 15:01:00 -03 2017 End ---------------------- subshell_test.ksh ----------------------

Make also sure there is no other ksh binary running, as it might be using the unpatched
image.

"""
diff -up ksh-20120801/src/cmd/ksh93/sh/xec.c.orig ksh-20120801/src/cmd/ksh93/sh/xec.c
--- ksh-20120801/src/cmd/ksh93/sh/xec.c.orig	2017-06-14 14:55:41.217712389 -0300
+++ ksh-20120801/src/cmd/ksh93/sh/xec.c	2017-06-14 14:56:33.976080250 -0300
@@ -1647,7 +1647,7 @@ int sh_exec(register const Shnode_t *t,
 
 				if((type&(FAMP|TFORK))==(FAMP|TFORK))
 				{
-					if(shp->comsub && !(shp->fdstatus[1]&IONOSEEK))
+					//if(shp->comsub && !(shp->fdstatus[1]&IONOSEEK))
 					{
 						if (!unpipe)
 							unpipe = iousepipe(shp);
"""

Comment 5 Siteshwar Vashisht 2017-06-15 08:10:51 UTC
Paulo,

It seems we are talking about different patches. I tested the patch kdudka mentioned in comment 2 and you are talking about the patch you posted in description.

Comment 6 Siteshwar Vashisht 2017-06-16 17:59:20 UTC
Created attachment 1288426 [details]
ksh-20120801-subshell-jobwait.patch

Comment 7 Kamil Dudka 2017-06-19 16:25:58 UTC
Comment on attachment 1288426 [details]
ksh-20120801-subshell-jobwait.patch

Looks good to me.  It reverts the logic to how it worked prior to applying ksh-20120801-macro.patch where sh_subfork() was always called for FAMP|TFORK.

I tried reproducers for bugs #913110 and #994251, which were fixed by this patch (and its successors), and have not observed any regression.  Moreover, it seems to fix #1333829 for the case where `...` is used instead of $(...).


Note You need to log in before you can comment on or make changes to this bug.