Bug 478961

Summary: [RHEL5] tcl threads support implementation can cause scripts to hang
Product: Red Hat Enterprise Linux 5 Reporter: Tomas Smetana <tsmetana>
Component: tclAssignee: Jaroslav Škarvada <jskarvad>
Status: CLOSED ERRATA QA Contact: Martin Kyral <mkyral>
Severity: medium Docs Contact:
Priority: high    
Version: 5.2CC: fhirtz, jentrena, jskarvad, mkyral, rvokal, sforsber, tao
Target Milestone: rcKeywords: Reopened
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: tcl-8.4.13-5.el5 Doc Type: Bug Fix
Doc Text:
Previously the combination of threads and fork in tcl scripts could cause the tcl script to hang. This is caused by poor implementation of threading in tcl. Currently it is not possible to drop the threading or rewrite it completely (it would break more things). As the fix we provide both threading and non-threading version of tcl that is switchable via alternatives (e.g. 64 bit tcl version on x86_64 by command: alternatives --config tcl.x86_64). Users who need to use fork in their tcl scripts and do not need threading support are now able to switch to the non-threading version of tcl.
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-01-08 04:51:45 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 743405, 807971    
Attachments:
Description Flags
GDB backtrace of the hung expect interpreter. none

Description Tomas Smetana 2009-01-06 09:27:08 UTC
Consider the following expect script:

#!/usr/bin/expect -d
if {[fork] != 0} exit
disconnect
set stty_init "-echo -opost"
spawn -noecho /usr/sbin/ntpq

If the script is run on a RHEL-5 system the ps output would contain something like:

17199 ?        00:00:00 test.ex
17200 ?        00:00:00 ntpq <defunct>

i.e., the interpreter hangs and the strace will end with:
futex(0x7104924, FUTEX_WAIT, 3, NULL

The problem lies in the threaded notification implementation in Tcl that doesn't look to be correct. I will attach the gdb backtrace which reveals what happened.

Version-Release number of selected component (if applicable):
tcl-8.4.13-3.fc6
expect-5.43.0-5.1
glibc-2.5-24

How reproducible:
Always.

Steps to Reproduce:
See above.
  
Actual results:
The expect interpreter hangs in futex waiting.

Expected results:
The interpreter correctly terminates.

Additional info:
The same problem had appeared in Fedora (bug #443246) and was fixed in tcl-8.5.1-4 by disabling the threads support entirely, which was an ABI breakage.  It seems that the issue is still present in the recent upstream version and the "proper" solution may require a deeper re-design of the threading support in Tcl.

Comment 1 Tomas Smetana 2009-01-06 09:29:01 UTC
Created attachment 328259 [details]
GDB backtrace of the hung expect interpreter.

Comment 2 RHEL Program Management 2009-03-26 17:18:10 UTC
This request was evaluated by Red Hat Product Management for
inclusion, but this component is not scheduled to be updated in
the current Red Hat Enterprise Linux release. If you would like
this request to be reviewed for the next minor release, ask your
support representative to set the next rhel-x.y flag to "?".

Comment 5 Issue Tracker 2010-01-05 20:13:24 UTC
Event posted on 01-05-2010 03:13pm EST by fhirtz

Note, that the test workaround of a rebuild without threading being enabled
does correct the demonstration case here for them. This obviously isn't a
priority, but it's still something that they'd like to see addressed in
the context of the release.


This event sent from IssueTracker by fhirtz 
 issue 242872

Comment 7 Jaroslav Škarvada 2010-04-20 08:57:36 UTC
This is well known issue arising from combination of threads and fork. Rebuilding without threads (as many other distros including Fedora done) fixes this issue but could also affect customers relying on threads. Fixing this in TCL needs complex rewrite and still wasn't done by upstream.

The possible solutions:

1) Provide separate builds of TCL with and without threads - customer choose if she/he needs threads or fork.

2) Rebuild without threads - it could affect customers relying on threads.

3) Try to fix it - it requires a lot of time and complex modifications that would probably introduce bugs and changes in behaviour.

4) Reporter can avoid the hang by using of C wrapper for starting daemons instead of fork in "expect".

Comment 8 Frank Hirtz 2010-05-05 21:46:29 UTC
The client in this case is going with option 1) as they're not needing threading. It seems that 1) is the most prudent option since it doesn't carry risks aside from requiring a bit of explaining on our side.

Comment 9 Tomas Smetana 2010-06-21 07:32:12 UTC
Hello,
  this bug seems to have even more ugly outcomes (e.g., the "wait" command doesn't return upon the child process' termination).  We have more customers being hit by this problem, so please reconsider the inclusion of the fix in the next minor release.

I think the proposal 1 from the comment #7 sounds like a safe enough option.

Thank you and regards.

Comment 15 RHEL Program Management 2010-07-19 07:54:39 UTC
Development Management has reviewed and declined this request.  You may appeal
this decision by reopening this request.

Comment 19 RHEL Program Management 2010-08-09 18:45:06 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated in the
current release, Red Hat is unfortunately unable to address this
request at this time. Red Hat invites you to ask your support
representative to propose this request, if appropriate and relevant,
in the next release of Red Hat Enterprise Linux.

Comment 23 RHEL Program Management 2011-05-31 13:51:10 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated in the
current release, Red Hat is unfortunately unable to address this
request at this time. Red Hat invites you to ask your support
representative to propose this request, if appropriate and relevant,
in the next release of Red Hat Enterprise Linux.

Comment 24 RHEL Program Management 2011-09-23 00:22:18 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated in the
current release, Red Hat is unfortunately unable to address this
request at this time. Red Hat invites you to ask your support
representative to propose this request, if appropriate and relevant,
in the next release of Red Hat Enterprise Linux.

Comment 28 RHEL Program Management 2012-04-02 14:17:53 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux release.  Product Management has
requested further review of this request by Red Hat Engineering, for
potential inclusion in a Red Hat Enterprise Linux release for currently
deployed products.  This request is not yet committed for inclusion in
a release.

Comment 31 Jaroslav Škarvada 2012-06-15 09:17:43 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Previously the combination of threads and fork in tcl scripts could cause the tcl script to hang. This is caused by poor implementation of threading in tcl. Currently it is not possible to drop the threading or rewrite it completely (it would break more things). As the fix we provide both threading and non-threading version of tcl that is switchable via alternatives (e.g. 64 bit tcl version on x86_64 by command: alternatives --config tcl.x86_64). Users who need to use fork in their tcl scripts and do not need threading support are now able to switch to the non-threading version of tcl.

Comment 34 errata-xmlrpc 2013-01-08 04:51:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0122.html