Bug 2332

Summary: "bad window argument" error caused by implementation of Tk_CoordsToWindow with Tcl and Tix
Product: [Retired] Red Hat Linux Reporter: Michael McConachie <michael>
Component: tcltkAssignee: Jens Petersen <petersen>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 5.2CC: claudiol, support
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 1999-05-11 15:47:36 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Michael McConachie 1999-04-22 22:47:09 UTC
Our applications use RH 5.2, TCL/TK and Tix in particular.
When we're running them with a remote X server, and
sometimes even locally, the performance is unacceptable.
Even worse, the applications crash every now and then.

The culprit is the "containing" function,
which returns the TK widget the mouse cursor is over (see
.../tcl8_0/tk8.0.3/generic/tkCmds.c:1313 and
.../tcl8_0/tk8.0.3/unix/tkUnixWm.c:3425).

Tk_CoordsToWindow calls XTranslateCoordinates repeatedly,
and XTranslateCoordinates is a synchronous function
(according the Xlib Reference Manual it "should be avoided
in most applications"). The Tix widgets call this function
all the time and so it takes up to several seconds for the
application to react to a mouse click.

The implementation of Tk_CoordsToWindow can even crash. It
calls XTranslateCoordinates first with the root window, then
with one of its children, then with one of its grandchildren
and so on. However, the X function may actually return a
window that doesn't belong to the calling application. If
that application exits during this process, the X server
reports an error (bad window argument) and the client exits
without returning from XTranslateCoordinates. This happens
to us all the time because the Tix widgets spend virtually
all of their time in Tk_CoordsToWindow and it is very likely
that one of the applications exits while another one is
inside Tk_CoordsToWindow.

Possible Solutions
==================

A. Never call "containing". It is a bad idea to call
"containing" because it is synchronous. We have identified
the few locations in TK and Tix code where it is called. It
should be possible to turn those calls into asynchronous
event handlers, since the Enter/Leave/MotionNofify events
contain the needed coordinate and window information.

B. Fix "containing". We can have "containing" process and
cache mouse events and record coordinates. That way we would
minimize the need to call the X server by doing it only when
the mouse has moved. Additionally, we must make sure not to
crash when a window is destroyed.
   B1. Catch the error in "containing" and restart the
serial query when a window disappears.
   B2. Record the full window hierarchy of the application
(at the moment only a subset of the windows are recorded)
and never query the server about windows of other
applications.

Comment 1 Michael McConachie 1999-04-22 22:51:59 UTC
We have identified a pretty clean fix for our immediate problem. You
may want to communicate it to the maintainer of Tix. The fix totally
eliminates calls to the offending synchronous function "containing".
The Tix Balloon module was basically

  1. calling "containing" for each balloon object in the application,
  2. calling "containing" for each mouse motion event,
  3. calling "containing" every 200 ms just for fun.

The Balloon module doesn't have to call "containing" at all because
the callback event already supplies the window in %W. Furthermore it's
more efficient to use the <Enter> event instead of mouse motion
events. And finally, even the source code admits that the 200 ms
verification is not needed.

Here's the diff:
======================================================================
*** /usr/lib/tix4.1/Balloon.tcl Sun Oct 11 00:15:51 1998
--- /home/athena5/sekino/Balloon.tcl    Wed Apr 21 15:22:11 1999
***************
*** 116,136 ****
  # "RAW" event bindings:
  #-------------------------------------------------------------------

! bind all <B1-Motion>      "+tixBalloon_XXMotion %X %Y 1"
! bind all <B2-Motion>      "+tixBalloon_XXMotion %X %Y 2"
! bind all <B3-Motion>      "+tixBalloon_XXMotion %X %Y 3"
! bind all <B4-Motion>      "+tixBalloon_XXMotion %X %Y 4"
! bind all <B5-Motion>      "+tixBalloon_XXMotion %X %Y 5"
! bind all <Any-Motion>             "+tixBalloon_XXMotion %X %Y 0"
! bind all <Leave>                  "+tixBalloon_XXMotion %X %Y %b"
! bind all <Button>                 "+tixBalloon_XXButton   %X %Y %b"
! bind all <ButtonRelease>    "+tixBalloon_XXButtonUp %X %Y %b"
! proc tixBalloon_XXMotion {rootX rootY b} {
      global tixBalloon

      foreach w $tixBalloon(bals) {
!       tixBalloon:XXMotion $w $rootX $rootY $b
      }
  }

--- 116,138 ----
  # "RAW" event bindings:
  #-------------------------------------------------------------------

! #bind all <B1-Motion>             "+tixBalloon_XXMotion %X %Y 1"
! #bind all <B2-Motion>             "+tixBalloon_XXMotion %X %Y 2"
! #bind all <B3-Motion>             "+tixBalloon_XXMotion %X %Y 3"
! #bind all <B4-Motion>             "+tixBalloon_XXMotion %X %Y 4"
! #bind all <B5-Motion>             "+tixBalloon_XXMotion %X %Y 5"
! #bind all <Any-Motion>            "+tixBalloon_XXMotion %X %Y 0"
! bind all <Enter>          "+tixBalloon_XXMotion %W %X %Y 0"
! bind all <Leave>                  "+tixBalloon_XXMotion %W %X %Y %b"
! #bind all <Button>                "+tixBalloon_XXButton   %X %Y %b"
! #bind all <ButtonRelease>    "+tixBalloon_XXButtonUp %X %Y %b"

! proc tixBalloon_XXMotion {cw rootX rootY b} {
      global tixBalloon

+ #    set cw [winfo containing $rootX $rootY]
      foreach w $tixBalloon(bals) {
!       tixBalloon:XXMotion $w $cw $rootX $rootY $b
      }
  }

***************
*** 182,188 ****
      return 1
  }

! proc tixBalloon:XXMotion {w rootX rootY b} {
      upvar #0 $w data

      if {$data(-state) == "none"} {
--- 184,190 ----
      return 1
  }

! proc tixBalloon:XXMotion {w cw rootX rootY b} {
      upvar #0 $w data

      if {$data(-state) == "none"} {
***************
*** 204,210 ****
        return
      }

-     set cw [winfo containing $rootX $rootY]
      if [tixBalloon:GrabBad $w $cw] {
        return
      }
--- 206,211 ----
***************
*** 345,351 ****

      set data(isActive) 1

!     after 200 tixBalloon:Verify $w
  }


--- 346,352 ----

      set data(isActive) 1

! #    after 200 tixBalloon:Verify $w
  }
=====================================================================

      --Bugtrack placed for Marko by mlm

Comment 2 Jeff Johnson 1999-05-11 15:47:59 UTC
This patch has been added to tcltk-8.0.5-30.src.rpm. Thanks

Comment 3 Need Real Name 2002-02-02 05:59:01 UTC
This patch is still being applied (tix-perf-patch), but I believe the problem
has long since been solved (AFAIK it was saolved in tk8.3.x in the
TkTranslate coordinates).

Could someone confirm that this is indeed still a problem in either
tix-8.1.3 or tix-8.2.0b3 - otherwise I think it is an unneeded patch.
For completeness I've opend a bug report in tix.sourceforge.net
http://sourceforge.net/tracker/index.php?func=detail&aid=510919&group_id=5649&atid=105649
but AFAIK this bug does not need fixing

Comment 4 Jens Petersen 2002-12-03 14:36:27 UTC
Thanks - this patch won't be in the coming tix-8.1.3 package.