Bug 161483 - Please include _Jv_CondWait patch [was: Eclipse CVS checkouts hang at final synchronization]
Please include _Jv_CondWait patch [was: Eclipse CVS checkouts hang at final ...
Product: Fedora
Classification: Fedora
Component: gcc (Show other bugs)
i686 Linux
medium Severity high
: ---
: ---
Assigned To: Jakub Jelinek
Depends On:
  Show dependency treegraph
Reported: 2005-06-23 13:30 EDT by Robin Green
Modified: 2007-11-30 17:11 EST (History)
6 users (show)

See Also:
Fixed In Version: 4.0.2-8.fc4
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2005-11-26 06:57:14 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
nativifyjar.sh (580 bytes, application/x-shellscript)
2005-10-13 15:18 EDT, Andrew Overholt
no flags Details
all.sh (516 bytes, application/x-shellscript)
2005-10-13 15:20 EDT, Andrew Overholt
no flags Details
statements to help trigger the bug (3.49 KB, patch)
2005-11-15 15:16 EST, Andrew Overholt
no flags Details | Diff
Script to rebuild ant plugins (657 bytes, text/plain)
2005-11-24 08:49 EST, Andrew Haley
no flags Details

External Trackers
Tracker ID Priority Status Summary Last Updated
GNU Compiler Collection 25016 None None None Never

  None (edit)
Description Robin Green 2005-06-23 13:30:17 EDT
Description of problem:
A moderately-sized CVS checkout hangs forever - it gets stuck at "Updating CVS
synchronization information". This only occurs on GNU java, not on the Sun 1.4.2
VM. I don't think this is the same as bug 151832, because Andrew Overholt
reports that bug 151832 occurs on both Sun and GNU runtimes.

This bug sometimes prevents eclipse shutting down normally, because even when
you do Cancel job, Send to background, and then File:Quit, eclipse still waits
indefinitely for the hung task to finish.

Version-Release number of selected component (if applicable):
eclipse-platform-3.1.0_fc-0.RC3.2 (built by me from rawhide CVS)

How reproducible:

Steps to Reproduce:
1. Switch to CVS perspective
2. Add a new server with the parameters shown here:
3. Open the server
4. Open "Versions"
5. Open org.eclipse.emf
6. Right-click on the 200506091102 build and choose Checkout
Actual results:
Files are checked out, but checkout process hangs for a long time at "Updating
CVS synchronization information".

Expected results:
Checkout should succeed in a reasonable amount of time.

Additional info:
I looked at the network traffic after it had got stuck for some minutes. netstat
shows there is a single TCP connection to eclipse.org on port 2401, but ethereal
shows that no packets were sent or received on that port for some minutes.

CVS compression level is the default.
Comment 1 Andrew Overholt 2005-07-15 16:19:04 EDT
I'm experiencing this now as well.


Comment 2 Robin Green 2005-07-22 10:48:04 EDT
On my machine, an oprofile run done when eclipse seemed to be hanging, shows
activity concentrated in interpreted code:

# opreport -l -t 1 'image:*gcj*'
CPU: CPU with timer interrupt, speed 0 MHz (estimated)
Profiling through timer interrupt
samples  %        app name                 symbol name
3193     26.0547  libgcj.so.6.0.0          _Jv_InterpMethod::run(void*, ffi_raw*)
928       7.5724  libgcj.so.6.0.0         
_Jv_CondWait(_Jv_ConditionVariable_t*, _Jv_Mutex_t*, long long, int)
880       7.1807  libgcj.so.6.0.0          java::lang::Object::wait(long long, int)
796       6.4953  libgcj.so.6.0.0          _Jv_platform_gettimeofday()
769       6.2750  libgcj.so.6.0.0          __i686.get_pc_thunk.bx
564       4.6022  libgcj.so.6.0.0          GC_mark_from
448       3.6557  libgcj.so.6.0.0          ffi_closure_raw_SYSV
412       3.3619  libgcj.so.6.0.0          java::lang::System::currentTimeMillis()
240       1.9584  org.eclipse.core.runtime_3.1.0.jar.so
org::eclipse::core::internal::jobs::Semaphore::acquire(long long)
183       1.4933  libgcj.so.6.0.0          .plt
149       1.2158  libgcj.so.6.0.0          _Jv_equalUtf8Consts(_Jv_Utf8Const
const*, _Jv_Utf8Const const*)
147       1.1995  libgcj.so.6.0.0          ffi_raw_call
129       1.0526  libgcj.so.6.0.0          _Jv_MonitorEnter
125       1.0200  libgcj.so.6.0.0         

Unfortunately, due to the lack of a SIGQUIT handler for gij, I am not sure how
to easily determine what code is being executed. (The cacao VM has a SIGQUIT
handler, but it completes the CVS synchronization in a reasonable amount of time
[although still slowly].)
Comment 3 Robin Green 2005-07-22 18:46:44 EDT
[Removing bogus bug alias, which was inserted by a buggy firefox plugin.]
Comment 4 Robin Green 2005-07-24 10:13:45 EDT
After fixing bug 163969, I was able to get a clearer picture of what was going
on with oprofile. Again, I sampled the hang for a few minutes - I didn't wait
for it to complete.

I was doing other stuff at the time with my machine, but according to top, the
vast majority of the time was spent in gij.

However, the oprofile report for the entire system starts like this:

CPU: CPU with timer interrupt, speed 0 MHz (estimated)
Profiling through timer interrupt
samples  %        image name               app name                 symbol name
560026   71.5807  no-vmlinux               no-vmlinux               (no symbols)
19564     2.5006  libgklayout.so           libgklayout.so           (no symbols)
18942     2.4211  anon (tgid:13508 range:0x8b2000-0x8b3000) gij                
     (no symbols)
12236     1.5640  libmozjs.so              libmozjs.so              (no symbols)
8833      1.1290  libxpconnect.so          libxpconnect.so          (no symbols)
8651      1.1057  anon (tgid:4986 range:0x88cb000-0x88d0000) Xorg              
      (no symbols)
8498      1.0862  libgcc_s.so.1            libgcc_s.so.1            __moddi3
8306      1.0616  libgcc_s.so.1            libgcc_s.so.1            __divdi3
7345      0.9388  libxpcom.so              libxpcom.so              (no symbols)
6085      0.7778  anon (tgid:4986 range:0x88dc000-0x8911000) Xorg              
      (no symbols)
6032      0.7710  libgcj.so.6.0.0          libgcj.so.6.0.0         
_Jv_CondWait(_Jv_ConditionVariable_t*, _Jv_Mutex_t*, long long, int)
5885      0.7522  libgcj.so.6.0.0          libgcj.so.6.0.0         
java::lang::Object::wait(long long, int)
5401      0.6903  anon (tgid:4986 range:0x87ce000-0x87ef000) Xorg              
      (no symbols)
5236      0.6692  libpython2.4.so.1.0      libpython2.4.so.1.0      (no symbols)
5160      0.6595  libpthread-2.3.90.so     libpthread-2.3.90.so    
5152      0.6585  libgcj.so.6.0.0          libgcj.so.6.0.0         

I suspect that gij is generating a vast number of syscalls, which is causing the
kernel to burn up most of the CPU time.

Now to isolate the gij parts of the profile:

CPU: CPU with timer interrupt, speed 0 MHz (estimated)
Profiling through timer interrupt
samples  %        image name               app name                 symbol name
18942    38.5737  anon (tgid:13508 range:0x8b2000-0x8b3000) gij                
     (no symbols)
6032     12.2836  libgcj.so.6.0.0          libgcj.so.6.0.0         
_Jv_CondWait(_Jv_ConditionVariable_t*, _Jv_Mutex_t*, long long, int)
5885     11.9843  libgcj.so.6.0.0          libgcj.so.6.0.0         
java::lang::Object::wait(long long, int)
5152     10.4916  libgcj.so.6.0.0          libgcj.so.6.0.0         
3032      6.1744  libgcj.so.6.0.0          libgcj.so.6.0.0         
2182      4.4434  libgcj.so.6.0.0          libgcj.so.6.0.0         
1421      2.8937  org.eclipse.core.runtime_3.1.0.jar.so
org.eclipse.core.runtime_3.1.0.jar.so org::eclipse::core::internal::jobs::Semaphore
::acquire(long long)
1387      2.8245  libgcj.so.6.0.0          libgcj.so.6.0.0          .plt
910       1.8531  libgcj.so.6.0.0          libgcj.so.6.0.0         
java::lang::Object::wait(long long)
735       1.4968  libgcj.so.6.0.0          libgcj.so.6.0.0          GC_mark_from
318       0.6476  libgcj.so.6.0.0          libgcj.so.6.0.0         
_Jv_InterpMethod::run(void*, ffi_raw*)
110       0.2240  libgcj.so.6.0.0          libgcj.so.6.0.0         

Clearly, a major part of the time the CPU is in the gij process, is spent in
locking code. (What is that anonymous map with 38.6% of the CPU time, though?
Could be just an artifact of an oprofile reporting bug, I suppose.)
Comment 5 Robin Green 2005-07-24 10:19:22 EDT
So, could be:

1. libgcj's locking code being too slow
2. eclipse calling libgcj's locking code a ridiculously high number of times
or 3. A combination of 1 and 2.

I'm going to try and look into possibility #2.
Comment 6 Andrew Haley 2005-07-28 14:03:12 EDT
Oddly, I can't duplicate this at all.  My checkout ends with     

    U org.eclipse.emf/tests/org.eclipse.emf.tests-feature/feature.properties
    U org.eclipse.emf/tests/org.eclipse.emf.tests-feature/feature.xml
    U org.eclipse.emf/tests/org.eclipse.emf.tests-feature/license.html
    cvs checkout: Updating
    U org.eclipse.emf/tests/org.eclipse.emf.tests-feature/rootfiles/epl-v10.html
    U org.eclipse.emf/tests/org.eclipse.emf.tests-feature/rootfiles/notice.html
ok (took 2:01.000)
Comment 7 Andrew Haley 2005-07-28 14:10:46 EDT
Ah, spoke too soon.  Exiting Eclipse doesn't terminate.

Backtrace is:

#0  0x0711d182 in java::lang::Object::wait (this=@a83e8d70,
timeout=9223372036854774393, nanos=0) at ./sysdep/locks.h:43
#1  0x07133081 in java.lang.Object.wait(long) (this=@a83e8d70,
    at ../../../libjava/java/lang/Object.java:449
#2  0xb44ea9ae in org::eclipse::core::internal::jobs::Semaphore::acquire ()
   from /usr/lib/eclipse/plugins/org.eclipse.core.runtime_3.1.0.jar.so
#3  0xb44e9996 in org::eclipse::core::internal::jobs::OrderedLock::doAcquire ()
   from /usr/lib/eclipse/plugins/org.eclipse.core.runtime_3.1.0.jar.so
#4  0xb44e96e2 in org::eclipse::core::internal::jobs::OrderedLock::acquire ()
   from /usr/lib/eclipse/plugins/org.eclipse.core.runtime_3.1.0.jar.so
#5  0xb44e966a in org::eclipse::core::internal::jobs::OrderedLock::acquire ()
   from /usr/lib/eclipse/plugins/org.eclipse.core.runtime_3.1.0.jar.so
#6  0xa88f5fda in
   from /usr/lib/eclipse/plugins/org.eclipse.team.cvs.core_3.1.0/cvs.jar.so
#7  0xa88f4a56 in
   from /usr/lib/eclipse/plugins/org.eclipse.team.cvs.core_3.1.0/cvs.jar.so
#8  0xa88f2c88 in
org::eclipse::team::internal::ccvs::core::resources::EclipseFolder::isCVSFolder ()
   from /usr/lib/eclipse/plugins/org.eclipse.team.cvs.core_3.1.0/cvs.jar.so
#9  0xa88f2faf in
org::eclipse::team::internal::ccvs::core::resources::EclipseFolder::isIgnored ()
   from /usr/lib/eclipse/plugins/org.eclipse.team.cvs.core_3.1.0/cvs.jar.so
#10 0xa88c7b7c in
org::eclipse::team::internal::ccvs::core::CVSSyncTreeSubscriber::isSupervised ()
   from /usr/lib/eclipse/plugins/org.eclipse.team.cvs.core_3.1.0/cvs.jar.so
#11 0xadce3909 in
org::eclipse::team::core::variants::ResourceVariantTreeSubscriber::members ()
   from /usr/lib/eclipse/plugins/org.eclipse.team.core_3.1.0/team.jar.so
#12 0xadcf7be4 in
() from /usr/lib/eclipse/plugins/org.eclipse.team.core_3.1.0/team.jar.so
Comment 8 Andrew Overholt 2005-07-28 14:18:11 EDT
This is only on rawhide stuff.  3.1M6 that we shipped with FC4 doesn't have this

Try checking out GNU Classpath as the example.  This happens with or without the
corresponding org.eclipse.team.cvs.core_3.1.0.jar.{so,db}.
Comment 9 Thomas Fitzsimmons 2005-08-16 12:03:22 EDT
FWIW, I found that disabling:

Preferences -> CVS -> Synchronize/Compare -> Consider file contents in comparisons

avoids this deadlock.  I doubt this info will help fix the real problem but it's
a workaround for those using native eclipse.
Comment 10 Andrew Overholt 2005-09-19 14:54:59 EDT
I've been working on this.  It appears that the patch for Eclipse bug 89416 is
what brings this libgcj issue to light.  I'm trying to create a test case for
the libgcj guys, but it's a very time-consuming process.  This is the only bug
that is standing in the way of Eclipse 3.1 hitting FC4, IMO.
Comment 11 Andrew Overholt 2005-09-20 16:10:21 EDT
If I take off CVS Decorators (Window->Preferences->Team->CVS->Label Decorations)
- all of them - I can get GNU Classpath to check out.
Comment 12 Andrew Overholt 2005-10-11 07:21:10 EDT
It appears this problem goes away when using a natively-compiled (with -O2)
org.eclipse.core.resources_3.1.0.jar.  aot-compile-rpm for FC4 did not use -fjni
as an argument for the native compilation of jars.  This resulted in certain
jars (o.e.c.resources, o.e.swt.gtk.linux, etc.) not being used BC-compiled. 
This can be verified by running eclipse with -vmargs -verbose and looking in the
output for "(bytecode)" (minus the quotes).

I've built a new version in rawhide (-15) which I'll test and then hopefully we
can get things back-ported to FC4 soon.
Comment 13 Bryce McKinlay 2005-10-13 14:47:03 EDT
Unfortunately it turns out that the problem isn't fixed by -O2 compilation.
Andrew has noticed that when running Eclipse using the gij -verbose option, the
problem does not occur. Our best guess at this point is that this is a timing
issue/race condition, as subtle changes to the environment (inserting debug
print statements, etc) also make the problem come and go. We've also noticed
that it happens all the time on a hyperthreaded machine, but only sometimes on a
single-threaded CPU.
Comment 14 Andrew Overholt 2005-10-13 15:17:12 EDT
Repeating what I said at [1]:

How to set things up to figure out WTF is going on with CVS checkouts in
Eclipse with libgcj -- Andrew Overholt

Last modified:  2005-10-13

. install Eclipse 3.1.1 from rawhide

  yum --enablerepo=development install eclipse-pde-devel

. start it using the Sun (or other proprietary) JVM

  mkdir -p ~/workspaces
  eclipse -vm <path to JVM>/bin/java -data ~/workspaces/cvs3.1.1

. import the source for the plug-ins we need:

  Import->External Plug-ins and Fragments
    . check "Projects with source folders"
    . next
    . pick org.eclipse.team.cvs.core, hit "Add ->"
    . hit "Required Plug-ins ->"
    . finish

. create ant build files for each

  foreach(org.eclipse.ant.core, org.eclipse.core.resources,
  	org.eclipse.core.runtime, org.eclipse.core.variables,
  	org.eclipse.team.core, org.eclipse.team.cvs.core,
    . right-click on plugin.xml
    . PDE Tools->Create Ant Build File
    . right-click on build.xml
    . Run As -> Ant Build...
    . check "build.update.jar", "build.jars" (should be checked), "clean",
    . change the order to be:  clean, build.jars, build.update.jar, refresh
    . hit Apply and then Run

. in another terminal, create a directory to store the native bits of the above

  mkdir -p ~/eclipse/cvsissues/nativebits

. get scripts to make nativifying easy (attached)

  cd ~/eclipse/cvsissues/nativebits
  wget http://overholt.ca/nativifyjar.sh
  wget http://overholt.ca/all.sh

. make the jar and db files for the above jars point to your modified ones:

  pushd /usr/lib/gcj/eclipse
    sudo mv org.eclipse.ant.core_3.1.1.jar.db{,.bak}
    sudo mv org.eclipse.core.resources_3.1.0.jar.db{,.bak}
    sudo mv org.eclipse.core.runtime_3.1.1.jar.db{,.bak}
    sudo mv org.eclipse.core.variables_3.1.0.jar.db{,.bak}
    sudo mv org.eclipse.team.core_3.1.1.jar.db{,.bak}
    sudo mv org.eclipse.team.cvs.core_3.1.1.jar.db{,.bak}
    sudo mv org.eclipse.update.configurator_3.1.0.jar.db{,.bak}
    sudo ln -s ~/eclipse/cvsissues/nativebits/org.eclipse.ant.core_3.1.1.jar.db
    sudo ln -s
    sudo ln -s ~/eclipse/cvsissues/nativebits/org.eclipse.core.runtime_3.1.1.jar.db
    sudo ln -s \
    sudo ln -s ~/eclipse/cvsissues/nativebits/org.eclipse.team.core_3.1.1.jar.db
    sudo ln -s \
    sudo ln -s \
  pushd /usr/share/eclipse/plugins
    sudo mv org.eclipse.ant.core_3.1.1.jar /tmp
    sudo mv org.eclipse.core.resources_3.1.0.jar /tmp
    sudo mv org.eclipse.core.runtime_3.1.1.jar /tmp
    sudo mv org.eclipse.core.variables_3.1.0.jar /tmp
    sudo mv org.eclipse.team.core_3.1.1.jar /tmp
    sudo mv org.eclipse.team.cvs.core_3.1.1.jar /tmp
    sudo mv org.eclipse.update.configurator_3.1.0.jar /tmp
    sudo ln -s \
    sudo ln -s \
    sudo ln -s \
    sudo ln -s \
    sudo ln -s \
    sudo ln -s \
    sudo ln -s \

. make changes in Eclipse as you see fit, re-run the Ant build for the
  encompassing plug-in each time (right-click on build.xml -> Run As -> Ant
  Build), re-run the native part for either all jars (all.sh) or just the one
  you changed:

  sudo rebuild-gcj-db


 one or more of:
  ./nativifyjar.sh org.eclipse.ant.core 3.1.1 ~/workspaces/cvs3.1.1
  ./nativifyjar.sh org.eclipse.core.resources 3.1.0 ~/workspaces/cvs3.1.1
  ./nativifyjar.sh org.eclipse.core.runtime 3.1.1 ~/workspaces/cvs3.1.1
  ./nativifyjar.sh org.eclipse.core.variables 3.1.0 ~/workspaces/cvs3.1.1
  ./nativifyjar.sh org.eclipse.team.core 3.1.1 ~/workspaces/cvs3.1.1
  ./nativifyjar.sh org.eclipse.team.cvs.core 3.1.1 ~/workspaces/cvs3.1.1
  ./nativifyjar.sh org.eclipse.update.configurator 3.1.0 ~/workspaces/cvs3.1.1

 followed by:
  sudo rebuild-gcj-db

. enable tracing if you want by editing <plugin>/.options and then tacking a
  -debug ~/workspaces/cvs3.1.1/<plugin>/.options onto the end of the eclipse
  launch command

. run native eclipse:

  rm -rf ~/workspaces/testCVS
  eclipse -data ~/workspaces/testCVS [-debug ~/workspaces/cvs3.1.1/<plugin>.options]

. NOTE:  you should also be able to run it from within Eclipse but I can't
  figure out how to get the classes in <project>/bin (which are theoretically
  the same that are in the generated jars) to be mapped to their corresponding

  (in Eclipse, set up a run profile that runs an Eclipse application with gij
  (you can set up an installed JRE for java-gcj-compat pretty easily by using
  /usr/lib/jvm as the location in the Installed JREs dialog))

Comment 15 Andrew Overholt 2005-10-13 15:18:43 EDT
Created attachment 119942 [details]
Comment 16 Andrew Overholt 2005-10-13 15:20:04 EDT
Created attachment 119943 [details]
Comment 17 Andrew Overholt 2005-10-13 15:30:56 EDT
After much looking into this, I am confident that the time is indeed being spent
in org::eclipse::core::internal::jobs::Semaphore::acquire as others above have
seen with backtraces and oprofile.

This is a synchronized method and AFAICT, there are three different threads
competing:  "Updating Change Sets for CVS Workspace", "Checking out
'classpath'", and "Decoration Calculation".  We aren't dead-locking, but the
lock acquiring and releasing in
is somehow messed up.  We're holding the locks way too long.  Figuring out why
is what I'm currently attempting to do.

There were times when I thought we were mistakenly returning files as being
dirty when they weren't but after adding some sorting code (to compare output
between runs with the Sun JVM and with our stuff) to
I can say for sure that we are not.  Another dead end I went down was seeing if
the setTeamPrivateMember call (I can't remember where this was) was outside the
IWorkspaceRunnable and affecting things but after modifying the code to set that
at Resource creation time, the problem persisted so it's not there, either.  The
calls to System.currentTimeMillis() are not causing delays, either.

I write this all just for posterity :)
Comment 18 Andrew Overholt 2005-10-13 15:59:13 EDT
(In reply to comment #14)
> How to set things up to figure out WTF is going on with CVS checkouts in
> Eclipse with libgcj -- Andrew Overholt
> Last modified:  2005-10-13
> . install Eclipse 3.1.1 from rawhide
>   yum --enablerepo=development install eclipse-pde-devel
> . start it using the Sun (or other proprietary) JVM
>   mkdir -p ~/workspaces
>   eclipse -vm <path to JVM>/bin/java -data ~/workspaces/cvs3.1.1

Here, you're better off using another Eclipse distribution (ie. download the
upstream .tar.gz) since we're going to replace some jars and stuff later on. 
Sorry for the confusion.
Comment 19 Andrew Overholt 2005-11-15 15:16:10 EST
Created attachment 121089 [details]
statements to help trigger the bug

This patch contains some commented out statements that I added for debugging
purposes.  When I thought this was fixed on my laptop, re-adding these lines
and following the steps in my earlier comments (about how to hack on this from
within Eclipse and rebuild the jar and jar.so, etc.) exposed the problem again.
 The problem is always present on my HT x86 box.
Comment 20 Andrew Overholt 2005-11-15 15:20:17 EST
(In reply to comment #19)
> This patch contains some commented out statements that I added for debugging
> purposes.

This patch can be applied like this:

expand org.eclipse.team.cvs.core
expand src
expand org.eclipse.team.internal.ccvs.core.resources
right-click on EclipseSynchronizer.java -> Team -> Apply patch
enter the location of the patch file
change "Ignore leading path name segments" to 5
hit "Guess" on maximum fuzz factor
the entries in the patch should now not have red exclamation points beside them
hit finish

Comment 21 Andrew Haley 2005-11-24 06:46:55 EST
See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25016

Comment 22 Andrew Haley 2005-11-24 08:49:03 EST
Created attachment 121453 [details]
Script to rebuild ant plugins

This script may be useful.  It quickly rebuilds an Eclipse plugin and installs
it, allowing the programmer easily to make experimental changes.
Comment 23 Andrew Overholt 2005-11-24 10:29:11 EST
I have verified that Andrew's patch does indeed fix this issue.  What a relief!
 Thank you very much, Andrew!
Comment 24 Jakub Jelinek 2005-11-26 06:57:14 EST
gcc-4.0.2-8.fc4 should appear in fc4 testing updates RSN (has been already built
and the update request filed).
Comment 25 Andrew Overholt 2005-11-29 10:02:24 EST
(In reply to comment #24)
> gcc-4.0.2-8.fc4 should appear in fc4 testing updates RSN (has been already built
> and the update request filed).

I have verified that this gcc build does indeed fix this bug.  Thanks!
Comment 26 Ben Konrath 2006-01-04 14:38:03 EST
Confirming this is fixed in updates-testing with the following package versions:


Note You need to log in before you can comment on or make changes to this bug.