Bug 121500

Summary: VMWARE: 10G JAVA TOOLS LIKE SRVCTL/DBCA/VIPCA HANG INSIDE VMWARE on RHEL3
Product: Red Hat Enterprise Linux 3 Reporter: Saar <saar.maoz>
Component: glibcAssignee: Jakub Jelinek <jakub>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: drepper
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-09-30 10:56:28 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
java control-/ trace 1
none
java control-/ trace 2 none

Description Saar 2004-04-22 05:52:12 UTC
Description of problem:

In RHEL3 all the Java based Oracle tools, like dbca, vipca and srvctl 
seem to HANG at seemingly random points..  We are not seeing the same 
hanging on RHAS2.1 nor on SLES8, on these two OS's, 10g installs just 
fine.  We are using U1 and RHEL3 on latest vmware bits 4.5.1. 

 There's a workaround.. when the application hangs you
just press control-z, bg, fg, and repeat in a loop, typically 4 times 
of this is more than enough to kick the application back on track.

I've contacted VMware and they claim it's most probably not in their
field..  We are talking about a plain and vanilla 10g install, nothing
fancy.  

 I've got a VM which reproduces the problem or you can install 10g 
yourself and within a few minute you'll see what I'm talking about.  

 Pressing CONTROL "/" on a java program prints the stack.. so I issue:
 srvctl status database -d O10G,  it hangs.. on slower machines more 
so than 
 faster CPUs.  I have two java status traces (control-/) not sure how 
to upload to bugzilla.  One 
 hangs before printing anything, second time I was lucky and it 
hanged after  printing status for one instance..  when it hangs, the 
only thing that releases is it control-z/bg/fg as described above.
 .
 Both traces are very similar, it always stuck at:
 .
 "main" prio=1 tid=0x08052368 nid=0x19b7 runnable [bfff9000..bfffa1c8]
         at oracle.ops.mgmt.nativesystem.OCRNative.getKeyValue(Native 
Method)
         at oracle.ops.mgmt.rawdevice.OCR.getKeyValue(OCR.java:381)
         at oracle.ops.mgmt.rawdevice.OCR.listSubKeys(OCR.java:607)
         at 
 oracle.ops.mgmt.rawdevice.OCRTreeHA.getDatabaseInstances
(OCRTreeHA.java:310)
         - locked <0xaaf52d90> (a oracle.ops.mgmt.rawdevice.OCRTreeHA)
 .
 one the outside it may seem in different places (dbca, progress, 
etc) but..  inside it seems like it's always 
 nativesystem.OCRNative.getKeyValue(Native Method)



running with strace yields a core dump..

setting severity to High, since all the 10g stack is not very useful 
with these random hangs..  feel free to drop priority since it's 
vmware..


Version-Release number of selected component (if applicable):

vmware 4.5.1 (workstation), RHEL3  U1, 2.4.21.9el

How reproducible:


Steps to Reproduce:
1. install 10g on rhel3
2. use dbca to create a DB or use srvctl to displace status of 
database "srvctl status db -d O10G"
  

I can also provide a vmware image that shows the problem.

Actual results:


Expected results:


Additional info:

Comment 1 Saar 2004-04-22 14:42:05 UTC
Created attachment 99632 [details]
java control-/ trace 1

Comment 2 Saar 2004-04-22 14:42:36 UTC
Created attachment 99633 [details]
java control-/ trace 2

Comment 3 Saar 2004-05-18 03:51:15 UTC
we're pretty sure this is solved by setting:

export LD_ASSUME_KERNEL=2.4.19

lowering severity to normal, since better workaround.

Comment 4 Jakub Jelinek 2004-05-25 15:43:56 UTC
Are you able to reproduce this with U2?
There has been a problem with pthread_cond_broadcast implementation
in U1, worked around in U2 (by disabling FUTEX_REQUEUE use) and
hopefully fixed for real instead of using workarounds in U3
(the kernel part of the changes is now waiting upstream acceptance).
Assuming it will be accepted it will be backported for U3.

Comment 5 Saar 2004-06-12 17:41:51 UTC
programs don't hang in Update 2.  probably can close this bug.

Comment 6 Ulrich Drepper 2004-09-30 10:56:28 UTC
Closing as per reporter's request.