Bug 1023157

Summary: java-abrt hangs during IPA test
Product: Red Hat Enterprise Linux 7 Reporter: Endi Sukma Dewata <edewata>
Component: java-1.7.0-openjdkAssignee: Pavel Tisnovsky <ptisnovs>
Status: CLOSED DUPLICATE QA Contact: BaseOS QE - Apps <qe-baseos-apps>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 7.0CC: edewata, jfilak, jvanek, mkosek, nkinder, ptisnovs, spoore
Target Milestone: rcKeywords: TestBlocker
Target Release: 7.0   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-10-31 09:02:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1018804, 1020711    

Description Endi Sukma Dewata 2013-10-24 18:29:14 UTC
By default RHEL 7 uses "java-abrt" instead of plain "java" executable. During IPA testing the JVM hangs at differing locations in the Java code, which probably indicates there is a problem in the JVM itself. Replacing "java-abrt" with "java" seems to fix the problem.

The test environment is a little complex. IPA uses CS which runs on Tomcat. The beaker tests are running AD trust tests that involve installing and uninstalling IPA and CS multiple times. The hang usually happens during Tomcat startup, but it's not very consistent. If the test is repeated in a loop, the problem usually appears within the first few iterations.

Please see the following bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1018804

Comment 1 Deepak Bhole 2013-10-24 18:33:47 UTC
Assigning to author of Java-ABRT connector. Pavel, please take a look when you get a chance.

Comment 4 jiri vanek 2013-10-29 09:56:46 UTC
Hi!

substitute .../jvm/java-1.7.0-openjd/ with full_path_to_jdk

As you probably noted, the java-abr-connector have two touches to jdk.
First - the .../jvm/java-1.7.0-openjd/jre/bin/java is no longer binary, but script which is launching the .../jvm/java-1.7.0-openjd/jre-abrt/bin/java.

The path was set, because the jvm binary have hardcoded paths around bin and lib.

If java-abrt-connector is installed, then .../jvm/java-1.7.0-openjd/jre-abrt/bin/java script is adding libjavaabrt.so agent into .../jvm/java-1.7.0-openjd/jre-abrt/bin/java arguments.

Please may you confirm, that the path is what is causing the freezing or libjavaabrt itself?

So you need to restore previous configuration with .../jvm/java-1.7.0-openjd/jre{,-abrt}/bin/{java,lib} where .../jvm/java-1.7.0-openjd/jre/bin/java is teh script lunching the .../jvm/java-1.7.0-openjd/jre-abrt/bin/java, and try your testrusite with java-abrt-connector installed/missing.

In both cases (wrong libjavaagent.so, or wrong paths) we will need the reproducer. Also, if your tests, or applications you are testing are dependent on .../whatever/jre/bin/java path, then they are wrong, and this bug should be reassign to authors of proper components.

Comment 7 Scott Poore 2013-10-29 21:17:23 UTC
I'll let Endi comment on the exact paths as used by pki/tomcat there.   I will say though that he had me try something like this to test and see if it was java-abrt.

In my test script, I added this:

cd /usr/lib/jvm/java/jre/bin
mv java java.old
ls -s ../../bin/java

Now, I will also note the following:

1. I was seeing the hang occur in RHEL7 release 20131011.n.0.
  
2. I have run the tests 10 times in the last two days for release 20131018.n.0 and not seen a hang.

3. When I apply the java link workaround to 20131011.n.0 I haven't seen the issue yet.

So, Was something fixed between 20131011.n.0 and 20131018.n.0 that may have resolved the issue?  Or just dumb luck that I haven't seen it again yet on the newer release?

I'm not sure yet if we have a good clean reproducer but, if necessary, I can help reproduce with IPA tests.

Thanks,
Scott

Comment 8 jiri vanek 2013-10-30 09:40:09 UTC
(In reply to Scott Poore from comment #7)
> I'll let Endi comment on the exact paths as used by pki/tomcat there.   I
> will say though that he had me try something like this to test and see if it
> was java-abrt.
> 
> In my test script, I added this:
> 
> cd /usr/lib/jvm/java/jre/bin
> mv java java.old
> ls -s ../../bin/java
> 
> Now, I will also note the following:
> 
> 1. I was seeing the hang occur in RHEL7 release 20131011.n.0.
>   
> 2. I have run the tests 10 times in the last two days for release
> 20131018.n.0 and not seen a hang.
> 
> 3. When I apply the java link workaround to 20131011.n.0 I haven't seen the
> issue yet.
> 
> So, Was something fixed between 20131011.n.0 and 20131018.n.0 that may have
> resolved the issue?  Or just dumb luck that I haven't seen it again yet on
> the newer release?
> 
> I'm not sure yet if we have a good clean reproducer but, if necessary, I can
> help reproduce with IPA tests.
> 
> Thanks,
> Scott

The abrt was made soft depndence. May  you rerun the tests with java-abrt-connector installed?

Comment 9 Jakub Filak 2013-10-30 13:11:35 UTC
Jiri, this bug report looks like a duplicate of bug #1012827.

Comment 10 Scott Poore 2013-10-30 14:48:27 UTC
Jiri, 

Ah I see.  

On the builds where it's hanging we have this installed:

abrt-java-connector-1.0.5-1.el7.x86_64

On the newer builds that is not installed.

I am running tests now so this should get installed with the newer rhel7 release.  I'll post results back here when I have them.

Thanks,
Scott

Comment 11 Endi Sukma Dewata 2013-10-30 16:31:45 UTC
Hi, just FYI when I reproduced the problem I was using the default java which on my test machine was:
/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.40-2.4.2.6.el7.x86_64/jre/bin/java
I saw that it's actually executing java-abrt with the abrt-java-connector.

As Scott said, I replaced the "java" file there with a soft link to:
/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.40-2.4.2.6.el7.x86_64/bin/java
This way it wasn't using java-abrt nor the abrt-java-connector anymore.

Unfortunately I don't have the test machine anymore but hopefully Scott will be able to get some results.

Comment 12 Scott Poore 2013-10-30 16:34:00 UTC
Jiri, Jakub,

I saw 2 of my 4 test runs hang.  So, it does appear that abrt-java-connector is involved.

Is there anything else we need to do here to confirm if this is a dup of bug #1012827 or is this enough?

Thanks,
Scott

Comment 13 Jakub Filak 2013-10-30 16:51:59 UTC
Scott, Jiri,

I am pretty sure that this bug is a dup of bug #1012827
It has exactly the same symptoms as we observed together with Pavel.

Comment 14 Endi Sukma Dewata 2013-10-30 17:34:07 UTC
Here's the stack trace from Scott's test. It looks like one of the thread is stuck in a critical section within abrt-java-connector.

#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f5e9a27fa4b in os::PlatformEvent::park (this=0x7f5e94017e00)
    at /usr/src/debug/java-1.7.0-openjdk-1.7.0.45-2.4.3.1.el7.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:5440
#2  0x00007f5e9a167be4 in JvmtiRawMonitor::SimpleEnter (
    this=this@entry=0x7f5e94011280, Self=Self@entry=0x7f5e94017000)
    at /usr/src/debug/java-1.7.0-openjdk-1.7.0.45-2.4.3.1.el7.x86_64/openjdk/hotspot/src/share/vm/prims/jvmtiRawMonitor.cpp:147
#3  0x00007f5e9a1681f5 in JvmtiRawMonitor::raw_enter (
    this=this@entry=0x7f5e94011280, 
    __the_thread__=__the_thread__@entry=0x7f5e94017000)
    at /usr/src/debug/java-1.7.0-openjdk-1.7.0.45-2.4.3.1.el7.x86_64/openjdk/hotspot/src/share/vm/prims/jvmtiRawMonitor.cpp:308
#4  0x00007f5e9a14bd9c in JvmtiEnv::RawMonitorEnter (this=<optimized out>, 
    rmonitor=0x7f5e94011280)
    at /usr/src/debug/java-1.7.0-openjdk-1.7.0.45-2.4.3.1.el7.x86_64/openjdk/hotspot/src/share/vm/prims/jvmtiEnv.cpp:3059
#5  0x00007f5e98a7ecd7 in enter_critical_section (
    jvmti_env=jvmti_env@entry=0x7f5e94010bf0)
    at /usr/src/debug/abrt-java-connector-9214372f6635aa377954f26a7c4dc90477a14564/src/abrt-checker.c:565
#6  0x00007f5e98a7ed61 in callback_on_object_alloc (jvmti_env=0x7f5e94010bf0, 
    jni_env=<optimized out>, thread=<optimized out>, object=<optimized out>, 
    object_klass=0x7f5e94103078, size=<optimized out>)
    at /usr/src/debug/abrt-java-connector-9214372f6635aa377954f26a7c4dc90477a14564/src/abrt-checker.c:1980
#7  0x00007f5e9a15fbda in JvmtiExport::post_vm_object_alloc (
    thread=0x7f5e94017000, object=object@entry=0x7686f1be8)
    at /usr/src/debug/java-1.7.0-openjdk-1.7.0.45-2.4.3.1.el7.x86_64/openjdk/hotspot/src/share/vm/prims/jvmtiExport.cpp:2122
#8  0x00007f5e9a160c5f in JvmtiVMObjectAllocEventCollector::~JvmtiVMObjectAllocEventCollector (this=0x7f5e9b6b75f0, __in_chrg=<optimized out>)
    at /usr/src/debug/java-1.7.0-openjdk-1.7.0.45-2.4.3.1.el7.x86_64/openjdk/hotspot/src/share/vm/prims/jvmtiExport.cpp:2338
#9  0x00007f5e9a0e08a3 in JVM_GetStackAccessControlContext (
    env=0x7f5e940171d8, cls=<optimized out>)
    at /usr/src/debug/java-1.7.0-openjdk-1.7.0.45-2.4.3.1.el7.x86_64/openjdk/hotspot/src/share/vm/prims/jvm.cpp:1281
#10 0x00007f5e90e2c4cc in ?? ()
#11 0x00007f5e9b6b7d20 in ?? ()
#12 0x00007f5e9b6b7cd8 in ?? ()
#13 0x000000000000006f in ?? ()
#14 0x0000000000000000 in ?? ()

Comment 15 jiri vanek 2013-10-31 09:02:45 UTC
(In reply to Jakub Filak from comment #13)
> Scott, Jiri,
> 
> I am pretty sure that this bug is a dup of bug #1012827
> It has exactly the same symptoms as we observed together with Pavel.

yes. looks like this. Although here is much more information, 1012827 have all flags. So closing this one. Thanx to all who provided information!

*** This bug has been marked as a duplicate of bug 1012827 ***