Bug 824850

Summary: Crash due to Augeas during finalization
Product: [Other] RHQ Project Reporter: Lukas Krejci <lkrejci>
Component: AgentAssignee: RHQ Project Maintainer <rhq-maint>
Status: NEW --- QA Contact: Mike Foley <mfoley>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.4CC: hrupp, mazz, robingrindrod
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugzilla.redhat.com/show_bug.cgi?id=955783
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On:    
Bug Blocks: 955783    

Description Lukas Krejci 2012-05-24 08:50:20 EDT
Description of problem:

Sometimes, more frequently than comfortable, I get agent crashes when either stopping the plugin container or the entire agent.

The crash report always points to Augeas being the culprit:

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j  com.sun.jna.Function.invokeInt(I[Ljava/lang/Object;)I+0
j  com.sun.jna.Function.invoke([Ljava/lang/Object;Ljava/lang/Class;Z)Ljava/lang/Object;+315
j  com.sun.jna.Function.invoke(Ljava/lang/Class;[Ljava/lang/Object;Ljava/util/Map;)Ljava/lang/Object;+214
j  com.sun.jna.Library$Handler.invoke(Ljava/lang/Object;Ljava/lang/reflect/Method;[Ljava/lang/Object;)Ljava/lang/Object;+341
j  $Proxy147.aug_close(Lnet/augeas/jna/AugPointer;)I+16
j  net.augeas.Augeas.close()I+16
j  org.rhq.plugins.augeas.AugeasConfigurationComponent.close()V+11
j  org.rhq.plugins.augeas.AugeasConfigurationComponent.finalize()V+1
v  ~StubRoutines::call_stub
j  java.lang.ref.Finalizer.invokeFinalizeMethod(Ljava/lang/Object;)V+0
j  java.lang.ref.Finalizer.runFinalizer()V+45
j  java.lang.ref.Finalizer.access$100(Ljava/lang/ref/Finalizer;)V+1
j  java.lang.ref.Finalizer$FinalizerThread.run()V+11

This leads me to believe we still somehow somewhere do a double close() on an augeas instance, leading to a segmentation fault.

Version-Release number of selected component (if applicable):
4.5.0-SNAPSHOT

How reproducible:
rather frequently

Steps to Reproduce:
1. Start the agent with the augeas based plugins available
2. import a platform, confirm hosts file and sudoers resources are imported, too
3. Do a several rounds of "pc stop" and "pc start" on the agent commandline
4. Start and shutdown agent repeatedly
  
Actual results:
Sometimes, the agent crashes during step 3 or 4.

Expected results:
no crashes, please

Additional info:
Comment 2 Charles Crouch 2012-05-24 11:26:04 EDT
(9:22:33 AM) ccrouch: lkrejci_: you are running RHQ?
(9:22:48 AM) ccrouch: with plugins using augeas presumably?
(9:25:38 AM) ips: lkrejci_: i see a fix from larry checked in to java-augeas 
(http://git.fedorahosted.org/git/?p=java-augeas.git;a=summary)
(9:25:46 AM) ips: would that be of any help to us?
(9:26:33 AM) ips: in any case, why don't we subclass Augeas and make close() 
more robust?
(9:26:57 AM) ips: eg - so it is a no-op if called multiple times
(9:27:08 AM) ips: and potentially synchronize it too
(9:29:26 AM) mazz: fwiw, I REGULARLY see my agent crash on shutdown because of 
that stupid augeas stuff
(9:29:29 AM) lkrejci_: ccrouch: yep, that's RHQ, not JON - Apache, which is the 
only augeas-based JON plugin, is not involved in this actually
(9:29:39 AM) mazz: happens about once every 3 or so shutdowns
(9:30:39 AM) ccrouch: lkrejci_: please add that to the bug
(9:30:54 AM) lkrejci_: ips: some plugins don't go through our scaffolding and 
do weird and wonderful stuff with it
(9:31:02 AM) ccrouch: i don't see this as a blocker for JON
Comment 3 Mike Foley 2012-05-29 10:41:33 EDT
per BZ Triage 5/29/2012 (ccrouch, loleary, asantos, mfoley, myarborough) moving these to JON 3.1.1 or later
Comment 4 John Mazzitelli 2013-09-13 16:20:07 EDT
*** Bug 787017 has been marked as a duplicate of this bug. ***