Bug 572189 - agent crash on Fedora 12
Summary: agent crash on Fedora 12
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: RHQ Project
Classification: Other
Component: Plugins
Version: 4.0.0
Hardware: i686
OS: Linux
high
medium
Target Milestone: ---
: ---
Assignee: Filip Drabek
QA Contact: Corey Welton
URL:
Whiteboard:
Depends On:
Blocks: jon24-apache
TreeView+ depends on / blocked
 
Reported: 2010-03-10 14:16 UTC by Joseph Marques
Modified: 2010-11-09 13:10 UTC (History)
1 user (show)

Fixed In Version: 1.4
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-11-09 13:10:46 UTC
Embargoed:


Attachments (Terms of Use)
core dump file (62.69 KB, application/octet-stream)
2010-03-10 15:46 UTC, Joseph Marques
no flags Details
agent log file (48.97 KB, application/octet-stream)
2010-03-10 15:46 UTC, Joseph Marques
no flags Details
core dump file from aug 5th 2010 (67.95 KB, application/octet-stream)
2010-08-05 05:51 UTC, Joseph Marques
no flags Details
agent log file from aug 5th 2010 (49.90 KB, application/octet-stream)
2010-08-05 05:51 UTC, Joseph Marques
no flags Details

Description Joseph Marques 2010-03-10 14:16:45 UTC
Description of problem:

The agent crashes shortly after starting it up.  I've attached the core dump file (hs_err_pid file) as well as the agent.log file

Version-Release number of selected component (if applicable):

3.0.0 beta

How reproducible:

Very

Steps to Reproduce:
1. Build RHQ with all community plugins enabled
2. Start agent
3. Import resources using AD portlet
  
Actual results:

Agent crashes with core dump

Expected results:

Agent remains standing

Additional info:

I spoke to Ian and Mazz about this issue, and there is reason to believe these crashes are either coming from the virt plugin or the augeas plugin.  If the issue is with the augeas plugin, that needs to be pinpointed soon because the Apache resource now using augeas for the configuration facet.

Comment 1 Joseph Marques 2010-03-10 15:46:25 UTC
Created attachment 399117 [details]
core dump file

Comment 2 Joseph Marques 2010-03-10 15:46:44 UTC
Created attachment 399118 [details]
agent log file

Comment 3 Lukas Krejci 2010-03-10 16:01:51 UTC
The core dump identified the libjvm.so as the place of the crash, even though that doesn't tell much.

I see that you are using JRE 6.0_18-b07 (is this a beta version?). Have you tried using different JRE/JDK?

Comment 4 Joseph Marques 2010-05-10 16:05:35 UTC
Lukas, I believe "-bXX" is a build identifier used to denote the precise 
internal version that was released for the runtime environment as well as the 
hotspot.

Comment 5 Corey Welton 2010-05-13 13:20:51 UTC
FWIW, I don't seem to be having this problem in the enterprise build with openjdk


lrwxrwxrwx. 1 root root 0 2010-05-13 08:22 /proc/2241/exe -> /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/bin/java*

Comment 6 Charles Crouch 2010-05-18 14:05:37 UTC
Work with Joseph to try to reproduce.

Comment 7 Joseph Marques 2010-05-18 15:45:09 UTC
Yes, this is an issue with Sun JDK, which is one of our supported JVMs on Linux.  I would suggest testing with JRE 1.6.0_18-b07, but if you absolutely can't obtain that version the latest 6.x version will have to do.  However, if this issue goes unresolved, I would propose that we put disclaimers against customers using _18.  Note: there are many internet search results for "1.6.0_18-b07 crash"

Comment 8 Charles Crouch 2010-06-07 14:28:31 UTC
Setting this back to high since I'm not aware this has been reproduced.

Comment 9 Charles Crouch 2010-07-20 14:53:41 UTC
Dropping severity until able to reproduce

Comment 10 Joseph Marques 2010-08-05 05:49:52 UTC
Just saw this again when rebuilding master/HEAD (commit 5911f875 @ Mon Aug 2 14:49:33 2010 -0400).  Will upload the agent log and core files shortly.  In this case, the crash came during an uninventory operation.  Here are the messages that occurred in the server log at that time:

01:40:02,595 INFO  [ResourceManagerBean] User [org.rhq.core.domain.auth.Subject[id=2,name=rhqadmin]] is marking resource [Resource[id=10001, type=Linux, key=marques-redhat, name=marques-redhat, parent=<null>, version=Linux 2.6.32.16-141.fc12.x86_64]] for asynchronous uninventory
01:40:03,206 WARN  [ServerCommunicationsService] {Failed to truncate/delete spool for deleted agent [Agent[id=10001,name=marques-redhat,address=localhost,port=16163,remote-endpoint=socket://localhost:16163/?rhq.communications.connector.rhqtype=agent&numAcceptThreads=1&maxPoolSize=303&clientMaxPoolSize=304&socketTimeout=60000&enableTcpNoDelay=true&backlog=200,last-availability-report=1280986617975]] please manually remove the file: null}!!! missing resource message key=[Failed to truncate/delete spool for deleted agent [Agent[id=10001,name=marques-redhat,address=localhost,port=16163,remote-endpoint=socket://localhost:16163/?rhq.communications.connector.rhqtype=agent&numAcceptThreads=1&maxPoolSize=303&clientMaxPoolSize=304&socketTimeout=60000&enableTcpNoDelay=true&backlog=200,last-availability-report=1280986617975]] please manually remove the file: null] args=[java.lang.NullPointerException]
01:40:03,206 INFO  [AgentManagerBean] Removed agent: Agent[id=10001,name=marques-redhat,address=localhost,port=16163,remote-endpoint=socket://localhost:16163/?rhq.communications.connector.rhqtype=agent&numAcceptThreads=1&maxPoolSize=303&clientMaxPoolSize=304&socketTimeout=60000&enableTcpNoDelay=true&backlog=200,last-availability-report=1280986617975]

Comment 11 Joseph Marques 2010-08-05 05:51:13 UTC
Created attachment 436756 [details]
core dump file from aug 5th 2010

Comment 12 Joseph Marques 2010-08-05 05:51:46 UTC
Created attachment 436757 [details]
agent log file from aug 5th 2010


Note You need to log in before you can comment on or make changes to this bug.