Bug 1150624 - Agent failed to start due to XML validation error [NEEDINFO]
Summary: Agent failed to start due to XML validation error
Alias: None
Product: JBoss Operations Network
Classification: JBoss
Component: Agent
Version: JON 3.3.0
Hardware: x86_64
OS: Linux
Target Milestone: ER05
: JON 3.3.0
Assignee: Stefan Negrea
QA Contact: Mike Foley
Depends On:
TreeView+ depends on / blocked
Reported: 2014-10-08 14:26 UTC by Viet Nguyen
Modified: 2014-12-17 13:47 UTC (History)
3 users (show)

Clone Of:
Last Closed: 2014-12-17 13:47:46 UTC
mazz: needinfo? (vnguyen)

Attachments (Terms of Use)
json dump of the container (8.49 KB, text/plain)
2014-10-08 14:48 UTC, Viet Nguyen
no flags Details
agent config xml (61.56 KB, application/xml)
2014-10-08 14:55 UTC, Viet Nguyen
no flags Details

Description Viet Nguyen 2014-10-08 14:26:50 UTC
I'm seeing xml validation error which prevents agent start up in a Kubernetes node (managed docker container env).  The container runs fine with plain vanilla 'docker run..." command.  I suspect Kubernetes' networking may have something to do this.

Related BZs:

agent.log (RHQ_AGENT_DEBUG=on)
rhq-agent-wrapper.sh: RHQ_AGENT_WRAPPER_BIN_DIR_PATH=/opt/rhq-agent/bin
rhq-agent-wrapper.sh: Loading environment script: /opt/rhq-agent/bin/rhq-agent-env.sh
rhq-agent-wrapper.sh: pidfile will be located at /opt/rhq-agent/bin/rhq-agent.pid
Starting RHQ Agent...
rhq-agent-wrapper.sh: Start script found here: /opt/rhq-agent/bin/rhq-agent.sh
rhq-agent-wrapper.sh: Executing agent with command: '/opt/rhq-agent/bin/rhq-agent.sh'
RHQ Agent (pid 61 ) is running
2014-10-07 17:11:18,653 DEBUG [main] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.java-logging-start}Reconfiguring Java Logging...
2014-10-07 17:11:18,663 DEBUG [main] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.java-logging-done}Done reconfiguring Java Logging.
2014-10-07 17:11:18,666 DEBUG [main] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.creating-agent}Agent being created now
2014-10-07 17:11:18,722 DEBUG [main] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.loading-config-file}Loading configuration file [agent-configuration.xml]
2014-10-07 17:12:14,943 FATAL [main] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.start-failure}Failed to start the agent
java.util.prefs.InvalidPreferencesFormatException: org.xml.sax.SAXParseException; lineNumber: 41; columnNumber: 44; Document root element "preferences", must match DOCTYPE root "null".
	at java.util.prefs.XmlSupport.importPreferences(XmlSupport.java:216)
	at java.util.prefs.Preferences.importPreferences(Preferences.java:1259)
	at org.rhq.enterprise.agent.AgentMain.loadConfigurationFile(AgentMain.java:1329)
	at org.rhq.enterprise.agent.AgentMain.checkInitialConfiguration(AgentMain.java:3590)
	at org.rhq.enterprise.agent.AgentMain.processArguments(AgentMain.java:3365)
	at org.rhq.enterprise.agent.AgentMain.<init>(AgentMain.java:554)
	at org.rhq.enterprise.agent.AgentMain.main(AgentMain.java:423)
Caused by: org.xml.sax.SAXParseException; lineNumber: 41; columnNumber: 44; Document root element "preferences", must match DOCTYPE root "null".
	at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:198)
	at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.error(ErrorHandlerWrapper.java:134)
	at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:437)
	at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:368)
	at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:325)
	at com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.rootElementSpecified(XMLDTDValidator.java:1599)
	at com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.handleStartElement(XMLDTDValidator.java:1877)
	at com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.startElement(XMLDTDValidator.java:742)
	at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(XMLDocumentFragmentScannerImpl.java:1363)
	at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$ContentDriver.scanRootElementHook(XMLDocumentScannerImpl.java:1292)
	at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:3138)
	at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:880)
	at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:606)
	at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:510)
	at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:848)
	at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:777)
	at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
	at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:243)
	at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:347)
	at java.util.prefs.XmlSupport.loadPrefsDoc(XmlSupport.java:250)
	at java.util.prefs.XmlSupport.importPreferences(XmlSupport.java:199)

Comment 1 Viet Nguyen 2014-10-08 14:48:30 UTC
Created attachment 945044 [details]
json dump of the container

Comment 2 Viet Nguyen 2014-10-08 14:55:26 UTC
Created attachment 945046 [details]
agent config xml

Comment 3 Heiko W. Rupp 2014-10-09 12:51:44 UTC
This has nothing to do with the agent configuration, but with the preferences node in ~/.java/.pref*

Can you please check if that file looks ok and if you can actually read/write it?

Comment 4 Viet Nguyen 2014-10-09 15:55:57 UTC
I see what might the problem.  The HOME env is set to / which is a read-only file system.  I tried to export HOME=/root to no avail.

Comment 5 Viet Nguyen 2014-10-09 17:13:33 UTC
More troubleshooting. Empty prefs.xml

# export RHQ_AGENT_JAVA_OPTS=-Djava.util.prefs.userRoot=/root -Djava.util.prefs.systemRoot=/opt

# rhq-agent-wrapper.sh

# ls -la /root/.java/.userPrefs/rhq-agent

total 16
drwxrwx---. 3 root root 4096 Oct  9 12:46 .
drwx------. 3 root root 4096 Oct  9 12:46 ..
drwxrwx---. 2 root root 4096 Oct  9 12:46 default
-rw-rw----. 1 root root  148 Oct  9 12:46 prefs.xml

# cat /root/.java/.userPrefs/rhq-agent/prefs.xml
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE map SYSTEM "http://java.sun.com/dtd/preferences.dtd">
<map MAP_XML_VERSION="1.0"/>

# cat /root/.java/.userPrefs/rhq-agent/default/prefs.xml

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE map SYSTEM "http://java.sun.com/dtd/preferences.dtd">
<map MAP_XML_VERSION="1.0"/>

Comment 6 Viet Nguyen 2014-10-10 14:56:09 UTC
More info:
java version "1.7.0_65"
OpenJDK Runtime Environment (rhel- u65-b17)
OpenJDK 64-Bit Server VM (build 24.65-b04, mixed mode)

Comment 7 John Mazzitelli 2014-10-10 14:56:40 UTC
I can only think of one of two things - one, Java is trying to validate the .dtd by grabbing it over the network and its failing or two for some reason this Java implementation is not writing out <preferences> element when creating the initial preference node.

What if you actually put the real XML in /root/.userPrefs/rhq-agent/default/prefs.xml? I think you might be able to just copy the content of agent-configuraiton.xml right in there and then restart the agent (thus manually pre-populating the agnet config preferences). Do you still get the error?

Comment 8 John Mazzitelli 2014-10-10 15:25:07 UTC
Its a network issue. How to replicate:

1) disconnect your network OR just edit /etc/hosts and put a bogus IP for java.sun.com

2) rm -rf $HOME/.java/.userPrefs/rhq-agent

3) start the agent

You get the exception listed in the description of this BZ.

Comment 9 John Mazzitelli 2014-10-10 15:39:13 UTC
This did not happen in RHQ 4.12. Something broke since then. I think it has to do with we now use javax.xml parsing to process the agent config for obfuscating preferences. I don't know why its happening, but that has been one thing that changed in this area and could be why its breaking now.

Comment 10 John Mazzitelli 2014-10-10 16:07:33 UTC
caused by fix to bug #782915

trying to determine how to fix this issue and not break that again

Comment 11 John Mazzitelli 2014-10-10 16:13:58 UTC
assigning to stefan - he's going to take the first stab at fixing this.

Comment 12 John Mazzitelli 2014-10-10 17:48:50 UTC
cherry picked fix to 3.3 branch:

commit 3892942c5d19feb98a38a52546582304983160a3
Author: Stefan Negrea <snegrea@redhat.com>
Date:   Fri Oct 10 12:44:38 2014 -0500

    [BZ 1150624] Add additional code to prevent attempts to load DTD.
    (cherry picked from commit 3064b71e8a10c2987c9c45188975a07e2462caa7)
    Signed-off-by: John Mazzitelli <mazz@redhat.com>

commit c1dbfc7ecb33efe108143d0b81165fa7113c1369
Author: Stefan Negrea <snegrea@redhat.com>
Date:   Fri Oct 10 12:16:59 2014 -0500

    [BZ 1150624] Avoid DTD retrieval even if the DOM has it. Also avoid parsing logic to determine if 
    (cherry picked from commit e73af2af73f53511bef5752034c383112a553d98)
    Signed-off-by: John Mazzitelli <mazz@redhat.com>

Comment 14 Simeon Pinder 2014-10-21 20:24:11 UTC
Moving to ON_QA as available to test with the latest brew build:

Note You need to log in before you can comment on or make changes to this bug.