I'm seeing xml validation error which prevents agent start up in a Kubernetes node (managed docker container env). The container runs fine with plain vanilla 'docker run..." command. I suspect Kubernetes' networking may have something to do this. Related BZs: https://bugzilla.redhat.com/show_bug.cgi?id=1127873 https://bugzilla.redhat.com/show_bug.cgi?id=782915 agent.log (RHQ_AGENT_DEBUG=on) ==== rhq-agent-wrapper.sh: RHQ_AGENT_WRAPPER_BIN_DIR_PATH=/opt/rhq-agent/bin rhq-agent-wrapper.sh: Loading environment script: /opt/rhq-agent/bin/rhq-agent-env.sh rhq-agent-wrapper.sh: pidfile will be located at /opt/rhq-agent/bin/rhq-agent.pid Starting RHQ Agent... rhq-agent-wrapper.sh: Start script found here: /opt/rhq-agent/bin/rhq-agent.sh rhq-agent-wrapper.sh: Executing agent with command: '/opt/rhq-agent/bin/rhq-agent.sh' RHQ Agent (pid 61 ) is running 2014-10-07 17:11:18,653 DEBUG [main] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.java-logging-start}Reconfiguring Java Logging... 2014-10-07 17:11:18,663 DEBUG [main] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.java-logging-done}Done reconfiguring Java Logging. 2014-10-07 17:11:18,666 DEBUG [main] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.creating-agent}Agent being created now 2014-10-07 17:11:18,722 DEBUG [main] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.loading-config-file}Loading configuration file [agent-configuration.xml] 2014-10-07 17:12:14,943 FATAL [main] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.start-failure}Failed to start the agent java.util.prefs.InvalidPreferencesFormatException: org.xml.sax.SAXParseException; lineNumber: 41; columnNumber: 44; Document root element "preferences", must match DOCTYPE root "null". at java.util.prefs.XmlSupport.importPreferences(XmlSupport.java:216) at java.util.prefs.Preferences.importPreferences(Preferences.java:1259) at org.rhq.enterprise.agent.AgentMain.loadConfigurationFile(AgentMain.java:1329) at org.rhq.enterprise.agent.AgentMain.checkInitialConfiguration(AgentMain.java:3590) at org.rhq.enterprise.agent.AgentMain.processArguments(AgentMain.java:3365) at org.rhq.enterprise.agent.AgentMain.<init>(AgentMain.java:554) at org.rhq.enterprise.agent.AgentMain.main(AgentMain.java:423) Caused by: org.xml.sax.SAXParseException; lineNumber: 41; columnNumber: 44; Document root element "preferences", must match DOCTYPE root "null". at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:198) at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.error(ErrorHandlerWrapper.java:134) at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:437) at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:368) at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:325) at com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.rootElementSpecified(XMLDTDValidator.java:1599) at com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.handleStartElement(XMLDTDValidator.java:1877) at com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.startElement(XMLDTDValidator.java:742) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(XMLDocumentFragmentScannerImpl.java:1363) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$ContentDriver.scanRootElementHook(XMLDocumentScannerImpl.java:1292) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:3138) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:880) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:606) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:510) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:848) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:777) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141) at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:243) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:347) at java.util.prefs.XmlSupport.loadPrefsDoc(XmlSupport.java:250) at java.util.prefs.XmlSupport.importPreferences(XmlSupport.java:199)
Created attachment 945044 [details] json dump of the container
Created attachment 945046 [details] agent config xml
This has nothing to do with the agent configuration, but with the preferences node in ~/.java/.pref* Can you please check if that file looks ok and if you can actually read/write it?
I see what might the problem. The HOME env is set to / which is a read-only file system. I tried to export HOME=/root to no avail.
More troubleshooting. Empty prefs.xml =========== # export RHQ_AGENT_JAVA_OPTS=-Djava.util.prefs.userRoot=/root -Djava.util.prefs.systemRoot=/opt # rhq-agent-wrapper.sh # ls -la /root/.java/.userPrefs/rhq-agent total 16 drwxrwx---. 3 root root 4096 Oct 9 12:46 . drwx------. 3 root root 4096 Oct 9 12:46 .. drwxrwx---. 2 root root 4096 Oct 9 12:46 default -rw-rw----. 1 root root 148 Oct 9 12:46 prefs.xml # cat /root/.java/.userPrefs/rhq-agent/prefs.xml <?xml version="1.0" encoding="UTF-8" standalone="no"?> <!DOCTYPE map SYSTEM "http://java.sun.com/dtd/preferences.dtd"> <map MAP_XML_VERSION="1.0"/> # cat /root/.java/.userPrefs/rhq-agent/default/prefs.xml <?xml version="1.0" encoding="UTF-8" standalone="no"?> <!DOCTYPE map SYSTEM "http://java.sun.com/dtd/preferences.dtd"> <map MAP_XML_VERSION="1.0"/>
More info: --- java version "1.7.0_65" OpenJDK Runtime Environment (rhel-2.5.1.2.el7_0-x86_64 u65-b17) OpenJDK 64-Bit Server VM (build 24.65-b04, mixed mode)
I can only think of one of two things - one, Java is trying to validate the .dtd by grabbing it over the network and its failing or two for some reason this Java implementation is not writing out <preferences> element when creating the initial preference node. What if you actually put the real XML in /root/.userPrefs/rhq-agent/default/prefs.xml? I think you might be able to just copy the content of agent-configuraiton.xml right in there and then restart the agent (thus manually pre-populating the agnet config preferences). Do you still get the error?
Its a network issue. How to replicate: 1) disconnect your network OR just edit /etc/hosts and put a bogus IP for java.sun.com 2) rm -rf $HOME/.java/.userPrefs/rhq-agent 3) start the agent You get the exception listed in the description of this BZ.
This did not happen in RHQ 4.12. Something broke since then. I think it has to do with we now use javax.xml parsing to process the agent config for obfuscating preferences. I don't know why its happening, but that has been one thing that changed in this area and could be why its breaking now.
caused by fix to bug #782915 trying to determine how to fix this issue and not break that again
assigning to stefan - he's going to take the first stab at fixing this.
cherry picked fix to 3.3 branch: commit 3892942c5d19feb98a38a52546582304983160a3 Author: Stefan Negrea <snegrea> Date: Fri Oct 10 12:44:38 2014 -0500 [BZ 1150624] Add additional code to prevent attempts to load DTD. (cherry picked from commit 3064b71e8a10c2987c9c45188975a07e2462caa7) Signed-off-by: John Mazzitelli <mazz> commit c1dbfc7ecb33efe108143d0b81165fa7113c1369 Author: Stefan Negrea <snegrea> Date: Fri Oct 10 12:16:59 2014 -0500 [BZ 1150624] Avoid DTD retrieval even if the DOM has it. Also avoid parsing logic to determine if (cherry picked from commit e73af2af73f53511bef5752034c383112a553d98) Signed-off-by: John Mazzitelli <mazz>
Moving to ON_QA as available to test with the latest brew build: https://brewweb.devel.redhat.com//buildinfo?buildID=394734
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days