Bug 1552165

Summary: hawkular-metrics goes into CrashLoopBackOff when prometheus endpoint is disabled
Product: OpenShift Container Platform Reporter: John Sanda <jsanda>
Component: HawkularAssignee: John Sanda <jsanda>
Status: CLOSED NOTABUG QA Contact: Junqi Zhao <juzhao>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.9.0CC: aos-bugs, jsanda, xtian
Target Milestone: ---   
Target Release: 3.9.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-03-12 15:05:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Attachments:
Description Flags
metrics pods info none

Description John Sanda 2018-03-06 15:51:52 UTC
Description of problem:
There is a bug in standalone.conf. The system logging manager is only configured when the ENABLE_PROMETHEUS_ENDPOINT variable is set. When the variable is not set, the logging manager does not get set properly and start up fails with:

Could not load Logmanager "org.jboss.logmanager.LogManager"
java.lang.ClassNotFoundException: org.jboss.logmanager.LogManager
	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:338)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	at org.jboss.modules.ConcurrentClassLoader.performLoadClass(ConcurrentClassLoader.java:348)
	at org.jboss.modules.ConcurrentClassLoader.loadClass(ConcurrentClassLoader.java:93)
	at java.util.logging.LogManager$1.run(LogManager.java:195)
	at java.util.logging.LogManager$1.run(LogManager.java:181)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.util.logging.LogManager.<clinit>(LogManager.java:181)
	at org.jboss.modules.Main.main(Main.java:480)
WARNING: Failed to load the specified log manager class org.jboss.logmanager.LogManager
Exception in thread "main" java.lang.NoClassDefFoundError: org/jboss/logmanager/Level
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:348)
	at org.jboss.modules.Module.run(Module.java:320)
	at org.jboss.modules.Main.main(Main.java:505)
Caused by: java.lang.ClassNotFoundException: org.jboss.logmanager.Level
	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:338)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	at org.jboss.modules.ConcurrentClassLoader.performLoadClass(ConcurrentClassLoader.java:348)
	at org.jboss.modules.ConcurrentClassLoader.loadClass(ConcurrentClassLoader.java:93)




Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 4 Junqi Zhao 2018-03-09 06:10:23 UTC
@John,
Tested with metrics v3.9.4-1 images and have two questions
1. ENABLE_PROMETHEUS_ENDPOINT is also in hawkular-cassandra pod, do we need to cover it.

2. Testing steps:
1) scale all metrics rc down
2) delete "ENABLE_PROMETHEUS_ENDPOINT" from hawkular-metrics rc, and scale all metrics rc up
        - name: ENABLE_PROMETHEUS_ENDPOINT
          value: "True"
3) All pods could be started up, and sanity testing is passed.
but from hawkular-metrics pod logs, I see 
"-Xbootclasspath/p:/opt/eap/jboss-modules.jar:/opt/eap/modules/system/layers/base/org/jboss/logmanager/main/jboss-logmanager-2.0.3.Final-redhat-1.jar"

which from your PR, it should be jboss-logmanager-2.0.4.Final.jar, is there something wrong?

# oc logs hawkular-metrics-6d2bl | grep logmanager | grep -i final
  JAVA_OPTS:  -server -verbose:gc -Xloggc:"/opt/eap/standalone/log/gc.log" -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=3M -XX:-TraceClassUnloading -Xms1192m -Xmx1192m -XX:MetaspaceSize=96M -XX:MaxMetaspaceSize=256m -Djava.net.preferIPv4Stack=true -Djboss.modules.system.pkgs=org.jboss.logmanager,jdk.nashorn.api -Djava.awt.headless=true -Xbootclasspath/p:/opt/eap/jboss-modules.jar:/opt/eap/modules/system/layers/base/org/jboss/logmanager/main/jboss-logmanager-2.0.3.Final-redhat-1.jar:/opt/eap/modules/system/layers/base/org/jboss/logmanager/ext/main/jboss-logmanager-ext-1.0.0.Alpha2-redhat-1.jar -Djava.util.logging.manager=org.jboss.logmanager.LogManager -javaagent:/opt/eap/jolokia.jar=port=8778,protocol=https,caCert=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt,clientPrincipal=cn=system:master-proxy,useSslClientAuthentication=true,extraClientCheck=true,host=0.0.0.0,discoveryEnabled=false -Djava.security.egd=file:/dev/./urandom -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/heapdump
	sun.boot.class.path = /opt/eap/jboss-modules.jar:/opt/eap/modules/system/layers/base/org/jboss/logmanager/main/jboss-logmanager-2.0.3.Final-redhat-1.jar:/opt/eap/modules/system/layers/base/org/jboss/logmanager/ext/main/jboss-logmanager-ext-1.0.0.Alpha2-redhat-1.jar:/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64/jre/lib/resources.jar:/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64/jre/lib/rt.jar:/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64/jre/lib/sunrsasign.jar:/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64/jre/lib/jsse.jar:/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64/jre/lib/jce.jar:/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64/jre/lib/charsets.jar:/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64/jre/lib/jfr.jar:/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64/jre/classes
2018-03-09 05:37:42,325 DEBUG [org.jboss.as.config] (MSC service thread 1-7) VM Arguments: -Duser.home=/home/jboss -Duser.name=jboss -D[Standalone] -verbose:gc -Xloggc:/opt/eap/standalone/log/gc.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=3M -XX:-TraceClassUnloading -Xms1192m -Xmx1192m -XX:MetaspaceSize=96M -XX:MaxMetaspaceSize=256m -Djava.net.preferIPv4Stack=true -Djboss.modules.system.pkgs=org.jboss.logmanager,jdk.nashorn.api -Djava.awt.headless=true -Xbootclasspath/p:/opt/eap/jboss-modules.jar:/opt/eap/modules/system/layers/base/org/jboss/logmanager/main/jboss-logmanager-2.0.3.Final-redhat-1.jar:/opt/eap/modules/system/layers/base/org/jboss/logmanager/ext/main/jboss-logmanager-ext-1.0.0.Alpha2-redhat-1.jar -Djava.util.logging.manager=org.jboss.logmanager.LogManager -javaagent:/opt/eap/jolokia.jar=port=8778,protocol=https,caCert=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt,clientPrincipal=cn=system:master-proxy,useSslClientAuthentication=true,extraClientCheck=true,host=0.0.0.0,discoveryEnabled=false -Djava.security.egd=file:/dev/./urandom -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/heapdump -Dorg.jboss.boot.log.file=/opt/eap/standalone/log/server.log -Dlogging.configuration=file:/opt/eap/standalone/configuration/logging.properties

Comment 5 Junqi Zhao 2018-03-09 06:11 UTC
Created attachment 1406124 [details]
metrics pods info

Comment 6 Junqi Zhao 2018-03-12 08:03:22 UTC
See Comment 4, assign it back, if my comment is wrong, please change it back to ON_QA

Comment 7 John Sanda 2018-03-12 15:05:36 UTC
The upstream PR updates standalone.conf; however, the downstream, EAP version of standalone.conf is different and does not have this problem. As it turns out, the upstream changes were not applied downstream. I am going to close this ticket.