Issue is not fully fixed, it still shows "Connection refused error" for hawkular-cassandra pod, issue is fixed for hawkular-metrics pod # oc get po -o wide NAME READY STATUS RESTARTS AGE IP NODE hawkular-cassandra-1-kt858 1/1 Running 0 8m 10.129.0.26 qe-juzhao-310-qeos-1-nrr-1 hawkular-metrics-gxwqq 1/1 Running 0 8m 10.128.0.25 qe-juzhao-310-qeos-1-master-etcd-1 hawkular-metrics-schema-cltg2 0/1 Completed 0 9m 10.129.0.25 qe-juzhao-310-qeos-1-nrr-1 heapster-b7rzw 1/1 Running 0 8m 10.129.0.27 qe-juzhao-310-qeos-1-nrr-1 ********************************************************************************* # curl 10.129.0.26:7575/metrics curl: (7) Failed connect to 10.129.0.26:7575; Connection refused although ENABLE_PROMETHEUS_ENDPOINT is true for hawkular-cassandra pod # oc exec hawkular-cassandra-1-kt858 -- env | grep ENABLE_PROMETHEUS_ENDPOINT ENABLE_PROMETHEUS_ENDPOINT=True more info please see the attached file metrics version: v3.10.0-0.58.0.0
Created attachment 1447755 [details] still shows "Connection refused error" for hawkular-cassandra pod
hawkular-metrics does a case-insensitive comparison in the script, whereas cassandra does not. The test is for all lower case "true". Unfortunately, setting ENABLE_PROMETHEUS_ENDPOINT=true in the inventory file does not work either. Setting the env var in the hawkular-cassandra-1 RC though will work. Since there is a relatively easy work around I am bumping this to 3.11.
*** Bug 1589023 has been marked as a duplicate of this bug. ***
Not sure what had happened, issue is fixed # rpm -qa | grep openshift-ansible openshift-ansible-roles-3.10.2-1.git.190.5abfddb.el7.noarch openshift-ansible-playbooks-3.10.2-1.git.190.5abfddb.el7.noarch openshift-ansible-docs-3.10.2-1.git.190.5abfddb.el7.noarch openshift-ansible-3.10.2-1.git.190.5abfddb.el7.noarch # openshift version openshift v3.10.2 metrics version: v3.10.2-1
Created attachment 1453333 [details] Issue is fixed
(In reply to Junqi Zhao from comment #12) > Created attachment 1453333 [details] > Issue is fixed Can the status be moved to VERIFIED?
(In reply to John Sanda from comment #13) > Can the status be moved to VERIFIED? no, issue is reproduced again, # oc get pod -o wide -n openshift-infra NAME READY STATUS RESTARTS AGE IP NODE hawkular-cassandra-1-7w2sf 1/1 Running 0 18h 10.2.2.9 ip-172-18-22-155.ec2.internal hawkular-cassandra-2-9mtrd 1/1 Running 1 18h 10.2.10.4 ip-172-18-0-153.ec2.internal hawkular-metrics-hbll5 1/1 Running 0 18h 10.2.6.89 ip-172-18-28-25.ec2.internal hawkular-metrics-schema-5qzj8 0/1 Completed 0 18h 10.2.6.87 ip-172-18-28-25.ec2.internal heapster-dtjpf 1/1 Running 0 18h 10.2.2.10 ip-172-18-22-155.ec2.internal hawkular-cassandra is still Connection refused # curl http://10.2.2.9:7575/metrics curl: (7) Failed connect to 10.2.2.9:7575; Connection refused # curl http://10.2.10.4:7575/metrics curl: (7) Failed connect to 10.2.10.4:7575; Connection refused $ oc get rc hawkular-cassandra-1 -o yaml | grep ENABLE_PROMETHEUS_ENDPOINT -A 2 - name: ENABLE_PROMETHEUS_ENDPOINT value: "True" cassandra version: metrics-cassandra-v3.11.0-0.10.0.0 hawkular-metrics works well # curl http://10.2.6.89:7575/metrics # HELP jvm_threads_current Current thread count of a JVM # TYPE jvm_threads_current gauge jvm_threads_current 405.0 # HELP jvm_threads_daemon Daemon thread count of a JVM # TYPE jvm_threads_daemon gauge jvm_threads_daemon 151.0 # HELP jvm_threads_peak Peak thread count of a JVM # TYPE jvm_threads_peak gauge jvm_threads_peak 422.0 # HELP jvm_threads_started_total Started thread count of a JVM # TYPE jvm_threads_started_total counter jvm_threads_started_total 987.0 # HELP jvm_threads_deadlocked Cycles of JVM-threads that are in deadlock waiting to acquire object monitors or ownable synchronizers # TYPE jvm_threads_deadlocked gauge jvm_threads_deadlocked 0.0 # HELP jvm_threads_deadlocked_monitor Cycles of JVM-threads that are in deadlock waiting to acquire object monitors # TYPE jvm_threads_deadlocked_monitor gauge jvm_threads_deadlocked_monitor 0.0 # HELP jvm_gc_collection_seconds Time spent in a given JVM garbage collector in seconds. # TYPE jvm_gc_collection_seconds summary jvm_gc_collection_seconds_count{gc="PS Scavenge",} 777.0 jvm_gc_collection_seconds_sum{gc="PS Scavenge",} 3.994 jvm_gc_collection_seconds_count{gc="PS MarkSweep",} 1.0 jvm_gc_collection_seconds_sum{gc="PS MarkSweep",} 0.125 # HELP process_cpu_seconds_total Total user and system CPU time spent in seconds. # TYPE process_cpu_seconds_total counter process_cpu_seconds_total 2732.21 # HELP process_start_time_seconds Start time of the process since unix epoch in seconds. # TYPE process_start_time_seconds gauge process_start_time_seconds 1.533026610444E9 # HELP process_open_fds Number of open file descriptors. # TYPE process_open_fds gauge process_open_fds 1549.0 # HELP process_max_fds Maximum number of open file descriptors. # TYPE process_max_fds gauge process_max_fds 1048576.0 # HELP process_virtual_memory_bytes Virtual memory size in bytes. # TYPE process_virtual_memory_bytes gauge process_virtual_memory_bytes 1.2267114496E10 # HELP process_resident_memory_bytes Resident memory size in bytes. # TYPE process_resident_memory_bytes gauge process_resident_memory_bytes 1.413013504E9 # HELP jvm_memory_bytes_used Used bytes of a given JVM memory area. # TYPE jvm_memory_bytes_used gauge jvm_memory_bytes_used{area="heap",} 6.17996568E8 jvm_memory_bytes_used{area="nonheap",} 1.86586304E8 # HELP jvm_memory_bytes_committed Committed (bytes) of a given JVM memory area. # TYPE jvm_memory_bytes_committed gauge jvm_memory_bytes_committed{area="heap",} 1.358954496E9 jvm_memory_bytes_committed{area="nonheap",} 2.0021248E8 # HELP jvm_memory_bytes_max Max (bytes) of a given JVM memory area. # TYPE jvm_memory_bytes_max gauge jvm_memory_bytes_max{area="heap",} 1.358954496E9 jvm_memory_bytes_max{area="nonheap",} 7.80140544E8 # HELP jvm_memory_pool_bytes_used Used bytes of a given JVM memory pool. # TYPE jvm_memory_pool_bytes_used gauge jvm_memory_pool_bytes_used{pool="Code Cache",} 6.1809216E7 jvm_memory_pool_bytes_used{pool="Metaspace",} 1.11660072E8 jvm_memory_pool_bytes_used{pool="Compressed Class Space",} 1.3117016E7 jvm_memory_pool_bytes_used{pool="PS Eden Space",} 3.69711576E8 jvm_memory_pool_bytes_used{pool="PS Survivor Space",} 1819184.0 jvm_memory_pool_bytes_used{pool="PS Old Gen",} 2.46465808E8 # HELP jvm_memory_pool_bytes_committed Committed bytes of a given JVM memory pool. # TYPE jvm_memory_pool_bytes_committed gauge jvm_memory_pool_bytes_committed{pool="Code Cache",} 6.2324736E7 jvm_memory_pool_bytes_committed{pool="Metaspace",} 1.21765888E8 jvm_memory_pool_bytes_committed{pool="Compressed Class Space",} 1.6121856E7 jvm_memory_pool_bytes_committed{pool="PS Eden Space",} 4.39353344E8 jvm_memory_pool_bytes_committed{pool="PS Survivor Space",} 7864320.0 jvm_memory_pool_bytes_committed{pool="PS Old Gen",} 9.11736832E8 # HELP jvm_memory_pool_bytes_max Max bytes of a given JVM memory pool. # TYPE jvm_memory_pool_bytes_max gauge jvm_memory_pool_bytes_max{pool="Code Cache",} 2.5165824E8 jvm_memory_pool_bytes_max{pool="Metaspace",} 2.68435456E8 jvm_memory_pool_bytes_max{pool="Compressed Class Space",} 2.60046848E8 jvm_memory_pool_bytes_max{pool="PS Eden Space",} 4.39877632E8 jvm_memory_pool_bytes_max{pool="PS Survivor Space",} 7864320.0 jvm_memory_pool_bytes_max{pool="PS Old Gen",} 9.11736832E8 # HELP jmx_config_reload_success_total Number of times configuration have successfully been reloaded. # TYPE jmx_config_reload_success_total counter jmx_config_reload_success_total 0.0 # HELP jmx_scrape_duration_seconds Time this JMX scrape took, in seconds. # TYPE jmx_scrape_duration_seconds gauge jmx_scrape_duration_seconds 5.14454E-4 # HELP jmx_scrape_error Non-zero if this scrape failed. # TYPE jmx_scrape_error gauge jmx_scrape_error 0.0 # HELP jvm_classes_loaded The number of classes that are currently loaded in the JVM # TYPE jvm_classes_loaded gauge jvm_classes_loaded 19414.0 # HELP jvm_classes_loaded_total The total number of classes that have been loaded since the JVM has started execution # TYPE jvm_classes_loaded_total counter jvm_classes_loaded_total 19448.0 # HELP jvm_classes_unloaded_total The total number of classes that have been unloaded since the JVM has started execution # TYPE jvm_classes_unloaded_total counter jvm_classes_unloaded_total 34.0 # HELP jmx_config_reload_failure_total Number of times configuration have failed to be reloaded. # TYPE jmx_config_reload_failure_total counter jmx_config_reload_failure_total 0.0 # HELP jvm_info JVM version info # TYPE jvm_info gauge jvm_info{version="1.8.0_181-b13",vendor="Oracle Corporation",} 1.0
Tested with metrics-cassandra:v3.11.0-0.11.0.0, issue is fixed. Please change to ON_QA. # oc get pod -n openshift-infra -o wide | grep hawkular-cassandra hawkular-cassandra-1-k498r 1/1 Running 0 2h 10.2.12.66 preserve-sharefr2-node-infra-1 # curl http://10.2.12.66:7575/metrics # HELP jvm_memory_bytes_used Used bytes of a given JVM memory area. # TYPE jvm_memory_bytes_used gauge jvm_memory_bytes_used{area="heap",} 4.4331744E8 jvm_memory_bytes_used{area="nonheap",} 8.8301192E7 # HELP jvm_memory_bytes_committed Committed (bytes) of a given JVM memory area. # TYPE jvm_memory_bytes_committed gauge jvm_memory_bytes_committed{area="heap",} 9.67114752E8 jvm_memory_bytes_committed{area="nonheap",} 9.1254784E7 # HELP jvm_memory_bytes_max Max (bytes) of a given JVM memory area. # TYPE jvm_memory_bytes_max gauge jvm_memory_bytes_max{area="heap",} 9.67114752E8 jvm_memory_bytes_max{area="nonheap",} -1.0 # HELP jvm_memory_pool_bytes_used Used bytes of a given JVM memory pool. # TYPE jvm_memory_pool_bytes_used gauge jvm_memory_pool_bytes_used{pool="Code Cache",} 3.8577408E7 jvm_memory_pool_bytes_used{pool="Metaspace",} 4.4996008E7 jvm_memory_pool_bytes_used{pool="Compressed Class Space",} 4727776.0 jvm_memory_pool_bytes_used{pool="Par Eden Space",} 1.82857384E8 jvm_memory_pool_bytes_used{pool="Par Survivor Space",} 9474640.0 jvm_memory_pool_bytes_used{pool="CMS Old Gen",} 2.50985416E8 # HELP jvm_memory_pool_bytes_committed Committed bytes of a given JVM memory pool. # TYPE jvm_memory_pool_bytes_committed gauge jvm_memory_pool_bytes_committed{pool="Code Cache",} 3.9452672E7 jvm_memory_pool_bytes_committed{pool="Metaspace",} 4.6727168E7 jvm_memory_pool_bytes_committed{pool="Compressed Class Space",} 5074944.0 jvm_memory_pool_bytes_committed{pool="Par Eden Space",} 2.65945088E8 jvm_memory_pool_bytes_committed{pool="Par Survivor Space",} 3.3226752E7 jvm_memory_pool_bytes_committed{pool="CMS Old Gen",} 6.67942912E8 # HELP jvm_memory_pool_bytes_max Max bytes of a given JVM memory pool. # TYPE jvm_memory_pool_bytes_max gauge jvm_memory_pool_bytes_max{pool="Code Cache",} 2.5165824E8 jvm_memory_pool_bytes_max{pool="Metaspace",} -1.0 jvm_memory_pool_bytes_max{pool="Compressed Class Space",} 1.073741824E9 jvm_memory_pool_bytes_max{pool="Par Eden Space",} 2.65945088E8 jvm_memory_pool_bytes_max{pool="Par Survivor Space",} 3.3226752E7 jvm_memory_pool_bytes_max{pool="CMS Old Gen",} 6.67942912E8 # HELP org_apache_cassandra_metrics_DroppedMessage_Count Attribute exposed for management (org.apache.cassandra.metrics<type=DroppedMessage, scope=MUTATION, name=Dropped><>Count) # TYPE org_apache_cassandra_metrics_DroppedMessage_Count untyped org_apache_cassandra_metrics_DroppedMessage_Count{scope="MUTATION",name="Dropped",} 0.0 org_apache_cassandra_metrics_DroppedMessage_Count{scope="READ",name="Dropped",} 0.0 org_apache_cassandra_metrics_DroppedMessage_Count{scope="RANGE_SLICE",name="Dropped",} 0.0 org_apache_cassandra_metrics_DroppedMessage_Count{scope="PAGED_RANGE",name="Dropped",} 0.0 # HELP org_apache_cassandra_metrics_ClientRequest_Count Attribute exposed for management (org.apache.cassandra.metrics<type=ClientRequest, scope=RangeSlice, name=Failures><>Count) # TYPE org_apache_cassandra_metrics_ClientRequest_Count untyped org_apache_cassandra_metrics_ClientRequest_Count{scope="RangeSlice",name="Failures",} 0.0 org_apache_cassandra_metrics_ClientRequest_Count{scope="Read",name="Unavailables",} 0.0 org_apache_cassandra_metrics_ClientRequest_Count{scope="Write",name="Unavailables",} 0.0 org_apache_cassandra_metrics_ClientRequest_Count{scope="Write",name="Timeouts",} 0.0 org_apache_cassandra_metrics_ClientRequest_Count{scope="RangeSlice",name="Unavailables",} 0.0 org_apache_cassandra_metrics_ClientRequest_Count{scope="Write",name="Failures",} 0.0 org_apache_cassandra_metrics_ClientRequest_Count{scope="Read",name="Failures",} 0.0 org_apache_cassandra_metrics_ClientRequest_Count{scope="RangeSlice",name="Timeouts",} 0.0 org_apache_cassandra_metrics_ClientRequest_Count{scope="Read",name="Timeouts",} 0.0 # HELP jmx_scrape_duration_seconds Time this JMX scrape took, in seconds. # TYPE jmx_scrape_duration_seconds gauge jmx_scrape_duration_seconds 0.048682478 # HELP jmx_scrape_error Non-zero if this scrape failed. # TYPE jmx_scrape_error gauge jmx_scrape_error 0.0 # HELP jmx_config_reload_failure_total Number of times configuration have failed to be reloaded. # TYPE jmx_config_reload_failure_total counter jmx_config_reload_failure_total 0.0 # HELP process_cpu_seconds_total Total user and system CPU time spent in seconds. # TYPE process_cpu_seconds_total counter process_cpu_seconds_total 405.12 # HELP process_start_time_seconds Start time of the process since unix epoch in seconds. # TYPE process_start_time_seconds gauge process_start_time_seconds 1.533706804225E9 # HELP process_open_fds Number of open file descriptors. # TYPE process_open_fds gauge process_open_fds 220.0 # HELP process_max_fds Maximum number of open file descriptors. # TYPE process_max_fds gauge process_max_fds 1048576.0 # HELP process_virtual_memory_bytes Virtual memory size in bytes. # TYPE process_virtual_memory_bytes gauge process_virtual_memory_bytes 3.119505408E9 # HELP process_resident_memory_bytes Resident memory size in bytes. # TYPE process_resident_memory_bytes gauge process_resident_memory_bytes 1.385967616E9 # HELP jvm_info JVM version info # TYPE jvm_info gauge jvm_info{version="1.8.0_181-b13",vendor="Oracle Corporation",} 1.0 # HELP jvm_threads_current Current thread count of a JVM # TYPE jvm_threads_current gauge jvm_threads_current 222.0 # HELP jvm_threads_daemon Daemon thread count of a JVM # TYPE jvm_threads_daemon gauge jvm_threads_daemon 198.0 # HELP jvm_threads_peak Peak thread count of a JVM # TYPE jvm_threads_peak gauge jvm_threads_peak 226.0 # HELP jvm_threads_started_total Started thread count of a JVM # TYPE jvm_threads_started_total counter jvm_threads_started_total 1407.0 # HELP jvm_threads_deadlocked Cycles of JVM-threads that are in deadlock waiting to acquire object monitors or ownable synchronizers # TYPE jvm_threads_deadlocked gauge jvm_threads_deadlocked 0.0 # HELP jvm_threads_deadlocked_monitor Cycles of JVM-threads that are in deadlock waiting to acquire object monitors # TYPE jvm_threads_deadlocked_monitor gauge jvm_threads_deadlocked_monitor 0.0 # HELP jvm_gc_collection_seconds Time spent in a given JVM garbage collector in seconds. # TYPE jvm_gc_collection_seconds summary jvm_gc_collection_seconds_count{gc="ParNew",} 144.0 jvm_gc_collection_seconds_sum{gc="ParNew",} 4.702 jvm_gc_collection_seconds_count{gc="ConcurrentMarkSweep",} 3.0 jvm_gc_collection_seconds_sum{gc="ConcurrentMarkSweep",} 0.216 # HELP jmx_config_reload_success_total Number of times configuration have successfully been reloaded. # TYPE jmx_config_reload_success_total counter jmx_config_reload_success_total 0.0 # HELP jvm_classes_loaded The number of classes that are currently loaded in the JVM # TYPE jvm_classes_loaded gauge jvm_classes_loaded 7141.0 # HELP jvm_classes_loaded_total The total number of classes that have been loaded since the JVM has started execution # TYPE jvm_classes_loaded_total counter jvm_classes_loaded_total 7182.0 # HELP jvm_classes_unloaded_total The total number of classes that have been unloaded since the JVM has started execution # TYPE jvm_classes_unloaded_total counter jvm_classes_unloaded_total 41.0 $ oc get rc hawkular-cassandra-1 -o yaml | grep ENABLE_PROMETHEUS_ENDPOINT -A 2 - name: ENABLE_PROMETHEUS_ENDPOINT value: "True"
Issue is fixed Images: metrics-cassandra/images/v3.11.0-0.20.0.0 metrics-heapster/images/v3.11.0-0.20.0.0 metrics-schema-installer/images/v3.11.0-0.20.0.0 metrics-hawkular-metrics/images/v3.11.0-0.20.0.0
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2652