Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1465220 - NPE for DropWizardReporter every minute in hawkular-metrics logs
NPE for DropWizardReporter every minute in hawkular-metrics logs
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Hawkular (Show other bugs)
3.6.0
Unspecified Unspecified
low Severity medium
: ---
: ---
Assigned To: Matt Wringe
Junqi Zhao
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-06-26 22:13 EDT by Mike Fiedler
Modified: 2017-08-16 15 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-08-10 01:28:56 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Hawkular-metrics pod log (53.02 KB, application/x-gzip)
2017-06-26 22:15 EDT, Mike Fiedler
no flags Details
Issue is fixed, hawkular metrics pod log (79.04 KB, text/plain)
2017-07-10 05:25 EDT, Junqi Zhao
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:1716 normal SHIPPED_LIVE Red Hat OpenShift Container Platform 3.6 RPM Release Advisory 2017-08-10 05:02:50 EDT

  None (edit)
Description Mike Fiedler 2017-06-26 22:13:03 EDT
Description of problem:

Every minute I am seeing the following NPE in the hawkular-metrcs log.   Making it difficult to debug other issues if it is not in fact a real error.

2017-06-27 02:05:09,237 ERROR [com.codahale.metrics.ScheduledReporter] (metrics-hawkular-metrics-reporter-1-thread-1) RuntimeException thrown from DropWizardReporter#report. Exception was suppressed.: java.lang.NullPointerException
        at org.hawkular.metrics.core.dropwizard.MetricNameService.createMetricName(MetricNameService.java:77)
        at org.hawkular.metrics.core.dropwizard.DropWizardReporter.getMetricId(DropWizardReporter.java:194)
        at org.hawkular.metrics.core.dropwizard.DropWizardReporter.lambda$report$2(DropWizardReporter.java:103)
        at java.util.Collections$UnmodifiableMap$UnmodifiableEntrySet.lambda$entryConsumer$0(Collections.java:1575)
        at java.lang.Iterable.forEach(Iterable.java:75)
        at java.util.Collections$UnmodifiableMap$UnmodifiableEntrySet.forEach(Collections.java:1580)
        at org.hawkular.metrics.core.dropwizard.DropWizardReporter.report(DropWizardReporter.java:102)
        at com.codahale.metrics.ScheduledReporter.report(ScheduledReporter.java:162)
        at com.codahale.metrics.ScheduledReporter$1.run(ScheduledReporter.java:117)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748)



Version-Release number of selected component (if applicable): hawkular-metrics 3.6.122

registry.ops.openshift.com/openshift3/metrics-hawkular-metrics    v3.6.122            992323342eea 


How reproducible: Always


Steps to Reproduce:
1.  Deploy metrics (inventory below)
2.  oc logs <hawkular-metrics-pod>

Actual results:

Log is full of NPEs with the above stack


Expected results:

Clean logs for normal operation


Additional info:

[oo_first_master]
192.1.0.8

[oo_first_master:vars]
openshift_deployment_type=openshift-enterprise
openshift_release=v3.6.0

openshift_metrics_install_metrics=true
openshift_metrics_hawkular_hostname=hawkular-metrics.0615-yzo.qe.rhcloud.com
openshift_metrics_project=openshift-infra
openshift_metrics_image_prefix=registry.ops.openshift.com/openshift3/
openshift_metrics_image_version=v3.6.122
openshift_metrics_cassandra_replicas=1
openshift_metrics_hawkular_replicas=1
openshift_metrics_cassandra_storage_type=pv
openshift_metrics_cassandra_pvc_size=395Gi
Comment 1 Mike Fiedler 2017-06-26 22:14:08 EDT
I found upstream issue https://issues.jboss.org/browse/HWKMETRICS-577?_sscc=t for another NPE in DropWizardReporter but the stack (and trigger) are different.
Comment 2 Mike Fiedler 2017-06-26 22:15 EDT
Created attachment 1292141 [details]
Hawkular-metrics pod log
Comment 3 John Sanda 2017-06-28 10:40:39 EDT
Lowering priority and bumping release since this does not impact any functionality used in OpenShift, and it can be disabled.
Comment 4 John Sanda 2017-06-28 11:45:18 EDT
Retargetting for 3.6 because this does impact some new, needed functionality for monitoring the health of hawkular-metrics.
Comment 5 John Sanda 2017-06-28 20:36:44 EDT
This is fixed upstream in https://issues.jboss.org/browse/HWKMETRICS-682 .
Comment 7 Mike Fiedler 2017-07-06 10:01:44 EDT
This issue still exists on hawkular-metrics 3.6.136.

Moving this back to ASSIGNED for mwringe until a new image is build with the upstream fix.
Comment 8 John Sanda 2017-07-06 10:36:38 EDT
(In reply to Mike Fiedler from comment #7)
> This issue still exists on hawkular-metrics 3.6.136.
> 
> Moving this back to ASSIGNED for mwringe until a new image is build with the
> upstream fix.

I am not sure if 3.6.136 has the necessary changes. The changes went into Hawkular Metrics 0.27.1 which was published into JBoss nexus repo on Monday afternoon.
Comment 11 Junqi Zhao 2017-07-10 05:18:07 EDT
Tested with the latest images(v3.6.140-1), use nfs pv, did not find NPE in hawkular-cassandra pod logs.

@Mike, please see my inventory file, we don't use [oo_first_master] now, and we usually deploy metrics by the following commands:

# cd /usr/share/ansible/openshift-ansible/
# ansible-playbook -vvv -i ${INVENTORY_FILE}   playbooks/byo/openshift-cluster/openshift-metrics.yml

**************************************************************************
Images from brew
metrics-hawkular-metrics   v3.6.140-1          3a5bebd0476a        2 hours ago         1.293 GB
metrics-cassandra          v3.6.140-1          9644ec21e399        2 hours ago         573.2 MB
metrics-heapster           v3.6.140-1          5549c67d8607        2 hours ago         274.4 MB

Inventory file:

[OSEv3:children]
masters

[masters]
${MASTER} openshift_public_hostname=${MASTER}

[OSEv3:vars]
ansible_ssh_user=root
ansible_ssh_private_key_file="~/libra.pem"
deployment_type=openshift-enterprise


# Metrics
openshift_metrics_install_metrics=true
openshift_metrics_hawkular_hostname=hawkular-metrics.${SUB_DOMAIN}
openshift_metrics_project=openshift-infra
openshift_metrics_image_prefix=${IMAGE_PREFIX}
openshift_metrics_image_version=v3.6
openshift_metrics_cassandra_replicas=1
openshift_metrics_hawkular_replicas=1
openshift_metrics_cassandra_storage_type=pv
openshift_metrics_cassandra_pvc_size=10Gi
Comment 12 Junqi Zhao 2017-07-10 05:25 EDT
Created attachment 1295746 [details]
Issue is fixed, hawkular metrics pod log
Comment 14 errata-xmlrpc 2017-08-10 01:28:56 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1716

Note You need to log in before you can comment on or make changes to this bug.