Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1465220

Summary:

NPE for DropWizardReporter every minute in hawkular-metrics logs

Product:

OpenShift Container Platform

Reporter:

Mike Fiedler <mifiedle>

Component:

Hawkular

Assignee:

Matt Wringe <mwringe>

Status:

CLOSED ERRATA

QA Contact:

Junqi Zhao <juzhao>

Severity:

medium

Docs Contact:

Priority:

low

Version:

3.6.0

CC:

aos-bugs, jsanda, smunilla, xtian

Target Milestone:

---

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

No Doc Update

Doc Text:

undefined

Story Points:

---

Clone Of:

Environment:

Last Closed:

2017-08-10 05:28:56 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Hawkular-metrics pod log	none
Issue is fixed, hawkular metrics pod log	none

Description Mike Fiedler 2017-06-27 02:13:03 UTC

Description of problem:

Every minute I am seeing the following NPE in the hawkular-metrcs log.   Making it difficult to debug other issues if it is not in fact a real error.

2017-06-27 02:05:09,237 ERROR [com.codahale.metrics.ScheduledReporter] (metrics-hawkular-metrics-reporter-1-thread-1) RuntimeException thrown from DropWizardReporter#report. Exception was suppressed.: java.lang.NullPointerException
        at org.hawkular.metrics.core.dropwizard.MetricNameService.createMetricName(MetricNameService.java:77)
        at org.hawkular.metrics.core.dropwizard.DropWizardReporter.getMetricId(DropWizardReporter.java:194)
        at org.hawkular.metrics.core.dropwizard.DropWizardReporter.lambda$report$2(DropWizardReporter.java:103)
        at java.util.Collections$UnmodifiableMap$UnmodifiableEntrySet.lambda$entryConsumer$0(Collections.java:1575)
        at java.lang.Iterable.forEach(Iterable.java:75)
        at java.util.Collections$UnmodifiableMap$UnmodifiableEntrySet.forEach(Collections.java:1580)
        at org.hawkular.metrics.core.dropwizard.DropWizardReporter.report(DropWizardReporter.java:102)
        at com.codahale.metrics.ScheduledReporter.report(ScheduledReporter.java:162)
        at com.codahale.metrics.ScheduledReporter$1.run(ScheduledReporter.java:117)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748)



Version-Release number of selected component (if applicable): hawkular-metrics 3.6.122

registry.ops.openshift.com/openshift3/metrics-hawkular-metrics    v3.6.122            992323342eea 


How reproducible: Always


Steps to Reproduce:
1.  Deploy metrics (inventory below)
2.  oc logs <hawkular-metrics-pod>

Actual results:

Log is full of NPEs with the above stack


Expected results:

Clean logs for normal operation


Additional info:

[oo_first_master]
192.1.0.8

[oo_first_master:vars]
openshift_deployment_type=openshift-enterprise
openshift_release=v3.6.0

openshift_metrics_install_metrics=true
openshift_metrics_hawkular_hostname=hawkular-metrics.0615-yzo.qe.rhcloud.com
openshift_metrics_project=openshift-infra
openshift_metrics_image_prefix=registry.ops.openshift.com/openshift3/
openshift_metrics_image_version=v3.6.122
openshift_metrics_cassandra_replicas=1
openshift_metrics_hawkular_replicas=1
openshift_metrics_cassandra_storage_type=pv
openshift_metrics_cassandra_pvc_size=395Gi

Comment 1 Mike Fiedler 2017-06-27 02:14:08 UTC

I found upstream issue https://issues.jboss.org/browse/HWKMETRICS-577?_sscc=t for another NPE in DropWizardReporter but the stack (and trigger) are different.

Comment 2 Mike Fiedler 2017-06-27 02:15:08 UTC

Created attachment 1292141 [details]
Hawkular-metrics pod log

Comment 3 John Sanda 2017-06-28 14:40:39 UTC

Lowering priority and bumping release since this does not impact any functionality used in OpenShift, and it can be disabled.

Comment 4 John Sanda 2017-06-28 15:45:18 UTC

Retargetting for 3.6 because this does impact some new, needed functionality for monitoring the health of hawkular-metrics.

Comment 5 John Sanda 2017-06-29 00:36:44 UTC

This is fixed upstream in https://issues.jboss.org/browse/HWKMETRICS-682 .

Comment 7 Mike Fiedler 2017-07-06 14:01:44 UTC

This issue still exists on hawkular-metrics 3.6.136.

Moving this back to ASSIGNED for mwringe until a new image is build with the upstream fix.

Comment 8 John Sanda 2017-07-06 14:36:38 UTC

(In reply to Mike Fiedler from comment #7)
> This issue still exists on hawkular-metrics 3.6.136.
> 
> Moving this back to ASSIGNED for mwringe until a new image is build with the
> upstream fix.

I am not sure if 3.6.136 has the necessary changes. The changes went into Hawkular Metrics 0.27.1 which was published into JBoss nexus repo on Monday afternoon.

Comment 11 Junqi Zhao 2017-07-10 09:18:07 UTC

Tested with the latest images(v3.6.140-1), use nfs pv, did not find NPE in hawkular-cassandra pod logs.

@Mike, please see my inventory file, we don't use [oo_first_master] now, and we usually deploy metrics by the following commands:

# cd /usr/share/ansible/openshift-ansible/
# ansible-playbook -vvv -i ${INVENTORY_FILE}   playbooks/byo/openshift-cluster/openshift-metrics.yml

**************************************************************************
Images from brew
metrics-hawkular-metrics   v3.6.140-1          3a5bebd0476a        2 hours ago         1.293 GB
metrics-cassandra          v3.6.140-1          9644ec21e399        2 hours ago         573.2 MB
metrics-heapster           v3.6.140-1          5549c67d8607        2 hours ago         274.4 MB

Inventory file:

[OSEv3:children]
masters

[masters]
${MASTER} openshift_public_hostname=${MASTER}

[OSEv3:vars]
ansible_ssh_user=root
ansible_ssh_private_key_file="~/libra.pem"
deployment_type=openshift-enterprise


# Metrics
openshift_metrics_install_metrics=true
openshift_metrics_hawkular_hostname=hawkular-metrics.${SUB_DOMAIN}
openshift_metrics_project=openshift-infra
openshift_metrics_image_prefix=${IMAGE_PREFIX}
openshift_metrics_image_version=v3.6
openshift_metrics_cassandra_replicas=1
openshift_metrics_hawkular_replicas=1
openshift_metrics_cassandra_storage_type=pv
openshift_metrics_cassandra_pvc_size=10Gi

Comment 12 Junqi Zhao 2017-07-10 09:25:12 UTC

Created attachment 1295746 [details]
Issue is fixed, hawkular metrics pod log

Comment 14 errata-xmlrpc 2017-08-10 05:28:56 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1716