Bug 1659453 - SNMP-based metrics are missing on undercloud [NEEDINFO]
Summary: SNMP-based metrics are missing on undercloud
Keywords:
Status: CLOSED EOL
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: gnocchi
Version: 14.0 (Rocky)
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: z3
: 14.0 (Rocky)
Assignee: Martin Magr
QA Contact: Leonid Natapov
URL:
Whiteboard:
: 1540950 (view as bug list)
Depends On: 1709277
Blocks: 1626151 1753950
TreeView+ depends on / blocked
 
Reported: 2018-12-14 12:37 UTC by Marek Aufart
Modified: 2020-01-24 12:28 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-01-24 12:28:36 UTC
Target Upstream Version:
maufart: needinfo? (iovadia)
maufart: needinfo? (jhajyahy)


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
OpenStack gerrit 656760 'None' MERGED Configure SNMP on undercloud 2020-01-24 12:28:51 UTC

Description Marek Aufart 2018-12-14 12:37:04 UTC
Description of problem:
Gnocchi on undercloud should provide hardware.* metrics. These metrics are captured via SNMP from undercloud nodes. Currently, no such metrics are present in gnocchi metric list output.

Version-Release number of selected component (if applicable):
OSP14


How reproducible:
always


Steps to Reproduce:
1. deploy OSP14 with enabled telemetry on undercloud
2. $ gnocchi metric list

Actual results:
no metrics from "hardware" category are present

Expected results:
hardware metrics should be present, e.g. hardware.cpu.util

Additional info:
This issue was initially discussed on rhos-mm mailing list [1] where it looked that it is just configuration issue, so we raised docs BZ to document how to setup it [2]. After several days of debugging, we are _not_ able get it working, so raising this BZ. 

This issue introduces regression in CloudForms integration (missing metrics for undercloud nodes).

[1] http://post-office.corp.redhat.com/archives/rhos-mm/2018-December/msg00002.html
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1655958

Comment 2 Martin Magr 2018-12-17 15:49:52 UTC
Ok, maybe we missed SNMP configuration whe going to containerized undercloud. We have to ivestigate more on that.

Comment 6 Martin Magr 2019-05-14 09:24:46 UTC
As from an Alex's comment in patch #656760 I've deployed queens with telemetry enabled and I also get empty metric list. What I found was following, probably related. i'm continuing with investigation. Marek, are you sure that this is a 13-to-14 regression?

2019-05-14 09:17:52.632 63609 DEBUG ceilometer.polling.manager [-] Skip pollster hardware.memory.used, no resources found this cycle poll_and_notify /usr/lib/python2.7/site-packages/ceilometer/polling/manager.py:175
2019-05-14 09:17:52.632 63609 DEBUG ceilometer.polling.manager [-] Skip pollster hardware.system_stats.io.outgoing.blocks, no resources found this cycle poll_and_notify /usr/lib/python2.7/site-packages/ceilometer/polling/manager.py:175
2019-05-14 09:17:52.632 63609 DEBUG ceilometer.polling.manager [-] Skip pollster hardware.memory.buffer, no resources found this cycle poll_and_notify /usr/lib/python2.7/site-packages/ceilometer/polling/manager.py:175
2019-05-14 09:17:52.632 63609 DEBUG ceilometer.polling.manager [-] Skip pollster hardware.memory.swap.avail, no resources found this cycle poll_and_notify /usr/lib/python2.7/site-packages/ceilometer/polling/manager.py:175
2019-05-14 09:17:52.632 63609 DEBUG ceilometer.polling.manager [-] Skip pollster hardware.cpu.util, no resources found this cycle poll_and_notify /usr/lib/python2.7/site-packages/ceilometer/polling/manager.py:175
2019-05-14 09:17:52.633 63609 DEBUG ceilometer.polling.manager [-] Skip pollster hardware.memory.total, no resources found this cycle poll_and_notify /usr/lib/python2.7/site-packages/ceilometer/polling/manager.py:175
2019-05-14 09:17:52.633 63609 DEBUG ceilometer.polling.manager [-] Skip pollster hardware.system_stats.io.incoming.blocks, no resources found this cycle poll_and_notify /usr/lib/python2.7/site-packages/ceilometer/polling/manager.py:175
2019-05-14 09:17:52.633 63609 DEBUG ceilometer.polling.manager [-] Skip pollster hardware.memory.cached, no resources found this cycle poll_and_notify /usr/lib/python2.7/site-packages/ceilometer/polling/manager.py:175
2019-05-14 09:17:52.633 63609 DEBUG ceilometer.polling.manager [-] Skip pollster hardware.network.ip.incoming.datagrams, no resources found this cycle poll_and_notify /usr/lib/python2.7/site-packages/ceilometer/polling/manager.py:175
2019-05-14 09:17:52.633 63609 DEBUG ceilometer.polling.manager [-] Skip pollster hardware.network.ip.outgoing.datagrams, no resources found this cycle poll_and_notify /usr/lib/python2.7/site-packages/ceilometer/polling/manager.py:175
2019-05-14 09:17:52.633 63609 DEBUG ceilometer.polling.manager [-] Skip pollster hardware.memory.swap.total, no resources found this cycle poll_and_notify /usr/lib/python2.7/site-packages/ceilometer/polling/manager.py:175

Comment 7 Martin Magr 2019-05-14 14:01:32 UTC
Same is for pike:

(undercloud) [stack@undercloud ~]$ grep telemetry undercloud.conf                                               
enable_telemetry = True
(undercloud) [stack@undercloud ~]$ ps -ef | grep ceilometer
ceilome+ 20087     1  0 13:38 ?        00:00:09 ceilometer-agent-notification: master process [/usr/bin/ceilometer-agent-notification --logfile /var/log/ceilometer/agent-notification.log]
ceilome+ 20165     1  0 13:38 ?        00:00:10 ceilometer-polling: master process [/usr/bin/ceilometer-polling --polling-namespaces central --logfile /var/log/ceilometer/central.log]
ceilome+ 20262 20087  3 13:38 ?        00:00:46 ceilometer-agent-notification: NotificationService worker(0)
ceilome+ 20279 20165  0 13:38 ?        00:00:00 ceilometer-polling: AgentManager worker(0)
stack    26553 15685  0 13:59 pts/0    00:00:00 grep --color=auto ceilometer
(undercloud) [stack@undercloud ~]$ gnocchi metric list

(undercloud) [stack@undercloud ~]$ rpm -qa openstack-ceilometer*
openstack-ceilometer-polling-9.0.8-0.20190511015415.4a82ac5.el7.noarch
openstack-ceilometer-common-9.0.8-0.20190511015415.4a82ac5.el7.noarch
openstack-ceilometer-notification-9.0.8-0.20190511015415.4a82ac5.el7.noarch
openstack-ceilometer-central-9.0.8-0.20190511015415.4a82ac5.el7.noarch
(undercloud) [stack@undercloud ~]$ 

With what version of OpenStack did you have the hardware metrics working?

Comment 8 Marek Aufart 2019-05-14 14:19:51 UTC
Let's sync with QE on this - Ido, do I remember correctly, that first failure of undercloud Hosts metrics appeared in OSP14 (containerized undercloud) and was working in OSP13? (or was there some workaround needed)

Comment 9 Rahul Chincholkar 2019-10-02 09:47:46 UTC
Reopening as this blocks BZ #1753950

Comment 14 Stephen 2019-11-07 20:39:23 UTC
*** Bug 1540950 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.