Bug 1423496

Summary: No events in OSP11 overcloud
Product: Red Hat OpenStack Reporter: Marek Aufart <maufart>
Component: openstack-pankoAssignee: Pradeep Kilambi <pkilambi>
Status: CLOSED ERRATA QA Contact: Sasha Smolyak <ssmolyak>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 11.0 (Ocata)CC: jdanjou, maufart, pkilambi, ssmolyak
Target Milestone: rcKeywords: Triaged
Target Release: 11.0 (Ocata)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-panko-2.0.1-0.20170302220923.9a9ae04.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-05-17 20:00:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1416648    
Bug Blocks:    
Attachments:
Description Flags
details_console none

Description Marek Aufart 2017-02-17 13:04:37 UTC
Created attachment 1251892 [details]
details_console

Description of problem: Panko service returns no events (but actions generating events were performed).

Version-Release number of selected component (if applicable):
OSP11

How reproducible:
always

Steps to Reproduce:
1. deploy OSP11 overcloud with Panko enabled (-e environments/services/panko.yaml)
2. do OSP actions which should generate events (e.g. create network)
3. check panko events with ceilometer event-list or over API

Actual results:
no events

Expected results:
some events should be present

Additional info:
See attachment, possibly related log entries:

[root@controller-0 ~]# tailf /var/log/ceilometer/agent-notification.log
2017-02-15 13:40:39.047 3090 WARNING oslo_config.cfg [-] Option "rabbit_userid" from group "oslo_messaging_rabbit" is deprecated for removal.  Its value may be silently ignored in the future.
2017-02-15 13:40:39.047 3090 WARNING oslo_config.cfg [-] Option "rabbit_password" from group "oslo_messaging_rabbit" is deprecated for removal.  Its value may be silently ignored in the future.
2017-02-15 13:40:39.055 3090 ERROR oslo.messaging._drivers.impl_rabbit [-] [181b602a-1329-416a-8895-2b667668ec96] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] Connection refused. Trying again in 1 seconds. Client port: None
2017-02-15 13:40:40.061 3090 ERROR oslo.messaging._drivers.impl_rabbit [-] [181b602a-1329-416a-8895-2b667668ec96] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] Connection refused. Trying again in 2 seconds. Client port: None
2017-02-15 13:40:42.068 3090 ERROR oslo.messaging._drivers.impl_rabbit [-] [181b602a-1329-416a-8895-2b667668ec96] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] Connection refused. Trying again in 4 seconds. Client port: None
2017-02-15 13:40:46.077 3090 ERROR oslo.messaging._drivers.impl_rabbit [-] [181b602a-1329-416a-8895-2b667668ec96] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] Connection refused. Trying again in 6 seconds. Client port: None
2017-02-15 13:40:52.090 3090 ERROR oslo.messaging._drivers.impl_rabbit [-] [181b602a-1329-416a-8895-2b667668ec96] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] Connection refused. Trying again in 8 seconds. Client port: None
2017-02-15 13:41:00.105 3090 INFO oslo.messaging._drivers.impl_rabbit [-] [181b602a-1329-416a-8895-2b667668ec96] Reconnected to AMQP server on controller-0.internalapi.localdomain:5672 via [amqp] client with port 45556.
2017-02-16 15:40:40.121 3090 WARNING ceilometer.pipeline [-] metering data image.size for 71443da8-8e28-4d3c-88f7-9ae2ca8242a7 @ 2017-02-16T15:40:40.111671 has no volume (volume: None), the sample will be dropped
2017-02-16 15:50:39.898 3090 WARNING ceilometer.pipeline [-] metering data image.size for 71443da8-8e28-4d3c-88f7-9ae2ca8242a7 @ 2017-02-16T15:50:39.872366 has no volume (volume: None), the sample will be dropped
^C

Comment 1 Julien Danjou 2017-02-27 14:48:04 UTC
Seems like RabbitMQ is down based on your log. What's going on?

Comment 2 Marek Aufart 2017-02-27 17:55:51 UTC
@Julien It can be some non-telemetry related issue. I'm going to reinstall the OSP11 and will test it again.

Comment 3 Marek Aufart 2017-02-28 13:46:06 UTC
Log does not show AMQP error after fresh deployment, but there are still no events, see http://paste.openstack.org/show/VYT7lhx1Wrg5ZWlJH3Dl/

[root@controller-0 ~]# tailf /var/log/ceilometer/agent-notification.log
2017-02-28 13:17:38.574 3102 INFO oslo.messaging._drivers.impl_rabbit [-] [f568210a-7ead-4857-83e6-9b77a6b4c891] Reconnected to AMQP server on controller-0.internalapi.localdomain:5672 via [amqp] client with port 41964.
2017-02-28 13:37:21.641 3102 WARNING ceilometer.pipeline [-] metering data image.size for a7c33c16-0d32-4b9e-a441-3dbec584e170 @ 2017-02-28T13:37:21.600643 has no volume (volume: None), the sample will be dropped
2017-02-28 13:37:21.642 3102 WARNING ceilometer.pipeline [-] metering data image.size for 952b9238-da75-4dec-a984-a43ed6c356d0 @ 2017-02-28T13:37:21.600643 has no volume (volume: None), the sample will be dropped

Comment 8 Julien Danjou 2017-02-28 17:34:29 UTC
Also https://bugs.launchpad.net/panko/+bug/1666174 needs to be fixed.

Comment 9 Julien Danjou 2017-02-28 17:36:09 UTC
I confirm that once these 3 issues are fixed it works:

[stack@undercloud-0 ~]$ ceilometer event-list
+--------------------------------------+--------------+----------------------------+-----------------------------------------------------------------+
| Message ID                           | Event Type   | Generated                  | Traits                                                          |
+--------------------------------------+--------------+----------------------------+-----------------------------------------------------------------+
| f9b0d353-d0bb-407b-9b92-8e650d16d290 | image.create | 2017-02-28T17:35:33.388423 | +-------------+--------+--------------------------------------+ |
|                                      |              |                            | |     name    |  type  |                value                 | |
|                                      |              |                            | +-------------+--------+--------------------------------------+ |
|                                      |              |                            | |  created_at | string |         2017-02-28T17:35:33Z         | |
|                                      |              |                            | |     name    | string |               imgtest2               | |
|                                      |              |                            | |  project_id | string |   5fbbc118911847be9e4488c1dac85783   | |
|                                      |              |                            | | resource_id | string | d347927e-e916-4bf3-90cd-b89b362f4b31 | |
|                                      |              |                            | |   service   | string |           image.localhost            | |
|                                      |              |                            | |    status   | string |                queued                | |
|                                      |              |                            | |   user_id   | string |   5fbbc118911847be9e4488c1dac85783   | |
|                                      |              |                            | +-------------+--------+--------------------------------------+ |
+--------------------------------------+--------------+----------------------------+-----------------------------------------------------------------+

Comment 15 Julien Danjou 2017-03-01 14:14:25 UTC
Yeah I believe you. I think it's just the patch missing. I just backported it to Ocata https://review.openstack.org/#/c/439607/ and I'll push a new release upstream ASAP.

Comment 19 Sasha Smolyak 2017-03-28 18:07:57 UTC
Finalizing all of the above:
1. Add event_dispatchers=panko
in ceilometer.conf

2. chcon --reference=/var/log/gnocchi/app.log /var/log/panko/panko.log
on every controller and then restart httpd

3. manually create /var/log/panko/ceilometer-collector.log

4. use overcloudrc.v3 

5. On every controller run  fcontext -N -a -t httpd_log_t /var/log/panko/panko.log and restart httpd

After all these workarounds the events are shown OK

In undercloud, however, need in addition, to change selinux to permissive: sudo setenforce 0

Comment 21 errata-xmlrpc 2017-05-17 20:00:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1245