Bug 2133055 - [RFE] Expose rabbitmq observability data via rabbitmq_prometheus plugin
Summary: [RFE] Expose rabbitmq observability data via rabbitmq_prometheus plugin
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: puppet-rabbitmq
Version: 17.1 (Wallaby)
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ga
: 17.1
Assignee: John Eckersberg
QA Contact: Nobody
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-10-07 16:44 UTC by Leif Madsen
Modified: 2023-11-08 14:01 UTC (History)
17 users (show)

Fixed In Version: puppet-rabbitmq-11.0.1-1.20230428151021.63fee2c.el9ost
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-08-16 01:12:16 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-19256 0 None None None 2022-10-07 17:16:11 UTC
Red Hat Product Errata RHEA-2023:4577 0 None None None 2023-08-16 01:12:53 UTC

Description Leif Madsen 2022-10-07 16:44:54 UTC
Description of problem: The release of RabbitMQ 3.8.0 added a plugin called 'rabbitmq_prometheus' which provides telemetry of the RabbitMQ server itself, and would be incredibly useful for observability of this service should the telemetry data be exposed and available for collection from a Prometheus instance.

Version-Release number of selected component (if applicable): rabbitmq 3.8.0 or later


Additional info:

    rabbitmq-plugins enable rabbitmq_prometheus
    curl -v -H "Accept:text/plain" "http://localhost:15692/metrics"
    <lots_of_metrics>
    netstat -tlnp | grep 15692
    tcp        0      0 0.0.0.0:15692           0.0.0.0:*               LISTEN      -                


This is available as far back as what we ship in RHOSP 17.0, and in theory could result in backports to RHOSP 17.1 for next-get observability.

Effort required will be expose this via puppet/ansible tripleo packaging and allow for controls to enable this functionality via director configurations.

https://github.com/rabbitmq/rabbitmq-server/tree/main/deps/rabbitmq_prometheus

Comment 2 Eric Nothen 2022-11-03 07:14:38 UTC
So the prometheus plugin is available on 16.2 as well:

~~~
[root@overcloud-controller-0 ~]# podman inspect $(podman ps -q -f name=rabbit) | jq .[].Config.Labels.version
"16.2.3"
[root@overcloud-controller-0 ~]# podman exec -ti $(podman ps -q -f name=rabbit) bash
[root@overcloud-controller-0 /]# rabbitmqctl --version
3.8.16
[root@overcloud-controller-0 /]# rabbitmq-plugins enable rabbitmq_prometheus
Enabling plugins on node rabbit@overcloud-controller-0:
rabbitmq_prometheus
The following plugins have been configured:
  rabbitmq_management
  rabbitmq_management_agent
  rabbitmq_prometheus
  rabbitmq_web_dispatch
Applying plugin configuration to rabbit@overcloud-controller-0...
The following plugins have been enabled:
  rabbitmq_prometheus

started 1 plugins.
[root@overcloud-controller-0 /]# 
~~~


As far as I can tell, the module puppet-rabbitmq supports enabling plugins [0], but the problem is that the resource is disabled when using "use_config_file_for_plugins: True" [1], which we are indeed using [2] to persistently store the list of enabled plugins on the container. 

We could expose a new boolean for this plugin (same as there's currently "admin_enable") and add it to the template file. Otherwise, we could expose an "enabled_plugins" list so that the whole list is not started from scratch on the template file, as it's currently the case [3].


[0] https://forge.puppet.com/modules/puppet/rabbitmq/10.1.2/reference#rabbitmq_plugin
[1] https://forge.puppet.com/modules/puppet/rabbitmq/10.1.2/reference#use_config_file_for_plugins
[2] https://github.com/openstack/tripleo-heat-templates/blob/stable/train/deployment/rabbitmq/rabbitmq-container-puppet.yaml#L229
[3] https://github.com/voxpupuli/puppet-rabbitmq/blob/master/templates/enabled_plugins.erb#L3

Comment 3 Eric Nothen 2022-11-15 15:20:44 UTC
I have been testing this patch on the rabbitmq module, and it's been working fine when running overcloud deploy in my 16.2 overcloud:

~~~
[stack.lab rabbitmq]$ git diff
diff --git a/manifests/config.pp b/manifests/config.pp
index db5e98e..77336fa 100644
--- a/manifests/config.pp
+++ b/manifests/config.pp
@@ -6,6 +6,7 @@ class rabbitmq::config {
   $admin_enable                        = $rabbitmq::admin_enable
   $management_enable                   = $rabbitmq::management_enable
   $use_config_file_for_plugins         = $rabbitmq::use_config_file_for_plugins
+  $plugins                             = $rabbitmq::plugins
   $cluster_node_type                   = $rabbitmq::cluster_node_type
   $cluster_nodes                       = $rabbitmq::cluster_nodes
   $config                              = $rabbitmq::config
diff --git a/manifests/init.pp b/manifests/init.pp
index 4a50a8a..cc7b24d 100644
--- a/manifests/init.pp
+++ b/manifests/init.pp
@@ -88,6 +88,9 @@
 # @param use_config_file_for_plugins
 #   If enabled the /etc/rabbitmq/enabled_plugins config file is created,
 #   replacing the use of the rabbitmqplugins provider to enable plugins.
+# @param plugins
+#   Additional list of plugins to start, or to add to /etc/rabbitmq/enabled_plugins,
+#   if use_config_file_for_plugins is enabled.
 # @param auth_backends
 #   An array specifying authorization/authentication backend to use. Single quotes should be placed around array entries,
 #   ex. `['{foo, baz}', 'baz']` Defaults to [rabbit_auth_backend_internal], and if using LDAP defaults to [rabbit_auth_backend_internal,
@@ -302,6 +305,7 @@ class rabbitmq (
   Boolean $admin_enable                                                                            = true,
   Boolean $management_enable                                                                       = false,
   Boolean $use_config_file_for_plugins                                                             = false,
+  Array $plugins                                                                                   = [],
   Enum['ram', 'disc'] $cluster_node_type                                                           = 'disc',
   Array $cluster_nodes                                                                             = [],
   String $config                                                                                   = 'rabbitmq/rabbitmq.config.erb',
@@ -477,6 +481,14 @@ class rabbitmq (
         }
       }
     }
+    # Start anything else listed on the plugins array, if it was not started already by the other booleans
+    $plugins.each | $plugin | {
+      rabbitmq_plugin { $plugin:
+        ensure   => present,
+        notify   => Class['rabbitmq::service'],
+        provider => 'rabbitmqplugins',
+      }
+    }
   }
 
   if $admin_enable and $service_manage {
diff --git a/templates/enabled_plugins.erb b/templates/enabled_plugins.erb
index 6d1dfac..b9321bd 100644
--- a/templates/enabled_plugins.erb
+++ b/templates/enabled_plugins.erb
@@ -1,6 +1,6 @@
 % This file managed by Puppet
 % Template Path: <%= @module_name %>/templates/enabled_plugins
-<%- @_plugins = [] -%>
+<%- @_plugins = @plugins -%>
 <%- if @admin_enable or @management_enable -%>
   <%- @_plugins << 'rabbitmq_management' -%>
 <%- end -%>
@@ -16,4 +16,4 @@
     <%- @_plugins << 'rabbitmq_shovel_management' -%>
   <%- end -%>
 <%- end -%>
-[<%= @_plugins.join(',')%>].
+[<%= (@_plugins.uniq).join(',')%>].
[stack.lab rabbitmq]$ 
~~~

Then my environment file looks like this:

~~~
[stack.lab ~]$ cat templates/rabbitmq.yaml 
parameter_defaults:
  ControllerExtraConfig:
  # Setup custom list of plugins
    rabbitmq::plugins:
      - rabbitmq_management
      - rabbitmq_prometheus
      - rabbitmq_stomp

  # Create new firewall rule for port 15692 on the controllers
    tripleo::firewall::firewall_rules:
      '110 rabbitmq prometheus':
        dport:
          - 15692
~~~


The "rabbitmq::plugins" array allows me to start any plugin, so it's more flexible than the current method of listing just some booleans inside of the enabled_plugins.erb template (although I left the current code so that my change is backwards compatible). I'm explicitly enabling "rabbitmq_management" on the new plugins array to test the "uniq" part of the updated enabled_plugins template (this plugin is already enabled here [0]).

Then the second block is to allow traffic to the prometheus plugin, but this could also be done by updating /usr/share/openstack-tripleo-heat-templates/deployment/rabbitmq/rabbitmq-container-puppet.yaml.

The end result is this:

~~~
[stack.lab ~]$ ansible -i inventory.yaml -m shell -a 'cat /var/lib/config-data/puppet-generated/rabbitmq/etc/rabbitmq/enabled_plugins' -b Controller[0]
overcloud-controller-0 | CHANGED | rc=0 >>
% This file managed by Puppet
% Template Path: rabbitmq/templates/enabled_plugins
[rabbitmq_management,rabbitmq_prometheus,rabbitmq_stomp].
[stack.lab ~]$ 
[stack.lab ~]$ ansible -i inventory.yaml -m shell -a 'podman exec -ti $(podman ps -q -f name=rabbit) rabbitmq-plugins list' -b Controller[0]
overcloud-controller-0 | CHANGED | rc=0 >>
Listing plugins with pattern ".*" ...
 Configured: E = explicitly enabled; e = implicitly enabled
 | Status: * = running on rabbit@overcloud-controller-0
 |/
[  ] rabbitmq_amqp1_0                  3.8.16
[  ] rabbitmq_auth_backend_cache       3.8.16
[  ] rabbitmq_auth_backend_http        3.8.16
[  ] rabbitmq_auth_backend_ldap        3.8.16
[  ] rabbitmq_auth_backend_oauth2      3.8.16
[  ] rabbitmq_auth_mechanism_ssl       3.8.16
[  ] rabbitmq_consistent_hash_exchange 3.8.16
[  ] rabbitmq_event_exchange           3.8.16
[  ] rabbitmq_federation               3.8.16
[  ] rabbitmq_federation_management    3.8.16
[  ] rabbitmq_jms_topic_exchange       3.8.16
[E*] rabbitmq_management               3.8.16
[e*] rabbitmq_management_agent         3.8.16
[  ] rabbitmq_mqtt                     3.8.16
[  ] rabbitmq_peer_discovery_aws       3.8.16
[  ] rabbitmq_peer_discovery_common    3.8.16
[  ] rabbitmq_peer_discovery_consul    3.8.16
[  ] rabbitmq_peer_discovery_etcd      3.8.16
[  ] rabbitmq_peer_discovery_k8s       3.8.16
[E*] rabbitmq_prometheus               3.8.16
[  ] rabbitmq_random_exchange          3.8.16
[  ] rabbitmq_recent_history_exchange  3.8.16
[  ] rabbitmq_sharding                 3.8.16
[  ] rabbitmq_shovel                   3.8.16
[  ] rabbitmq_shovel_management        3.8.16
[E*] rabbitmq_stomp                    3.8.16
[  ] rabbitmq_top                      3.8.16
[  ] rabbitmq_tracing                  3.8.16
[  ] rabbitmq_trust_store              3.8.16
[e*] rabbitmq_web_dispatch             3.8.16
[  ] rabbitmq_web_mqtt                 3.8.16
[  ] rabbitmq_web_mqtt_examples        3.8.16
[  ] rabbitmq_web_stomp                3.8.16
[  ] rabbitmq_web_stomp_examples       3.8.16
[stack.lab ~]$ 
[stack.lab ~]$ curl http://overcloud-controller-0:15692/metrics 2>/dev/null | wc -l
1729
[stack.lab ~]$ curl http://overcloud-controller-0:15692/metrics 2>/dev/null | tail
erlang_vm_allocators{alloc="driver_alloc",instance_no="8",kind="mbcs",usage="blocks_size"} 496
erlang_vm_allocators{alloc="driver_alloc",instance_no="8",kind="mbcs_pool",usage="carriers"} 0
erlang_vm_allocators{alloc="driver_alloc",instance_no="8",kind="mbcs_pool",usage="carriers_size"} 0
erlang_vm_allocators{alloc="driver_alloc",instance_no="8",kind="mbcs_pool",usage="blocks"} 0
erlang_vm_allocators{alloc="driver_alloc",instance_no="8",kind="mbcs_pool",usage="blocks_size"} 0
erlang_vm_allocators{alloc="driver_alloc",instance_no="8",kind="sbcs",usage="carriers"} 0
erlang_vm_allocators{alloc="driver_alloc",instance_no="8",kind="sbcs",usage="carriers_size"} 0
erlang_vm_allocators{alloc="driver_alloc",instance_no="8",kind="sbcs",usage="blocks"} 0
erlang_vm_allocators{alloc="driver_alloc",instance_no="8",kind="sbcs",usage="blocks_size"} 0
~~~

Leif, do you think this change can make its way on a future z-stream of 16.2? Should I open a PR on the module upstream?


[0] https://github.com/openstack/tripleo-heat-templates/blob/stable/train/deployment/rabbitmq/rabbitmq-container-puppet.yaml#L228

Comment 4 Leif Madsen 2022-11-21 20:33:05 UTC
(In reply to Eric Nothen from comment #3)
> I have been testing this patch on the rabbitmq module, and it's been working
> fine when running overcloud deploy in my 16.2 overcloud:
> 
> ~~~
> [stack.lab rabbitmq]$ git diff
> diff --git a/manifests/config.pp b/manifests/config.pp
> index db5e98e..77336fa 100644
> --- a/manifests/config.pp
> +++ b/manifests/config.pp
> @@ -6,6 +6,7 @@ class rabbitmq::config {
>    $admin_enable                        = $rabbitmq::admin_enable
>    $management_enable                   = $rabbitmq::management_enable
>    $use_config_file_for_plugins         =
> $rabbitmq::use_config_file_for_plugins
> +  $plugins                             = $rabbitmq::plugins
>    $cluster_node_type                   = $rabbitmq::cluster_node_type
>    $cluster_nodes                       = $rabbitmq::cluster_nodes
>    $config                              = $rabbitmq::config
> diff --git a/manifests/init.pp b/manifests/init.pp
> index 4a50a8a..cc7b24d 100644
> --- a/manifests/init.pp
> +++ b/manifests/init.pp
> @@ -88,6 +88,9 @@
>  # @param use_config_file_for_plugins
>  #   If enabled the /etc/rabbitmq/enabled_plugins config file is created,
>  #   replacing the use of the rabbitmqplugins provider to enable plugins.
> +# @param plugins
> +#   Additional list of plugins to start, or to add to
> /etc/rabbitmq/enabled_plugins,
> +#   if use_config_file_for_plugins is enabled.
>  # @param auth_backends
>  #   An array specifying authorization/authentication backend to use. Single
> quotes should be placed around array entries,
>  #   ex. `['{foo, baz}', 'baz']` Defaults to [rabbit_auth_backend_internal],
> and if using LDAP defaults to [rabbit_auth_backend_internal,
> @@ -302,6 +305,7 @@ class rabbitmq (
>    Boolean $admin_enable                                                    
> = true,
>    Boolean $management_enable                                               
> = false,
>    Boolean $use_config_file_for_plugins                                     
> = false,
> +  Array $plugins                                                           
> = [],
>    Enum['ram', 'disc'] $cluster_node_type                                   
> = 'disc',
>    Array $cluster_nodes                                                     
> = [],
>    String $config                                                           
> = 'rabbitmq/rabbitmq.config.erb',
> @@ -477,6 +481,14 @@ class rabbitmq (
>          }
>        }
>      }
> +    # Start anything else listed on the plugins array, if it was not
> started already by the other booleans
> +    $plugins.each | $plugin | {
> +      rabbitmq_plugin { $plugin:
> +        ensure   => present,
> +        notify   => Class['rabbitmq::service'],
> +        provider => 'rabbitmqplugins',
> +      }
> +    }
>    }
>  
>    if $admin_enable and $service_manage {
> diff --git a/templates/enabled_plugins.erb b/templates/enabled_plugins.erb
> index 6d1dfac..b9321bd 100644
> --- a/templates/enabled_plugins.erb
> +++ b/templates/enabled_plugins.erb
> @@ -1,6 +1,6 @@
>  % This file managed by Puppet
>  % Template Path: <%= @module_name %>/templates/enabled_plugins
> -<%- @_plugins = [] -%>
> +<%- @_plugins = @plugins -%>
>  <%- if @admin_enable or @management_enable -%>
>    <%- @_plugins << 'rabbitmq_management' -%>
>  <%- end -%>
> @@ -16,4 +16,4 @@
>      <%- @_plugins << 'rabbitmq_shovel_management' -%>
>    <%- end -%>
>  <%- end -%>
> -[<%= @_plugins.join(',')%>].
> +[<%= (@_plugins.uniq).join(',')%>].
> [stack.lab rabbitmq]$ 
> ~~~
> 
> Then my environment file looks like this:
> 
> ~~~
> [stack.lab ~]$ cat templates/rabbitmq.yaml 
> parameter_defaults:
>   ControllerExtraConfig:
>   # Setup custom list of plugins
>     rabbitmq::plugins:
>       - rabbitmq_management
>       - rabbitmq_prometheus
>       - rabbitmq_stomp
> 
>   # Create new firewall rule for port 15692 on the controllers
>     tripleo::firewall::firewall_rules:
>       '110 rabbitmq prometheus':
>         dport:
>           - 15692
> ~~~
> 
> 
> The "rabbitmq::plugins" array allows me to start any plugin, so it's more
> flexible than the current method of listing just some booleans inside of the
> enabled_plugins.erb template (although I left the current code so that my
> change is backwards compatible). I'm explicitly enabling
> "rabbitmq_management" on the new plugins array to test the "uniq" part of
> the updated enabled_plugins template (this plugin is already enabled here
> [0]).
> 
> Then the second block is to allow traffic to the prometheus plugin, but this
> could also be done by updating
> /usr/share/openstack-tripleo-heat-templates/deployment/rabbitmq/rabbitmq-
> container-puppet.yaml.
> 
> The end result is this:
> 
> ~~~
> [stack.lab ~]$ ansible -i inventory.yaml -m shell -a 'cat
> /var/lib/config-data/puppet-generated/rabbitmq/etc/rabbitmq/enabled_plugins'
> -b Controller[0]
> overcloud-controller-0 | CHANGED | rc=0 >>
> % This file managed by Puppet
> % Template Path: rabbitmq/templates/enabled_plugins
> [rabbitmq_management,rabbitmq_prometheus,rabbitmq_stomp].
> [stack.lab ~]$ 
> [stack.lab ~]$ ansible -i inventory.yaml -m shell -a 'podman
> exec -ti $(podman ps -q -f name=rabbit) rabbitmq-plugins list' -b
> Controller[0]
> overcloud-controller-0 | CHANGED | rc=0 >>
> Listing plugins with pattern ".*" ...
>  Configured: E = explicitly enabled; e = implicitly enabled
>  | Status: * = running on rabbit@overcloud-controller-0
>  |/
> [  ] rabbitmq_amqp1_0                  3.8.16
> [  ] rabbitmq_auth_backend_cache       3.8.16
> [  ] rabbitmq_auth_backend_http        3.8.16
> [  ] rabbitmq_auth_backend_ldap        3.8.16
> [  ] rabbitmq_auth_backend_oauth2      3.8.16
> [  ] rabbitmq_auth_mechanism_ssl       3.8.16
> [  ] rabbitmq_consistent_hash_exchange 3.8.16
> [  ] rabbitmq_event_exchange           3.8.16
> [  ] rabbitmq_federation               3.8.16
> [  ] rabbitmq_federation_management    3.8.16
> [  ] rabbitmq_jms_topic_exchange       3.8.16
> [E*] rabbitmq_management               3.8.16
> [e*] rabbitmq_management_agent         3.8.16
> [  ] rabbitmq_mqtt                     3.8.16
> [  ] rabbitmq_peer_discovery_aws       3.8.16
> [  ] rabbitmq_peer_discovery_common    3.8.16
> [  ] rabbitmq_peer_discovery_consul    3.8.16
> [  ] rabbitmq_peer_discovery_etcd      3.8.16
> [  ] rabbitmq_peer_discovery_k8s       3.8.16
> [E*] rabbitmq_prometheus               3.8.16
> [  ] rabbitmq_random_exchange          3.8.16
> [  ] rabbitmq_recent_history_exchange  3.8.16
> [  ] rabbitmq_sharding                 3.8.16
> [  ] rabbitmq_shovel                   3.8.16
> [  ] rabbitmq_shovel_management        3.8.16
> [E*] rabbitmq_stomp                    3.8.16
> [  ] rabbitmq_top                      3.8.16
> [  ] rabbitmq_tracing                  3.8.16
> [  ] rabbitmq_trust_store              3.8.16
> [e*] rabbitmq_web_dispatch             3.8.16
> [  ] rabbitmq_web_mqtt                 3.8.16
> [  ] rabbitmq_web_mqtt_examples        3.8.16
> [  ] rabbitmq_web_stomp                3.8.16
> [  ] rabbitmq_web_stomp_examples       3.8.16
> [stack.lab ~]$ 
> [stack.lab ~]$ curl
> http://overcloud-controller-0:15692/metrics 2>/dev/null | wc -l
> 1729
> [stack.lab ~]$ curl
> http://overcloud-controller-0:15692/metrics 2>/dev/null | tail
> erlang_vm_allocators{alloc="driver_alloc",instance_no="8",kind="mbcs",
> usage="blocks_size"} 496
> erlang_vm_allocators{alloc="driver_alloc",instance_no="8",kind="mbcs_pool",
> usage="carriers"} 0
> erlang_vm_allocators{alloc="driver_alloc",instance_no="8",kind="mbcs_pool",
> usage="carriers_size"} 0
> erlang_vm_allocators{alloc="driver_alloc",instance_no="8",kind="mbcs_pool",
> usage="blocks"} 0
> erlang_vm_allocators{alloc="driver_alloc",instance_no="8",kind="mbcs_pool",
> usage="blocks_size"} 0
> erlang_vm_allocators{alloc="driver_alloc",instance_no="8",kind="sbcs",
> usage="carriers"} 0
> erlang_vm_allocators{alloc="driver_alloc",instance_no="8",kind="sbcs",
> usage="carriers_size"} 0
> erlang_vm_allocators{alloc="driver_alloc",instance_no="8",kind="sbcs",
> usage="blocks"} 0
> erlang_vm_allocators{alloc="driver_alloc",instance_no="8",kind="sbcs",
> usage="blocks_size"} 0
> ~~~
> 
> Leif, do you think this change can make its way on a future z-stream of
> 16.2? Should I open a PR on the module upstream?
> 
> 
> [0]
> https://github.com/openstack/tripleo-heat-templates/blob/stable/train/
> deployment/rabbitmq/rabbitmq-container-puppet.yaml#L228

I'm not really sure if this is appropriate for 16.2. I suppose if this landed in upstream then it'll make it's way in during a future import of train downstream though :)

Comment 5 Eric Nothen 2022-11-23 10:06:42 UTC
(In reply to Leif Madsen from comment #4)
>
> I'm not really sure if this is appropriate for 16.2. 

But what would be the objection? There's zero THT code changes required, and the current default behavior is kept. Asking so that I know if I would need to do more changes to my proposed patch.

> I suppose if this landed in upstream then it'll make it's way in during 
> a future import of train downstream though :)

ok, it's on the way: https://github.com/voxpupuli/puppet-rabbitmq/pull/917

Comment 40 errata-xmlrpc 2023-08-16 01:12:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 17.1 (Wallaby)), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2023:4577


Note You need to log in before you can comment on or make changes to this bug.