Description of problem: Openstack deployed by OSP-d has on all controller nodes autogenerated config file for rabbitmq with wrong values - specifically It contains tcp_listen_options section twice, first entry without keepalive option enabled and second with keepalived enabled, afaik first entry is only read and set in erlang configs. The goal is to have keepalive enabled which will not happen afaik with configuration generated by OSPd for rabbitmq server. Example of rabbitmq.config after clean deploy: % This file managed by Puppet % Template Path: rabbitmq/templates/rabbitmq.config [ {rabbit, [ {tcp_listen_options, [binary, {packet, raw}, {reuseaddr, true}, {backlog, 128}, {nodelay, true}, {exit_on_close, false}] }, {cluster_partition_handling, pause_minority}, {tcp_listen_options, [binary, {packet, raw}, {reuseaddr, true}, {backlog, 128}, {nodelay, true}, {exit_on_close, false}, {keepalive, true}]}, {default_user, <<"guest">>}, {default_pass, <<"guest">>} ]}, {kernel, [ {inet_dist_listen_max, 35672}, {inet_dist_listen_min, 35672} ]} , {rabbitmq_management, [ {listener, [ {port, 15672} ]} ]} ]. and It should look like for example as: % This file managed by Puppet % Template Path: rabbitmq/templates/rabbitmq.config [ {rabbit, [ {tcp_listen_options, [binary, {packet, raw}, {reuseaddr, true}, {backlog, 128}, {nodelay, true}, {exit_on_close, false}, {keepalive, true} ] }, {cluster_partition_handling, pause_minority}, {default_user, <<"guest">>}, {default_pass, <<"guest">>} ]}, {kernel, [ {inet_dist_listen_max, 35672}, {inet_dist_listen_min, 35672} ]} , {rabbitmq_management, [ {listener, [ {port, 15672} ]} ]} ]. Version-Release number of selected component (if applicable): $ $ rpm -qa | grep tripleo openstack-tripleo-image-elements-0.9.7-2.el7ost.noarch openstack-tripleo-heat-templates-kilo-0.8.7-12.el7ost.noarch openstack-tripleo-puppet-elements-0.0.2-1.el7ost.noarch openstack-tripleo-heat-templates-0.8.7-12.el7ost.noarch openstack-tripleo-0.0.7-1.el7ost.noarch openstack-tripleo-common-0.1.1-1.el7ost.noarch python-tripleoclient-0.1.1-2.el7ost.noarch How reproducible: Always Steps to Reproduce: 1. grep tcp_listen_options /etc/rabbitmq/rabbitmq.config Actual results: 2 entries Expected results: 1 entry Additional info:
The configuration is not produced as It was the intention causing cluster to have longer recovery time.
This bug did not make the OSP 8.0 release. It is being deferred to OSP 10.
Ping, anyone :) This looks very easy to fix so why keep this open?
Ping, again :) This looks very easy to fix, so let's just fix it :)
So the reason for this bug is the following. In puppet/services/rabbitmq.yaml we have the following: rabbitmq_config_variables: tcp_listen_options: '[binary, {packet, raw}, {reuseaddr, true}, {backlog, 128}, {nodelay, true}, {exit_on_close, false}, {keepalive, true}]' This rabbitmq_config_variables will simply populate the tcp_listen_option option in rabbitmq.config. The problem is that the puppet module for rabbit already contains the logic to set tcp_listen_options. Namely it does the following (templates/rabbitmq.config.erb): {tcp_listen_options, [binary, <%- if @tcp_keepalive -%> {keepalive, true}, <%- end -%> {packet, raw}, {reuseaddr, true}, <%- if @tcp_backlog -%> {backlog, <%= @tcp_backlog %>}, <%- end -%> <%- if @tcp_sndbuf -%> {sndbuf, <%= @tcp_sndbuf %>}, <%- end -%> <%- if @tcp_recbuf -%> {recbuf, <%= @tcp_recbuf %>}, <%- end -%> {nodelay, true}, {linger, {true, 0}}, {exit_on_close, false}] }, So what we need to verify if we remove that line from rabbitmq.yaml that we set the parameter tcp_sndbuf to 128 and that {linger, {true, 0}} is okay
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1245