Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1848705

Summary: Defaults from SERVER_ERL_ARGS are getting overwritten by tripleo config
Product: Red Hat OpenStack Reporter: John Eckersberg <jeckersb>
Component: openstack-tripleo-heat-templatesAssignee: Peter Lemenkov <plemenko>
Status: CLOSED CURRENTRELEASE QA Contact: pkomarov
Severity: medium Docs Contact:
Priority: medium    
Version: 16.2 (Train)CC: apevec, jeckersb, jschluet, lhh, lmiccini, mburns, michele, pkomarov
Target Milestone: betaKeywords: TestOnly, Triaged
Target Release: 16.2 (Train on RHEL 8.4)   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: puppet-rabbitmq-10.1.2-2.20210323012953.8b9b006.el8ost.1 openstack-tripleo-heat-templates-11.4.1-2.20210323012110.c3396e2.el8ost.1 puppet-tripleo-11.5.1-2.20210323024955.4d3d23e.el8ost.1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-10-14 15:55:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description John Eckersberg 2020-06-18 19:01:07 UTC
Bug forked off from https://bugzilla.redhat.com/show_bug.cgi?id=1779407#c62:

The erlang default for +zdbbl is 1024 (value in kB) so 1 MB.

RabbitMQ in its environment script overrides this default to be 128000 or ~128 MB.  But it does this by setting it in SERVER_ERL_ARGS, which we are overriding with tripleo in rabbitmq-env.conf in the value RABBITMQ_SERVER_ERL_ARGS.  Since we do not specify that value in tripleo, it's falling back to the erl default of 1 MB.

I'll file a new bug about this whole situation.  We need to revisit how we're handling the environment variables.

Comment 1 Michele Baldessari 2020-06-23 10:23:28 UTC
(In reply to John Eckersberg from comment #0)
> Bug forked off from https://bugzilla.redhat.com/show_bug.cgi?id=1779407#c62:
> 
> The erlang default for +zdbbl is 1024 (value in kB) so 1 MB.
> 
> RabbitMQ in its environment script overrides this default to be 128000 or
> ~128 MB.  But it does this by setting it in SERVER_ERL_ARGS, which we are
> overriding with tripleo in rabbitmq-env.conf in the value
> RABBITMQ_SERVER_ERL_ARGS.  Since we do not specify that value in tripleo,
> it's falling back to the erl default of 1 MB.
> 
> I'll file a new bug about this whole situation.  We need to revisit how
> we're handling the environment variables.

Yo John,

so in /usr/lib/rabbitmq/lib/rabbitmq_server-3.6.15/sbin/rabbitmq-env I see:
"""
DEFAULT_SCHEDULER_BIND_TYPE="db"                                                                                         
[ "x" = "x$RABBITMQ_SCHEDULER_BIND_TYPE" ] && RABBITMQ_SCHEDULER_BIND_TYPE=${DEFAULT_SCHEDULER_BIND_TYPE}                
                                                                                                                         
DEFAULT_DISTRIBUTION_BUFFER_SIZE=128000                                                                                  
[ "x" = "x$RABBITMQ_DISTRIBUTION_BUFFER_SIZE" ] && RABBITMQ_DISTRIBUTION_BUFFER_SIZE=${DEFAULT_DISTRIBUTION_BUFFER_SIZE} 
                                                                                                                         
## Common defaults                                                                                                       
SERVER_ERL_ARGS="+P 1048576 +t 5000000 +stbt $RABBITMQ_SCHEDULER_BIND_TYPE +zdbbl $RABBITMQ_DISTRIBUTION_BUFFER_SIZE"    
"""

I see that we should probably use server_additional_erl_args for our customizations instead since that does not seem to overload 
anything. Right now in THT we do the following in /usr/share/openstack-tripleo-heat-templates/puppet/services/rabbitmq.yaml:
rabbitmq_environment:                                                                                   
  RABBITMQ_SERVER_ERL_ARGS: '"+K true +P 1048576 -kernel inet_default_connect_options [{nodelay,true}]"'
  RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS: {get_param: RabbitAdditionalErlArgs}                             

So initially I thought I'd go for something like below:
--- /usr/share/openstack-tripleo-heat-templates/puppet/services/rabbitmq.yaml.orig      2020-06-23 03:42:12.720890203 -0400
+++ /usr/share/openstack-tripleo-heat-templates/puppet/services/rabbitmq.yaml   2020-06-23 03:45:42.963141088 -0400
@@ -113,8 +113,12 @@
               NODE_PORT: ''
               NODE_IP_ADDRESS: ''
               RABBITMQ_NODENAME: "rabbit@%{::hostname}"
-              RABBITMQ_SERVER_ERL_ARGS: '"+K true +P 1048576 -kernel inet_default_connect_options [{nodelay,true}]"'
-              RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS: {get_param: RabbitAdditionalErlArgs}
+              RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS:
+                str_replace:
+                  template:
+                    '"+K true +P 1048576 -kernel inet_default_connect_options [{nodelay,true}] $ADDITIONALERLARGS"'
+                  params:
+                    $ADDITIONALERLARGS: {get_param: RabbitAdditionalErlArgs}
               'export ERL_EPMD_ADDRESS': "%{hiera('rabbitmq::interface')}"
             rabbitmq_kernel_variables:
               inet_dist_listen_min: '25672'

But that won't work because puppet-tripleo overrides additional_erl_args when we use tls-everywhere here:
https://github.com/openstack/puppet-tripleo/blob/master/manifests/profile/base/rabbitmq.pp#L156

I'll see if I need to tweak puppet-tripleo as well. Failing that we can always put all the current defaults inside RABBITMQ_SERVER_ERL_ARGS. That is just a bit annoying in case the change in the future, but it is still quite doable I guess.

Comment 2 Michele Baldessari 2020-06-23 10:59:38 UTC
So we will have to go with just copying the defaults and hardcode them in RABBITMQ_SERVER_ERL_ARGS. This is because if you configure ipv6 or tls puppet-rabbitmq will just forcefully set RABBITMQ_SERVER_ERL_ARGS with stuff: https://github.com/voxpupuli/puppet-rabbitmq/blob/master/manifests/config.pp#L128-L139

So even if we only used RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS, puppet-rabbitmq would populate RABBITMQ_SERVER_ERL_ARGS in the ipv6 and/or tls case.

Comment 5 John Eckersberg 2020-06-23 18:41:55 UTC
The RabbitMQ docs explicitly say we really shouldn't set RABBITMQ_SERVER_ERL_ARGS.

From https://www.rabbitmq.com/configure.html#supported-environment-variables, under RABBITMQ_SERVER_ERL_ARGS:

"Standard parameters for the erl command used when invoking the RabbitMQ Server. This should be overridden for debugging purposes only. Overriding this variable replaces the default value. "

So... first thing is don't set RABBITMQ_SERVER_ERL_ARGS in puppet-rabbitmq when ipv6/tls are enabled, instead use RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS:

https://github.com/voxpupuli/puppet-rabbitmq/pull/841

Comment 6 John Eckersberg 2020-06-23 19:05:01 UTC
Next, we don't need to set RABBITMQ_SERVER_ERL_ARGS at all in tripleo-heat-templates!

https://review.opendev.org/#/c/737604/

I think that plus the previous comment are enough so we never touch the rmq defaults.

Comment 7 Michele Baldessari 2020-06-24 10:03:18 UTC
So I preventively backported a puppet-tripleo fix https://review.opendev.org/#/c/737716/ (queens) and https://review.opendev.org/#/c/737715/ (stein) which is required in case we need/want to rebase puppet-rabbitmq to a version that includes eck's https://github.com/voxpupuli/puppet-rabbitmq/pull/841 on older releases

I added https://review.opendev.org/737733 for puppet-tripleo so that we also fix the sbwt none with tls-e and put a depends-on that in https://review.opendev.org/#/c/737604/

Comment 8 Michele Baldessari 2020-06-24 10:08:42 UTC
So a test run on OSP13 TLS-E with the following:
- puppet-rabbitmq pointing to git master
- puppet-tripleo with https://review.opendev.org/737733 and https://review.opendev.org/#/c/737716/ applied
- tht with https://review.opendev.org/#/c/737604/ applied

Gave me the following:
[root@messaging-0 ~]# more /var/lib/config-data/puppet-generated/rabbitmq/etc/rabbitmq/rabbitmq-env.conf 
NODE_IP_ADDRESS=
NODE_PORT=
RABBITMQ_CTL_ERL_ARGS="-ssl_dist_opt server_certfile /etc/pki/tls/certs/rabbitmq.crt -ssl_dist_opt server_keyfile /etc/pki/tls/private/rabbitmq.key -ssl_dist_opt server_ciphers ECDH
E-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-SH
A256:ECDHE-ECDSA-AES128-SHA256:AES256-GCM-SHA384:AES256-SHA256:AES128-GCM-SHA256:AES128-SHA256:DHE-DSS-AES256-GCM-SHA384:DHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES256-SHA256:DHE-DSS-AES2
56-SHA256:DHE-DSS-AES128-GCM-SHA256:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES128-SHA256:DHE-DSS-AES128-SHA256 -ssl_dist_opt server_secure_renegotiate true -ssl_dist_opt client_secure_re
negotiate true +sbwt none -pa /usr/lib64/erlang/lib/ssl-7.3.3.2/ebin  -proto_dist inet_tls"
RABBITMQ_NODENAME=rabbit@messaging-0
RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="-ssl_dist_opt server_certfile /etc/pki/tls/certs/rabbitmq.crt -ssl_dist_opt server_keyfile /etc/pki/tls/private/rabbitmq.key -ssl_dist_opt serve
r_ciphers ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE
-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA256:AES256-GCM-SHA384:AES256-SHA256:AES128-GCM-SHA256:AES128-SHA256:DHE-DSS-AES256-GCM-SHA384:DHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES256-SHA25
6:DHE-DSS-AES256-SHA256:DHE-DSS-AES128-GCM-SHA256:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES128-SHA256:DHE-DSS-AES128-SHA256 -ssl_dist_opt server_secure_renegotiate true -ssl_dist_opt cl
ient_secure_renegotiate true +sbwt none -pa /usr/lib64/erlang/lib/ssl-7.3.3.2/ebin  -proto_dist inet_tls"
export ERL_EPMD_ADDRESS=172.17.1.147
export ERL_INETRC=/etc/rabbitmq/inetrc


root      715122  0.0  0.0  40700  1444 ?        S    09:38   0:00                  \_ /sbin/runuser -u rabbitmq -- /usr/lib/rabbitmq/bin/rabbitmq-server
42439     715139  0.0  0.0  11692  1560 ?        S    09:38   0:00                      \_ /bin/sh /usr/lib/rabbitmq/bin/rabbitmq-server
42439     715351  6.7  4.9 1517580 395688 ?      Sl   09:38   1:56                          \_ /usr/lib64/erlang/erts-7.3.1.6/bin/beam.smp -W w -A 64 -P 1048576 -t 5000000 -stbt db -zdbbl 128000 -K true -sbwt none -B i -- -root /usr/lib64/erlang -progname erl -- -home /var/lib/rabbitmq -- -pa /usr/lib/rabbitmq/lib/rabbitmq_server-3.6.15/ebin -noshell -noinput -s rabbit boot -sname rabbit@messaging-0 -boot start_sasl -config /etc/rabbitmq/rabbitmq -kernel inet_default_connect_options [{nodelay,true}] -ssl_dist_opt server_certfile /etc/pki/tls/certs/rabbitmq.crt -ssl_dist_opt server_keyfile /etc/pki/tls/private/rabbitmq.key -ssl_dist_opt server_ciphers ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA256:AES256-GCM-SHA384:AES256-SHA256:AES128-GCM-SHA256:AES128-SHA256:DHE-DSS-AES256-GCM-SHA384:DHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES256-SHA256:DHE-DSS-AES256-SHA256:DHE-DSS-AES128-GCM-SHA256:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES128-SHA256:DHE-DSS-AES128-SHA256 -ssl_dist_opt server_secure_renegotiate true -ssl_dist_opt client_secure_renegotiate true -pa /usr/lib64/erlang/lib/ssl-7.3.3.2/ebin -proto_dist inet_tls -sasl errlog_type error -sasl sasl_error_logger false -rabbit error_logger {file,"/var/log/rabbitmq/rabbit"} -rabbit sasl_error_logger {file,"/var/log/rabbitmq/rabbit"} -rabbit enabled_plugins_file "/etc/rabbitmq/enabled_plugins" -rabbit plugins_dir "/usr/lib/rabbitmq/plugins:/usr/lib/rabbitmq/lib/rabbitmq_server-3.6.15/plugins" -rabbit plugins_expand_dir "/var/lib/rabbitmq/mnesia/rabbit@messaging-0-plugins-expand" -os_mon start_cpu_sup false -os_mon start_disksup false -os_mon start_memsup false -mnesia dir "/var/lib/rabbitmq/mnesia/rabbit@messaging-0"

Which looks fairly okay to me so far.

John do you know how to verify that indeed the second +sbwt none "beats" the default -sbwt none in the above erlang process?

Comment 9 John Eckersberg 2020-06-24 13:28:59 UTC
I think the +sbwt vs -sbwt thing doesn't really matter and you're seeing the same argument.  Erlang used to use '+' (and still does) for a bunch of their cli option flags.  But I think somewhere in the last 20 years they realized that literally every cli app in the world uses '-'.  So the erl command will accept it either way.  I believe what you're seeing above is when you pass +sbwt to erl, there are some intermediate steps (erl is a shell script, which then calls erlexec, which i believe finally calls beam.smp...) and the +sbwt flag is parsed by one piece and then re-emitted as -sbwt in the end to beam.smp.

And yes, those environments and the cmdline look more reasonable to me!

RFE at some point we need to switch to the "new-style" config and move all of that ssl stuff into the config files.  But that's a bigger project for another day.

Comment 10 Lon Hohberger 2020-10-29 10:51:33 UTC
According to our records, this should be resolved by puppet-tripleo-11.5.0-1.20200914161840.f716ef5.el8ost.  This build is available now.

Comment 11 Lon Hohberger 2020-10-29 10:51:37 UTC
According to our records, this should be resolved by openstack-tripleo-heat-templates-11.3.2-1.20200914170156.el8ost.  This build is available now.

Comment 18 Thierry Vignaud 2021-10-14 15:55:48 UTC
According to our records, this should be resolved by puppet-rabbitmq-10.1.2-2.20210528110135.8b9b006.el8ost.2.  This build is available now.

Comment 19 Thierry Vignaud 2021-10-14 15:55:52 UTC
According to our records, this should be resolved by openstack-tripleo-heat-templates-11.5.1-2.20210603174823.el8ost.9.  This build is available now.

Comment 20 Thierry Vignaud 2021-10-14 15:55:54 UTC
According to our records, this should be resolved by puppet-tripleo-11.6.2-2.20210603175725.el8ost.2.  This build is available now.