This bug has been migrated to another issue tracking site. It has been closed here and may no longer be being monitored.

If you would like to get updates for this issue, or to participate in it, you may do so at Red Hat Issue Tracker .
Bug 2290861 - rabbit-mq ssl config caused pacemaker rabbitmq resource to fail monitoring and controller node to get fenced
Summary: rabbit-mq ssl config caused pacemaker rabbitmq resource to fail monitoring an...
Keywords:
Status: CLOSED MIGRATED
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 17.1 (Wallaby)
Hardware: x86_64
OS: Linux
high
high
Target Milestone: z5
: 17.1
Assignee: Luca Miccini
QA Contact: Joe H. Rahme
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-06-07 10:42 UTC by alisci
Modified: 2025-01-31 14:25 UTC (History)
13 users (show)

Fixed In Version: puppet-tripleo-14.2.3-17.1.20241216110839.40278e1.el9osttrunk openstack-tripleo-heat-templates-14.3.1-17.1.20241216110839.e7c7ce3.el9osttrunk
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2025-01-31 14:22:20 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker   OSP-32229 0 None None None 2025-01-31 14:22:19 UTC
Red Hat Issue Tracker OSP-33564 0 None None None 2025-01-31 14:25:01 UTC

Description alisci 2024-06-07 10:42:34 UTC
Description of problem:
CU experienced several fencing at pacemaker controller nodes due to failure on monitoring rabbit-mq cluster resource. the following get logged during the fault:

~~~
2024-05-16 00:14:37.945764+02:00 [error] <0.9405.0> ** Node 'rabbit.domain.com' not responding **
2024-05-16 00:14:37.945764+02:00 [error] <0.9405.0> ** Removing (timedout) connection **
2024-05-16 00:14:37.945764+02:00 [error] <0.9405.0>
2024-05-16 00:14:37.945976+02:00 [notice] <0.9404.0> TLS server: In state connection at tls_connection_1_3.erl:633 generated SERVER ALERT: Fatal - Internal Error
2024-05-16 00:14:37.945976+02:00 [notice] <0.9404.0>  - closed
~~~

it turned out that disabling SSL at rabbitmq by changing rabbitmq-env.conf in the following way solved the issue:

from:
RABBITMQ_CTL_ERL_ARGS="+sbwt none +sbwtdcpu none +sbwtdio none -ssl_dist_optfile /etc/rabbitmq/ssl-dist.conf -crypto fips_mode false -pa /usr/lib64/erlang/lib/ssl-10.7.3.2/ebin  -proto_dist inet_tls"
RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="+sbwt none +sbwtdcpu none +sbwtdio none -ssl_dist_optfile /etc/rabbitmq/ssl-dist.conf -crypto fips_mode false -pa /usr/lib64/erlang/lib/ssl-10.7.3.2/ebin  -proto_dist inet_tls"

to:
RABBITMQ_CTL_ERL_ARGS="+sbwt none +sbwtdcpu none +sbwtdio none"
RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="+sbwt none +sbwtdcpu none +sbwtdio none"


Version-Release number of selected component (if applicable):
OSP 17.1.2

Comment 36 Luca Miccini 2024-12-16 11:04:10 UTC
Since we haven't been able to reproduce this in our lab (and it is unlikely we will be able to update or rebase rabbitmq in osp17.1 in the future) we decided to take the safest path and enforce tlsv1.2 as the default for rabbitmq.

Value has been set via hieradata 'rabbitmq::ssl_versions' to 'tlsv1.2' and can be customized like:

  ExtraConfig:
   rabbitmq::ssl_versions: ['XXX', 'YYY']


Note You need to log in before you can comment on or make changes to this bug.