Bug 1956235 - rabbitmq-server-3.7.28 has broken chown commands for the erlang cookie
Summary: rabbitmq-server-3.7.28 has broken chown commands for the erlang cookie
Keywords:
Status: ON_QA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rabbitmq-server
Version: 13.0 (Queens)
Hardware: All
OS: Linux
urgent
urgent
Target Milestone: z16
: 13.0 (Queens)
Assignee: Peter Lemenkov
QA Contact: pkomarov
URL:
Whiteboard:
: 1956269 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-05-03 09:12 UTC by Luca Miccini
Modified: 2021-05-04 12:35 UTC (History)
9 users (show)

Fixed In Version: rabbitmq-server-3.7.28-2.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)

Description Luca Miccini 2021-05-03 09:12:43 UTC
Description of problem:

undercloud setup fails because rabbitmq cannot come up properly.

Error is:

2021-05-03 08:27:49,929 INFO: Error: /Stage[main]/Rabbitmq::Service/Service[rabbitmq-server]/ensure: change from stopped to running failed: Systemd start for rabbitmq-server failed!
2021-05-03 08:27:49,930 INFO: journalctl log for rabbitmq-server:
2021-05-03 08:27:49,930 INFO: -- Logs begin at Mon 2021-05-03 08:10:16 UTC, end at Mon 2021-05-03 08:27:49 UTC. --
2021-05-03 08:27:49,930 INFO: May 03 08:27:49 undercloud-0.redhat.local systemd[1]: Starting RabbitMQ broker...
2021-05-03 08:27:49,930 INFO: May 03 08:27:49 undercloud-0.redhat.local rabbitmq-server[18624]: 2021-05-03 08:27:49.232723
2021-05-03 08:27:49,930 INFO: May 03 08:27:49 undercloud-0.redhat.local rabbitmq-server[18624]: args: []
2021-05-03 08:27:49,930 INFO: May 03 08:27:49 undercloud-0.redhat.local rabbitmq-server[18624]: format: "Error when reading /var/lib/rabbitmq/.erlang.cookie: eacces"
2021-05-03 08:27:49,930 INFO: May 03 08:27:49 undercloud-0.redhat.local rabbitmq-server[18624]: label: {error_logger,error_msg}


rabbitmq cannot access the cookie that has been created with the wrong uid?

[root@undercloud-0 ~]# ll /var/lib/rabbitmq/ -lA
total 4
-r--------. 1 root     root     20 May  3 00:00 .erlang.cookie

compose: 2021-04-30.1

Comment 1 Michele Baldessari 2021-05-03 11:35:53 UTC
We have found this issue in the latest osp 13 compose, but this also affects 16.2 composes where we have rabbitmq-server-3.7.28-1.el8ost.1.x86_64. The problem is that the variables have not expanded correctly in some commands:
grep -ir '@RABBITMQ_USER@:@RABBITMQ_GROUP@' /etc/ /usr
/usr/sbin/rabbitmqctl:        chown @RABBITMQ_USER@:@RABBITMQ_GROUP@ "$_erlang_cookie"
/usr/sbin/rabbitmq-plugins:        chown @RABBITMQ_USER@:@RABBITMQ_GROUP@ "$_erlang_cookie"
/usr/sbin/rabbitmq-server:        chown @RABBITMQ_USER@:@RABBITMQ_GROUP@ "$_erlang_cookie"


rabbitmq-server-3.7.23-2.el8ost.x86_64 is not affectd and we have correct commands there:
chown rabbitmq:rabbitmq "$_erlang_cookie"

Comment 2 Michele Baldessari 2021-05-03 11:45:46 UTC
*** Bug 1956269 has been marked as a duplicate of this bug. ***

Comment 3 Michele Baldessari 2021-05-03 11:48:33 UTC
Specifically the problem is that we seem to have dropped http://pkgs.devel.redhat.com/cgit/rpms/rabbitmq-server/commit/?h=rhos-16.1-rhel-8-trunk&id=260d939d9ed557bd6bef54aee289c99be2e2c9ee in 3.7.28 and we had it in 3.7.23

Note also that:
This affects also 16.2 where we have the .el8 build of rabbitmqserver. The reason it only fails on osp 13 UC install is due to another (sigh) bug in the undercloud installation, namely that the cookie set by tripleoclient ends up in a hiera variable (rabbit_cookie) that never gets applied nor used. If we did not have that bug, it's unlikely we would have seen this issue for a longer time.

Comment 4 Peter Lemenkov 2021-05-03 15:59:54 UTC
Let me look. Should be easy to fix.

Comment 5 Michele Baldessari 2021-05-03 16:36:53 UTC
(In reply to Peter Lemenkov from comment #4)
> Let me look. Should be easy to fix.

We fixed it by reincluding those patches mentioned at c#3. We test it and we should be good now
The scratch build we used is here https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=1055853

Comment 6 Michele Baldessari 2021-05-03 17:45:37 UTC
So the BZ has been fixed via rabbitmq-server-3.7.28-2.el7ost (we tested both UC/OC deployment on osp13). The fixed in used in the commit for this build http://pkgs.devel.redhat.com/cgit/rpms/rabbitmq-server/commit/?h=rhos-13.0-rhel-7&id=7054b6f5331377f38c734d30969405cb261b596e was the original rebase bug since that was really the root cause and had all the acks already.


Note You need to log in before you can comment on or make changes to this bug.