Bug 1956235

Summary: rabbitmq-server-3.7.28 has broken chown commands for the erlang cookie
Product: Red Hat OpenStack Reporter: Luca Miccini <lmiccini>
Component: rabbitmq-serverAssignee: Peter Lemenkov <plemenko>
Status: CLOSED ERRATA QA Contact: pkomarov
Severity: urgent Docs Contact:
Priority: urgent    
Version: 13.0 (Queens)CC: ahyder, apevec, jeckersb, lhh, mburns, michele, plemenko, psedlak, sathlang
Target Milestone: z16Keywords: Triaged, ZStream
Target Release: 13.0 (Queens)   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: rabbitmq-server-3.7.28-2.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-06-16 10:59:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Luca Miccini 2021-05-03 09:12:43 UTC
Description of problem:

undercloud setup fails because rabbitmq cannot come up properly.

Error is:

2021-05-03 08:27:49,929 INFO: Error: /Stage[main]/Rabbitmq::Service/Service[rabbitmq-server]/ensure: change from stopped to running failed: Systemd start for rabbitmq-server failed!
2021-05-03 08:27:49,930 INFO: journalctl log for rabbitmq-server:
2021-05-03 08:27:49,930 INFO: -- Logs begin at Mon 2021-05-03 08:10:16 UTC, end at Mon 2021-05-03 08:27:49 UTC. --
2021-05-03 08:27:49,930 INFO: May 03 08:27:49 undercloud-0.redhat.local systemd[1]: Starting RabbitMQ broker...
2021-05-03 08:27:49,930 INFO: May 03 08:27:49 undercloud-0.redhat.local rabbitmq-server[18624]: 2021-05-03 08:27:49.232723
2021-05-03 08:27:49,930 INFO: May 03 08:27:49 undercloud-0.redhat.local rabbitmq-server[18624]: args: []
2021-05-03 08:27:49,930 INFO: May 03 08:27:49 undercloud-0.redhat.local rabbitmq-server[18624]: format: "Error when reading /var/lib/rabbitmq/.erlang.cookie: eacces"
2021-05-03 08:27:49,930 INFO: May 03 08:27:49 undercloud-0.redhat.local rabbitmq-server[18624]: label: {error_logger,error_msg}


rabbitmq cannot access the cookie that has been created with the wrong uid?

[root@undercloud-0 ~]# ll /var/lib/rabbitmq/ -lA
total 4
-r--------. 1 root     root     20 May  3 00:00 .erlang.cookie

compose: 2021-04-30.1

Comment 1 Michele Baldessari 2021-05-03 11:35:53 UTC
We have found this issue in the latest osp 13 compose, but this also affects 16.2 composes where we have rabbitmq-server-3.7.28-1.el8ost.1.x86_64. The problem is that the variables have not expanded correctly in some commands:
grep -ir '@RABBITMQ_USER@:@RABBITMQ_GROUP@' /etc/ /usr
/usr/sbin/rabbitmqctl:        chown @RABBITMQ_USER@:@RABBITMQ_GROUP@ "$_erlang_cookie"
/usr/sbin/rabbitmq-plugins:        chown @RABBITMQ_USER@:@RABBITMQ_GROUP@ "$_erlang_cookie"
/usr/sbin/rabbitmq-server:        chown @RABBITMQ_USER@:@RABBITMQ_GROUP@ "$_erlang_cookie"


rabbitmq-server-3.7.23-2.el8ost.x86_64 is not affectd and we have correct commands there:
chown rabbitmq:rabbitmq "$_erlang_cookie"

Comment 2 Michele Baldessari 2021-05-03 11:45:46 UTC
*** Bug 1956269 has been marked as a duplicate of this bug. ***

Comment 3 Michele Baldessari 2021-05-03 11:48:33 UTC
Specifically the problem is that we seem to have dropped http://pkgs.devel.redhat.com/cgit/rpms/rabbitmq-server/commit/?h=rhos-16.1-rhel-8-trunk&id=260d939d9ed557bd6bef54aee289c99be2e2c9ee in 3.7.28 and we had it in 3.7.23

Note also that:
This affects also 16.2 where we have the .el8 build of rabbitmqserver. The reason it only fails on osp 13 UC install is due to another (sigh) bug in the undercloud installation, namely that the cookie set by tripleoclient ends up in a hiera variable (rabbit_cookie) that never gets applied nor used. If we did not have that bug, it's unlikely we would have seen this issue for a longer time.

Comment 4 Peter Lemenkov 2021-05-03 15:59:54 UTC
Let me look. Should be easy to fix.

Comment 5 Michele Baldessari 2021-05-03 16:36:53 UTC
(In reply to Peter Lemenkov from comment #4)
> Let me look. Should be easy to fix.

We fixed it by reincluding those patches mentioned at c#3. We test it and we should be good now
The scratch build we used is here https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=1055853

Comment 6 Michele Baldessari 2021-05-03 17:45:37 UTC
So the BZ has been fixed via rabbitmq-server-3.7.28-2.el7ost (we tested both UC/OC deployment on osp13). The fixed in used in the commit for this build http://pkgs.devel.redhat.com/cgit/rpms/rabbitmq-server/commit/?h=rhos-13.0-rhel-7&id=7054b6f5331377f38c734d30969405cb261b596e was the original rebase bug since that was really the root cause and had all the acks already.

Comment 18 errata-xmlrpc 2021-06-16 10:59:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 13.0 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:2385