Description of problem:
Seems like “slave” redis servers are not connected to the redis “master” that is being started by pacemaker when you deploy with TLS Everywhere.
On an unencrypted Redis cluster, we can see slaves connected to the master:
# /usr/bin/redis-cli -a UdEnQrH6Jfdy6A4W2Rnja7UuZ -s '/var/run/redis/redis.sock' info
verify if there are any slaves. Here is a working example:
We don’t see any slave connection when TLS everywhere is enabled.
On initial deployment, all replicas of the redis resource start correctly in pacemaker, and give 1 Master and 2 Slave (no error). But that is only because no replication has taken place at all.
However, when restarting a Slave, the start operation won’t succeed because the redis resource agent will try to connect to the redis master, and would fail to do:
* redis_start_0 on redis-bundle-2 'unknown error' (1): call=8, status=Timed Out, exitreason='none',
last-rc-change='Mon Nov 27 15:02:01 2017', queued=1ms, exec=200001ms
With following logs from the slave redis server:
96:S 27 Nov 14:24:15.116 # Error condition on socket for SYNC: Connection reset by peer
96:S 27 Nov 14:24:16.116 * Connecting to MASTER overcloud-controller-1:6379
96:S 27 Nov 14:24:16.117 * MASTER <-> SLAVE sync started
96:S 27 Nov 14:24:16.117 * Non blocking connect for SYNC fired the event.
96:S 27 Nov 14:24:16.117 # Error condition on socket for SYNC: Connection reset by peer
96:S 27 Nov 14:24:17.120 * Connecting to MASTER overcloud-controller-1:6379
This is because on the 6379 port of the remote host there is an stunnel process expecting SSL traffic, whereas redis sends unencrypted traffic to it.
Added a suggested Known Issue for the OSP12 release
*** Bug 1538301 has been marked as a duplicate of this bug. ***
$ sudo pcs status | grep redis
GuestOnline: [ galera-bundle-0@controller-0 galera-bundle-1@controller-1 galera-bundle-2@controller-2 rabbitmq-bundle-0@controller-0 rabbitmq-bundle-1@controller-1 rabbitmq-bundle-2@controller-2 redis-bundle-0@controller-0 redis-bundle-1@controller-1 redis-bundle-2@controller-2 ]
Docker container set: redis-bundle [192.168.24.1:8787/rhosp13/openstack-redis:pcmklatest]
redis-bundle-0 (ocf::heartbeat:redis): Slave controller-0
redis-bundle-1 (ocf::heartbeat:redis): Master controller-1
redis-bundle-2 (ocf::heartbeat:redis): Slave controller-2
$ sudo pcs resource restart redis-bundle controller-0
redis-bundle successfully restarted
$ sudo /usr/bin/redis-cli -a NFzhyzTZMUVvNv9BayXxMNKr6 -s '/var/run/redis/redis.sock' info
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.