Bug 1703448 - redis-bundle pcs resource doesn't start after controller reboot
Summary: redis-bundle pcs resource doesn't start after controller reboot
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 15.0 (Stein)
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: beta
: 15.0 (Stein)
Assignee: Emilien Macchi
QA Contact: pkomarov
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-04-26 12:54 UTC by Marius Cornea
Modified: 2019-09-26 10:49 UTC (History)
10 users (show)

Fixed In Version: openstack-tripleo-heat-templates-10.5.1-0.20190507230544.ff461d5.el8ost puppet-tripleo-10.4.2-0.20190502220347.02cd12e.el8ost
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-09-21 11:21:38 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1826554 0 None None None 2019-04-26 15:24:02 UTC
OpenStack gerrit 656641 0 'None' MERGED redis HA: allow SELinux relabel for /var/run/redis 2021-02-22 00:04:00 UTC
OpenStack gerrit 656642 0 'None' MERGED ensure /var/run/redis is present upon reboot 2021-02-22 00:03:59 UTC
Red Hat Product Errata RHEA-2019:2811 0 None None None 2019-09-21 11:21:53 UTC

Description Marius Cornea 2019-04-26 12:54:22 UTC
Description of problem:

redis-bundle pcs resource doesn't start after controller reboot

Version-Release number of selected component (if applicable):
15  -p RHOS_TRUNK-15.0-RHEL-8-20190423.n.1

How reproducible:
100%

Steps to Reproduce:
1. Deploy OSP15 overcloud
2. Reboot 1/3 controllers
3. Check pcs status

Actual results:

[root@controller-2 heat-admin]# pcs status
Cluster name: tripleo_cluster
Stack: corosync
Current DC: controller-0 (version 2.0.1-4.el8-0eb7991564) - partition with quorum
Last updated: Fri Apr 26 12:49:42 2019
Last change: Fri Apr 26 02:47:53 2019 by root via cibadmin on controller-0

15 nodes configured
46 resources configured

Online: [ controller-0 controller-1 controller-2 ]
GuestOnline: [ galera-bundle-0@controller-0 galera-bundle-1@controller-1 galera-bundle-2@controller-2 ovn-dbs-bundle-0@controller-0 ovn-dbs-bundle-1@controller-1 ovn-dbs-bundle-2@controller-2 rabbitmq-bundle-0@controller-0 rabbitmq-bundle-1@controller-1 rabbitmq-bundle-2@controller-2 redis-bundle-0@controller-0 redis-bundle-1@controller-1 ]

Full list of resources:

 podman container set: galera-bundle [192.168.24.1:8787/rhosp15/openstack-mariadb:pcmklatest]
   galera-bundle-0	(ocf::heartbeat:galera):	Master controller-0
   galera-bundle-1	(ocf::heartbeat:galera):	Master controller-1
   galera-bundle-2	(ocf::heartbeat:galera):	Master controller-2
 podman container set: rabbitmq-bundle [192.168.24.1:8787/rhosp15/openstack-rabbitmq:pcmklatest]
   rabbitmq-bundle-0	(ocf::heartbeat:rabbitmq-cluster):	Started controller-0
   rabbitmq-bundle-1	(ocf::heartbeat:rabbitmq-cluster):	Started controller-1
   rabbitmq-bundle-2	(ocf::heartbeat:rabbitmq-cluster):	Started controller-2
 podman container set: redis-bundle [192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest]
   redis-bundle-0	(ocf::heartbeat:redis):	Master controller-0
   redis-bundle-1	(ocf::heartbeat:redis):	Slave controller-1
   redis-bundle-2	(ocf::heartbeat:redis):	Stopped
 ip-192.168.24.17	(ocf::heartbeat:IPaddr2):	Started controller-0
 ip-10.0.0.103	(ocf::heartbeat:IPaddr2):	Started controller-1
 ip-172.17.1.79	(ocf::heartbeat:IPaddr2):	Started controller-1
 ip-172.17.1.149	(ocf::heartbeat:IPaddr2):	Started controller-0
 ip-172.17.3.117	(ocf::heartbeat:IPaddr2):	Started controller-1
 ip-172.17.4.40	(ocf::heartbeat:IPaddr2):	Started controller-1
 podman container set: haproxy-bundle [192.168.24.1:8787/rhosp15/openstack-haproxy:pcmklatest]
   haproxy-bundle-podman-0	(ocf::heartbeat:podman):	Started controller-0
   haproxy-bundle-podman-1	(ocf::heartbeat:podman):	Started controller-1
   haproxy-bundle-podman-2	(ocf::heartbeat:podman):	Started controller-2
 podman container set: ovn-dbs-bundle [192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest]
   ovn-dbs-bundle-0	(ocf::ovn:ovndb-servers):	Master controller-0
   ovn-dbs-bundle-1	(ocf::ovn:ovndb-servers):	Slave controller-1
   ovn-dbs-bundle-2	(ocf::ovn:ovndb-servers):	Slave controller-2
 podman container: openstack-cinder-volume [192.168.24.1:8787/rhosp15/openstack-cinder-volume:pcmklatest]
   openstack-cinder-volume-podman-0	(ocf::heartbeat:podman):	Started controller-0

Failed Resource Actions:
* redis-bundle-podman-2_start_0 on controller-2 'unknown error' (1): call=102, status=complete, exitreason='podman failed to launch container',
    last-rc-change='Fri Apr 26 03:10:17 2019', queued=0ms, exec=3103ms


Expected results:
All pcs resources are started.

Additional info:

[root@controller-2 heat-admin]# grep redis-bundle-podman-2_start_0 /var/log/messages 
Apr 26 02:33:19 controller-2 pacemaker-execd[25213]: notice: redis-bundle-podman-2_start_0:59150:stderr [ error getting image "redis-bundle-podman-2": unable to find a name and tag match for redis-bundle-podman-2 in repotags: no such image ]
Apr 26 02:33:19 controller-2 pacemaker-execd[25213]: notice: redis-bundle-podman-2_start_0:59150:stderr [ error getting image "redis-bundle-podman-2": unable to find a name and tag match for redis-bundle-podman-2 in repotags: no such image ]
Apr 26 02:33:24 controller-2 pacemaker-execd[25213]: notice: redis-bundle-podman-2_start_0:60076:stderr [ error getting image "redis-bundle-podman-2": unable to find a name and tag match for redis-bundle-podman-2 in repotags: no such image ]
Apr 26 02:33:24 controller-2 pacemaker-execd[25213]: notice: redis-bundle-podman-2_start_0:60076:stderr [ error getting image "redis-bundle-podman-2": unable to find a name and tag match for redis-bundle-podman-2 in repotags: no such image ]
Apr 26 03:10:20 controller-2 pacemaker-execd[3389]: notice: redis-bundle-podman-2_start_0:8251:stderr [ error getting image "redis-bundle-podman-2": unable to find a name and tag match for redis-bundle-podman-2 in repotags: no such image ]
Apr 26 03:10:20 controller-2 pacemaker-execd[3389]: notice: redis-bundle-podman-2_start_0:8251:stderr [ error getting image "redis-bundle-podman-2": unable to find a name and tag match for redis-bundle-podman-2 in repotags: no such image ]
Apr 26 03:10:20 controller-2 pacemaker-execd[3389]: notice: redis-bundle-podman-2_start_0:8251:stderr [ ocf-exit-reason:podman failed to launch container ]
Apr 26 03:10:20 controller-2 pacemaker-controld[3392]: notice: controller-2-redis-bundle-podman-2_start_0:102 [ error getting image "redis-bundle-podman-2": unable to find a name and tag match for redis-bundle-podman-2 in repotags: no such image\nerror getting image "redis-bundle-podman-2": unable to find a name and tag match for redis-bundle-podman-2 in repotags: no such image\nocf-exit-reason:podman failed to launch container\n ]

Comment 1 Marius Cornea 2019-04-26 12:58:13 UTC
Apr 26 03:10:20 controller-2 podman(redis-bundle-podman-2)[8251]: ERROR: error checking path "/var/run/redis": stat /var/run/redis: no such file or directory
Apr 26 03:10:20 controller-2 podman(redis-bundle-podman-2)[8251]: ERROR: podman failed to launch container
Apr 26 03:10:20 controller-2 pacemaker-execd[3389]: notice: redis-bundle-podman-2_start_0:8251:stderr [ error getting image "redis-bundle-podman-2": unable to find a name and tag match for redis-bundle-podman-2 in repotags: no such image 
]
Apr 26 03:10:20 controller-2 pacemaker-execd[3389]: notice: redis-bundle-podman-2_start_0:8251:stderr [ error getting image "redis-bundle-podman-2": unable to find a name and tag match for redis-bundle-podman-2 in repotags: no such image 
]
Apr 26 03:10:20 controller-2 pacemaker-execd[3389]: notice: redis-bundle-podman-2_start_0:8251:stderr [ ocf-exit-reason:podman failed to launch container ]
Apr 26 03:10:20 controller-2 pacemaker-controld[3392]: notice: Result of start operation for redis-bundle-podman-2 on controller-2: 1 (unknown error)
Apr 26 03:10:20 controller-2 pacemaker-controld[3392]: notice: controller-2-redis-bundle-podman-2_start_0:102 [ error getting image "redis-bundle-podman-2": unable to find a name and tag match for redis-bundle-podman-2 in repotags: no suc
h image\nerror getting image "redis-bundle-podman-2": unable to find a name and tag match for redis-bundle-podman-2 in repotags: no such image\nocf-exit-reason:podman failed to launch container\n ]

Comment 2 Marius Cornea 2019-04-26 12:59:41 UTC
Looks like /var/run/redis doesn't exist:

[root@controller-2 heat-admin]# ls /var/run/redis
ls: cannot access '/var/run/redis': No such file or directory

Comment 9 pkomarov 2019-06-04 18:46:32 UTC
[stack@undercloud-0 ~]$ rhos-release -L;cat core_puddle_version 
Installed repositories (rhel-8.0):
  15
  ceph-4
  ceph-osd-4
  rhel-8.0
RHOS_TRUNK-15.0-RHEL-8-20190523.n.1

[stack@undercloud-0 ~]$ rpm -qa|grep openstack-tripleo-heat-templates
openstack-tripleo-heat-templates-10.5.1-0.20190521220357.dd20049.el8ost.noarch
[stack@undercloud-0 ~]$ rpm -qa|grep uppet-tripleo
puppet-tripleo-10.4.2-0.20190523020355.13b3a43.el8ost.noarch

[stack@undercloud-0 ~]$ ansible controller-0 -b -mshell -a"reboot"
 [WARNING]: Found both group and host with same name: undercloud

controller-0 | UNREACHABLE! => {
    "changed": false,
    "msg": "Failed to connect to the host via ssh: Warning: Permanently added '192.168.24.15' (ECDSA) to the list of known hosts.\r\nConnection to 192.168.24.15 closed by remote host.\r\nConnection to 192.168.24.15 closed.",
    "unreachable": true
}
[stack@undercloud-0 ~]$ ansible controller-0 -b -mshell -a"pcs status |grep redis;ls /var/run/redis"

controller-0 | CHANGED | rc=0 >>
Last change: Tue Jun  4 18:43:32 2019 by redis-bundle-1 via crm_attribute on controller-1
GuestOnline: [ galera-bundle-0@controller-0 galera-bundle-1@controller-1 galera-bundle-2@controller-2 ovn-dbs-bundle-0@controller-0 ovn-dbs-bundle-1@controller-1 ovn-dbs-bundle-2@controller-2 rabbitmq-bundle-0@controller-0 rabbitmq-bundle-1@controller-1 rabbitmq-bundle-2@controller-2 redis-bundle-0@controller-0 redis-bundle-1@controller-1 redis-bundle-2@controller-2 ]
 podman container set: redis-bundle [192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest]
   redis-bundle-0	(ocf::heartbeat:redis):	Slave controller-0
   redis-bundle-1	(ocf::heartbeat:redis):	Master controller-1
   redis-bundle-2	(ocf::heartbeat:redis):	Slave controller-2
redis-server.pid
redis.sock

Comment 12 errata-xmlrpc 2019-09-21 11:21:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:2811


Note You need to log in before you can comment on or make changes to this bug.