Description of problem: redis-bundle pcs resource doesn't start after controller reboot Version-Release number of selected component (if applicable): 15 -p RHOS_TRUNK-15.0-RHEL-8-20190423.n.1 How reproducible: 100% Steps to Reproduce: 1. Deploy OSP15 overcloud 2. Reboot 1/3 controllers 3. Check pcs status Actual results: [root@controller-2 heat-admin]# pcs status Cluster name: tripleo_cluster Stack: corosync Current DC: controller-0 (version 2.0.1-4.el8-0eb7991564) - partition with quorum Last updated: Fri Apr 26 12:49:42 2019 Last change: Fri Apr 26 02:47:53 2019 by root via cibadmin on controller-0 15 nodes configured 46 resources configured Online: [ controller-0 controller-1 controller-2 ] GuestOnline: [ galera-bundle-0@controller-0 galera-bundle-1@controller-1 galera-bundle-2@controller-2 ovn-dbs-bundle-0@controller-0 ovn-dbs-bundle-1@controller-1 ovn-dbs-bundle-2@controller-2 rabbitmq-bundle-0@controller-0 rabbitmq-bundle-1@controller-1 rabbitmq-bundle-2@controller-2 redis-bundle-0@controller-0 redis-bundle-1@controller-1 ] Full list of resources: podman container set: galera-bundle [192.168.24.1:8787/rhosp15/openstack-mariadb:pcmklatest] galera-bundle-0 (ocf::heartbeat:galera): Master controller-0 galera-bundle-1 (ocf::heartbeat:galera): Master controller-1 galera-bundle-2 (ocf::heartbeat:galera): Master controller-2 podman container set: rabbitmq-bundle [192.168.24.1:8787/rhosp15/openstack-rabbitmq:pcmklatest] rabbitmq-bundle-0 (ocf::heartbeat:rabbitmq-cluster): Started controller-0 rabbitmq-bundle-1 (ocf::heartbeat:rabbitmq-cluster): Started controller-1 rabbitmq-bundle-2 (ocf::heartbeat:rabbitmq-cluster): Started controller-2 podman container set: redis-bundle [192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest] redis-bundle-0 (ocf::heartbeat:redis): Master controller-0 redis-bundle-1 (ocf::heartbeat:redis): Slave controller-1 redis-bundle-2 (ocf::heartbeat:redis): Stopped ip-192.168.24.17 (ocf::heartbeat:IPaddr2): Started controller-0 ip-10.0.0.103 (ocf::heartbeat:IPaddr2): Started controller-1 ip-172.17.1.79 (ocf::heartbeat:IPaddr2): Started controller-1 ip-172.17.1.149 (ocf::heartbeat:IPaddr2): Started controller-0 ip-172.17.3.117 (ocf::heartbeat:IPaddr2): Started controller-1 ip-172.17.4.40 (ocf::heartbeat:IPaddr2): Started controller-1 podman container set: haproxy-bundle [192.168.24.1:8787/rhosp15/openstack-haproxy:pcmklatest] haproxy-bundle-podman-0 (ocf::heartbeat:podman): Started controller-0 haproxy-bundle-podman-1 (ocf::heartbeat:podman): Started controller-1 haproxy-bundle-podman-2 (ocf::heartbeat:podman): Started controller-2 podman container set: ovn-dbs-bundle [192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest] ovn-dbs-bundle-0 (ocf::ovn:ovndb-servers): Master controller-0 ovn-dbs-bundle-1 (ocf::ovn:ovndb-servers): Slave controller-1 ovn-dbs-bundle-2 (ocf::ovn:ovndb-servers): Slave controller-2 podman container: openstack-cinder-volume [192.168.24.1:8787/rhosp15/openstack-cinder-volume:pcmklatest] openstack-cinder-volume-podman-0 (ocf::heartbeat:podman): Started controller-0 Failed Resource Actions: * redis-bundle-podman-2_start_0 on controller-2 'unknown error' (1): call=102, status=complete, exitreason='podman failed to launch container', last-rc-change='Fri Apr 26 03:10:17 2019', queued=0ms, exec=3103ms Expected results: All pcs resources are started. Additional info: [root@controller-2 heat-admin]# grep redis-bundle-podman-2_start_0 /var/log/messages Apr 26 02:33:19 controller-2 pacemaker-execd[25213]: notice: redis-bundle-podman-2_start_0:59150:stderr [ error getting image "redis-bundle-podman-2": unable to find a name and tag match for redis-bundle-podman-2 in repotags: no such image ] Apr 26 02:33:19 controller-2 pacemaker-execd[25213]: notice: redis-bundle-podman-2_start_0:59150:stderr [ error getting image "redis-bundle-podman-2": unable to find a name and tag match for redis-bundle-podman-2 in repotags: no such image ] Apr 26 02:33:24 controller-2 pacemaker-execd[25213]: notice: redis-bundle-podman-2_start_0:60076:stderr [ error getting image "redis-bundle-podman-2": unable to find a name and tag match for redis-bundle-podman-2 in repotags: no such image ] Apr 26 02:33:24 controller-2 pacemaker-execd[25213]: notice: redis-bundle-podman-2_start_0:60076:stderr [ error getting image "redis-bundle-podman-2": unable to find a name and tag match for redis-bundle-podman-2 in repotags: no such image ] Apr 26 03:10:20 controller-2 pacemaker-execd[3389]: notice: redis-bundle-podman-2_start_0:8251:stderr [ error getting image "redis-bundle-podman-2": unable to find a name and tag match for redis-bundle-podman-2 in repotags: no such image ] Apr 26 03:10:20 controller-2 pacemaker-execd[3389]: notice: redis-bundle-podman-2_start_0:8251:stderr [ error getting image "redis-bundle-podman-2": unable to find a name and tag match for redis-bundle-podman-2 in repotags: no such image ] Apr 26 03:10:20 controller-2 pacemaker-execd[3389]: notice: redis-bundle-podman-2_start_0:8251:stderr [ ocf-exit-reason:podman failed to launch container ] Apr 26 03:10:20 controller-2 pacemaker-controld[3392]: notice: controller-2-redis-bundle-podman-2_start_0:102 [ error getting image "redis-bundle-podman-2": unable to find a name and tag match for redis-bundle-podman-2 in repotags: no such image\nerror getting image "redis-bundle-podman-2": unable to find a name and tag match for redis-bundle-podman-2 in repotags: no such image\nocf-exit-reason:podman failed to launch container\n ]
Apr 26 03:10:20 controller-2 podman(redis-bundle-podman-2)[8251]: ERROR: error checking path "/var/run/redis": stat /var/run/redis: no such file or directory Apr 26 03:10:20 controller-2 podman(redis-bundle-podman-2)[8251]: ERROR: podman failed to launch container Apr 26 03:10:20 controller-2 pacemaker-execd[3389]: notice: redis-bundle-podman-2_start_0:8251:stderr [ error getting image "redis-bundle-podman-2": unable to find a name and tag match for redis-bundle-podman-2 in repotags: no such image ] Apr 26 03:10:20 controller-2 pacemaker-execd[3389]: notice: redis-bundle-podman-2_start_0:8251:stderr [ error getting image "redis-bundle-podman-2": unable to find a name and tag match for redis-bundle-podman-2 in repotags: no such image ] Apr 26 03:10:20 controller-2 pacemaker-execd[3389]: notice: redis-bundle-podman-2_start_0:8251:stderr [ ocf-exit-reason:podman failed to launch container ] Apr 26 03:10:20 controller-2 pacemaker-controld[3392]: notice: Result of start operation for redis-bundle-podman-2 on controller-2: 1 (unknown error) Apr 26 03:10:20 controller-2 pacemaker-controld[3392]: notice: controller-2-redis-bundle-podman-2_start_0:102 [ error getting image "redis-bundle-podman-2": unable to find a name and tag match for redis-bundle-podman-2 in repotags: no suc h image\nerror getting image "redis-bundle-podman-2": unable to find a name and tag match for redis-bundle-podman-2 in repotags: no such image\nocf-exit-reason:podman failed to launch container\n ]
Looks like /var/run/redis doesn't exist: [root@controller-2 heat-admin]# ls /var/run/redis ls: cannot access '/var/run/redis': No such file or directory
[stack@undercloud-0 ~]$ rhos-release -L;cat core_puddle_version Installed repositories (rhel-8.0): 15 ceph-4 ceph-osd-4 rhel-8.0 RHOS_TRUNK-15.0-RHEL-8-20190523.n.1 [stack@undercloud-0 ~]$ rpm -qa|grep openstack-tripleo-heat-templates openstack-tripleo-heat-templates-10.5.1-0.20190521220357.dd20049.el8ost.noarch [stack@undercloud-0 ~]$ rpm -qa|grep uppet-tripleo puppet-tripleo-10.4.2-0.20190523020355.13b3a43.el8ost.noarch [stack@undercloud-0 ~]$ ansible controller-0 -b -mshell -a"reboot" [WARNING]: Found both group and host with same name: undercloud controller-0 | UNREACHABLE! => { "changed": false, "msg": "Failed to connect to the host via ssh: Warning: Permanently added '192.168.24.15' (ECDSA) to the list of known hosts.\r\nConnection to 192.168.24.15 closed by remote host.\r\nConnection to 192.168.24.15 closed.", "unreachable": true } [stack@undercloud-0 ~]$ ansible controller-0 -b -mshell -a"pcs status |grep redis;ls /var/run/redis" controller-0 | CHANGED | rc=0 >> Last change: Tue Jun 4 18:43:32 2019 by redis-bundle-1 via crm_attribute on controller-1 GuestOnline: [ galera-bundle-0@controller-0 galera-bundle-1@controller-1 galera-bundle-2@controller-2 ovn-dbs-bundle-0@controller-0 ovn-dbs-bundle-1@controller-1 ovn-dbs-bundle-2@controller-2 rabbitmq-bundle-0@controller-0 rabbitmq-bundle-1@controller-1 rabbitmq-bundle-2@controller-2 redis-bundle-0@controller-0 redis-bundle-1@controller-1 redis-bundle-2@controller-2 ] podman container set: redis-bundle [192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest] redis-bundle-0 (ocf::heartbeat:redis): Slave controller-0 redis-bundle-1 (ocf::heartbeat:redis): Master controller-1 redis-bundle-2 (ocf::heartbeat:redis): Slave controller-2 redis-server.pid redis.sock
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:2811