Description of problem:
The improved support for container in upstream pacemaker is needed for OSP12 in order to manage galera and rabbit as a containerized service.
Support for this feature has been merged upstream
QA: No special testing will be needed for this BZ, as it will be implied by testing the new pcs syntax in Bug 1433016.
QA: For completeness, I will outline a test procedure without pcs here, though testing the pcs syntax will be sufficient to test this bz. A test build is not available yet.
Given the current focus on containers everywhere, I'm sure QA has already started discussions about what host OS + container OS combinations to test generally. For this bz, latest RHEL 7.4 nightly in both is probably a good choice. In my testing, I've been using RHEL 7.3 host and CentOS 7.3 containers (with 7.4 pacemaker build in both).
1. Configure a Pacemaker cluster of at least two cluster nodes (and no Pacemaker Remote nodes). You'll need about 450MB free disk space on each node.
2. On every node:
2a. Install docker. QA should use whatever RH ships. In my personal testing, I've been using the upstream repo:
# yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
# yum install docker-ce
# systemctl enable --now docker
2b, Pull a base image to work with. In my personal testing, I've been using CentOS 7:
# docker pull centos:centos7
2c. Create some infrastructure for the tests:
# mkdir -p /root/bz1432722 \
# for i in 0 1 2; do cat >/var/local/containers/httpd-bundle-$i/index.html <<EOF
<h1>httpd-bundle-$i @ $(hostname)</h1>
2d. Put a copy of the pacemaker-cli, pacemaker-libs, pacemaker, pacemaker-cluster-libs, and pacemaker-remote RPMs for this BZ into /root/bz1432722. (This will more or less simulate what users who pull a 7.4 GA base image will get. However this is not strictly necessary, and any base image with a version of pacemaker that supports Pacemaker Remote should work.)
2e. Create a Dockerfile for testing (replace centos:centos7 with your base image). Here I'm using apache as an example of a service to be containerized:
# cat >/root/bz1432722/Dockerfile <<EOF
COPY pacemaker*.rpm ./
RUN yum update -y
RUN yum install -y httpd bind-utils curl lsof wget which
RUN yum install -y ./pacemaker*.rpm resource-agents
CMD rm -f pacemaker*.rpm
3. On every node, build a custom image. This step should be repeated if during testing you need to switch out the pacemaker packages or change the Dockerfile:
3a. Build the image:
# cd /root/bz1432722
# docker rmi pcmktest:http
# docker build -t pcmktest:http .
3b. If desired, verify that the image was created:
# docker images # output should look something like:
REPOSITORY TAG IMAGE ID CREATED SIZE
pcmktest http aab04ad64ab0 About a minute ago 412 MB
centos centos7 98d35105a391 12 days ago 192 MB
3c. At least in my testing, building triggers a docker "waiting for lo to become free" bug. Reboot the node to avoid this possibility.
4. From any one node, start the cluster, and configure a bundle using the test image. Replace the IP address with something appropriate (three sequential IPs need to be available):
# pcs cluster start --all --wait
# cibadmin --modify --allow-create --scope resources -X '<bundle id="httpd-bundle">
<docker image="pcmktest:http" replicas="3" options="--log-driver=journald" />
<network ip-range-start="192.168.122.131" host-interface="eth0" host-netmask="24">
<port-mapping id="httpd-port" port="80"/>
<primitive class="ocf" id="httpd" provider="heartbeat" type="apache"/>
5. Test away. Three containers should come up, and apache should be reachable at the specified IPs. This feature is tech preview, and not all things you'd expect to do with a regular resource are implemented for bundles yet. But it will be worthwhile to test as much as possible and list what works and what doesn't. You can also modify the bundle configuration to try different values. Follow Bug 1435481 for the documentation; upstream documentation will be available soon as well. The docker instances will be named like httpd-bundle-docker-0, so you can use standard docker commands with that (e.g. docker inspect or docker exec).
Known issues to be addressed separately:
- Bug 1447903
- Bug 1447916
- Bug 1447918
- Bug 1447951
Additionally, Michele Baldessari and myself are testing the new features from this build since a month now, so I can say that's it's working as expected for us.
We're following different instructions  to deploy a cluster with containerized ocf resources, and this is the result:
[root@rhelz ~]# crm_mon -1
Current DC: rhelz (version 1.1.16-11.el7-94ff4df) - partition with quorum
Last updated: Wed Jun 21 09:48:16 2017
Last change: Wed Jun 21 09:29:42 2017 by root via cibadmin on rhelz
4 nodes configured
16 resources configured
Online: [ rhelz ]
GuestOnline: [ galera-bundle-0@rhelz rabbitmq-bundle-0@rhelz redis-bundle-0@rhelz ]
Docker container: rabbitmq-bundle [192.168.24.1:8787/rhosp12/openstack-rabbitmq-docker:2017-06-19.1]
rabbitmq-bundle-0 (ocf::heartbeat:rabbitmq-cluster): Started rhelz
Docker container: galera-bundle [192.168.24.1:8787/rhosp12/openstack-mariadb-docker:2017-06-19.1]
galera-bundle-0 (ocf::heartbeat:galera): Master rhelz
Docker container: redis-bundle [192.168.24.1:8787/rhosp12/openstack-redis-docker:2017-06-19.1]
redis-bundle-0 (ocf::heartbeat:redis): Master rhelz
ip-192.168.122.254 (ocf::heartbeat:IPaddr2): Started rhelz
ip-192.168.122.250 (ocf::heartbeat:IPaddr2): Started rhelz
ip-192.168.122.249 (ocf::heartbeat:IPaddr2): Started rhelz
ip-192.168.122.253 (ocf::heartbeat:IPaddr2): Started rhelz
ip-192.168.122.247 (ocf::heartbeat:IPaddr2): Started rhelz
ip-192.168.122.248 (ocf::heartbeat:IPaddr2): Started rhelz
Docker container: haproxy-bundle [192.168.24.1:8787/rhosp12/openstack-haproxy-docker:2017-06-19.1]
haproxy-bundle-docker-0 (ocf::heartbeat:docker): Started rhelz
The resources marked as "Docker container" are containerized ocf resource managed by pacemaker.
We also verified on multi-node Openstack overclouds that the feature is also working as expected on multi-node clusters.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.