Bug 1433016 - Improved container support required for OSP
Summary: Improved container support required for OSP
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: pcs
Version: 7.3
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: rc
: 7.4
Assignee: Tomas Jelinek
QA Contact: cluster-qe@redhat.com
Steven J. Levine
URL:
Whiteboard:
Depends On: 1432722
Blocks: 1435481
TreeView+ depends on / blocked
 
Reported: 2017-03-16 15:02 UTC by Ken Gaillot
Modified: 2020-07-14 09:25 UTC (History)
12 users (show)

Fixed In Version: pcs-0.9.158-6.el7
Doc Type: Technology Preview
Doc Text:
.The pcs tool now manages bundle resources in Pacemaker As a Technology Preview starting with Red Hat Enterprise Linux 7.4, Pacemaker supports a special syntax for launching a Docker container with any infrastructure it requires: the bundle. After you have created a Pacemaker bundle, you can create a Pacemaker resource that the bundle encapsulates. For information on Pacemaker support for containers, see the link:https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html-single/high_availability_add-on_reference/[High Availability Add-On Reference]. There is one exception to this feature being Technology Preview: As of RHEL 7.4, Red Hat fully supports the usage of Pacemaker bundles for Red Hat Openstack Platform (RHOSP) deployments.
Clone Of: 1432722
: 1435481 (view as bug list)
Environment:
Last Closed: 2017-08-01 18:26:07 UTC
Target Upstream Version:


Attachments (Terms of Use)
proposed fix + tests (405.46 KB, patch)
2017-05-04 14:57 UTC, Tomas Jelinek
no flags Details | Diff
proposed fix enable/disable + tests (24.54 KB, patch)
2017-06-06 10:02 UTC, Ivan Devat
no flags Details | Diff
proposed fix container type + tests (21.07 KB, patch)
2017-06-12 10:46 UTC, Tomas Jelinek
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:1958 0 normal SHIPPED_LIVE pcs bug fix and enhancement update 2017-08-01 18:09:47 UTC

Description Ken Gaillot 2017-03-16 15:02:34 UTC
Description of problem:

The improved support for container in upstream pacemaker is needed for OSP12 in order to manage galera and rabbit as a containerized service.

Comment 2 Ken Gaillot 2017-03-16 15:32:46 UTC
Pacemaker is adding a new syntax to better support Docker containers. There will be a new resource type (in addition to primitive, group, etc.) to define a container grouping, consisting of Docker settings, network settings, storage settings, and the service being containerized.

The exact syntax is not yet finalized, but the components of a container grouping will be:

* as with any resource type, a resource name and optional resource description (for the container grouping as a whole)

* Docker settings:
** "image" (a Docker image identifier such as "beekhof:http")
** optional "replicas" (the desired number of container instances of this image, defaulting to 1)
** optional "replicas-per-host" (default 1)
** optional "masters" (integer)
** optional "pcmk-remote-bin" (file path to pacemaker_remote inside the container, defaulting to usual location)
** optional "options" (arbitrary Docker command-line options, e.g. "--log-driver=journald")

* Optional network settings:
** "ip-range-start" (the first IPv4 address available to assign to containers; by default, no IPs will be assigned)
** "host-network"
** "host-netmask"
** "docker-network"
** zero or more "port-mapping" entries: each has an ID plus either an integer "port" or a "range" (starting port - ending port e.g. 1100-1199)

* Optional storage settings:
** zero or more "storage-mapping" entries: each has an ID plus:
*** either "source-dir" or "source-dir-root" (directory path on the host)
*** "target-dir" (directory path inside the container to use as mount point)
*** "options" (arbitrary mount options, e.g. "rw")

* A single primitive resource for the containerized service

We are keeping the option open for the (distant) future to support container technologies other than Docker. The idea is that the Docker settings part would be swapped out with something else (Rocket, OCI, LXC, etc.) but the network, storage, and primitive would hopefully stay the same.

The name of the container grouping type is not yet decided -- possibly container-grouping, bundle, or launcher. The names of the individual settings will likely mostly stay as above, but a few may change. Not all of the settings may be implemented in time for 7.4.

We will need pcs commands to add, remove, and edit container groupings.

Comment 3 Ken Gaillot 2017-03-28 15:22:39 UTC
QA: A full test procedure will be documented here once the new syntax is finalized, but the basic process will be:

* install docker on all nodes
* configure a docker image with apache and pacemaker remote on all nodes
* configure pacemaker using new syntax here
* test a variety of actions such as failing a container, migrating a container, disabling a container, putting a node running a container into standby, etc.

Comment 8 Tomas Jelinek 2017-04-25 13:38:09 UTC
upstream tutorial for pacemaker:
http://wiki.clusterlabs.org/wiki/Bundle_Walk-Through

Comment 9 Jan Pokorný [poki] 2017-04-26 16:27:28 UTC
Note that implementation for "docker" engine of a bundle has a hardcoded
reference to ocf:heartbeat:docker resource agent, so it might make sense
to check it's available prior to inner processing of

  pcs bundle create CONTAINER container docker ...

Comment 10 Tomas Jelinek 2017-05-04 12:08:12 UTC
(In reply to Jan Pokorný from comment #9)
> Note that implementation for "docker" engine of a bundle has a hardcoded
> reference to ocf:heartbeat:docker resource agent, so it might make sense
> to check it's available prior to inner processing of
> 
>   pcs bundle create CONTAINER container docker ...

Should this be checked on all nodes or just the node where the command is run? When adding a node to a cluster, should pcs check docker agent is installed on the node? Should pcs check if docker is installed? If it runs? What if the user wants to edit the CIB outside a cluster node? Should pcs check on cib push?

These checks are not done for any resource / stonith agents and it works - pacemaker reports an error when a resource cannot run and the user reacts to that.

For now we are not going to implement these checks. If this becomes a problem, we will deal with it.

Comment 11 Tomas Jelinek 2017-05-04 14:57:15 UTC
Created attachment 1276372 [details]
proposed fix + tests

summary of changes:
* new commands:
  * bundle create
    * create a new bundle with no encapsulated resources
  * bundle update
    * change an existing bundle
* updated commands - working with bundles:
  * resource create
    * new keyword "bundle" allows to create a resource inside a specified bundle
  * resource delete
    * allows deleting bundles and resources encapsulated in bundles
  * config show, resource show
    * displays bundles
  * resource clear, resource ban
  * resource restart
    * bz1447910
  * resource cleanup
    * bz1447916
  * resource failcount
    * works with inner bundle resources
  * bundles may be referenced in constraints
* updated commands - not working with bundles:
  * these exit with an error when requested to operate on bundles
  * resource enable, resource disable
  * resource manage, resource unmanage
  * resource op add, resource op remove
  * resource meta
  * resource group add, resource group remove, resource ungroup
    * also moving a resource from a bundle to a group is not allowed
  * resource clone, resource master, resource unclone
    * also cloning a resource in a bundle is not allowed
  * resource utilization
  * resource debug-start, resource debug-stop, resource debug-promote, resource debug-demote, resource debug-monitor
    * bz1447918
  * resource move
    * does not work for master and clone resource either
* web ui
  * works when bundles are present in a cluster
  * no support for bundles in web ui

Comment 13 Jan Pokorný [poki] 2017-06-05 10:10:37 UTC
Re [comment 10]:
Main problem is that docker agent is not present prior to resource-agents
3.9.6.  So the suggested check would prevent issues new pcs vs. old
resource-agents (ditto IPaddr2, but that was added substantially
earlier).


Btw. I think it's dearly unfortunate to allow the container type not
being a required input while at the same type, the container options
themselves are relative to this particular and now possibly omitted type.
So anytime the preference for the implicit/defautl container type
changes, there's an imminent breakage in the scripts relying on default
being stable in time.  And who can predict the future and whether
the current default will be as relevant at some distant point in time?
(For that reason, I will always emit the container type in clufter
for the purpose of the reverse extraction of CIB into respective
reinstating pcs commands.)

Comment 14 Ivan Devat 2017-06-06 10:02:43 UTC
Created attachment 1285326 [details]
proposed fix enable/disable + tests

Comment 15 Tomas Jelinek 2017-06-06 10:04:42 UTC
After fix - enable and disable works:

[root@rh73-node1:~]# rpm -q pcs
pcs-0.9.158-4.el7.x86_64
[root@rh73-node1:~]# pcs resource
 Docker container set: http-bundle [pcmktest:http]
   http-bundle-0 (192.168.122.250)      (ocf::pacemaker:Stateful):      Started rh73-node2
   http-bundle-1 (192.168.122.251)      (ocf::pacemaker:Stateful):      Started rh73-node1
[root@rh73-node1:~]# pcs resource disable http-bundle --wait
resource 'http-bundle' is not running on any node
[root@rh73-node1:~]# pcs resource
 Docker container set: http-bundle [pcmktest:http]
   http-bundle-0 (192.168.122.250)      (ocf::pacemaker:Stateful):      Stopped (disabled)
   http-bundle-1 (192.168.122.251)      (ocf::pacemaker:Stateful):      Stopped (disabled)
[root@rh73-node1:~]# pcs resource enable http-bundle --wait
resource 'http-bundle' is running on nodes 'http-bundle-0', 'http-bundle-1', 'rh73-node1', 'rh73-node2'
[root@rh73-node1:~]# pcs resource
 Docker container set: http-bundle [pcmktest:http]
   http-bundle-0 (192.168.122.250)      (ocf::pacemaker:Stateful):      Started rh73-node2
   http-bundle-1 (192.168.122.251)      (ocf::pacemaker:Stateful):      Started rh73-node1

Comment 18 Tomas Jelinek 2017-06-12 10:46:59 UTC
Created attachment 1287024 [details]
proposed fix container type + tests

container type is now mandatory (see comment 13)

Comment 19 Tomas Jelinek 2017-06-15 13:03:44 UTC
After fix:

[root@rh73-node1:~]# pcs resource bundle create http-bundle container image=pcmktest:http
Error: '' is not a valid container type value, use docker
[root@rh73-node1:~]# echo $?
1
[root@rh73-node1:~]# pcs resource bundle create http-bundle container docker image=pcmktest:http
[root@rh73-node1:~]# echo $?
0

Comment 22 errata-xmlrpc 2017-08-01 18:26:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1958


Note You need to log in before you can comment on or make changes to this bug.