Bug 1489735

Summary:	pcs resource restart <bundle> <node> is not working
Product:	Red Hat Enterprise Linux 7	Reporter:	Michele Baldessari <michele>
Component:	pacemaker	Assignee:	Ken Gaillot <kgaillot>
Status:	CLOSED ERRATA	QA Contact:	Ofer Blaut <oblaut>
Severity:	high	Docs Contact:
Priority:	high
Version:	7.4	CC:	abeekhof, cfeist, cluster-maint, idevat, kgaillot, mkrcmari, omular, tojeline
Target Milestone:	rc
Target Release:	7.5
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:	pacemaker-1.1.18-1.el7	Doc Type:	No Doc Update
Doc Text:	Bundles were tech preview in 7.4, so bug fixes do not need to be documented.	Story Points:	---
Clone Of:
Clones:	1501274 (view as bug list)		Environment:
Last Closed:	2018-04-10 15:32:51 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1489728
Bug Blocks:	1501274

Description Michele Baldessari 2017-09-08 08:54:45 UTC

Description of problem:
This currently does not work until https://bugzilla.redhat.com/show_bug.cgi?id=1489728 is fixed. But once that is fixed (i.e. when crm_resource --restart <bundle> --node=<node> will work), we need a small fix in pcs to allow the bundle restart on a single node.

Comment 2 Tomas Jelinek 2017-09-11 13:34:13 UTC

There is a chance this will actually work without any changes in pcs as we do not care if the specified resource is a bundle. We will see once pacemaker part is ready.

Comment 3 Tomas Jelinek 2017-10-04 15:44:51 UTC

I tried restarting bundles with pacemaker-1.1.16-12.el7_4.4 installed both in the cluster machines and in containers. "crm_resource --restart <bundle>" seems to be working fine. However, "crm_resource --restart <bundle> --node <node>" seems to be restarting the bundle on all nodes.

# crm_resource --restart --resource httpd-bundle

Set 'httpd-bundle' option: id=httpd-bundle-meta_attributes-target-role set=httpd-bundle-meta_attributes name=target-role=stopped
Waiting for 1 resources to stop:
 * httpd-bundle
Deleted 'httpd-bundle' option: id=httpd-bundle-meta_attributes-target-role name=target-role
Waiting for 1 resources to start again:
 * httpd-bundle

# crm_resource --restart --resource httpd-bundle --node rh74-node1

Set 'httpd-bundle' option: id=httpd-bundle-meta_attributes-target-role set=httpd-bundle-meta_attributes name=target-role=stopped
Waiting for 1 resources to stop:
 * httpd-bundle
Deleted 'httpd-bundle' option: id=httpd-bundle-meta_attributes-target-role name=target-role
Waiting for 1 resources to start again:
 * httpd-bundle

While this is happening, I can see the bundle resource stopping and starting on all nodes in both cases.


Ken, is the bundle restart supposed to work with the --node option? If the option is not going to be supported, nothing has to be done in pcs, as 'pcs resource restart httpd-bundle' already works. In the other case, a small fix in pcs is needed to allow specifying a node when restarting a bundle.

Comment 4 Ken Gaillot 2017-10-04 16:09:22 UTC

Grabbing this as a pacemaker bug because nothing is needed from pcs.

I think crm_resource --restart should treat bundles the same way as clones, i.e. ban the resource from the specified node rather than set target-role to Stopped. This should be a fairly simple change we can get into 7.5.

Comment 5 Tomas Jelinek 2017-10-05 07:03:59 UTC

(In reply to Ken Gaillot from comment #4)
> Grabbing this as a pacemaker bug because nothing is needed from pcs.

That depends. Pcs currently does not allow specifying a node when restarting bundles:

# pcs resource restart httpd-bundle rh74-node1
Error: can only restart on a specific node for a clone or master/slave resource

As I said, a small fix in pcs is needed to allow specifying a node when restarting a bundle.


If a node is not specified, everything works with current pcs:

# pcs resource restart httpd-bundle
httpd-bundle successfully restarted

Comment 6 Ken Gaillot 2017-10-05 13:59:21 UTC

Ah, I missed that. We can clone this bz then, but let me make sure my idea works first.

Comment 7 Ken Gaillot 2017-10-06 23:17:02 UTC

Fixed upstream as of commit d6eb1cb4

When restarting a bundle with multiple replicas on a single node, crm_resource will now do so by banning the bundle from that node (same as it would with a clone). Note that if there are any free nodes available to run the instance, the instance may begin to start there, but once crm_resource notices it stopped on the original node, it will remove the ban and the instance will likely move back to the original node (same as would happen with a clone with clone-max less than the number of available nodes).

I noticed "crm_resource --wait" does not return after restarting a bundle, but that appears to be a separate issue, not affecting this bz.

Comment 8 Ken Gaillot 2017-10-06 23:18:36 UTC

Also, for completeness, it is likely there are other places where tools should treat bundles like clones. That also will be investigated separately from this bz.

Comment 13 errata-xmlrpc 2018-04-10 15:32:51 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0860