Bug 1489735

Summary: pcs resource restart <bundle> <node> is not working
Product: Red Hat Enterprise Linux 7 Reporter: Michele Baldessari <michele>
Component: pacemakerAssignee: Ken Gaillot <kgaillot>
Status: CLOSED ERRATA QA Contact: Ofer Blaut <oblaut>
Severity: high Docs Contact:
Priority: high    
Version: 7.4CC: abeekhof, cfeist, cluster-maint, idevat, kgaillot, mkrcmari, omular, tojeline
Target Milestone: rc   
Target Release: 7.5   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: pacemaker-1.1.18-1.el7 Doc Type: No Doc Update
Doc Text:
Bundles were tech preview in 7.4, so bug fixes do not need to be documented.
Story Points: ---
Clone Of:
: 1501274 (view as bug list) Environment:
Last Closed: 2018-04-10 15:32:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1489728    
Bug Blocks: 1501274    

Description Michele Baldessari 2017-09-08 08:54:45 UTC
Description of problem:
This currently does not work until https://bugzilla.redhat.com/show_bug.cgi?id=1489728 is fixed. But once that is fixed (i.e. when crm_resource --restart <bundle> --node=<node> will work), we need a small fix in pcs to allow the bundle restart on a single node.

Comment 2 Tomas Jelinek 2017-09-11 13:34:13 UTC
There is a chance this will actually work without any changes in pcs as we do not care if the specified resource is a bundle. We will see once pacemaker part is ready.

Comment 3 Tomas Jelinek 2017-10-04 15:44:51 UTC
I tried restarting bundles with pacemaker-1.1.16-12.el7_4.4 installed both in the cluster machines and in containers. "crm_resource --restart <bundle>" seems to be working fine. However, "crm_resource --restart <bundle> --node <node>" seems to be restarting the bundle on all nodes.

# crm_resource --restart --resource httpd-bundle

Set 'httpd-bundle' option: id=httpd-bundle-meta_attributes-target-role set=httpd-bundle-meta_attributes name=target-role=stopped
Waiting for 1 resources to stop:
 * httpd-bundle
Deleted 'httpd-bundle' option: id=httpd-bundle-meta_attributes-target-role name=target-role
Waiting for 1 resources to start again:
 * httpd-bundle

# crm_resource --restart --resource httpd-bundle --node rh74-node1

Set 'httpd-bundle' option: id=httpd-bundle-meta_attributes-target-role set=httpd-bundle-meta_attributes name=target-role=stopped
Waiting for 1 resources to stop:
 * httpd-bundle
Deleted 'httpd-bundle' option: id=httpd-bundle-meta_attributes-target-role name=target-role
Waiting for 1 resources to start again:
 * httpd-bundle

While this is happening, I can see the bundle resource stopping and starting on all nodes in both cases.


Ken, is the bundle restart supposed to work with the --node option? If the option is not going to be supported, nothing has to be done in pcs, as 'pcs resource restart httpd-bundle' already works. In the other case, a small fix in pcs is needed to allow specifying a node when restarting a bundle.

Comment 4 Ken Gaillot 2017-10-04 16:09:22 UTC
Grabbing this as a pacemaker bug because nothing is needed from pcs.

I think crm_resource --restart should treat bundles the same way as clones, i.e. ban the resource from the specified node rather than set target-role to Stopped. This should be a fairly simple change we can get into 7.5.

Comment 5 Tomas Jelinek 2017-10-05 07:03:59 UTC
(In reply to Ken Gaillot from comment #4)
> Grabbing this as a pacemaker bug because nothing is needed from pcs.

That depends. Pcs currently does not allow specifying a node when restarting bundles:

# pcs resource restart httpd-bundle rh74-node1
Error: can only restart on a specific node for a clone or master/slave resource

As I said, a small fix in pcs is needed to allow specifying a node when restarting a bundle.


If a node is not specified, everything works with current pcs:

# pcs resource restart httpd-bundle
httpd-bundle successfully restarted

Comment 6 Ken Gaillot 2017-10-05 13:59:21 UTC
Ah, I missed that. We can clone this bz then, but let me make sure my idea works first.

Comment 7 Ken Gaillot 2017-10-06 23:17:02 UTC
Fixed upstream as of commit d6eb1cb4

When restarting a bundle with multiple replicas on a single node, crm_resource will now do so by banning the bundle from that node (same as it would with a clone). Note that if there are any free nodes available to run the instance, the instance may begin to start there, but once crm_resource notices it stopped on the original node, it will remove the ban and the instance will likely move back to the original node (same as would happen with a clone with clone-max less than the number of available nodes).

I noticed "crm_resource --wait" does not return after restarting a bundle, but that appears to be a separate issue, not affecting this bz.

Comment 8 Ken Gaillot 2017-10-06 23:18:36 UTC
Also, for completeness, it is likely there are other places where tools should treat bundles like clones. That also will be investigated separately from this bz.

Comment 13 errata-xmlrpc 2018-04-10 15:32:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0860