Bug 1459503
Summary: | OpenStack is not compatible with pcs management of remote and guest nodes | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Michele Baldessari <michele> | ||||
Component: | pcs | Assignee: | Tomas Jelinek <tojeline> | ||||
Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | urgent | ||||||
Version: | 7.4 | CC: | cfeist, chjones, cluster-maint, dciabrin, fdinitto, idevat, mkrcmari, omular, royoung, tojeline | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | pcs-0.9.158-5.el7 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2017-08-01 18:26:07 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Michele Baldessari
2017-06-07 10:11:39 UTC
Pcs does not break pcmk remote setup. Pcs implements management of pcmk remote nodes. This is a feature requested among others by OpenStack: bz1176018 There are new commands "pcs cluster node add-remote" and "pcs cluster node add-guest". These not only edit cib but also distribute pcmk authkey to new nodes and start and enable pcmk remote daemon as requested. For the commands to work pcsd must run on remote / guest nodes. Also "pcs cluster setup" creates a pcmk authkey and sends it to all nodes. So later when a remote node is added the key is only sent to the new node. This way there is no need for all the nodes to be online when adding a remote node. We can do a downstream patch for 7.4 which will: * automatically force pcs resource create ocf:pacemaker:remote * not generate new pcmk authkey in cluster setup if one already exists (In reply to Tomas Jelinek from comment #2) > Pcs does not break pcmk remote setup. Pcs implements management of pcmk > remote nodes. This is a feature requested among others by OpenStack: > bz1176018 Thanks Tomas, I realize the new feature is what prompted this change. While it does not break the new way of creating it does break the older documented way of setting up pacemaker remote nodes (https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html-single/High_Availability_Add-On_Reference/index.html#pacemaker_remote). These are not really workarounds, it's what we have documented for quite some time. Don't get me wrong, I like the change and the overall direction, but we are just not able to change everything under the hood in one pcs release (OSP cycle). This would break anyone that uses puppet or ansible with pcs to set up a cluster with remotes, it's not really OSP specific as such. > There are new commands "pcs cluster node add-remote" and "pcs cluster node > add-guest". These not only edit cib but also distribute pcmk authkey to new > nodes and start and enable pcmk remote daemon as requested. For the commands > to work pcsd must run on remote / guest nodes. > > Also "pcs cluster setup" creates a pcmk authkey and sends it to all nodes. > So later when a remote node is added the key is only sent to the new node. > This way there is no need for all the nodes to be online when adding a > remote node. Ack, yes the feature is very nice in itself. I think if we can just not rewrite /etc/pacemaker/authkey if it already exists that should do it (at least for us) for the OSP case. If you're super swamped I can give it a shot as well, just ping me. Thanks for all your help as usual, Michele (In reply to Michele Baldessari from comment #4) > These are not really workarounds, it's what we have documented for quite > some time. By "workaround" I meant workaround for a state when pcs does not provide full support for remote nodes. > This would break anyone that uses puppet or ansible with pcs to set up a > cluster with remotes, it's not really OSP specific as such. Not necessarily. If the authkey is distributed by puppet or ansible after cluster setup is done, everything should work as before. Ad --force in resource create: If it only emitted a warning, the user would have to delete the new node just to create it with the new command. (In reply to Tomas Jelinek from comment #5) > (In reply to Michele Baldessari from comment #4) > > These are not really workarounds, it's what we have documented for quite > > some time. > > By "workaround" I meant workaround for a state when pcs does not provide > full support for remote nodes. Right, the subject implies that we're doing something hacky, which is not the case (this time ;). > > This would break anyone that uses puppet or ansible with pcs to set up a > > cluster with remotes, it's not really OSP specific as such. > > Not necessarily. If the authkey is distributed by puppet or ansible after > cluster setup is done, everything should work as before. Right, but it's definitely a change in requirement / behaviour that does break existing automation. > Ad --force in resource create: > If it only emitted a warning, the user would have to delete the new node > just to create it with the new command. I am just saying that if you fail the remote creation without --force, we're fine with that. I have tested the patch Ivan gave me and it works as expected and I am able to create pacemaker remote resource. Thanks again for your quick help! Created attachment 1286082 [details]
proposed fix
Tests: After Fix: > 1) setup reuses existing pacemaker authkey [vm-rhel72-1 ~] $ cat /etc/pacemaker/authkey existing atuhkey content [vm-rhel72-1 ~] $ pcs cluster setup --name=devcluster vm-rhel72-1 vm-rhel72-3 Destroying cluster on nodes: vm-rhel72-1, vm-rhel72-3... vm-rhel72-1: Stopping Cluster (pacemaker)... vm-rhel72-3: Stopping Cluster (pacemaker)... vm-rhel72-1: Successfully destroyed cluster vm-rhel72-3: Successfully destroyed cluster Sending 'pacemaker_remote authkey' to 'vm-rhel72-1', 'vm-rhel72-3' vm-rhel72-1: successful distribution of the file 'pacemaker_remote authkey' vm-rhel72-1: successful distribution of the file 'pacemaker_remote authkey' vm-rhel72-3: successful distribution of the file 'pacemaker_remote authkey' Sending cluster config files to the nodes... vm-rhel72-1: Succeeded vm-rhel72-3: Succeeded Synchronizing pcsd certificates on nodes vm-rhel72-1, vm-rhel72-3... vm-rhel72-3: Success vm-rhel72-1: Success Restarting pcsd on the nodes in order to reload the certificates... vm-rhel72-1: Success vm-rhel72-3: Success [vm-rhel72-1 ~] $ cat /etc/pacemaker/authkey existing atuhkey content [vm-rhel72-1 ~] $ ssh vm-rhel72-3 'cat /etc/pacemaker/authkey' existing atuhkey content > 2) allow crate remote / guest resource without force [vm-rhel72-1 ~] $ pcs resource create RN ocf:pacemaker:remote Warning: this command is not sufficient for creating a remote connection, use 'pcs cluster node add-remote' [vm-rhel72-1 ~] $ echo $? 0 [vm-rhel72-1 ~] $ pcs resource create R ocf:heartbeat:Dummy meta remote-node="vm-rhel72-2" Warning: this command is not sufficient for creating a guest node, use 'pcs cluster node add-guest' [vm-rhel72-1 ~] $ echo $? 0 [vm-rhel72-1 ~] $ pcs resource update R meta remote-node= Warning: this command is not sufficient for removing a guest node, use 'pcs cluster node remove-guest' [vm-rhel72-1 ~] $ echo $? 0 [vm-rhel72-1 ~] $ pcs resource meta R remote-node="vm-rhel72-2" Warning: this command is not sufficient for creating a guest node, use 'pcs cluster node add-guest' [vm-rhel72-1 ~] $ echo $? 0 Additionally, Michele Baldessari and myself are using the features from this build for Openstack upstream, so I can say that's it's working as expected for us. We have a puppet-based scenario that relies on puppet-pacemaker [1] to deploy a HA Openstack overcloud on pacemaker remote nodes. After the fix, the deploy pass as expected, and we can validate that the existing key generated by puppet in /etc/pacemaker/authkey is the one which is being used to initialize the pacemaker remote nodes in the cluster. We also validate that we don't need the --force flag to succefully create a remote resource. [1] https://github.com/openstack/puppet-pacemaker/blob/master/manifests/resource/remote.pp Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1958 |