Bug 1465152

Summary: Update remote node add and remove commands in Pacemaker reference
Product: Red Hat Enterprise Linux 7 Reporter: Steven J. Levine <slevine>
Component: doc-High_Availability_Add-On_ReferenceAssignee: Steven J. Levine <slevine>
Status: CLOSED CURRENTRELEASE QA Contact: ecs-bugs
Severity: unspecified Docs Contact:
Priority: high    
Version: 7.4CC: cfeist, cluster-maint, cluster-qe, idevat, kgaillot, omular, rhel-docs, rsteiger, tlavigne, tojeline
Target Milestone: rcKeywords: Documentation
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1386512 Environment:
Last Closed: 2017-08-01 16:47:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1386512    
Bug Blocks:    

Comment 2 Steven J. Levine 2017-06-26 20:07:44 UTC
Tomas:

I figured I should update the Pacemaker reference to account for the updated "pcs cluster node add-remote" and "pcs cluster node remove-remote" commands.

The issue is that we don't publish separate manuals for each point release, so we note which release supports what.  In this case I didn't think I could eliminate the older versions of the commands in case somebody was running RHEL 7.3 or earlier.

Could you look at section 9.4.8 and review the updated section?  Thanks.

http://jenkinscat.gsslab.pnq.redhat.com:8080/job/doc-Red_Hat_Enterprise_Linux-7-High_Availability_Add-On_Reference%20(html-single)/lastSuccessfulBuild/artifact/tmp/en-US/html-single/index.html#remotenode_addremove

Steven

Comment 3 Tomas Jelinek 2017-06-27 12:49:01 UTC
Steven,

I don't quite get what this sentence is supposed to mean:
"You do not need to run this command if the resource was originally created as a guest node."
as there is no other pcs command which does what the "pcs cluster node add-guest" command does.

Also when it's written like this, it makes no sense why the command has changed. Maybe we should mention the new command does more than the old one.

Basically the old add command only edits the CIB making a VirtualDomain resource a guest node. The new add command does the same plus it sends pacemaker authkey to the new node and starts and enables pacemaker remote daemon there. With the old command in 7.3 users were required to distribute the authkey and start the service manually. Also for the new command to work it is required pcsd to be running on the guest node.

Comment 4 Steven J. Levine 2017-06-27 14:24:32 UTC
Ken:

I'm putting this in needinfo from you because you originally worked with me on the documentation in question, and before I eliminate the sentence that Tomas notes in Comment 3 I wanted to run this by you just to be sure there's nothing here that we'd be losing if I eliminate that sentence.

I'm also not quite sure about the issue here of distributing the authkey and starting the service manually when using "pcs cluster remote-node add" -- that is, we didn't mention that in the older documentation so I don't know if we should add that information now (for users of 7.3 and earlier).

Steven

Comment 5 Ken Gaillot 2017-06-27 16:02:48 UTC
(In reply to Tomas Jelinek from comment #3)
> Steven,
> 
> I don't quite get what this sentence is supposed to mean:
> "You do not need to run this command if the resource was originally created
> as a guest node."
> as there is no other pcs command which does what the "pcs cluster node
> add-guest" command does.

The user could have created it when doing "pcs resource create", by specifying the remote-node meta-attribute (which we need to rename to guest-node on the pacemaker side at some point).

However, if someone's familiar enough with the meta-attributes to do that, they probably don't need this sentence, and it might confuse people who aren't as familiar, so I'm fine with dropping it.

Comment 6 Ken Gaillot 2017-06-27 16:05:56 UTC
(In reply to Steven J. Levine from comment #4)
> I'm also not quite sure about the issue here of distributing the authkey and
> starting the service manually when using "pcs cluster remote-node add" --
> that is, we didn't mention that in the older documentation so I don't know
> if we should add that information now (for users of 7.3 and earlier).

I'm pretty sure we did mention that in the old documentation somewhere. It's the part that starts with something like: Generate a key: dd if=/dev/urandom of=/etc/pacemaker/authkey bs=4096 count=1

Comment 7 Tomas Jelinek 2017-06-27 16:51:25 UTC
(In reply to Ken Gaillot from comment #5)
> (In reply to Tomas Jelinek from comment #3)
> > Steven,
> > 
> > I don't quite get what this sentence is supposed to mean:
> > "You do not need to run this command if the resource was originally created
> > as a guest node."
> > as there is no other pcs command which does what the "pcs cluster node
> > add-guest" command does.
> 
> The user could have created it when doing "pcs resource create", by
> specifying the remote-node meta-attribute (which we need to rename to
> guest-node on the pacemaker side at some point).
> 
> However, if someone's familiar enough with the meta-attributes to do that,
> they probably don't need this sentence, and it might confuse people who
> aren't as familiar, so I'm fine with dropping it.

Ah, got it. That's what I thought it meant.
I'm also voting for removing the sentence. My reason however is a bit different: the new add command works with the authkey and pacemaker remote service whereas the resource create command does not. The resource create command even emits a warning saying the node add-guest command should be used when the remote-node meta attribute is requested to be set in the new resource.

Comment 8 Steven J. Levine 2017-06-27 19:01:34 UTC
Tomas:

Could you look over my rewrite of the section that documents the updated remote node commands, section 9.4.8:

http://jenkinscat.gsslab.pnq.redhat.com:8080/job/doc-Red_Hat_Enterprise_Linux-7-High_Availability_Add-On_Reference%20(html-single)/lastSuccessfulBuild/artifact/tmp/en-US/html-single/index.html#remotenode_addremove

Steven

Comment 9 Tomas Jelinek 2017-06-28 08:05:49 UTC
There is a typo in 7.3 part:
you must startS and enable the pacemaker_remote

Should we mention the guest node must be running pcsd and be authenticated for the new add command to work?

Otherwise it looks fine to me.



And just to be sure, are you going to update also the remote nodes part of the doc with respect to new commands for managing remote nodes "pcs cluster node add-remote" and "pcs cluster node remove-remote"?

Comment 10 Steven J. Levine 2017-06-28 14:06:23 UTC
Tomas:

"And just to be sure, are you going to update also the remote nodes part of the doc with respect to new commands for managing remote nodes "pcs cluster node add-remote" and "pcs cluster node remove-remote"?"

I hadn't been planning that, no.  When the Release Note update came my way about the new commands, I figured I should update the doc as well to replace the instances of the old commands (or at least call them out as 7.3 and earlier). The only place the documentation referenced the old commands was in the section on converting a VirtualDomain resource to a guest node (the section we've been looking at here).  The procedure for configuring a remote node creates the resource independently (section 9.4.6):


http://jenkinscat.gsslab.pnq.redhat.com:8080/job/doc-Red_Hat_Enterprise_Linux-7-High_Availability_Add-On_Reference%20(html-single)/lastSuccessfulBuild/artifact/tmp/en-US/html-single/index.html#remotenode_config

Do you think we should reconsider that procedure?

Putting this in needinfo from Ken since he put together the procedure we document.

Steven

Comment 11 Tomas Jelinek 2017-06-28 15:18:39 UTC
Steven:

Well, the new "pcs cluster node add-remote" command does half of what is covered in that section (steps 3, 4, 5 and 6):
* sends pacemaker authkey to the remote node
* starts and enables pacemaker remote daemon on the remote node
* creates an ocf:pacemaker:remote resource
That makes creating remote nodes quite easier. Because of that I think it's worth to document it.

Also:
[root@rh73-node1:~]# pcs resource create remote1 ocf:pacemaker:remote
Warning: this command is not sufficient for creating a remote connection, use 'pcs cluster node add-remote'

i.e. the documented way still works but users may be getting some deprecation warnings

Comment 12 Steven J. Levine 2017-06-28 16:23:19 UTC
Ken:

(Leaving this in NEEDINFO from you as per Comment 10)

I think I would need your help or at least advice in rewriting the procedure here, as per Tomas's comments. Can I really just replace all of steps 3, 4, 5, and 6 with a single command?  (It would have to refer back to Step six of section 9,4,5 for the resource_id parameter.)

And if we do rewrite that section, is section 9.4.8 necessary -- at least the 7.4 parts of it?

If we rewrite that section, I'd have to split it into two: One for 7,3 and earlier (with the current procedure) and one for 7.4.

So I'm looking for your advice here. This seems a major procedural change.

Steven

Comment 13 Ken Gaillot 2017-06-28 18:55:15 UTC
Yes, steps 3-6 of section 9.4.6 can now be replaced with the "pcs cluster node add-remote" command. There's always a "but" ... but (as you mentioned) the old steps will still be needed for < 7.4, and the new command needs pcsd enabled on the planned remote node first.

In the existing documentation, "remote1" is the resource ID.

Section 9.4.6 and 9.4.8 are alternative procedures -- 9.4.6 creates a remote node, 9.4.8 creates a guest node. Both use Pacemaker Remote, but a remote node is standalone, whereas the cluster starts the VM that becomes the guest node.

Comment 15 Ken Gaillot 2017-06-28 23:09:11 UTC
Hmm, looking at it again, I see some more areas affected by the new commands:

* 9.4.1 should mention that the new 7.4 commands set up the authkey for you.

* 9.4.2 and 9.4.3 -- The text describes how to create the nodes and set the options before 7.4. In 7.4, the nodes should be created and the options set with the add-guest and add-remote commands. Maybe avoid mentioning creation/configuring here, and just list the options, saying they are configured as described later.

* 9.4.5 -- The text here describes how to set up a guest node before 7.4. Before 7.4, we said users could create a guest node either at the same time as creating the VirtualDomain resource (this section), or afterward ("9.4.9 Converting a VM Resource to a Guest Node"). With 7.4, we're recommending that users always use the second approach. I'm thinking we should combine these two sections along these lines:

-- Steps 4-5 here are actually relevant to any VirtualDomain resource, and should be in "9.3. Configuring a Virtual Domain as a Resource" instead.

-- The initial part of Step 1 "After installing the virtualization software and enabling the libvirtd service on the cluster nodes" should then just become something like "Configure the VirtualDomain resource as described in section 9.3."

-- The rest of steps 1-2 would be divided into pre-7.4 and 7.4. Pre-7.4 would need everything currently here; 7.4 just needs to install pacemaker-remote resource-agents pcs, enable and start pcsd, and update the firewall.

-- Step 6 would be replaced by the contents of the "Converting a VM ..." section. Note the current "Converting a VM ..." text should say "guest node" instead of "remote node" in the first paragraph.

* 9.4.6 step 3 should also "systemctl start pcsd.service". (FYI, "systemctl enable --now <whatever>" is equivalent to separate enable and start commands, if you wanted to simplify a bit.)

* 9.4.6 step 4 is missing "pcs". "starts the node, and configures it to start the cluster on boot" should be "starts pacemaker_remote on the node, and configures the node to start pacemaker_remote on boot".

* Once the planned article on firewall configuration for HA is available, we should link to it rather than give the example firewall-cmd commands and port numbers. That will give users the opportunity to understand the nuances/variations.

Comment 16 Steven J. Levine 2017-07-03 21:15:49 UTC
Next round of review, incorporating Comment 15 as best I understood.

http://jenkinscat.gsslab.pnq.redhat.com:8080/job/doc-Red_Hat_Enterprise_Linux-7-High_Availability_Add-On_Reference%20(html-single)/lastSuccessfulBuild/artifact/tmp/en-US/html-single/index.html#pacemaker_remote

I also added a note to BZ#1440627 (update firewall info) to update the doc with this reference once that new section is done.

Comment 17 Ken Gaillot 2017-07-05 14:59:11 UTC
Comments on latest:

* 9.4.1 -- As of 7.4, "pcs cluster node add-guest" sets up the authkey for guest nodes, and "pcs cluster node add-remote" sets up the authkey for remote nodes

* 9.4.2 and 9.4.3 -- I'd make it clearer that these options should be set in 7.4 using the add-guest and add-remote commands, and only in <=7.3 should they be set when creating the resource. I.e. in 7.4 users should not be using "pcs resource create" to create a remote node, and they should not be setting the meta-data options when creating a VirtualDomain for a guest node. Maybe something like:

9.4.2 -- In addition to the VirtualDomain resource options, metadata options define the resource as a guest node and define the connection parameters. In 7.4, these should be set using the "pcs cluster node add-guest" command; before 7.4, you can set these when creating the resource. Table 9.4, “Metadata Options for Configuring KVM/LXC Resources as Remote Nodes” describes these metadata options. 

9.4.3 -- A remote node is defined as a cluster resource with ocf:pacemaker:remote as the resource agent. In 7.4, this resource should be created using the "pcs cluster node add-remote" command; before 7.4, you can create this resource with the "pcs resource create" command. Table 9.5, “Resource Options for Remote Nodes” describes the resource options you can configure for a remote resource. 

* 9.4.5 -- "the the"; step 3: the systemctl commands here are only needed <7.4, however, in 7.4 equivalent commands are needed for pcsd; step 5: nix the two sentences starting with "After converting" b/c that is step 2

Comment 18 Steven J. Levine 2017-07-05 21:46:17 UTC
I have made what I think are the updates as per comment 17:


http://jenkinscat.gsslab.pnq.redhat.com:8080/job/doc-Red_Hat_Enterprise_Linux-7-High_Availability_Add-On_Reference%20(html-single)/lastSuccessfulBuild/artifact/tmp/en-US/html-single/index.html#pacemaker_remote

AND ALSO:

The very beginning of section 9.4 defines a guest node and notes the pre-7.4 method of creating one:

"guest node — A virtual guest node running the pacemaker_remote service. A guest node is configured using the remote-node metadata option of a resource agent such as ocf:pacemaker:VirtualDomain. The virtual guest resource is managed by the cluster; it is both started by the cluster and integrated into the cluster as a remote node."

I think I can just remove that second sentence and the definition is still valid. Do you think I need to mention how it is configured here in the definition?

Steven

Comment 19 Ken Gaillot 2017-07-05 22:32:09 UTC
I think it looks good now. The guest node definition should be fine with or without the second sentence; it's still true, it's just a matter of what command configures it, which is described later.

Comment 20 Steven J. Levine 2017-07-05 23:09:31 UTC
Updates made to 7.4 draft.

Comment 21 Tomas Jelinek 2017-07-10 12:17:59 UTC
I went through the whole 9.4 chapter. Steven and Ken, you did an excellent job on this, thanks!

I only found a few minor issues:
* 9.4. The pacemaker_remote Service
  * typo: they are not eligible to be BE the cluster's Designated Controller
  * last paragraph, maybe add "and guest" to "the remote AND GUEST nodes behave just like cluster nodes" so it's clear it applies to guest nodes also
* 9.4.4. Changing Default pacemaker_remote Options
  * Maybe it should be noted that pcs/pcsd doesn't support changing location of the authkey file. In other words pcs always reads and writes the authkey to /etc/pacemaker/authkey. This may get fixed in a future release of pcs.
* 9.4.5. Configuration Overview: KVM Guest Node
  * In step 3, TCP port 2224 for pcsd needs to be enabled as well.
  * In step 5 I think it's worth mentioning the command must be run on a cluster node and not on the guest node which is being added.
* 9.4.6. Configuration Overview: Remote Node (Red Hat Enterprise Linux 7.4)
  * In step 4 I think it's worth mentioning the command must be run on a cluster node and not on the remote node which is being added.

Comment 22 Steven J. Levine 2017-07-10 17:32:48 UTC
> * 9.4.4. Changing Default pacemaker_remote Options
>  * Maybe it should be noted that pcs/pcsd doesn't support
> changing location of the authkey file. In other words pcs
>  always reads and writes the authkey to
> /etc/pacemaker/authkey. This may get fixed in a
> future release of pcs.

If this is currently not supported, it seems as if we shouldn't document it at all and that this section should only be about changing the default port for Pacemaker Remote connections.  Is there any reason for providing the instrctions for changing the location of the authkey file before we support this?

Comment 23 Steven J. Levine 2017-07-10 17:46:40 UTC
Another question:

* 9.4.5. Configuration Overview: KVM Guest Node
  * In step 3, TCP port 2224 for pcsd needs to be enabled as well.

Does that involve running two separate commands?

# firewall-cmd --add-port 3121/tcp --permanent
# firewall-cmd --add-port 2224/tcp --permanent

Or is is there a way to combine these in one command?  I checked the man page, and it looks as if you might be able to use this, but I can't find any actual examples that would prove it:

# firewall-cmd --add-port 3121/tcp --add-port 2224/tcp --permanent

Comment 24 Tomas Jelinek 2017-07-11 08:32:02 UTC
Does it matter? Anyway I tried it with the one command and it works for me.

Comment 26 Steven J. Levine 2017-07-11 16:13:59 UTC
Ken:

I have removed the information about changing the authkey from the section on changing default Pacemaker Remote options, so that the section is now only about changing the default port location.  For a sanity check, can you look at the pared down section 9.4.4?

http://jenkinscat.gsslab.pnq.redhat.com:8080/job/doc-Red_Hat_Enterprise_Linux-7-High_Availability_Add-On_Reference%20(html-single)/lastSuccessfulBuild/artifact/tmp/en-US/html-single/index.html#pacemakerremote_defaults

Steven

Comment 27 Ken Gaillot 2017-07-11 17:04:28 UTC
It looks good to me.