1415197 – pcsd ports shall be configurable to solve pacemaker/pcs inside of containers (+ occupying IANA-assigned port by default)

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1415197 - pcsd ports shall be configurable to solve pacemaker/pcs inside of containers (+ occupying IANA-assigned port by default)

Summary: pcsd ports shall be configurable to solve pacemaker/pcs inside of containers ...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	pcs
Sub Component:
Version:	7.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	unspecified
Target Milestone:	rc
Target Release:	---
Assignee:	Ondrej Mular
QA Contact:	Ofer Blaut
Docs Contact:	Steven J. Levine
URL:
Whiteboard:
Duplicates (1):	1428172 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-01-20 14:29 UTC by Jan Pokorný [poki]
Modified:	2018-04-10 15:38 UTC (History)
CC List:	11 users (show)
Fixed In Version:	pcs-0.9.162-5.el7
Doc Type:	Enhancement
Doc Text:	The "pcsd" port is now configurable The port on which "pcsd" is listening can now be changed in the "pcsd" configuration file, and "pcs" can now communicate with "pcsd" using a custom port. This feature is primarily for the use of "pcsd" inside containers.
Clone Of:
Environment:
Last Closed:	2018-04-10 15:37:49 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	885257	0	unspecified	CLOSED	Need port 2224 opened up for cluster suite and port tcp port 11111 removed	2022-10-18 07:38:56 UTC
Red Hat Product Errata	RHBA-2018:0866	0	None	None	None	2018-04-10 15:38:37 UTC

Internal Links: 885257

Description Jan Pokorný [poki] 2017-01-20 14:29:11 UTC

$ curl -s https://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.csv | grep 2224
efi-mg,2224,tcp,Easy Flexible Internet/Multiplayer Games,[Thomas_Efer],[Thomas_Efer],2006-03,,,,,
efi-mg,2224,udp,Easy Flexible Internet/Multiplayer Games,[Thomas_Efer],[Thomas_Efer],2006-03,,,,,

This means that the port assignment even predates pcs existence by
several years :-/

To rectify the current state, there should at least be a possibility
for pcs to switch to use another port.  That would resolve conflicts
arising from such an unexpected port occupation.

Comment 2 Jan Pokorný [poki] 2017-03-02 09:04:46 UTC

*** Bug 1428172 has been marked as a duplicate of this bug. ***

Comment 3 Jan Pokorný [poki] 2017-03-02 09:15:40 UTC

From [bug 1428172 comment 0]:

> To run pcsd in a container we can have issues where the ports inside
> the container don't necessarily match ports that are coming in from
> the outside of the container.
> 
> So we need to support these two things:
> 
> 1.  Be able to configure the port that pcsd listens on
> 
> 2.  When setting up the cluster we will need to configure the ports that
>     each individual pcsd is listening on.  And all pcs commands will
>     either need to accept a port parameter or be able to lookup up the
>     port and use it when running commands where you don't specify a host
>     (like 'pcs cluster start --all').

Comment 6 Andrew Beekhof 2017-03-02 23:12:57 UTC

Relatively speaking, it is less important that pcs listens on a configurable port than being able to be told the its peer is already listening on a different one (because while the port on the inside might be the normal one, the external port will almost never match).

Comment 7 Ondrej Mular 2017-09-20 11:55:18 UTC

Proposed fix:
https://github.com/ClusterLabs/pcs/commit/3de33c2ebe74f5238376b86a00db234ce6

Test:
nodes: rhel74-node1, rhel74-node2
Change port on which pcsd is listening using option PCSD_PORT in file /etc/sysconfig/pcsd and restart pcsd.
> [root@rhel74-node1 ~]# cat /etc/sysconfig/pcsd | grep PCSD_PORT
> PCSD_PORT=2225
> [root@rhel74-node1 ~]# systemctl restart pcsd

> [root@rhel74-node2 ~]# cat /etc/sysconfig/pcsd | grep PCSD_PORT
> PCSD_PORT=2226
> [root@rhel74-node2 ~]# systemctl restart pcsd

Authenticate nodes
> [root@rhel74-node1 ~]# pcs cluster auth rhel74-node1:2225 rhel74-node2:2226 -u hacluster
> Password: 
> rhel74-node1: Authorized
> rhel74-node2: Authorized

Create and start cluster:
> [root@rhel74-node1 ~]# pcs cluster setup --name rh74-cluster rhel74-node{1,2}
> Destroying cluster on nodes: rhel74-node1, rhel74-node2...
> rhel74-node2: Stopping Cluster (pacemaker)...
> rhel74-node1: Stopping Cluster (pacemaker)...
> rhel74-node2: Successfully destroyed cluster
> rhel74-node1: Successfully destroyed cluster
>
> Sending 'pacemaker_remote authkey' to 'rhel74-node1', 'rhel74-node2'
> rhel74-node1: successful distribution of the file 'pacemaker_remote authkey'
> rhel74-node2: successful distribution of the file 'pacemaker_remote authkey'
> Sending cluster config files to the nodes...
> rhel74-node1: Succeeded
> rhel74-node2: Succeeded
>
> Synchronizing pcsd certificates on nodes rhel74-node1, rhel74-node2...
> rhel74-node1: Success
> rhel74-node2: Success
> Restarting pcsd on the nodes in order to reload the certificates...
> rhel74-node2: Success
> rhel74-node1: Success
> [root@rhel74-node1 ~]# pcs cluster start --all
> rhel74-node1: Starting Cluster...
> rhel74-node2: Starting Cluster...
> [root@rhel74-node1 ~]# pcs status
> Cluster name: rh74-cluster
> WARNING: no stonith devices and stonith-enabled is not false
> Stack: corosync
> Current DC: rhel74-node2 (version 1.1.16-11.el7-94ff4df) - partition with quorum
> Last updated: Wed Sep 20 13:26:13 2017
> Last change: Wed Sep 20 13:21:48 2017 by hacluster via crmd on rhel74-node2
>
> 2 nodes configured
> 0 resources configured
>
> Online: [ rhel74-node1 rhel74-node2 ]
>
> No resources
>
>
> Daemon Status:
>   corosync: active/disabled
>   pacemaker: active/disabled
>   pcsd: active/enabled

It is also possible to specify pcsd port in the web UI when creating a cluster, adding existing cluster into the web UI and when adding a node into the cluster.

Comment 8 Ondrej Mular 2017-10-11 11:53:48 UTC

Additional fix:
https://github.com/ClusterLabs/pcs/commit/34b781e1da8b36c2f8a2a46f81ee7a9968fa

Comment 9 Ivan Devat 2017-11-02 16:11:48 UTC

After Fix:
[ant ~] $ rpm -q pcs
pcs-0.9.161-1.el7.x86_64

> setup ports

[ant ~] $ echo PCSD_PORT=2225 >> /etc/sysconfig/pcsd
[ant ~] $ ssh bee "echo PCSD_PORT=2226 >> /etc/sysconfig/pcsd"

[ant ~] $ service pcsd restart
Redirecting to /bin/systemctl restart pcsd.service
[ant ~] $ ssh bee "service pcsd restart"
Redirecting to /bin/systemctl restart pcsd.service

[ant ~] $ LANG=C netstat -nal | grep '\(2224\|2225\).*LISTEN'
tcp6       0      0 :::2225                 :::*                    LISTEN
[ant ~] $ ssh bee "LANG=C netstat -nal | grep '\(2224\|2226\).*LISTEN'"
root@bee's password:
tcp6       0      0 :::2226                 :::*                    LISTEN

> authenticate nodes

[ant ~] $ pcs cluster auth ant:2225 bee:2226 -u hacluster
Password:
ant: Authorized
bee: Authorized

> create and run cluster

[ant ~] $ pcs cluster setup --name=zoo ant bee
Destroying cluster on nodes: ant, bee...
bee: Stopping Cluster (pacemaker)...
ant: Stopping Cluster (pacemaker)...
ant: Successfully destroyed cluster
bee: Successfully destroyed cluster

Sending 'pacemaker_remote authkey' to 'ant', 'bee'
ant: successful distribution of the file 'pacemaker_remote authkey'
bee: successful distribution of the file 'pacemaker_remote authkey'
Sending cluster config files to the nodes...
ant: Succeeded
bee: Succeeded

Synchronizing pcsd certificates on nodes ant, bee...
ant: Success
bee: Success
Restarting pcsd on the nodes in order to reload the certificates...
bee: Success
ant: Success

[ant ~] $ pcs cluster start --all
ant: Starting Cluster...
bee: Starting Cluster...

[ant ~] $ pcs status
Cluster name: zoo
WARNING: no stonith devices and stonith-enabled is not false
Stack: corosync
Current DC: bee (version 1.1.16-12.el7-94ff4df) - partition with quorum
Last updated: Thu Nov  2 15:46:14 2017
Last change: Thu Nov  2 15:46:03 2017 by hacluster via crmd on bee

2 nodes configured
0 resources configured

Online: [ ant bee ]

No resources


Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/disabled

Comment 11 Ondrej Mular 2017-11-09 13:31:49 UTC

additional fix:
https://github.com/ClusterLabs/pcs/commit/cc21497bb47ca8813cb61a2965e9882e378

Comment 12 Ondrej Mular 2017-11-16 07:10:28 UTC

After fix:
[root@rhel75-node1 ~]# rpm -q pcs
pcs-0.9.162-1.el7.x86_64

> works the same way as described in comment #9

Comment 15 Tomas Jelinek 2018-02-01 10:33:27 UTC

additional fix:
https://github.com/ClusterLabs/pcs/commit/ed3e020f477f1e597cb6a86c5305b26a5a892096

'pcs cluster auth' does not work when all of these conditions are met:
* the command is run on a cluster node
* nodes run pcsd on a non-default port
* the node, where the command is run, is not authorized against cluster nodes
More specifically, the auth tokens are not written to disk.

Reproducer:
# The node is in a cluster.
[root@rh74-node1:~]# pcs status nodes corosync
Corosync Nodes:
 Online:
 Offline: rh74-node1 rh74-node2
# The node is not authorized.
[root@rh74-node1:~]# pcs pcsd clear-auth
# Cluster nodes run pcsd on a non-default port.
[root@rh74-node1:~]# pcs cluster auth rh74-node1:2225 rh74-node2:2225 -u hacluster
Password:
rh74-node2: Authorized
rh74-node1: Authorized
Error: Unable to synchronize and save tokens on nodes: rh74-node1, rh74-node2. Are they authorized?

Comment 16 Tomas Jelinek 2018-02-01 12:32:11 UTC

additional fix:
https://github.com/ClusterLabs/pcs/commit/7b5994ddc79fc3e5a829e870834f496b1b90c133

Reproducer:
* set up a cluster
* set pcsd to run on a non-default port
* authenticate cluster nodes
* set pcsd to run on a default port
* authenticate cluster nodes not specifying a port

Actual results:
A previously used port is used to send tokens to nodes. Since pcsd no longer listens on it, auth tokens are not saved on the nodes.

Expected results:
Since no port is specified in the command, the default port should be used regardless to what port was used before.

[root@rh74-node1:~]# pcs status nodes corosync
Corosync Nodes:
 Online:
 Offline: rh74-node1 rh74-node2
# pcsd running o a non-default port
[root@rh74-node1:~]# pcs cluster auth rh74-node1:2225 rh74-node2:2225 -u hacluster
Password: 
rh74-node2: Authorized
rh74-node1: Authorized
# pcsd running on a non-default port, auth not working
[root@rh74-node1:~]# pcs cluster auth rh74-node1 rh74-node2 -u hacluster
Password: 
rh74-node2: Authorized
rh74-node1: Authorized
Error: Unable to synchronize and save tokens on nodes: rh74-node1, rh74-node2. Are they authorized?
# pcsd running on a non-default port, auth working when a port is specified even though there should be no difference
[root@rh74-node1:~]# pcs cluster auth rh74-node1:2224 rh74-node2:2224 -u hacluster
Password: 
rh74-node2: Authorized
rh74-node1: Authorized

Comment 17 Ondrej Mular 2018-02-06 08:38:42 UTC

After fix:
[root@rhel75-node1 ~]# rpm -q pcs
pcs-0.9.162-5.el7.x86_64

> nodes: rhel75-node1, rhel75-node2
> Change port on which pcsd is listening using option PCSD_PORT in file /etc/sysconfig/pcsd and restart pcsd.

[root@rhel75-node1 ~]# cat /etc/sysconfig/pcsd | grep PCSD_PORT
PCSD_PORT=2225
[root@rhel75-node1 ~]# systemctl restart pcsd

[root@rhel75-node2 ~]# cat /etc/sysconfig/pcsd | grep PCSD_PORT
PCSD_PORT=2226
[root@rhel75-node2 ~]# systemctl restart pcsd

> Authenticate nodes
[root@rhel75-node1 ~]# pcs cluster auth rhel75-node1:2225 rhel75-node2:2226 -u hacluster
Password: 
rhel75-node1: Authorized
rhel75-node2: Authorized

Create and start cluster:
[root@rhel75-node1 ~]# pcs cluster setup --name rh75-cluster rhel75-node{1,2}
Destroying cluster on nodes: rhel75-node1, rhel75-node2...
rhel75-node1: Stopping Cluster (pacemaker)...
rhel75-node2: Stopping Cluster (pacemaker)...
rhel75-node1: Successfully destroyed cluster
rhel75-node2: Successfully destroyed cluster

Sending 'pacemaker_remote authkey' to 'rhel75-node1', 'rhel75-node2'
rhel75-node1: successful distribution of the file 'pacemaker_remote authkey'
rhel75-node2: successful distribution of the file 'pacemaker_remote authkey'
Sending cluster config files to the nodes...
rhel75-node1: Succeeded
rhel75-node2: Succeeded

Synchronizing pcsd certificates on nodes rhel75-node1, rhel75-node2...
rhel75-node1: Success
rhel75-node2: Success
Restarting pcsd on the nodes in order to reload the certificates...
rhel75-node1: Success
rhel75-node2: Success
[root@rhel75-node1 ~]# pcs cluster start --all
rhel75-node1: Starting Cluster...
rhel75-node2: Starting Cluster...

[root@rhel75-node1 ~]# pcs status
Cluster name: rh75-cluster
WARNING: no stonith devices and stonith-enabled is not false
Stack: corosync
Current DC: rhel75-node2 (version 1.1.18-5.el7-1a4ef7d180) - partition with quorum
Last updated: Tue Feb  6 08:32:57 2018
Last change: Tue Feb  6 08:32:50 2018 by hacluster via crmd on rhel75-node2

2 nodes configured
0 resources configured

Online: [ rhel75-node1 rhel75-node2 ]

No resources


Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/disabled

> Test for issue described in comment #15
[root@rhel75-node1 ~]# pcs cluster status 
Cluster Status:
 Stack: corosync
 Current DC: rhel75-node2 (version 1.1.18-5.el7-1a4ef7d180) - partition with quorum
 Last updated: Tue Feb  6 08:36:03 2018
 Last change: Tue Feb  6 08:32:50 2018 by hacluster via crmd on rhel75-node2
 2 nodes configured
 0 resources configured

PCSD Status:
  rhel75-node2: Online
  rhel75-node1: Online
> Remove tokens on node 
[root@rhel75-node1 ~]# rm /var/lib/pcsd/tokens
rm: remove regular file ‘/var/lib/pcsd/tokens’? y
> Then try to re-authenticate
[root@rhel75-node1 ~]# pcs cluster auth rhel75-node1:2225 rhel75-node2:2226 -u hacluster
Password: 
rhel75-node1: Authorized
rhel75-node2: Authorized

> Test for issue described in comment #16
[root@rhel75-node1 ~]# pcs cluster status 
Cluster Status:
 Stack: corosync
 Current DC: rhel75-node2 (version 1.1.18-5.el7-1a4ef7d180) - partition with quorum
 Last updated: Tue Feb  6 08:37:18 2018
 Last change: Tue Feb  6 08:32:50 2018 by hacluster via crmd on rhel75-node2
 2 nodes configured
 0 resources configured

PCSD Status:
  rhel75-node1: Online
  rhel75-node2: Online
> Make sure that nodes are authenticated on non default ports (default port: 2224)
[root@rhel75-node1 ~]# cat /var/lib/pcsd/tokens
{
  "format_version": 3,
  "data_version": 2,
  "tokens": {
    "rhel75-node1": "cc86e1fd-e90d-4270-9eb6-09584a5a853f",
    "rhel75-node2": "c624575a-52a0-40f3-8767-a24336d340b1"
  },
  "ports": {
    "rhel75-node1": 2225,
    "rhel75-node2": 2226
  }
> Change pcsd ports to default on all nodes and restart pcsd
[root@rhel75-node1 ~]# cat /etc/sysconfig/pcsd | grep PCSD_PORT
#PCSD_PORT=2224
[root@rhel75-node1 ~]# systemctl restart pcsd

[root@rhel75-node2 ~]# cat /etc/sysconfig/pcsd | grep PCSD_PORT
#PCSD_PORT=2224
[root@rhel75-node2 ~]# systemctl restart pcsd

> Authenticate nodes not specifying ports
[root@rhel75-node1 ~]# pcs cluster auth rhel75-node1 rhel75-node2 -u hacluster
Password: 
rhel75-node1: Authorized
rhel75-node2: Authorized

Comment 23 Marian Krcmarik 2018-02-28 23:00:46 UTC

Verified on pcs-0.9.162-4.el7.x86_64

Comment 25 errata-xmlrpc 2018-04-10 15:37:49 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0866

Note You need to log in before you can comment on or make changes to this bug.