Bug 1415197 - pcsd ports shall be configurable to solve pacemaker/pcs inside of containers (+ occupying IANA-assigned port by default)
Summary: pcsd ports shall be configurable to solve pacemaker/pcs inside of containers ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: pcs
Version: 7.4
Hardware: Unspecified
OS: Unspecified
high
unspecified
Target Milestone: rc
: ---
Assignee: Ondrej Mular
QA Contact: Ofer Blaut
Steven J. Levine
URL:
Whiteboard:
: 1428172 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-01-20 14:29 UTC by Jan Pokorný [poki]
Modified: 2018-04-10 15:38 UTC (History)
11 users (show)

Fixed In Version: pcs-0.9.162-5.el7
Doc Type: Enhancement
Doc Text:
The "pcsd" port is now configurable The port on which "pcsd" is listening can now be changed in the "pcsd" configuration file, and "pcs" can now communicate with "pcsd" using a custom port. This feature is primarily for the use of "pcsd" inside containers.
Clone Of:
Environment:
Last Closed: 2018-04-10 15:37:49 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 885257 0 unspecified CLOSED Need port 2224 opened up for cluster suite and port tcp port 11111 removed 2022-10-18 07:38:56 UTC
Red Hat Product Errata RHBA-2018:0866 0 None None None 2018-04-10 15:38:37 UTC

Internal Links: 885257

Description Jan Pokorný [poki] 2017-01-20 14:29:11 UTC
$ curl -s https://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.csv | grep 2224
efi-mg,2224,tcp,Easy Flexible Internet/Multiplayer Games,[Thomas_Efer],[Thomas_Efer],2006-03,,,,,
efi-mg,2224,udp,Easy Flexible Internet/Multiplayer Games,[Thomas_Efer],[Thomas_Efer],2006-03,,,,,

This means that the port assignment even predates pcs existence by
several years :-/

To rectify the current state, there should at least be a possibility
for pcs to switch to use another port.  That would resolve conflicts
arising from such an unexpected port occupation.

Comment 2 Jan Pokorný [poki] 2017-03-02 09:04:46 UTC
*** Bug 1428172 has been marked as a duplicate of this bug. ***

Comment 3 Jan Pokorný [poki] 2017-03-02 09:15:40 UTC
From [bug 1428172 comment 0]:

> To run pcsd in a container we can have issues where the ports inside
> the container don't necessarily match ports that are coming in from
> the outside of the container.
> 
> So we need to support these two things:
> 
> 1.  Be able to configure the port that pcsd listens on
> 
> 2.  When setting up the cluster we will need to configure the ports that
>     each individual pcsd is listening on.  And all pcs commands will
>     either need to accept a port parameter or be able to lookup up the
>     port and use it when running commands where you don't specify a host
>     (like 'pcs cluster start --all').

Comment 6 Andrew Beekhof 2017-03-02 23:12:57 UTC
Relatively speaking, it is less important that pcs listens on a configurable port than being able to be told the its peer is already listening on a different one (because while the port on the inside might be the normal one, the external port will almost never match).

Comment 7 Ondrej Mular 2017-09-20 11:55:18 UTC
Proposed fix:
https://github.com/ClusterLabs/pcs/commit/3de33c2ebe74f5238376b86a00db234ce6

Test:
nodes: rhel74-node1, rhel74-node2
Change port on which pcsd is listening using option PCSD_PORT in file /etc/sysconfig/pcsd and restart pcsd.
> [root@rhel74-node1 ~]# cat /etc/sysconfig/pcsd | grep PCSD_PORT
> PCSD_PORT=2225
> [root@rhel74-node1 ~]# systemctl restart pcsd

> [root@rhel74-node2 ~]# cat /etc/sysconfig/pcsd | grep PCSD_PORT
> PCSD_PORT=2226
> [root@rhel74-node2 ~]# systemctl restart pcsd

Authenticate nodes
> [root@rhel74-node1 ~]# pcs cluster auth rhel74-node1:2225 rhel74-node2:2226 -u hacluster
> Password: 
> rhel74-node1: Authorized
> rhel74-node2: Authorized

Create and start cluster:
> [root@rhel74-node1 ~]# pcs cluster setup --name rh74-cluster rhel74-node{1,2}
> Destroying cluster on nodes: rhel74-node1, rhel74-node2...
> rhel74-node2: Stopping Cluster (pacemaker)...
> rhel74-node1: Stopping Cluster (pacemaker)...
> rhel74-node2: Successfully destroyed cluster
> rhel74-node1: Successfully destroyed cluster
>
> Sending 'pacemaker_remote authkey' to 'rhel74-node1', 'rhel74-node2'
> rhel74-node1: successful distribution of the file 'pacemaker_remote authkey'
> rhel74-node2: successful distribution of the file 'pacemaker_remote authkey'
> Sending cluster config files to the nodes...
> rhel74-node1: Succeeded
> rhel74-node2: Succeeded
>
> Synchronizing pcsd certificates on nodes rhel74-node1, rhel74-node2...
> rhel74-node1: Success
> rhel74-node2: Success
> Restarting pcsd on the nodes in order to reload the certificates...
> rhel74-node2: Success
> rhel74-node1: Success
> [root@rhel74-node1 ~]# pcs cluster start --all
> rhel74-node1: Starting Cluster...
> rhel74-node2: Starting Cluster...
> [root@rhel74-node1 ~]# pcs status
> Cluster name: rh74-cluster
> WARNING: no stonith devices and stonith-enabled is not false
> Stack: corosync
> Current DC: rhel74-node2 (version 1.1.16-11.el7-94ff4df) - partition with quorum
> Last updated: Wed Sep 20 13:26:13 2017
> Last change: Wed Sep 20 13:21:48 2017 by hacluster via crmd on rhel74-node2
>
> 2 nodes configured
> 0 resources configured
>
> Online: [ rhel74-node1 rhel74-node2 ]
>
> No resources
>
>
> Daemon Status:
>   corosync: active/disabled
>   pacemaker: active/disabled
>   pcsd: active/enabled

It is also possible to specify pcsd port in the web UI when creating a cluster, adding existing cluster into the web UI and when adding a node into the cluster.

Comment 9 Ivan Devat 2017-11-02 16:11:48 UTC
After Fix:
[ant ~] $ rpm -q pcs
pcs-0.9.161-1.el7.x86_64

> setup ports

[ant ~] $ echo PCSD_PORT=2225 >> /etc/sysconfig/pcsd
[ant ~] $ ssh bee "echo PCSD_PORT=2226 >> /etc/sysconfig/pcsd"

[ant ~] $ service pcsd restart
Redirecting to /bin/systemctl restart pcsd.service
[ant ~] $ ssh bee "service pcsd restart"
Redirecting to /bin/systemctl restart pcsd.service

[ant ~] $ LANG=C netstat -nal | grep '\(2224\|2225\).*LISTEN'
tcp6       0      0 :::2225                 :::*                    LISTEN
[ant ~] $ ssh bee "LANG=C netstat -nal | grep '\(2224\|2226\).*LISTEN'"
root@bee's password:
tcp6       0      0 :::2226                 :::*                    LISTEN

> authenticate nodes

[ant ~] $ pcs cluster auth ant:2225 bee:2226 -u hacluster
Password:
ant: Authorized
bee: Authorized

> create and run cluster

[ant ~] $ pcs cluster setup --name=zoo ant bee
Destroying cluster on nodes: ant, bee...
bee: Stopping Cluster (pacemaker)...
ant: Stopping Cluster (pacemaker)...
ant: Successfully destroyed cluster
bee: Successfully destroyed cluster

Sending 'pacemaker_remote authkey' to 'ant', 'bee'
ant: successful distribution of the file 'pacemaker_remote authkey'
bee: successful distribution of the file 'pacemaker_remote authkey'
Sending cluster config files to the nodes...
ant: Succeeded
bee: Succeeded

Synchronizing pcsd certificates on nodes ant, bee...
ant: Success
bee: Success
Restarting pcsd on the nodes in order to reload the certificates...
bee: Success
ant: Success

[ant ~] $ pcs cluster start --all
ant: Starting Cluster...
bee: Starting Cluster...

[ant ~] $ pcs status
Cluster name: zoo
WARNING: no stonith devices and stonith-enabled is not false
Stack: corosync
Current DC: bee (version 1.1.16-12.el7-94ff4df) - partition with quorum
Last updated: Thu Nov  2 15:46:14 2017
Last change: Thu Nov  2 15:46:03 2017 by hacluster via crmd on bee

2 nodes configured
0 resources configured

Online: [ ant bee ]

No resources


Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/disabled

Comment 12 Ondrej Mular 2017-11-16 07:10:28 UTC
After fix:
[root@rhel75-node1 ~]# rpm -q pcs
pcs-0.9.162-1.el7.x86_64

> works the same way as described in comment #9

Comment 15 Tomas Jelinek 2018-02-01 10:33:27 UTC
additional fix:
https://github.com/ClusterLabs/pcs/commit/ed3e020f477f1e597cb6a86c5305b26a5a892096

'pcs cluster auth' does not work when all of these conditions are met:
* the command is run on a cluster node
* nodes run pcsd on a non-default port
* the node, where the command is run, is not authorized against cluster nodes
More specifically, the auth tokens are not written to disk.

Reproducer:
# The node is in a cluster.
[root@rh74-node1:~]# pcs status nodes corosync
Corosync Nodes:
 Online:
 Offline: rh74-node1 rh74-node2
# The node is not authorized.
[root@rh74-node1:~]# pcs pcsd clear-auth
# Cluster nodes run pcsd on a non-default port.
[root@rh74-node1:~]# pcs cluster auth rh74-node1:2225 rh74-node2:2225 -u hacluster
Password:
rh74-node2: Authorized
rh74-node1: Authorized
Error: Unable to synchronize and save tokens on nodes: rh74-node1, rh74-node2. Are they authorized?

Comment 16 Tomas Jelinek 2018-02-01 12:32:11 UTC
additional fix:
https://github.com/ClusterLabs/pcs/commit/7b5994ddc79fc3e5a829e870834f496b1b90c133

Reproducer:
* set up a cluster
* set pcsd to run on a non-default port
* authenticate cluster nodes
* set pcsd to run on a default port
* authenticate cluster nodes not specifying a port

Actual results:
A previously used port is used to send tokens to nodes. Since pcsd no longer listens on it, auth tokens are not saved on the nodes.

Expected results:
Since no port is specified in the command, the default port should be used regardless to what port was used before.

[root@rh74-node1:~]# pcs status nodes corosync
Corosync Nodes:
 Online:
 Offline: rh74-node1 rh74-node2
# pcsd running o a non-default port
[root@rh74-node1:~]# pcs cluster auth rh74-node1:2225 rh74-node2:2225 -u hacluster
Password: 
rh74-node2: Authorized
rh74-node1: Authorized
# pcsd running on a non-default port, auth not working
[root@rh74-node1:~]# pcs cluster auth rh74-node1 rh74-node2 -u hacluster
Password: 
rh74-node2: Authorized
rh74-node1: Authorized
Error: Unable to synchronize and save tokens on nodes: rh74-node1, rh74-node2. Are they authorized?
# pcsd running on a non-default port, auth working when a port is specified even though there should be no difference
[root@rh74-node1:~]# pcs cluster auth rh74-node1:2224 rh74-node2:2224 -u hacluster
Password: 
rh74-node2: Authorized
rh74-node1: Authorized

Comment 17 Ondrej Mular 2018-02-06 08:38:42 UTC
After fix:
[root@rhel75-node1 ~]# rpm -q pcs
pcs-0.9.162-5.el7.x86_64

> nodes: rhel75-node1, rhel75-node2
> Change port on which pcsd is listening using option PCSD_PORT in file /etc/sysconfig/pcsd and restart pcsd.

[root@rhel75-node1 ~]# cat /etc/sysconfig/pcsd | grep PCSD_PORT
PCSD_PORT=2225
[root@rhel75-node1 ~]# systemctl restart pcsd

[root@rhel75-node2 ~]# cat /etc/sysconfig/pcsd | grep PCSD_PORT
PCSD_PORT=2226
[root@rhel75-node2 ~]# systemctl restart pcsd

> Authenticate nodes
[root@rhel75-node1 ~]# pcs cluster auth rhel75-node1:2225 rhel75-node2:2226 -u hacluster
Password: 
rhel75-node1: Authorized
rhel75-node2: Authorized

Create and start cluster:
[root@rhel75-node1 ~]# pcs cluster setup --name rh75-cluster rhel75-node{1,2}
Destroying cluster on nodes: rhel75-node1, rhel75-node2...
rhel75-node1: Stopping Cluster (pacemaker)...
rhel75-node2: Stopping Cluster (pacemaker)...
rhel75-node1: Successfully destroyed cluster
rhel75-node2: Successfully destroyed cluster

Sending 'pacemaker_remote authkey' to 'rhel75-node1', 'rhel75-node2'
rhel75-node1: successful distribution of the file 'pacemaker_remote authkey'
rhel75-node2: successful distribution of the file 'pacemaker_remote authkey'
Sending cluster config files to the nodes...
rhel75-node1: Succeeded
rhel75-node2: Succeeded

Synchronizing pcsd certificates on nodes rhel75-node1, rhel75-node2...
rhel75-node1: Success
rhel75-node2: Success
Restarting pcsd on the nodes in order to reload the certificates...
rhel75-node1: Success
rhel75-node2: Success
[root@rhel75-node1 ~]# pcs cluster start --all
rhel75-node1: Starting Cluster...
rhel75-node2: Starting Cluster...

[root@rhel75-node1 ~]# pcs status
Cluster name: rh75-cluster
WARNING: no stonith devices and stonith-enabled is not false
Stack: corosync
Current DC: rhel75-node2 (version 1.1.18-5.el7-1a4ef7d180) - partition with quorum
Last updated: Tue Feb  6 08:32:57 2018
Last change: Tue Feb  6 08:32:50 2018 by hacluster via crmd on rhel75-node2

2 nodes configured
0 resources configured

Online: [ rhel75-node1 rhel75-node2 ]

No resources


Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/disabled

> Test for issue described in comment #15
[root@rhel75-node1 ~]# pcs cluster status 
Cluster Status:
 Stack: corosync
 Current DC: rhel75-node2 (version 1.1.18-5.el7-1a4ef7d180) - partition with quorum
 Last updated: Tue Feb  6 08:36:03 2018
 Last change: Tue Feb  6 08:32:50 2018 by hacluster via crmd on rhel75-node2
 2 nodes configured
 0 resources configured

PCSD Status:
  rhel75-node2: Online
  rhel75-node1: Online
> Remove tokens on node 
[root@rhel75-node1 ~]# rm /var/lib/pcsd/tokens
rm: remove regular file ‘/var/lib/pcsd/tokens’? y
> Then try to re-authenticate
[root@rhel75-node1 ~]# pcs cluster auth rhel75-node1:2225 rhel75-node2:2226 -u hacluster
Password: 
rhel75-node1: Authorized
rhel75-node2: Authorized

> Test for issue described in comment #16
[root@rhel75-node1 ~]# pcs cluster status 
Cluster Status:
 Stack: corosync
 Current DC: rhel75-node2 (version 1.1.18-5.el7-1a4ef7d180) - partition with quorum
 Last updated: Tue Feb  6 08:37:18 2018
 Last change: Tue Feb  6 08:32:50 2018 by hacluster via crmd on rhel75-node2
 2 nodes configured
 0 resources configured

PCSD Status:
  rhel75-node1: Online
  rhel75-node2: Online
> Make sure that nodes are authenticated on non default ports (default port: 2224)
[root@rhel75-node1 ~]# cat /var/lib/pcsd/tokens
{
  "format_version": 3,
  "data_version": 2,
  "tokens": {
    "rhel75-node1": "cc86e1fd-e90d-4270-9eb6-09584a5a853f",
    "rhel75-node2": "c624575a-52a0-40f3-8767-a24336d340b1"
  },
  "ports": {
    "rhel75-node1": 2225,
    "rhel75-node2": 2226
  }
> Change pcsd ports to default on all nodes and restart pcsd
[root@rhel75-node1 ~]# cat /etc/sysconfig/pcsd | grep PCSD_PORT
#PCSD_PORT=2224
[root@rhel75-node1 ~]# systemctl restart pcsd

[root@rhel75-node2 ~]# cat /etc/sysconfig/pcsd | grep PCSD_PORT
#PCSD_PORT=2224
[root@rhel75-node2 ~]# systemctl restart pcsd

> Authenticate nodes not specifying ports
[root@rhel75-node1 ~]# pcs cluster auth rhel75-node1 rhel75-node2 -u hacluster
Password: 
rhel75-node1: Authorized
rhel75-node2: Authorized

Comment 23 Marian Krcmarik 2018-02-28 23:00:46 UTC
Verified on pcs-0.9.162-4.el7.x86_64

Comment 25 errata-xmlrpc 2018-04-10 15:37:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0866


Note You need to log in before you can comment on or make changes to this bug.