$ curl -s https://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.csv | grep 2224 efi-mg,2224,tcp,Easy Flexible Internet/Multiplayer Games,[Thomas_Efer],[Thomas_Efer],2006-03,,,,, efi-mg,2224,udp,Easy Flexible Internet/Multiplayer Games,[Thomas_Efer],[Thomas_Efer],2006-03,,,,, This means that the port assignment even predates pcs existence by several years :-/ To rectify the current state, there should at least be a possibility for pcs to switch to use another port. That would resolve conflicts arising from such an unexpected port occupation.
*** Bug 1428172 has been marked as a duplicate of this bug. ***
From [bug 1428172 comment 0]: > To run pcsd in a container we can have issues where the ports inside > the container don't necessarily match ports that are coming in from > the outside of the container. > > So we need to support these two things: > > 1. Be able to configure the port that pcsd listens on > > 2. When setting up the cluster we will need to configure the ports that > each individual pcsd is listening on. And all pcs commands will > either need to accept a port parameter or be able to lookup up the > port and use it when running commands where you don't specify a host > (like 'pcs cluster start --all').
Relatively speaking, it is less important that pcs listens on a configurable port than being able to be told the its peer is already listening on a different one (because while the port on the inside might be the normal one, the external port will almost never match).
Proposed fix: https://github.com/ClusterLabs/pcs/commit/3de33c2ebe74f5238376b86a00db234ce6 Test: nodes: rhel74-node1, rhel74-node2 Change port on which pcsd is listening using option PCSD_PORT in file /etc/sysconfig/pcsd and restart pcsd. > [root@rhel74-node1 ~]# cat /etc/sysconfig/pcsd | grep PCSD_PORT > PCSD_PORT=2225 > [root@rhel74-node1 ~]# systemctl restart pcsd > [root@rhel74-node2 ~]# cat /etc/sysconfig/pcsd | grep PCSD_PORT > PCSD_PORT=2226 > [root@rhel74-node2 ~]# systemctl restart pcsd Authenticate nodes > [root@rhel74-node1 ~]# pcs cluster auth rhel74-node1:2225 rhel74-node2:2226 -u hacluster > Password: > rhel74-node1: Authorized > rhel74-node2: Authorized Create and start cluster: > [root@rhel74-node1 ~]# pcs cluster setup --name rh74-cluster rhel74-node{1,2} > Destroying cluster on nodes: rhel74-node1, rhel74-node2... > rhel74-node2: Stopping Cluster (pacemaker)... > rhel74-node1: Stopping Cluster (pacemaker)... > rhel74-node2: Successfully destroyed cluster > rhel74-node1: Successfully destroyed cluster > > Sending 'pacemaker_remote authkey' to 'rhel74-node1', 'rhel74-node2' > rhel74-node1: successful distribution of the file 'pacemaker_remote authkey' > rhel74-node2: successful distribution of the file 'pacemaker_remote authkey' > Sending cluster config files to the nodes... > rhel74-node1: Succeeded > rhel74-node2: Succeeded > > Synchronizing pcsd certificates on nodes rhel74-node1, rhel74-node2... > rhel74-node1: Success > rhel74-node2: Success > Restarting pcsd on the nodes in order to reload the certificates... > rhel74-node2: Success > rhel74-node1: Success > [root@rhel74-node1 ~]# pcs cluster start --all > rhel74-node1: Starting Cluster... > rhel74-node2: Starting Cluster... > [root@rhel74-node1 ~]# pcs status > Cluster name: rh74-cluster > WARNING: no stonith devices and stonith-enabled is not false > Stack: corosync > Current DC: rhel74-node2 (version 1.1.16-11.el7-94ff4df) - partition with quorum > Last updated: Wed Sep 20 13:26:13 2017 > Last change: Wed Sep 20 13:21:48 2017 by hacluster via crmd on rhel74-node2 > > 2 nodes configured > 0 resources configured > > Online: [ rhel74-node1 rhel74-node2 ] > > No resources > > > Daemon Status: > corosync: active/disabled > pacemaker: active/disabled > pcsd: active/enabled It is also possible to specify pcsd port in the web UI when creating a cluster, adding existing cluster into the web UI and when adding a node into the cluster.
Additional fix: https://github.com/ClusterLabs/pcs/commit/34b781e1da8b36c2f8a2a46f81ee7a9968fa
After Fix: [ant ~] $ rpm -q pcs pcs-0.9.161-1.el7.x86_64 > setup ports [ant ~] $ echo PCSD_PORT=2225 >> /etc/sysconfig/pcsd [ant ~] $ ssh bee "echo PCSD_PORT=2226 >> /etc/sysconfig/pcsd" [ant ~] $ service pcsd restart Redirecting to /bin/systemctl restart pcsd.service [ant ~] $ ssh bee "service pcsd restart" Redirecting to /bin/systemctl restart pcsd.service [ant ~] $ LANG=C netstat -nal | grep '\(2224\|2225\).*LISTEN' tcp6 0 0 :::2225 :::* LISTEN [ant ~] $ ssh bee "LANG=C netstat -nal | grep '\(2224\|2226\).*LISTEN'" root@bee's password: tcp6 0 0 :::2226 :::* LISTEN > authenticate nodes [ant ~] $ pcs cluster auth ant:2225 bee:2226 -u hacluster Password: ant: Authorized bee: Authorized > create and run cluster [ant ~] $ pcs cluster setup --name=zoo ant bee Destroying cluster on nodes: ant, bee... bee: Stopping Cluster (pacemaker)... ant: Stopping Cluster (pacemaker)... ant: Successfully destroyed cluster bee: Successfully destroyed cluster Sending 'pacemaker_remote authkey' to 'ant', 'bee' ant: successful distribution of the file 'pacemaker_remote authkey' bee: successful distribution of the file 'pacemaker_remote authkey' Sending cluster config files to the nodes... ant: Succeeded bee: Succeeded Synchronizing pcsd certificates on nodes ant, bee... ant: Success bee: Success Restarting pcsd on the nodes in order to reload the certificates... bee: Success ant: Success [ant ~] $ pcs cluster start --all ant: Starting Cluster... bee: Starting Cluster... [ant ~] $ pcs status Cluster name: zoo WARNING: no stonith devices and stonith-enabled is not false Stack: corosync Current DC: bee (version 1.1.16-12.el7-94ff4df) - partition with quorum Last updated: Thu Nov 2 15:46:14 2017 Last change: Thu Nov 2 15:46:03 2017 by hacluster via crmd on bee 2 nodes configured 0 resources configured Online: [ ant bee ] No resources Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/disabled
additional fix: https://github.com/ClusterLabs/pcs/commit/cc21497bb47ca8813cb61a2965e9882e378
After fix: [root@rhel75-node1 ~]# rpm -q pcs pcs-0.9.162-1.el7.x86_64 > works the same way as described in comment #9
additional fix: https://github.com/ClusterLabs/pcs/commit/ed3e020f477f1e597cb6a86c5305b26a5a892096 'pcs cluster auth' does not work when all of these conditions are met: * the command is run on a cluster node * nodes run pcsd on a non-default port * the node, where the command is run, is not authorized against cluster nodes More specifically, the auth tokens are not written to disk. Reproducer: # The node is in a cluster. [root@rh74-node1:~]# pcs status nodes corosync Corosync Nodes: Online: Offline: rh74-node1 rh74-node2 # The node is not authorized. [root@rh74-node1:~]# pcs pcsd clear-auth # Cluster nodes run pcsd on a non-default port. [root@rh74-node1:~]# pcs cluster auth rh74-node1:2225 rh74-node2:2225 -u hacluster Password: rh74-node2: Authorized rh74-node1: Authorized Error: Unable to synchronize and save tokens on nodes: rh74-node1, rh74-node2. Are they authorized?
additional fix: https://github.com/ClusterLabs/pcs/commit/7b5994ddc79fc3e5a829e870834f496b1b90c133 Reproducer: * set up a cluster * set pcsd to run on a non-default port * authenticate cluster nodes * set pcsd to run on a default port * authenticate cluster nodes not specifying a port Actual results: A previously used port is used to send tokens to nodes. Since pcsd no longer listens on it, auth tokens are not saved on the nodes. Expected results: Since no port is specified in the command, the default port should be used regardless to what port was used before. [root@rh74-node1:~]# pcs status nodes corosync Corosync Nodes: Online: Offline: rh74-node1 rh74-node2 # pcsd running o a non-default port [root@rh74-node1:~]# pcs cluster auth rh74-node1:2225 rh74-node2:2225 -u hacluster Password: rh74-node2: Authorized rh74-node1: Authorized # pcsd running on a non-default port, auth not working [root@rh74-node1:~]# pcs cluster auth rh74-node1 rh74-node2 -u hacluster Password: rh74-node2: Authorized rh74-node1: Authorized Error: Unable to synchronize and save tokens on nodes: rh74-node1, rh74-node2. Are they authorized? # pcsd running on a non-default port, auth working when a port is specified even though there should be no difference [root@rh74-node1:~]# pcs cluster auth rh74-node1:2224 rh74-node2:2224 -u hacluster Password: rh74-node2: Authorized rh74-node1: Authorized
After fix: [root@rhel75-node1 ~]# rpm -q pcs pcs-0.9.162-5.el7.x86_64 > nodes: rhel75-node1, rhel75-node2 > Change port on which pcsd is listening using option PCSD_PORT in file /etc/sysconfig/pcsd and restart pcsd. [root@rhel75-node1 ~]# cat /etc/sysconfig/pcsd | grep PCSD_PORT PCSD_PORT=2225 [root@rhel75-node1 ~]# systemctl restart pcsd [root@rhel75-node2 ~]# cat /etc/sysconfig/pcsd | grep PCSD_PORT PCSD_PORT=2226 [root@rhel75-node2 ~]# systemctl restart pcsd > Authenticate nodes [root@rhel75-node1 ~]# pcs cluster auth rhel75-node1:2225 rhel75-node2:2226 -u hacluster Password: rhel75-node1: Authorized rhel75-node2: Authorized Create and start cluster: [root@rhel75-node1 ~]# pcs cluster setup --name rh75-cluster rhel75-node{1,2} Destroying cluster on nodes: rhel75-node1, rhel75-node2... rhel75-node1: Stopping Cluster (pacemaker)... rhel75-node2: Stopping Cluster (pacemaker)... rhel75-node1: Successfully destroyed cluster rhel75-node2: Successfully destroyed cluster Sending 'pacemaker_remote authkey' to 'rhel75-node1', 'rhel75-node2' rhel75-node1: successful distribution of the file 'pacemaker_remote authkey' rhel75-node2: successful distribution of the file 'pacemaker_remote authkey' Sending cluster config files to the nodes... rhel75-node1: Succeeded rhel75-node2: Succeeded Synchronizing pcsd certificates on nodes rhel75-node1, rhel75-node2... rhel75-node1: Success rhel75-node2: Success Restarting pcsd on the nodes in order to reload the certificates... rhel75-node1: Success rhel75-node2: Success [root@rhel75-node1 ~]# pcs cluster start --all rhel75-node1: Starting Cluster... rhel75-node2: Starting Cluster... [root@rhel75-node1 ~]# pcs status Cluster name: rh75-cluster WARNING: no stonith devices and stonith-enabled is not false Stack: corosync Current DC: rhel75-node2 (version 1.1.18-5.el7-1a4ef7d180) - partition with quorum Last updated: Tue Feb 6 08:32:57 2018 Last change: Tue Feb 6 08:32:50 2018 by hacluster via crmd on rhel75-node2 2 nodes configured 0 resources configured Online: [ rhel75-node1 rhel75-node2 ] No resources Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/disabled > Test for issue described in comment #15 [root@rhel75-node1 ~]# pcs cluster status Cluster Status: Stack: corosync Current DC: rhel75-node2 (version 1.1.18-5.el7-1a4ef7d180) - partition with quorum Last updated: Tue Feb 6 08:36:03 2018 Last change: Tue Feb 6 08:32:50 2018 by hacluster via crmd on rhel75-node2 2 nodes configured 0 resources configured PCSD Status: rhel75-node2: Online rhel75-node1: Online > Remove tokens on node [root@rhel75-node1 ~]# rm /var/lib/pcsd/tokens rm: remove regular file ‘/var/lib/pcsd/tokens’? y > Then try to re-authenticate [root@rhel75-node1 ~]# pcs cluster auth rhel75-node1:2225 rhel75-node2:2226 -u hacluster Password: rhel75-node1: Authorized rhel75-node2: Authorized > Test for issue described in comment #16 [root@rhel75-node1 ~]# pcs cluster status Cluster Status: Stack: corosync Current DC: rhel75-node2 (version 1.1.18-5.el7-1a4ef7d180) - partition with quorum Last updated: Tue Feb 6 08:37:18 2018 Last change: Tue Feb 6 08:32:50 2018 by hacluster via crmd on rhel75-node2 2 nodes configured 0 resources configured PCSD Status: rhel75-node1: Online rhel75-node2: Online > Make sure that nodes are authenticated on non default ports (default port: 2224) [root@rhel75-node1 ~]# cat /var/lib/pcsd/tokens { "format_version": 3, "data_version": 2, "tokens": { "rhel75-node1": "cc86e1fd-e90d-4270-9eb6-09584a5a853f", "rhel75-node2": "c624575a-52a0-40f3-8767-a24336d340b1" }, "ports": { "rhel75-node1": 2225, "rhel75-node2": 2226 } > Change pcsd ports to default on all nodes and restart pcsd [root@rhel75-node1 ~]# cat /etc/sysconfig/pcsd | grep PCSD_PORT #PCSD_PORT=2224 [root@rhel75-node1 ~]# systemctl restart pcsd [root@rhel75-node2 ~]# cat /etc/sysconfig/pcsd | grep PCSD_PORT #PCSD_PORT=2224 [root@rhel75-node2 ~]# systemctl restart pcsd > Authenticate nodes not specifying ports [root@rhel75-node1 ~]# pcs cluster auth rhel75-node1 rhel75-node2 -u hacluster Password: rhel75-node1: Authorized rhel75-node2: Authorized
Verified on pcs-0.9.162-4.el7.x86_64
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0866