Red Hat Bugzilla – Bug 1165821
pcs CLI/GUI should be capable of setting up a hardened cluster (for not entirely trusted environment)
Last modified: 2017-08-02 03:00:11 EDT
Most notable sign of a hardened cluster is likely the encrypted messaging (i.e., corosync) layer, which would probably mean /etc/corosync/authkey setup and distribution and aligning corosync.conf with intention to use the encryption. Also, if there would be some persisted cluster-wide flag denoting the cluster should be purposefully hardened, various properties could be checked during the run-time (is the channel towards the fence devices supporting encryption really configured to that effect? etc.).
BTW. the "persisted cluster-wide flag" I mentioned could actually be implied by non-presence of "secauth: off" in corosync configuration (and/or encryption not disabled by other means like "crypto_{hash,type}: none").
Note wrt. backup/restore logic (at least global, nonlocal): it should be perfectly fine not to backup /etc/corosync/authkey, but rather, on restore, check if authentication is needed per corosync configuration (and that authkey is not part of the tarball) and generate and distribute that file anew.
See also the github issue https://github.com/ClusterLabs/pcs/issues/98
We are going to implement enabling corosync encryption and distributing corosync authkey. This will be enabled by default and pcs will not offer an option to disable the encryption. For other hardening related issues please file separate bzs. (In reply to Jan Pokorný from comment #4) > Note wrt. backup/restore logic (at least global, nonlocal): > it should be perfectly fine not to backup /etc/corosync/authkey, > but rather, on restore, check if authentication is needed per > corosync configuration (and that authkey is not part of the tarball) > and generate and distribute that file anew. When one of nodes is unreachable during the restore procedure, the user can restore cluster configuration on that node later using the same tarball. For the node to be able to connect to the cluster it must have the same key as the rest of the cluster. Therefore we are going to include the authkeys (both corosync and pacemaker) in the tarball.
Created attachment 1281458 [details] proposed fix
Created attachment 1281494 [details] proposed fix - backup and restore keys
re [comment 13] > We are going to implement enabling corosync encryption and > distributing corosync authkey. This will be enabled by default > and pcs will not offer an option to disable the encryption. This is quite a substantial change (well, a complete twist) in behavior that needs to be spelled out prominently! Note I really wasn't asking for that without ability to stick with the old one. > For other hardening related issues please file separate bzs. Note that some of the configuration bits I had in mind will be solved with a resolution of [bug 1173346] for which I'd like to kindly ask for a priority elevation.
After fix: [root@rh73-node1:~]# rpm -q pcs pcs-0.9.158-2.el7.x86_64 > the authkey is created and distributed automatically [root@rh73-node1:~]# ls -l /etc/corosync/authkey ls: cannot access /etc/corosync/authkey: No such file or directory [root@rh73-node1:~]# pcs cluster setup --name rhel73 rh73-node1 rh73-node2 Destroying cluster on nodes: rh73-node1, rh73-node2... rh73-node2: Stopping Cluster (pacemaker)... rh73-node1: Stopping Cluster (pacemaker)... rh73-node2: Successfully destroyed cluster rh73-node1: Successfully destroyed cluster Sending 'corosync authkey', 'pacemaker_remote authkey' to 'rh73-node1', 'rh73-node2' rh73-node1: successful distribution of the file 'corosync authkey' rh73-node1: successful distribution of the file 'pacemaker_remote authkey' rh73-node2: successful distribution of the file 'corosync authkey' rh73-node2: successful distribution of the file 'pacemaker_remote authkey' Sending cluster config files to the nodes... rh73-node1: Succeeded rh73-node2: Succeeded Synchronizing pcsd certificates on nodes rh73-node1, rh73-node2... rh73-node1: Success rh73-node2: Success Restarting pcsd on the nodes in order to reload the certificates... rh73-node1: Success rh73-node2: Success [root@rh73-node1:~]# ls -l /etc/corosync/authkey -r--------. 1 root root 256 May 26 14:46 /etc/corosync/authkey > after cluster start, corosync reports in its log that traffic encryption has been enabled [26801] rh73-node1 corosyncnotice [TOTEM ] Initializing transmit/receive security (NSS) crypto: aes256 hash: sha1 > the authkey is part of backup / restore procedure [root@rh73-node1:~]# cat /etc/corosync/authkey c92c34c422808663f15bfc811faf12f84984bc270fc846f31601431d1c47ae3f9658b4a9ac55194503a039f145f6293e90f414d7ff78e54956f3a140c12fe00c8e5441d9ab73d0aeaf88f5d822098f93c8f591f94e27c16aa8626efa5017461d39e4f9cfddf16648636a465110813760044e4de3ee33f33dfdbfa464ce02cc22[root@rh73-node1:~]# [root@rh73-node1:~]# cat /etc/pacemaker/authkey eca4d1834c7619b535a19432e90c3107fa3e9a82d06a513ff33ae32d701d00d4[root@rh73-node1:~]# [root@rh73-node1:~]# pcs config backup cluster.tar.bz2 [root@rh73-node1:~]# tar -tf cluster.tar.bz2 | grep authkey corosync_authkey pacemaker_authkey [root@rh73-node1:~]# pcs cluster destroy --all rh73-node1: Stopping Cluster (pacemaker)... rh73-node2: Stopping Cluster (pacemaker)... rh73-node2: Successfully destroyed cluster rh73-node1: Successfully destroyed cluster [root@rh73-node1:~]# cat /etc/corosync/authkey cat: /etc/corosync/authkey: No such file or directory [root@rh73-node1:~]# cat /etc/pacemaker/authkey cat: /etc/pacemaker/authkey: No such file or directory [root@rh73-node1:~]# pcs config restore cluster.tar.bz2 rh73-node1: Succeeded rh73-node2: Succeeded [root@rh73-node1:~]# cat /etc/corosync/authkey c92c34c422808663f15bfc811faf12f84984bc270fc846f31601431d1c47ae3f9658b4a9ac55194503a039f145f6293e90f414d7ff78e54956f3a140c12fe00c8e5441d9ab73d0aeaf88f5d822098f93c8f591f94e27c16aa8626efa5017461d39e4f9cfddf16648636a465110813760044e4de3ee33f33dfdbfa464ce02cc22[root@rh73-node1:~]# [root@rh73-node1:~]# cat /etc/pacemaker/authkey eca4d1834c7619b535a19432e90c3107fa3e9a82d06a513ff33ae32d701d00d4[root@rh73-node1:~]#
Is there any reason to reduce key length/complexity by using just hex instead full byte? Corosync reads just first 128 bytes so pcs generated authkey is then not 1024-bits strong (as expected).
I did not know that corosync reads just first 128 bytes from the key. I'm going to fix it in the next build.
@Ivan: Ok, sounds good, thanks. Also what exactly is the reason to force encryption without ability to disable it? I mean, to have encryption enabled by default is for sure the good thing, but encryption uses some cpu. And because corosync is single threaded, encryption can negatively influence resulting throughput. This is not a problem for "pacemaker only" clusters, but may become problem for gfs/clvm. Also what if somebody wants to turn on the encryption on "old" (pre 7.4) cluster? They will be forced to "rebuild" the cluster? What exactly is the problem? It's really about adding/removing one variable in corosync.conf (secauth). Authkey can be distributed even when encryption is disabled.
(In reply to Jan Friesse from comment #21) > @Ivan: Ok, sounds good, thanks. > > Also what exactly is the reason to force encryption without ability to > disable it? I mean, to have encryption enabled by default is for sure the > good thing, but encryption uses some cpu. And because corosync is single > threaded, encryption can negatively influence resulting throughput. This is > not a problem for "pacemaker only" clusters, but may become problem for > gfs/clvm. That is because of me. I had an impression, apparently wrong, that we want to have corosync encryption enabled always. We will add an option to disable the encryption to the cluster setup command. > > Also what if somebody wants to turn on the encryption on "old" (pre 7.4) > cluster? They will be forced to "rebuild" the cluster? > > What exactly is the problem? It's really about adding/removing one variable > in corosync.conf (secauth). Authkey can be distributed even when encryption > is disabled. We are running out of time for 7.4. So we have been planning to add commands for enabling and disabling the encryption in an existing cluster in 7.5 timeframe.
*** Bug 1455919 has been marked as a duplicate of this bug. ***
Summary of comment 24: When enabling encryption in an existing cluster, we want to remove all secauth, crypto_cipher and crypto_hash directives from corosnyc.conf. When these are not set, the encryption is enabled by default in corosync. Of course the authkey needs to be distributed as well.
Created attachment 1283766 [details] proposed fix (part3)
After Fix: [vm-rhel72-1 ~] $ rpm -q pcs pcs-0.9.158-3.el7.x86_64 [vm-rhel72-1 ~] $ pcs cluster setup --name=devcluster vm-rhel72-1 vm-rhel72-3 --start --no-hardened Destroying cluster on nodes: vm-rhel72-1, vm-rhel72-3... vm-rhel72-1: Stopping Cluster (pacemaker)... vm-rhel72-3: Stopping Cluster (pacemaker)... vm-rhel72-3: Successfully destroyed cluster vm-rhel72-1: Successfully destroyed cluster Sending 'pacemaker_remote authkey' to 'vm-rhel72-1', 'vm-rhel72-3' vm-rhel72-3: successful distribution of the file 'pacemaker_remote authkey' vm-rhel72-1: successful distribution of the file 'pacemaker_remote authkey' Sending cluster config files to the nodes... vm-rhel72-1: Succeeded vm-rhel72-3: Succeeded Starting cluster on nodes: vm-rhel72-1, vm-rhel72-3... vm-rhel72-1: Starting Cluster... vm-rhel72-3: Starting Cluster... Synchronizing pcsd certificates on nodes vm-rhel72-1, vm-rhel72-3... vm-rhel72-3: Success vm-rhel72-1: Success Restarting pcsd on the nodes in order to reload the certificates... vm-rhel72-3: Success vm-rhel72-1: Success [vm-rhel72-1 ~] $ ls -l /etc/corosync/a* zsh: no matches found: /etc/corosync/a* [vm-rhel72-1 ~] $ cat /var/log/cluster/corosync.log|grep security [14010] vm-rhel72-1 corosyncnotice [TOTEM ] Initializing transmit/receive security (NSS) crypto: none hash: none
We want the encryption to be disabled by default in RHEL7.4.
The option --no-hardened sounds a little awkward. I'd propose --with-encryption and --no-encryption (or something similar). Thanks, Chris
One question/proposal. From testcase (line ls -l /etc/corosync/a*) it looks like when no-hardened is enabled, authkey is not distributed. I would recommend to always distribute authkey, because it may make future enable/disable operation easier (just changing corosync.conf).
Created attachment 1284135 [details] proposed fix 4
(In reply to Jan Friesse from comment #30) > One question/proposal. From testcase (line ls -l /etc/corosync/a*) it looks > like when no-hardened is enabled, authkey is not distributed. I would > recommend to always distribute authkey, because it may make future > enable/disable operation easier (just changing corosync.conf). Unfortunately in the future enable/disable operation we cannot make an assumption that authkey exists. So we will need to distribute it anyways.
Created attachment 1285324 [details] proposed fix (part5)
After Fix: [vm-rhel72-1 ~] $ rpm -q pcs pcs-0.9.158-4.el7.x86_64 > 1) implicitly without encryption [vm-rhel72-1 ~] $ pcs cluster setup --name=devcluster vm-rhel72-1 vm-rhel72-3 --start Destroying cluster on nodes: vm-rhel72-1, vm-rhel72-3... vm-rhel72-1: Stopping Cluster (pacemaker)... vm-rhel72-3: Stopping Cluster (pacemaker)... vm-rhel72-1: Successfully destroyed cluster vm-rhel72-3: Successfully destroyed cluster Sending 'pacemaker_remote authkey' to 'vm-rhel72-1', 'vm-rhel72-3' vm-rhel72-3: successful distribution of the file 'pacemaker_remote authkey' vm-rhel72-1: successful distribution of the file 'pacemaker_remote authkey' Sending cluster config files to the nodes... vm-rhel72-1: Succeeded vm-rhel72-3: Succeeded Starting cluster on nodes: vm-rhel72-1, vm-rhel72-3... vm-rhel72-1: Starting Cluster... vm-rhel72-3: Starting Cluster... Synchronizing pcsd certificates on nodes vm-rhel72-1, vm-rhel72-3... vm-rhel72-3: Success vm-rhel72-1: Success Restarting pcsd on the nodes in order to reload the certificates... vm-rhel72-1: Success vm-rhel72-3: Success [vm-rhel72-1 ~] $ cat /var/log/cluster/corosync.log|grep security [8322] vm-rhel72-1 corosyncnotice [TOTEM ] Initializing transmit/receive security (NSS) crypto: none hash: none [vm-rhel72-1 ~] $ ls -l /etc/corosync/a* zsh: no matches found: /etc/corosync/a* > 2) explicitly with encryption [vm-rhel72-1 ~] $ pcs cluster setup --name=devcluster vm-rhel72-1 vm-rhel72-3 --start --encryption=1 Destroying cluster on nodes: vm-rhel72-1, vm-rhel72-3... vm-rhel72-1: Stopping Cluster (pacemaker)... vm-rhel72-3: Stopping Cluster (pacemaker)... vm-rhel72-1: Successfully destroyed cluster vm-rhel72-3: Successfully destroyed cluster Sending 'corosync authkey', 'pacemaker_remote authkey' to 'vm-rhel72-1', 'vm-rhel72-3' vm-rhel72-3: successful distribution of the file 'corosync authkey' vm-rhel72-3: successful distribution of the file 'pacemaker_remote authkey' vm-rhel72-1: successful distribution of the file 'corosync authkey' vm-rhel72-1: successful distribution of the file 'pacemaker_remote authkey' Sending cluster config files to the nodes... vm-rhel72-1: Succeeded vm-rhel72-3: Succeeded Starting cluster on nodes: vm-rhel72-1, vm-rhel72-3... vm-rhel72-1: Starting Cluster... vm-rhel72-3: Starting Cluster... Synchronizing pcsd certificates on nodes vm-rhel72-1, vm-rhel72-3... vm-rhel72-3: Success vm-rhel72-1: Success Restarting pcsd on the nodes in order to reload the certificates... vm-rhel72-1: Success vm-rhel72-3: Success [vm-rhel72-1 ~] $ cat /var/log/cluster/corosync.log|grep security [15309] vm-rhel72-1 corosyncnotice [TOTEM ] Initializing transmit/receive security (NSS) crypto: aes256 hash: sha1 [vm-rhel72-1 ~] $ ls -l /etc/corosync/a* -r--------. 1 root root 128 6. čen 12.47 /etc/corosync/authkey > authkey is not an ASCII text [vm-rhel72-1 ~] $ file /etc/corosync/authkey /etc/corosync/authkey: data
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1958