1740218 – [fips] knet transport fails to initialize with OS in fips mode

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1740218 - [fips] knet transport fails to initialize with OS in fips mode

Summary: [fips] knet transport fails to initialize with OS in fips mode

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 8
Classification:	Red Hat
Component:	pcs
Sub Component:
Version:	8.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	8.1
Assignee:	Tomas Jelinek
QA Contact:	cluster-qe@redhat.com
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1746565
TreeView+	depends on / blocked

Reported:	2019-08-12 13:38 UTC by Patrik Hagara
Modified:	2020-11-14 07:05 UTC (History)
CC List:	10 users (show)
Fixed In Version:	pcs-0.10.2-4.el8
Doc Type:	Bug Fix
Doc Text:	Cause: Pcs creates 384 bytes long corosync authkey. Consequence: Corosync does not start when FIPS mode is enabled. Fix: Make pcs create 256 bytes long corosync authkey. Result: Corosync starts even when FIPS mode is enabled.
Clone Of:
Clones:	1746565 (view as bug list)
Environment:
Last Closed:	2019-11-05 20:40:02 UTC
Type:	Bug
Target Upstream Version:
Embargoed:
Dependent Products:
Flags:	pm-rhel: mirror+

Attachments	(Terms of Use)
proposed fix (4.52 KB, patch) 2019-08-13 12:12 UTC, Tomas Jelinek	no flags	Details \| Diff
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2019:3311	0	None	None	None	2019-11-05 20:40:05 UTC

Description Patrik Hagara 2019-08-12 13:38:03 UTC

Description of problem:

> Aug 12 14:01:00 virt-063 systemd[1]: Starting Corosync Cluster Engine...
> Aug 12 14:01:00 virt-063 corosync[24924]:  [MAIN  ] Corosync Cluster Engine 3.0.2 starting up
> Aug 12 14:01:00 virt-063 corosync[24924]:  [MAIN  ] Corosync built-in features: dbus systemd xmlconf vqsim nozzle snmp pie relro bindnow
> Aug 12 14:01:01 virt-063 corosync[24924]:  [TOTEM ] Initializing transport (Kronosnet).
> Aug 12 14:01:01 virt-063 corosync[24924]:  [TOTEM ] knet_handle_crypto failed: -2
> Aug 12 14:01:01 virt-063 corosync[24924]:  [KNET  ] common: crypto_nss.so has been loaded from /usr/lib64/kronosnet/crypto_nss.so
> Aug 12 14:01:01 virt-063 corosync[24924]:  [KNET  ] nsscrypto: Failure to import key into NSS (-8190): security library: received bad data.
> Aug 12 14:01:01 virt-063 corosync[24924]:  [KNET  ] nsscrypto: Secret key is probably too long. Try reduce it to 256 bytes
> Aug 12 14:01:01 virt-063 corosync[24924]:  [MAIN  ] Can't initialize TOTEM layer
> Aug 12 14:01:01 virt-063 corosync[24924]:  [MAIN  ] Corosync Cluster Engine exiting with status 15 at main.c:1531.
> Aug 12 14:01:01 virt-063 systemd[1]: corosync.service: Main process exited, code=exited, status=15/n/a
> Aug 12 14:01:01 virt-063 systemd[1]: corosync.service: Failed with result 'exit-code'.
> Aug 12 14:01:01 virt-063 systemd[1]: Failed to start Corosync Cluster Engine.

Version-Release number of selected component (if applicable):
libknet1-1.10-1.el8.x86_64
corosync-3.0.2-3.el8.x86_64
pacemaker-2.0.2-2.el8.x86_64
pcs-0.10.2-3.el8.x86_64

How reproducible:
always

Steps to Reproduce:
1. enable fips mode on to-be cluster nodes
2. setup a cluster on the nodes using pcs
3. try to start the cluster

Actual results:
corosync fails to start due to knet crypto errors

Expected results:
cluster successfully starts in fips mode

Additional info:

Comment 1 Jan Friesse 2019-08-12 13:56:47 UTC

@Phagara:
this is nether kronosnet nor corosync bug. Error (and solution) is logged:

> Secret key is probably too long. Try reduce it to 256 bytes

and there is really nothing knet can do about it, because in fips mode it is simply impossible to import longer key. corosync-keygen generates 256 bytes key by default (exactly for this fips mode reason), but it looks like pcs is (for whatever reason) generating 384 bytes :(

Anyway, reassigning to pcs.

Comment 3 Tomas Jelinek 2019-08-13 12:12:45 UTC

Created attachment 1603296 [details]
proposed fix

Comment 4 Tomas Jelinek 2019-08-13 13:09:06 UTC

After fix:
[root@rh80-node1:~]# rpm -q pcs
pcs-0.10.2-4.el8.x86_64

[root@rh80-node1:~]# pcs cluster setup rhel80-knet rh80-node1 rh80-node2
No addresses specified for host 'rh80-node1', using 'rh80-node1'
No addresses specified for host 'rh80-node2', using 'rh80-node2'
Destroying cluster on hosts: 'rh80-node1', 'rh80-node2'...
rh80-node1: Successfully destroyed cluster
rh80-node2: Successfully destroyed cluster
Requesting remove 'pcsd settings' from 'rh80-node1', 'rh80-node2'
rh80-node1: successful removal of the file 'pcsd settings'
rh80-node2: successful removal of the file 'pcsd settings'
Sending 'corosync authkey', 'pacemaker authkey' to 'rh80-node1', 'rh80-node2'
rh80-node1: successful distribution of the file 'corosync authkey'
rh80-node1: successful distribution of the file 'pacemaker authkey'
rh80-node2: successful distribution of the file 'corosync authkey'
rh80-node2: successful distribution of the file 'pacemaker authkey'
Sending 'corosync.conf' to 'rh80-node1', 'rh80-node2'
rh80-node1: successful distribution of the file 'corosync.conf'
rh80-node2: successful distribution of the file 'corosync.conf'
Cluster has been successfully set up.

[root@rh80-node1:~]# ls -l /etc/corosync/authkey
-r--------. 1 root root 256 Aug 13 14:07 /etc/corosync/authkey

[root@rh80-node1:~]# fips-mode-setup --check
FIPS mode is enabled.

[root@rh80-node1:~]# pcs cluster start --all --wait
rh80-node2: Starting Cluster...
rh80-node1: Starting Cluster...
Waiting for node(s) to start...
rh80-node1: Started
rh80-node2: Started

[root@rh80-node1:~]# systemctl status corosync
● corosync.service - Corosync Cluster Engine
   Loaded: loaded (/usr/lib/systemd/system/corosync.service; disabled; vendor preset: disabled)
   Active: active (running) since Tue 2019-08-13 14:08:09 CEST; 30s ago
{...snip...}

Comment 10 errata-xmlrpc 2019-11-05 20:40:02 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:3311

Note You need to log in before you can comment on or make changes to this bug.