Bug 1869232 - firewalld and snmpd services fail to start if RHEL-8.2 (FIPS enabled) VM runs on non-FIPS compliant RHEL-6 KVM hypervisor
Summary: firewalld and snmpd services fail to start if RHEL-8.2 (FIPS enabled) VM runs...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: rng-tools
Version: 8.2
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: rc
: 8.0
Assignee: Vladis Dronov
QA Contact: Vilém Maršík
URL:
Whiteboard:
Depends On:
Blocks: 1680409
TreeView+ depends on / blocked
 
Reported: 2020-08-17 11:03 UTC by Abhishekh Patil
Modified: 2023-08-08 02:52 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-11-26 18:08:08 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Abhishekh Patil 2020-08-17 11:03:37 UTC
Description of problem:

firewalld and snmpd services fail to start if RHEL-8.2 (FIPS enabled) VM runs on non-FIPS compliant RHEL-6 KVM hypervisor 


Version-Release number of selected component (if applicable):
fipscheck-1.5.0-4.el8.x86_64

How reproducible:
sometimes

Steps to Reproduce:
1. Deploy a RHEL-8.2 VM on a RHEL-6 KVM host
2. Enabled FIPS on the VM and reboot it
3. Try to start firewalld and snmpd services

Actual results:
firewalld and snmpd services fail to start

Expected results:
firewalld and snmpd services should continue to run

Additional info:

Comment 2 Tomas Mraz 2020-08-17 12:05:15 UTC
This needs further investigation on why the failure happens. fipscheck is definitely not the right component as it has a single purpose - check .hmac checksums.

What is the error message in journal for the failing firewalld and snmpd services?

Is there anything else failing?

Comment 19 Laurent Vivier 2020-08-31 09:59:08 UTC
I can see in guest messages file:

Aug 21 19:32:34 derya rngd[1119]: Initializing available sources
Aug 21 19:32:34 derya rngd[1119]: Failed to init entropy source hwrng
Aug 21 19:32:34 derya rngd[1119]: Failed to init entropy source rdrand

According to lspci from attachment 1712303 [details], there is no virtio-rng device attached to the guest.

firewalld failed to start because of timeout:

Aug 21 19:34:03 derya systemd[1]: firewalld.service: Start operation timed out. Terminating.
Aug 21 19:34:03 derya systemd[1]: firewalld.service: Failed with result 'timeout'.
Aug 21 19:34:03 derya systemd[1]: Failed to start firewalld - dynamic firewall daemon.

Perhaps the firewalld daemone is waiting for entropy that never comes?

As the the system doesn't have neither the RDRAND and RDSEED instructions nor the virtio-rng device, I think it there is no hardware entropy source.

The question should be asked to FIPS developers to know if FIPS can work without hardware entropy source.
[trying Tomáš Mráz as he has some fips-mode-setup BZ assigned to him...]

Comment 20 Tomas Mraz 2020-08-31 10:45:30 UTC
There must be some kind of entropy source otherwise yes, the initialization of various system components can take a very long time (until the kernel entropy pool is initialized for the first time).

So this is unfortunately expected outcome. The kernel in RHEL-8.3 should have jitter entropy generator built in, so that should eventually help to resolve the issue.

However the rngd should be able to provide jitter entropy generator and that should help in this situation even on RHEL-8.2. What rngd -l prints?

Comment 21 Laurent Vivier 2020-09-04 13:10:36 UTC
Abhishekh,

Could you provide the result of "rngd -l; echo $?" in the VM?

Thanks

Comment 22 Siddhant Rao 2020-09-09 09:21:21 UTC
Laurent, here is the output requested,

~~~
# rngd -l; echo $?
Entropy sources that are available but disabled

1: TPM RNG Device (tpm)

4: NIST Network Entropy Beacon (nist)

Available and enabled entropy sources:

5: JITTER Entropy generator (jitter)

1
~~~

Let me know if you need anything else

Comment 23 Laurent Vivier 2020-09-09 09:49:54 UTC
(In reply to Siddhant Rao from comment #22)
> Laurent, here is the output requested,
> 
> ~~~
> # rngd -l; echo $?
> Entropy sources that are available but disabled
> 
> 1: TPM RNG Device (tpm)
> 
> 4: NIST Network Entropy Beacon (nist)
> 
> Available and enabled entropy sources:
> 
> 5: JITTER Entropy generator (jitter)
> 
> 1

Tomas, what are your conclusion?

I think the problem is not related to qemu-kvm, I'd like to see this BZ assigned back to firewalld...

Comment 24 Tomas Mraz 2020-09-09 10:04:44 UTC
Yep, either there is a problem with firewalld or the jitter entropy generator in rngd does not work as expected. I would expect that the rngd jitter entropy generator should provide enough entropy to seed the kernel rng in less than 90 seconds (that is seen in the logs above).

Comment 27 Vladis Dronov 2020-09-29 20:57:13 UTC
could you please, run the following test on a problem machine (where entropy is not enough):

# ps -ef | grep rngd <== so we can check arguments rngd is running with on this system

# systemctl stop rngd

or:

# kill <rngd pid>

# /sbin/rngd -l

# /sbin/rngd -O jitter

/sbin/rngd -x 0 -x 1 -x 2 -x 3 -x 4 -x 6 -t <== let this run for, say 10 seconds then stop with Ctrl-C

please, provide the output of the above. please, ensure this is a problem machine showing issues wit firewalld and sshd.

Comment 28 Vladis Dronov 2020-09-29 20:59:26 UTC
before running "/sbin/rngd -l" above, please, double-check that "rngd" process is not running by:

# ps -ef | grep rngd

Comment 32 Vladis Dronov 2020-10-06 18:00:46 UTC
ok, so:

1) FIPS tests in FIPS mode must be run at a boot time and they drain a lot of entropy from a kernel pool. that's why FIPS case at boot is worse than non-FIPS case.

2) a jitter is not a magic infinite entropy source. more to that, a jitter is a slow entropy source (see bz1715899#c4):

> jitterentropy entropy source is meant to provide entropy early during boot, when other entropy sources aren't available, but is, by design slow, and cpu bound.

so the answer to why "the jitter entropy RNG in rngd is not collecting the entropy quickly enough" is "it is by design". this cannot be fixed.

3) an advice for fixing the case in this bz is to look into providing virtio-rng into a guest. that'll work as a good entropy source at VM boot.

4) (this is not related to this exact case, but still) set random.trust_cpu=on for the VM kernel, this will make kernel to use RDRAND instruction early at boot (before userspace, incl. rngd). a -8.3 kernel will make this setting a default, so no kernel parameter will be needed.

5) there is a kernel jitter patch (commit 50ee7529ec45 "random: try to actively add entropy rather than passively wait for it") developed recently in the upstream. it allows for a lot of entropy at boot. this patch was backported to -8.3 (commit edc3002c84e7, bz1778762) and it is being backported to -8.2.z (bz1884682). an advise is to use these kernels.

6) i've built a recent -8.2.z kernel with the above patch and it solves the issue in my test environment. a customer could use this test kernel as a band-aid:

Task info: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=31725829
http://brew-task-repos.usersys.redhat.com/repos/scratch/vdronov/kernel/4.18.0/193.25.1.el8_2.muchrng/

that's all i have to state for this case, please, update if any further concern.

Comment 35 Vladis Dronov 2020-11-26 18:08:08 UTC
no update for a month+, closing.


Note You need to log in before you can comment on or make changes to this bug.