Bug 1464200

Summary: rngd: "read error" warning when attempting to access nonexistent device
Product: Red Hat Enterprise Linux 7 Reporter: Vilém Maršík <vmarsik>
Component: rng-toolsAssignee: Neil Horman <nhorman>
Status: CLOSED ERRATA QA Contact: Vilém Maršík <vmarsik>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.4CC: cuilj2, emcnabb, hannsj_uhl, iav, jbastian, liu.junbj, mdshaikh, nhorman, steved, vagrawal, vmarsik, yozone
Target Milestone: rc   
Target Release: 7.5   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-10 10:09:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1420851, 1442258, 1473033    
Attachments:
Description Flags
test patch none

Description Vilém Maršík 2017-06-22 15:52:21 UTC
This was shown during testing of Bug #1421234. 

Description of problem:

# rngd -f
read error
...
^C

# strace -f rngd -f
...
open("/dev/hwrng", O_RDONLY)            = 3
read(3, 0x7fff6adc3880, 16)             = -1 ENODEV (No such device)
write(2, "read error\n", 11read error
)            = 11
...

Not sure what part is to blame - either kernel (/dev/hwrng can be opened, but returns -ENODEV on read attempt), or rng-tools (rngd prints "read error" on access to a nonexisting device).

Version-Release number of selected component (if applicable):
rng-tools-5-11.el7.x86_64
kernel is 3.10.0-685.el7.x86_64

How reproducible:
100% (may require no HW entropy source to be present)

Steps to Reproduce: see above

Actual results:
read error


Expected results:
no read error message unless a real read error happens

Additional info:
Developer thinks in https://bugzilla.redhat.com/show_bug.cgi?id=1421234#c52 that rngd should error and exit on no HW entropy source. However, the machine used here has RDSEED CPU extension (Xeon E3-1285L v4 @ 3.4GHz), so this does not apply (and the daemon does not auto-exit either). The error message is still shown.

Comment 2 Neil Horman 2017-06-22 17:17:54 UTC
this isn't going to make 7.4 at this point, moving to 7.5

Comment 3 Neil Horman 2017-06-22 17:31:06 UTC
You're initial comment is convoluting several items 

1) The "read error" message will appear now anytime the hwrng is opened, if a backing hardware rng is not present (which seems to be the case on this system).  Note the presence of the RDSEED extensions does not constitue the availability of a hwrng (at least not by default, it may be possible to tell the kernel to use RDSEED as a hwrng, but that doesn't happen by default)

2) The lack of exit is expected here, because the use of RDSEED is independently configured by rngd.  That is to say, it will both look for RDSEED availability and /dev/hwrng as rng sources, along with a tpm.  Only if none of those are available, will rngd exit for lack of an entropy source.  Since you have RDSEED availability, it should run fine, which it sounds like it is.  If you want to observe it exiting, run rngd with the -d option.

All in all this sounds like its working as designed, with the possible exception of the read error on attempting to setup a non-existant /dev/hwrng.  Do you just want that error removed?

Comment 4 Evan McNabb 2017-06-23 16:52:02 UTC
Hi Neil,

I'll throw in my unsolicited 2 cents... I think perhaps a more detailed error message (or messages? if we want different scenarios to report different errors) might be good here.

Comment 5 Vilém Maršík 2017-07-10 12:17:54 UTC
This is how I understand it:
Is there a good reason that open() on /dev/hwrng succeeds, but read() attempts return -ENODEV?
Yes -> please remove the rngd error message in that case, as this is expected behavior
No -> please fix the device, so that -ENODEV is returned on open(); then let's re-test how rngd handles that

Comment 6 Neil Horman 2017-07-10 14:49:29 UTC
In answer to your first question, yes, because the underlying hwrng can be altered at run time, and so the device may change at any time, up until a user space process opens it and takes a reference to it.  However, removing the error isn't valid as a lack of hwrng implementation isn't something that can be recovered from dynamically (i.e. we should inform the user and exit).

If we remove the error, we remove the error in all cases, even when its legitimate (i.e. we later find we also dont have rdseed and exit), implying silent failure.  It also eliminates the possibility of detecting errors at run time, if the hwrng fails at some later point in its use.

About the only thing we can really do here is eliminate the error message, and only print it after we init the entropy source successfully, and then encounter an error during normal operation, but it involves changing some of our fips initialization code, and I'm hesitant to do that as I don't know what the impact will be on fips certification.

We can try it for the purpose of testing, but I'm not sure it will be a legitimate solution

Comment 7 Neil Horman 2017-07-10 15:34:25 UTC
Created attachment 1295857 [details]
test patch

Comment 8 Vilém Maršík 2017-07-10 16:36:50 UTC
Yes, it would be better if rngd matched kernel's logic and checked for appearing and disappearing entropy sources on an open hwrng device. But if we cannot implement this, then your patch is much better than nothing, as nonexisting HW entropy source won't mix with the other "read error"s anymore. Thanks, will test the patch.

Comment 9 Vilém Maršík 2017-07-10 19:17:56 UTC
What more info are you requesting?

Comment 10 Neil Horman 2017-07-10 19:23:29 UTC
the results of the test, please, i've not had time to run it yet.

Comment 11 Vilém Maršík 2017-07-11 14:11:19 UTC
Looks good:

# rngd -f
hwrng: no available rng
Unable to open file: /dev/tpm0
^C

Comment 12 Neil Horman 2017-07-11 16:53:16 UTC
ok, so at this point I guess the thing to do is wait for pm ack so they can decide what the fips impact is

Comment 16 Vilém Maršík 2017-07-12 00:08:06 UTC
Did no vmware testing. Not sure if you have the same issue, or another read error. Perhaps you could find something interesting in "strace -f rngd -f" ?

Comment 17 Neil Horman 2017-07-12 10:51:37 UTC
the "read error" is codified within rngd itself.  The patch here will fix the vmware issue as well.  The only culpability the kernel has here is triggering when rngd triggers the read error, and is doing so validly.

Comment 18 Vishal Agrawal 2017-07-12 17:34:24 UTC
(In reply to Neil Horman from comment #17)
> the "read error" is codified within rngd itself.  The patch here will fix
> the vmware issue as well.  The only culpability the kernel has here is
> triggering when rngd triggers the read error, and is doing so validly.

Thanks Neil, I would be waiting for patch.

- Vishal Agrawal.

Comment 22 Steve Dickson 2017-11-28 19:08:13 UTC
*** Bug 1518376 has been marked as a duplicate of this bug. ***

Comment 23 Vilém Maršík 2017-12-15 10:21:18 UTC
I had to install 7.3 to reproduce the original behavior - "read error" in loop multiple times, no other output. 7.4 prints the error just once, even on the same machine, together with extra information about entropy sources it tries to access, 7.5 does not print the error, just even more extra information:

RHEL 7.3:
[root@dell-per210-01 ~]# rpm -q rng-tools
rng-tools-5-8.el7.x86_64
[root@dell-per210-01 ~]# rngd -f
read error

read error
...

RHEL 7.4:
[root@dell-per210-01 ~]# rpm -q rng-tools
rng-tools-5-11.el7.x86_64
[root@dell-per210-01 ~]# rngd -f
read error

hwrng: no available rng
Unable to open file: /dev/tpm0
can't open any entropy source
Maybe RNG device modules are not loaded


RHEL 7.5 Beta:
[root@dell-per210-01 ~]# rpm -q rng-tools
rng-tools-5-13.el7.x86_64
[root@dell-per210-01 ~]# rngd -f
Failed to init entropy source 0: Hardware RNG Device

Failed to init entropy source 1: TPM RNG Device

Failed to init entropy source 2: Intel RDRAND Instruction RNG

can't open any entropy source
Maybe RNG device modules are not loaded


----

Not sure why I could not reproduce the original behavior on 7.4, just on 7.3 . I might had been using a 7.4 beta version of rng-tools that got patched before it got released as final 7.4, but did not find the corresponding patch in git. Anyway, this is not that important, as the problem is clearly fixed in 7.5, and the error message does not show anymore.

Setting verified.

Comment 27 errata-xmlrpc 2018-04-10 10:09:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0698

Comment 29 Red Hat Bugzilla 2023-09-15 00:02:41 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days