Bug 117218 - Entropy pool not updated (/dev/random blocks)
Entropy pool not updated (/dev/random blocks)
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
i686 Linux
medium Severity medium
: ---
: ---
Assigned To: Ernie Petrides
Depends On:
  Show dependency treegraph
Reported: 2004-03-01 13:44 EST by Aaron Straus
Modified: 2007-11-30 17:07 EST (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2005-07-22 16:42:47 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
/dev/random driver fixes committed in RHEL3 U3 (15.44 KB, patch)
2004-05-11 14:40 EDT, Ernie Petrides
no flags Details | Diff

  None (edit)
Description Aaron Straus 2004-03-01 13:44:24 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5)
Gecko/20040130 Firebird/0.7

Description of problem:
We have a number of Dell PowerEdge 1750s.   All are configured

On one machine we've gotten it to a state where the entropy pool is
never updated so /dev/random always blocks on reads.  No process that
I can see is reading /dev/random.  /dev/random has worked on this
machine in the past.  We have an identically configured machine where
/dev/random will return random bytes.   It does not help to do disk
accesses or use the keyboard & mouse.

I have not tried rebooting yet, my guess is it will fix it?

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.  cat /dev/random | od
2.  wiggle mouse, type on keyboard, do disk accesses

Actual Results:  no output from cat

Expected Results:  on an identically configured machine random bytes
are printed

Additional info:
All the machines have SCSI drives.
Comment 2 Aaron Straus 2004-03-06 13:04:03 EST
We had to reboot the machine today.  /dev/random is now fine.
Comment 3 Stefan Hudson 2004-03-11 11:55:09 EST
Just had the same problem on a Dell Poweredge 2550 with an LSI RAID
card.  Kernel kernel-smp-2.4.21-9.EL.  Rebooting corrected it as well.

Additional information:
"service random stop; service random start" did not help.
/dev/urandom does not block.
Comment 4 Stefan Hudson 2004-03-16 18:45:27 EST
It happened again.  Reboot fixed it again.  Kernel 9.0.1-smp this
time.  This is causing downtime on a production server.  What should I
look for if (when) it happens again to provide additional information?
Comment 5 Mark DeWandel 2004-03-17 09:47:50 EST
*** Bug 101266 has been marked as a duplicate of this bug. ***
Comment 6 yuval yeret 2004-03-22 04:48:21 EST
Reproduced on 2.4.21-9 (9.0.1-smp) on two different machines 
(supermicro p4 dual-xeon with HT with qlogic2300 HBAs, 
 supermicro p4 dual-xeon with HT with 3ware IDE RAID)

Comment 7 Stefan Neufeind 2004-03-22 15:36:18 EST
also have a look here. seems like a similar problem, here on a 
Comment 8 Leonard den Ottolander 2004-03-25 14:53:25 EST
AFAICT /dev/random running out of entropy is expected behaviour. It
needs input for new entropy. This is why /dev/urandom is there.

I would say NOTABUG, but wont close as I am not 100% sure.
Comment 9 Leonard den Ottolander 2004-03-25 15:04:03 EST
man 4 random states:

When the entropy pool is empty, reads from /dev/random will block
until additional environmental noise is gathered.

I.e. known behaviour. Closing NOTABUG.
Comment 10 Aaron Straus 2004-03-25 15:29:24 EST
I disagree.  It is true that /dev/random __should__ block until there
is sufficient entropy.  However, the problem is that the entropy pool
is __never__ refilled.  Moving the mouse, typing on the keyboard and
disk activity should all refill the pool and you should get bytes out
of /dev/random at that point.  On my system this never happened.

Also nothing was reading /dev/random, so nothing was draining the
entropy pool.  

I believe this is a bug?
Comment 11 Elliot Lee 2004-03-25 15:42:24 EST
What aaron said. The problem is that it blocks forever. It does not
gather additional environmental noise and pass it to the app trying to
do the read.
Comment 12 Mark DeWandel 2004-03-25 15:58:46 EST
This is undoubtedly a bug and will be fixed in RHEL3 U3.  I have
back-ported changes from 2.6 which will be committed after U3 opens.
The problem is that critical data structures are completely unguarded
by locks and consequently end up in a state in which no entropy is
generated on SMP systems.  Has anyone reproduced this problem on a
Comment 13 Leonard den Ottolander 2004-03-25 17:24:01 EST
Sorry for this. I got confused by the fact that bug 118921, which
refereneces this bug, only states that /dev/random blocks when it runs
out of entropy. That is expected behaviour.

Instead of closing this bug I should have looked more closely before
doing so. Different reporters have different issues. Again, sorry. It
hopefully won't happen again.
Comment 14 Stefan Neufeind 2004-03-25 18:02:07 EST
The problem of not having the entropy pool fill again occured for me 
on a uniprocessor machine, so this is not (exclusively) SMP-related. 
Maybe there are two issues with the same effect?
Comment 15 Mark DeWandel 2004-03-30 13:41:49 EST
Yes, I believe that missed wakeups are also a possibility.  However,
I have had no success reproducing this on any in-house machine other
than production servers.  Consequently, I am making a test kernel
available for testing on my Red Hat "people" page.  The URL is
http://people.redhat.com/~mdewand/.dev_random/.  Here you will
find the following two choices for download:


These kernels contain changes to the /dev/random driver back-ported
from 2.6.  I would appreciate any feedback that anyone can provide
regarding their experiences with either of these kernels.
Comment 16 Ernie Petrides 2004-03-31 17:35:43 EST
*** Bug 119526 has been marked as a duplicate of this bug. ***
Comment 17 Jim Richard 2004-04-04 21:02:12 EDT

Thanks for including my bug in this sorry I missed it when I searched
for it. Just a reminder I see this on RH 8 systems as well, though I'm
aware that they are out of support. So this problem probably exists in
other kernels as well. 

Comment 18 Mark DeWandel 2004-04-06 08:26:47 EDT
Any feedback regarding the RPMs I posted a week ago?
Comment 19 Stefan Neufeind 2004-04-06 09:45:03 EDT
Sorry, currently have no chance to reproduce this on a server, since 
they are production and I don't have adequat test-equipment here at 
the moment.
Comment 20 Stefan Hudson 2004-04-06 11:33:47 EDT
They were installed on the server that had the problem last week
(wednesday, I think).  So far so good, but the problem was rare enough
that it will be a few weeks before I can comfortably say it's fixed.

If it does turn out to be a fix, can you provide patched kernels for
any kernel updates until it's included in U3?
Comment 21 Stefan Neufeind 2004-04-06 11:44:10 EDT
Hi Stefan H.,

could you maybe do/try some stresstesting? I'm thinking about reading 
from /dev/random to /dev/null until it's empty or so. It should imho 
be possible to read faster from /dev/random than the entropy-pool can 
fill up again. And then we could see if at the point where the pool 
is exhausted new entropy is still gennerated.

by the way: You're also running it on a server without kbd/mouse? And 
does the disk have few/high hdd-activity?
Comment 23 Stefan Hudson 2004-05-10 18:21:40 EDT
Sorry for not following up on this sooner.  We have tested /dev/random
a number of times over the last few weeks as you describe, and the
entropy pool always fills back up correctly now after being exhausted.

The server has a keyboard and mouse attached through a KVM, but it is
not selected most of the time - someone logs into it for a few minutes
every couple days on average.

Did this patch make it into 9.0.3?  Or do we need to wait for RHEL3-U3
to get it in the mainline kernel?
Comment 24 Ernie Petrides 2004-05-11 00:01:21 EDT
The fixes for this problem that Mark DeWandel back-ported from 2.6
have just been committed to the RHEL3 U3 patch pool this evening
(in kernel version 2.4.21-15.3.EL).
Comment 25 Ernie Petrides 2004-05-11 00:16:19 EDT
Stefan, just to clarify, the fix did *not* make it into -9.0.3.EL
nor into -15.EL (the U2 kernel).  Thus, the first officially
supported RHEL3 kernel with the fix will be the U3 kernel.
Comment 26 Stefan Neufeind 2004-05-11 01:21:44 EDT
Will these fixes also soon be ported over to Fedora? Anything known 
about their next kernel-release that might include this?

Thank you guys for taking this bug seriously!
Comment 27 Ernie Petrides 2004-05-11 14:38:10 EDT
Stefan, my understanding is that the fixes came from 2.6, which
is what Fedora (as of FC2) is based on.  So I'd guess the fixes
are there already.  If you need me to check out a specific FC
kernel version to verify that the fixes are contained there,
please let me know.  I'll attach the RHEL3 U3 patch that I
committed last night in the next comment for reference.
Comment 28 Ernie Petrides 2004-05-11 14:40:05 EDT
Created attachment 100157 [details]
/dev/random driver fixes committed in RHEL3 U3
Comment 29 John Flanagan 2004-09-02 00:31:06 EDT
An errata has been issued which should help the problem 
described in this bug report. This report is therefore being 
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, 
please follow the link below. You may reopen this bug report 
if the solution does not work for you.

Comment 30 Stefan Neufeind 2004-09-02 01:40:20 EDT
Thank you very much. Does somebody know when ports of these fixes 
will occur in Fedora Core 2?
Comment 31 Leonard den Ottolander 2004-09-02 06:24:21 EDT
Stefan, your question in comment #30 are already answered in comment #27.
Comment 32 Jos Martin 2004-11-25 13:39:27 EST
Still seeing this on 2.4.21-20.ELsmp - the errata suggests it's been
fixed and yet on a server with no keyboard and mouse the /dev/random
device can produce no output for minutes at a time. This is causing
all java JINI services to hang on startup and some java SSL services.
The only safe workaround for JINI is 

rm /dev/random
mknod -m 0444 /dev/random c 1 9

as suggested in http://linux.about.com/od/commands/l/blcmdl4_random.htm
Comment 33 Leonard den Ottolander 2004-11-25 16:45:43 EST
Comment 32: Jos, please see comment 8, 9 and 10.
Comment 34 Guillaume Berche 2005-07-22 07:58:30 EDT
(In reply to comment #33)
> Comment 32: Jos, please see comment 8, 9 and 10.

Jos I indeed read comments #8, to comment #10 but could not find the answer to
Jos: is the fix included in RHEL3 U3 and attached in comment #28 makes use of
/dev/random possible is a headless server such as the ones in most data centers
(i.e. without mouse and keyword).

Was the fix able to include other environmental data such as interrupts or
hardware specific data such as hard disk statistics (BTW more details about how
the fix works would certainly help in understanding how the bug was fixed)?

Is the workaround of using /dev/urandom instead still necessary on headless
computers running RHEL3 U3?
Comment 35 Guillaume Berche 2005-07-22 09:54:56 EDT
Sorry, it seems that while adding myself to the CC list I have by mistake
changed the status of this bug, which was not my intention. I therefore tried to
put it back to the previous state left by "John Flanagan on 2004-09-02 00:31
EST", i.e. "CLOSED ERRATA", but was refused permission to do so.

But I would still appreciate details about how this bug was fixed.
Comment 36 Ernie Petrides 2005-07-22 16:42:47 EDT
Hello, Guillaume.  From reading the patch in comment #28, it looks like
the fixes were oriented around sleep/wakeup synchronization (as opposed
to incorporating new sources of randomness).  Unfortunately, the person
who did this work is no longer here.  Sorry I'm not able to get better
answers for you.

Reclosing bug.

Note You need to log in before you can comment on or make changes to this bug.