Red Hat Bugzilla – Bug 117218
Entropy pool not updated (/dev/random blocks)
Last modified: 2007-11-30 17:07:00 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5)
Description of problem:
We have a number of Dell PowerEdge 1750s. All are configured
On one machine we've gotten it to a state where the entropy pool is
never updated so /dev/random always blocks on reads. No process that
I can see is reading /dev/random. /dev/random has worked on this
machine in the past. We have an identically configured machine where
/dev/random will return random bytes. It does not help to do disk
accesses or use the keyboard & mouse.
I have not tried rebooting yet, my guess is it will fix it?
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. cat /dev/random | od
2. wiggle mouse, type on keyboard, do disk accesses
Actual Results: no output from cat
Expected Results: on an identically configured machine random bytes
All the machines have SCSI drives.
We had to reboot the machine today. /dev/random is now fine.
Just had the same problem on a Dell Poweredge 2550 with an LSI RAID
card. Kernel kernel-smp-2.4.21-9.EL. Rebooting corrected it as well.
"service random stop; service random start" did not help.
/dev/urandom does not block.
It happened again. Reboot fixed it again. Kernel 9.0.1-smp this
time. This is causing downtime on a production server. What should I
look for if (when) it happens again to provide additional information?
*** Bug 101266 has been marked as a duplicate of this bug. ***
Reproduced on 2.4.21-9 (9.0.1-smp) on two different machines
(supermicro p4 dual-xeon with HT with qlogic2300 HBAs,
supermicro p4 dual-xeon with HT with 3ware IDE RAID)
also have a look here. seems like a similar problem, here on a
AFAICT /dev/random running out of entropy is expected behaviour. It
needs input for new entropy. This is why /dev/urandom is there.
I would say NOTABUG, but wont close as I am not 100% sure.
man 4 random states:
When the entropy pool is empty, reads from /dev/random will block
until additional environmental noise is gathered.
I.e. known behaviour. Closing NOTABUG.
I disagree. It is true that /dev/random __should__ block until there
is sufficient entropy. However, the problem is that the entropy pool
is __never__ refilled. Moving the mouse, typing on the keyboard and
disk activity should all refill the pool and you should get bytes out
of /dev/random at that point. On my system this never happened.
Also nothing was reading /dev/random, so nothing was draining the
I believe this is a bug?
What aaron said. The problem is that it blocks forever. It does not
gather additional environmental noise and pass it to the app trying to
do the read.
This is undoubtedly a bug and will be fixed in RHEL3 U3. I have
back-ported changes from 2.6 which will be committed after U3 opens.
The problem is that critical data structures are completely unguarded
by locks and consequently end up in a state in which no entropy is
generated on SMP systems. Has anyone reproduced this problem on a
Sorry for this. I got confused by the fact that bug 118921, which
refereneces this bug, only states that /dev/random blocks when it runs
out of entropy. That is expected behaviour.
Instead of closing this bug I should have looked more closely before
doing so. Different reporters have different issues. Again, sorry. It
hopefully won't happen again.
The problem of not having the entropy pool fill again occured for me
on a uniprocessor machine, so this is not (exclusively) SMP-related.
Maybe there are two issues with the same effect?
Yes, I believe that missed wakeups are also a possibility. However,
I have had no success reproducing this on any in-house machine other
than production servers. Consequently, I am making a test kernel
available for testing on my Red Hat "people" page. The URL is
http://people.redhat.com/~mdewand/.dev_random/. Here you will
find the following two choices for download:
These kernels contain changes to the /dev/random driver back-ported
from 2.6. I would appreciate any feedback that anyone can provide
regarding their experiences with either of these kernels.
*** Bug 119526 has been marked as a duplicate of this bug. ***
Thanks for including my bug in this sorry I missed it when I searched
for it. Just a reminder I see this on RH 8 systems as well, though I'm
aware that they are out of support. So this problem probably exists in
other kernels as well.
Any feedback regarding the RPMs I posted a week ago?
Sorry, currently have no chance to reproduce this on a server, since
they are production and I don't have adequat test-equipment here at
They were installed on the server that had the problem last week
(wednesday, I think). So far so good, but the problem was rare enough
that it will be a few weeks before I can comfortably say it's fixed.
If it does turn out to be a fix, can you provide patched kernels for
any kernel updates until it's included in U3?
Hi Stefan H.,
could you maybe do/try some stresstesting? I'm thinking about reading
from /dev/random to /dev/null until it's empty or so. It should imho
be possible to read faster from /dev/random than the entropy-pool can
fill up again. And then we could see if at the point where the pool
is exhausted new entropy is still gennerated.
by the way: You're also running it on a server without kbd/mouse? And
does the disk have few/high hdd-activity?
Sorry for not following up on this sooner. We have tested /dev/random
a number of times over the last few weeks as you describe, and the
entropy pool always fills back up correctly now after being exhausted.
The server has a keyboard and mouse attached through a KVM, but it is
not selected most of the time - someone logs into it for a few minutes
every couple days on average.
Did this patch make it into 9.0.3? Or do we need to wait for RHEL3-U3
to get it in the mainline kernel?
The fixes for this problem that Mark DeWandel back-ported from 2.6
have just been committed to the RHEL3 U3 patch pool this evening
(in kernel version 2.4.21-15.3.EL).
Stefan, just to clarify, the fix did *not* make it into -9.0.3.EL
nor into -15.EL (the U2 kernel). Thus, the first officially
supported RHEL3 kernel with the fix will be the U3 kernel.
Will these fixes also soon be ported over to Fedora? Anything known
about their next kernel-release that might include this?
Thank you guys for taking this bug seriously!
Stefan, my understanding is that the fixes came from 2.6, which
is what Fedora (as of FC2) is based on. So I'd guess the fixes
are there already. If you need me to check out a specific FC
kernel version to verify that the fixes are contained there,
please let me know. I'll attach the RHEL3 U3 patch that I
committed last night in the next comment for reference.
Created attachment 100157 [details]
/dev/random driver fixes committed in RHEL3 U3
An errata has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.
Thank you very much. Does somebody know when ports of these fixes
will occur in Fedora Core 2?
Stefan, your question in comment #30 are already answered in comment #27.
Still seeing this on 2.4.21-20.ELsmp - the errata suggests it's been
fixed and yet on a server with no keyboard and mouse the /dev/random
device can produce no output for minutes at a time. This is causing
all java JINI services to hang on startup and some java SSL services.
The only safe workaround for JINI is
mknod -m 0444 /dev/random c 1 9
as suggested in http://linux.about.com/od/commands/l/blcmdl4_random.htm
Comment 32: Jos, please see comment 8, 9 and 10.
(In reply to comment #33)
> Comment 32: Jos, please see comment 8, 9 and 10.
Jos I indeed read comments #8, to comment #10 but could not find the answer to
Jos: is the fix included in RHEL3 U3 and attached in comment #28 makes use of
/dev/random possible is a headless server such as the ones in most data centers
(i.e. without mouse and keyword).
Was the fix able to include other environmental data such as interrupts or
hardware specific data such as hard disk statistics (BTW more details about how
the fix works would certainly help in understanding how the bug was fixed)?
Is the workaround of using /dev/urandom instead still necessary on headless
computers running RHEL3 U3?
Sorry, it seems that while adding myself to the CC list I have by mistake
changed the status of this bug, which was not my intention. I therefore tried to
put it back to the previous state left by "John Flanagan on 2004-09-02 00:31
EST", i.e. "CLOSED ERRATA", but was refused permission to do so.
But I would still appreciate details about how this bug was fixed.
Hello, Guillaume. From reading the patch in comment #28, it looks like
the fixes were oriented around sleep/wakeup synchronization (as opposed
to incorporating new sources of randomness). Unfortunately, the person
who did this work is no longer here. Sorry I'm not able to get better
answers for you.