Bug 115284 - (openssl or kernel) dovecot crashes with ssl error message
(openssl or kernel) dovecot crashes with ssl error message
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: openssl (Show other bugs)
1
All Linux
medium Severity medium
: ---
: ---
Assigned To: Tomas Mraz
:
Depends On:
Blocks: FC2Update
  Show dependency treegraph
 
Reported: 2004-02-10 02:53 EST by Daniel Hammer
Modified: 2007-11-30 17:10 EST (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-02-09 04:28:33 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Dovecot.conf for machine which crashed. (17.41 KB, text/plain)
2004-02-26 01:20 EST, Michael Koziarski
no flags Details
doveconf file (17.53 KB, text/plain)
2004-04-01 07:20 EST, Anders Nielsen
no flags Details
my dovecot.conf (619 bytes, text/plain)
2004-05-10 13:36 EDT, Steve Meyers
no flags Details
OpenSSL random-patch (1.03 KB, patch)
2004-05-19 08:09 EDT, Timo Sirainen
no flags Details | Diff

  None (edit)
Description Daniel Hammer 2004-02-10 02:53:34 EST
Description of problem:
Dovecot crashes with the following error message:

imap-login: RAND_bytes() failed: error:24064064:random number
generator:SSLEAY_RAND_BYTES:PRNG not seeded

and with the next login attempt from the same machine 

dovecot: Login process died too early - shutting down

The user is using "KMail" and checks every minute for mail.

Version-Release number of selected component (if applicable):
dovecot-0.99.10-6

How reproducible: Always
Comment 1 Jeremy Katz 2004-02-17 19:31:18 EST
I'm not seeing this at all.  Do you have anything away from the
defaults in your dovecot.conf?  
Comment 2 Michael Koziarski 2004-02-26 01:18:50 EST
I'm seeing this same error.  I've switched to ~/Maildir and postfix. 
But apart from that I have a standard setup.  I'll attach my dovecot.conf
Comment 3 Michael Koziarski 2004-02-26 01:20:03 EST
Created attachment 98061 [details]
Dovecot.conf for machine which crashed.

This crash happens very rarely.  There's some mention of it here:

http://dovecot.procontrol.fi/list/dovecot/2004-January/002838.html
Comment 4 Anders Nielsen 2004-03-30 07:50:42 EST
This happend to our dovecot server as well. See log below. As far as I
know we don't have any users using KMail.

imap-login: Mar 30 08:08:00 Fatal: RAND_bytes() failed:
error:24064064:random number generator:SSLEAY_RAND_BYTES:PRNG not seeded
dovecot: Mar 30 08:08:00 Error: Login process died too early -
shutting down
dovecot: Mar 30 08:08:00 Error: child 21354 (login) returned error 89
Comment 5 Anders Nielsen 2004-04-01 07:19:06 EST
Today this crash happend 2 times and counting :-/

I am attaching my dovecot.conf as well.

I see that the status is NEEDINFO - is there anything else I could do
to help get this fixed?

Comment 6 Anders Nielsen 2004-04-01 07:20:18 EST
Created attachment 99035 [details]
doveconf file
Comment 7 Trevor Cordes 2004-04-16 23:42:46 EDT
I too have had this problem occur recently on 2 different servers that
I just switched from stock imap to dovecot.  They happened 1 day apart
all within 3 days of switching to dovecot.  Another 3rd machine I
manage has been running dovecot for 1 week with the exact same
configuration with zero errors so far.

My dovecot.conf:
protocols = imap imaps pop3 pop3s
imap_listen = *
pop3_listen = *
ssl_cert_file = /usr/share/ssl/certs/dovecot.pem
ssl_key_file = /usr/share/ssl/private/dovecot.pem
login_dir = /var/run/dovecot-login
login = imap
login_process_per_connection = yes
login = pop3
first_valid_uid = 300
maildir_copy_with_hardlinks = yes
mbox_locks = fcntl
umask = 0027
auth = default
auth_mechanisms = plain
auth_userdb = passwd
auth_passdb = pam
auth_user = root
auth_username_chars =
abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ01234567890.-_@
Comment 8 Michael Koziarski 2004-04-21 16:38:20 EDT
It's happened again for me.   So that's a good 2 months since the last
time.

Apr 21 14:58:53 mithrandir imap-login: Login: michael [127.0.0.1]
Apr 21 14:58:53 mithrandir imap-login: RAND_bytes() failed:
error:24064064:random number generator:SSLEAY_RAND_BYTES:PRNG not seeded
Apr 21 14:58:53 mithrandir dovecot: Login process died too early -
shutting down
Apr 21 14:58:53 mithrandir dovecot: child 14926 (login) returned error 89
Comment 9 Robert Roselius 2004-04-27 19:18:55 EDT
Happening to me, on two servers, both Fedora Core 1.0.  About 20
users.  Works find most of the time, but dovecot crashes about once a
day, sometimes several times, sometimes not at all.  I'm running a
cron job every minute to check for dovecot and restart it as needed.
The sshd and mod_ssl seem to work fine.

I don't know SSL stuff very well, but it seems to be related to
/dev/urandom - entropy starvation maybe?

# cat /proc/version
Linux version 2.4.22-1.2115.nptlsmp (bhcompile@daffy.perf.redhat.com)
(gcc version 3.2.3 20030422 (Red Hat Linux 3.2.3-6)) #1 SMP Wed Oct 29
15:30:09 EST 2003
# rpm -q openssl
openssl-0.9.7a-23
# rpm -q dovecot
dovecot-0.99.10-6
Comment 10 Steve Meyers 2004-05-10 13:36:39 EDT
Created attachment 100137 [details]
my dovecot.conf

I've had the same problem using Fedora core 1.	My dovecot.conf is attached.

[root@mail log]# rpm -q dovecot openssl openssl096 glibc
dovecot-0.99.10-6
openssl-0.9.7a-33.10
openssl096-0.9.6-26
glibc-2.3.2-101.4
Comment 11 Rick Johnson 2004-05-10 14:37:49 EDT
I do not experience this using a Rawhide version dovecot-0.99.10.4-3
under either Red Hat 9 or Fedora Core 1, but did experience it using
the stock Fedora Core 1 package dovecot-0.99.10-6. I am also using SSL
connectivity.
Comment 12 Jonas Smedegaard 2004-05-10 15:04:18 EDT
Those of you experiencing problems: Do you have more than one version
of libssl installed? Strange crashes can occur if some
applications/libraries use 0.9.6 and others use 0.9.7 concurrently -
even if they are not linked together.
Comment 13 John Dennis 2004-05-10 15:47:52 EDT
Adding Nalin to CC as he is the openssl maintainer and I want his input. 

In dovecot this error seems to be generated here:
src/login-common/ssl-proxy-openssl.c line 447

	/* PRNG initialization might want to use /dev/urandom, make sure it
	   does it before chrooting. */
	if (RAND_bytes(&buf, 1) != 1)
		i_fatal("RAND_bytes() failed: %s\n", ssl_last_error());

RAND_bytes is in the openssl library and if this fails dovecot fails
so I don't think this is dovecot issue (or at least not yet).

I briefly looked at the RAND_bytes implemtation all the all places
where openssl generates "PRNG not seeded" error message but it was a
bit opaque to me, Nalin I believe you own openssl, are you familar
with this problem? Is there any reason to believe this is dovecot
related, or is this a general failure of openssl?
Comment 14 Daniel Hammer 2004-05-10 15:52:04 EDT
# rpm -qa | grep -i ssl | sort
openssl-0.9.7a-33.10
pyOpenSSL-0.5.1-11

There are just 
/lib/libssl.so.4
/lib/libssl.so.0.9.7a
with /lib/libssl.so.4 -> libssl.so.0.9.7a.

The RAND_bytes man page of the openssl-devel package says:

"RAND_bytes() puts num cryptographically strong pseudo-random bytes
into buf. An error occurs if the PRNG has not been seeded with enough
randomness to ensure an unpredictable byte sequence."

which seems to be the case here. Need some openssl expert on this
matter. I tried to strace the whole thing, but after more than 3 weeks
without the error I gave up. The above comment (written at the same
time a mine) seems to point the right direction (IMHO).
Comment 15 Warren Togami 2004-05-12 05:06:04 EDT
http://dovecot.fi/list/dovecot/2004-May/003314.html
http://dovecot.fi/list/dovecot/2004-May/003316.html
(You may want to read the entire thread though...)
Comment 16 Anders Nielsen 2004-05-12 05:25:04 EDT
I tried using the rawhide version dovecot-0.99.10.4-3. It crashed 2
times within the first hour which isn't unsual.

The I applied Timo's second patch (The one in 003316 above). After 24
hours no crashes :-)

Since I would expect 5-10 crashes on a working day this looks promising!



Comment 17 Warren Togami 2004-05-12 05:38:25 EDT
http://www.redhat.com/archives/fedora-test-list/2004-May/msg01364.html

dovecot-0.99.10.4-4 is what will be shipping in FC2, which contains
three patches from Timo during last week.  Unfortunately it seems at
least one maildir user had new problems with -4 while -3 was fine. 
Due the maildir crash issue, and the SSL crash issue, it appears that
we need to sort this stuff out and prepare a very well tested update
for FC2 later this month.
Comment 18 Warren Togami 2004-05-19 06:37:52 EDT
I personally have been using this on FC1 with SSL with perfect
stability through RH9 and FC1's lifetime.  But now I realized that I
am using my own custom vanilla upstream 2.4 kernel.  This supports
Timo's finding that this may be a problem with FC1's 2.4 kernel
/dev/urandom.  Nalin said something about running out of entropy.  Any
status update on this?
Comment 19 Anders Nielsen 2004-05-19 07:34:05 EDT
Well, the SSL related crash seems to be fixed with Timo's patch as I
noted in comment #16.

Now I have 8 days of uptime with it - it used to crash serveral times
a day.

It don't think the patch made it into the rpm yet though.
Comment 20 Warren Togami 2004-05-19 07:56:17 EDT
Timo, is that patch a proper general fix, or rather an ugly hack to
workaround Fedora's openssl or kernel problem?  Will future releases
of dovecot contain that patch?

Do you recommend FC's dovecot to be patched in that way?
Comment 21 Timo Sirainen 2004-05-19 08:09:03 EDT
It doesn't crash anymore, but I think it instead just fails SSL
connections since it doesn't have enough entropy.

I'm currently assuming this is all because Redhat kernel has some
/dev/urandom change that makes it possible that read()ing it returns
less bytes than requested. OpenSSL library then doesn't try reading
more and fails instead. So the problem could be fixed in either of them. 

Maybe OpenSSL library fix would be better as it's logic currently is a
bit broken.. There's this workaround: if (t.tv_usec == 10*1000)
t.tv_usec=0; which is triggered with Linux every time as select()
doesn't spend any time waiting for data and so tv_usec isn't updated.

How about this attached patch.
Comment 22 Timo Sirainen 2004-05-19 08:09:40 EDT
Created attachment 100326 [details]
OpenSSL random-patch
Comment 23 Tomas Mraz 2005-02-09 04:28:33 EST
I think the /dev/urandom change isn't in the current FC2/3 2.6.x
kernels so the patch is unnecessary.

Note You need to log in before you can comment on or make changes to this bug.