Red Hat Bugzilla – Bug 449247
SIGSEGV in dovecot when downloading messages due to corrup index
Last modified: 2008-06-27 11:54:02 EDT
Description of problem:
The dovecot message indexes become corrupt and cause the imap executable that is
part of the dovecot message to SEGV as can be seen in the following syslog output:
May 31 15:57:53 waterloo dovecot: child 9438 (imap) killed with signal 11
May 31 15:57:53 waterloo kernel: imap: segfault at 8 ip 080a32fd sp
bfaa5a80 error 4 in imap[8048000+95000]
During this failure, I did not see this message, but there are several entries
in syslog like
May 29 23:40:47 waterloo dovecot: IMAP(crow): Corrupted index cache file
/home/crow/.imapIndexes/.imap/INBOX/dovecot.index.cache: invalid record size
Whatever is corrupt is only aggravated when I connect via my Samsung Blackjack
running Windows Mobile 6.0. Using Thunderbird on Windows Vista, Windows XP, or
Fedora 9 does not cause the problem.
To recover, I can remove the dovecot indexes and things will work for a few days.
My mail_location definition in /etc/dovecot.conf is
mail_location = mbox:~/.imap/:INBOX=/var/spool/mail/%u:INDEX=~/.imapIndexes
Version-Release number of selected component (if applicable):
When it happens is unpredictable, but it will always eventually fail. Typically
it takes 2-3 days. I cannot make the indexes become corrupt on demand, though.
I've been running this same dovecot configuration in Fedora 7 for quite a while
without issue. This just started happening when I upgraded to Fedora 9.
Steps to Reproduce:
1. Use various imap clients to connect to dovecot.
2. Continue this for 2-3 days.
3. Notice that synchronization using the Windows Mobile imap client stops
working due to a SEGV in the dovecot imap executable.
I used rawlog to see the IMAP protocol being used by the Windows Mobile client.
The client performs a NAMESPACE command and several LIST commands to fetch the
folder names, then for each folder selected for synchronization, it does a
A67 SELECT "INBOX"
A68 FETCH 1:166 (INTERNALDATE UID FLAGS RFC822.SIZE BODY.PEEK[HEADER.FIELDS
(DATE FROM SUBJECT MESSAGE-ID CONTENT-TYPE X-MS-TNEF-Correlator CONTENT-CLASS
IMPORTANCE PRIORITY X-PRIORITY)] BODYSTRUCTURE)
This succeeds for one mailbox and then fails for my INBOX. It downloaded 151 of
the 166 messages per the rawlog output log.
I'm a software developer in my day job, so I'd be happy to debug, but I wasn't
sure the best way to insert a debugger into the executable chain since the imap
executable is launched internally.
You should be able to attach the the debugger to the imap process that is
processing your requests. For the first attempt you should let the system
generate a core dump that can be analyzed after the crash.
So you need to
- enable coredump creating - edit /etc/init.d/dovecot and put somewhere at
beginning a line with "DAEMON_COREFILE_LIMIT=unlimited", then restart the
- install the debuginfo with "yum --enablerepo=fedora-debuginfo install
After the crash you should be able to run "gdb /usr/libexec/dovecot/imap
/path/to/coredump/file". You should see the filename of the coredump in the logs.
I've been unable to get a core dump of the imap process. I set the core_pattern
to make sure that the process would be able to write the core and validated that
the ulimit was set as triggered by the DAEMON_COREFILE_LIMIT setting, but still
A 'kill -ABRT <dovecotPID>' did generate a core file as expected for the dovecot
process, so I think the configuration is correct, but no such luck for the child
I did find the gdbhelper wrapper in the dovecot distribution and was able to get
a stack trace. I will attach the gdbhelper output file to this bug.
Created attachment 308874 [details]
Output from dovecot gdbhelper wrapper when used to wrap the imap executable.
Could you try to update dovecot to current stable version 1.0.14 with "yum
--enablerepo=updates-testing,updates-testing-debuginfo update dovecot"? Just to
be sure that this issue wasn't fixed there :-)
Moving to 1.0.14 caused the problem to go away, but when I went back to 1.0.13,
it didn't happen any more either, so I'm not sure I've proved anything.
Looking at the dovecot changelog for 1.0.14 (specifically the changeset
documented at http://hg.dovecot.org/dovecot-1.0/rev/538f8892a2f1), it is likely
that the problem is fixed.
I'll run with 1.0.14 for a while (it typically takes 3-7 days for the index
cache to get corrupted again) and see what happens. You can move this bug back
to the "waiting on me" state.
Thanks for your help.
Ok, let me know both the good and bad news :-)
So what are the results?
I'm still getting corrupted index cache files, but I have not received a
segfault due to it since installing 1.0.14-8.fc9.
Feel free to close this bug. Thanks!
OK, closing the bug. And you can open corrupted index cache files on the
dovecot's mailing list to get into the direct contact with author.