Bug 475848 - Frequent "Next message unexpectedly lost" errors in maillog
Summary: Frequent "Next message unexpectedly lost" errors in maillog
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: dovecot
Version: 10
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Michal Hlavinka
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-12-10 19:17 UTC by Ben Webb
Modified: 2009-03-11 12:39 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-03-11 12:39:34 UTC
Type: ---


Attachments (Terms of Use)

Description Ben Webb 2008-12-10 19:17:29 UTC
Description of problem:
Since upgrading to F10, I'm seeing a few "Next message unexpectedly lost" errors in maillog. This generally results in an IMAP panic and users complaining about various problems reported by their mail clients (e.g. message size shrinking when saving to a mailbox). Looking back through maillogs for F9, I don't see any of these messages.

Version-Release number of selected component (if applicable):
dovecot-1.1.7-1.fc10.i386

How reproducible:
Occurs a few times a day.

Steps to Reproduce:
1. Run dovecot as an IMAPS server.
2. Have users check their mail...
  
Actual results:
Errors such as the following in /var/log/maillog for server foo and user bar:
Dec 10 10:52:13 foo dovecot: IMAP(bar): Next message unexpectedly lost from 1023923
Dec 10 10:52:13 foo dovecot: IMAP(bar): Next message unexpectedly lost from 1023923
Dec 10 10:52:13 foo dovecot: Panic: IMAP(bar): file istream.c: line 81 (i_stream_read): assertion failed: (_stream->skip != _stream->pos)
Dec 10 10:52:13 foo dovecot: IMAP(bar): Raw backtrace: imap [0x80d51b1] -> imap [0x80d5232] -> imap [0x80d4b69] -> imap(i_stream_read+0xca) [0x80d9b3a] -> imap [0x80dbbbe] -> imap(i_stream_read+0x31) [0x80d9aa1] -> imap [0x80cadbe] -> imap(i_stream_read+0x31) [0x80d9aa1] -> imap(i_stream_read_data+0x42) [0x80d9be2] -> imap(message_get_body_size+0x5c) [0x80cfe1c] -> imap(index_mail_init_stream+0x114) [0x8098784] -> imap [0x807f8f6] -> imap(index_mail_get_virtual_size+0x70) [0x8097430] -> imap [0x80639e8] -> imap(imap_fetch+0x140) [0x8062770] -> imap(cmd_fetch+0x241) [0x805bee1] -> imap [0x80602ec] -> imap [0x8060383] -> imap [0x806093d] -> imap(client_input+0x5e) [0x8060dbe] -> imap(io_loop_handler_run+0xe0) [0x80dd0d0] -> imap(io_loop_run+0x20) [0x80dc550] -> imap(main+0x46a) [0x8068f1a] -> /lib/libc.so.6(__libc_start_main+0xe5) [0x7f36d5] -> imap [0x805a371]
Dec 10 10:52:13 foo dovecot: child 2786 (imap) killed with signal 6


Expected results:
No IMP panics or messages lost.

Additional info:

Comment 1 Michal Hlavinka 2008-12-11 07:19:43 UTC
can you provide me output of this:
ls -l /etc/dovecot.conf

and

ls -l /usr/libexec/dovecot/deliver

thanks

If dovecot.conf doesn't have 0644 permissions, setting dovecot.conf to 0644 will most probably fix your problem.

Comment 2 Ben Webb 2008-12-11 07:26:05 UTC
# ls -l /etc/dovecot.conf
-rw-r--r-- 1 root root 46585 2008-12-08 22:38 /etc/dovecot.conf
# ls -l /usr/libexec/dovecot/deliver 
-rwxr-xr-x 1 root root 828788 2008-12-02 09:06 /usr/libexec/dovecot/deliver

Surely if dovecot did not have 0644 permissions, nothing would work at all? 99% of our mail can be accessed without any problems - the IMAP panics happen only from time to time.

I have tried deleting the dovecot indexes (in ~/mail/.imap) to see if that clears up the problem - these IMAP panics only appear to happen thus far for a small number of our users (not all of them) - and will update if this appears to fix things.

Comment 3 Michal Hlavinka 2008-12-11 07:30:12 UTC
without 0644 permissions 'dovecot' will work, but 'deliver' (local mail delivering agent) will not

Comment 4 Michal Hlavinka 2008-12-11 08:43:16 UTC
Try to get coredump and upload it here, please.

1) install dovecot-debuginfo package:
yum --enablerepo=updates-debuginfo --enablerepo=fedora-debuginfo install dovecot-debuginfo

2) set mail_drop_priv_before_exec = yes in config file.

3) add next lines to /etc/init.d/dovecot just after #!/bin/bash

DAEMON_COREFILE_LIMIT='unlimited'
echo 1 >/proc/sys/kernel/suid_dumpable
echo 1 >/proc/sys/kernel/core_uses_pid
echo "/tmp/core.%e.%s" >/proc/sys/kernel/core_pattern

4) restart dovecot

after this problem occurs next time, new core file should be stored in /tmp directory

Comment 5 Dan Horák 2008-12-16 17:15:11 UTC
AFAICT this looks like an upstream issue - see http://www.dovecot.org/list/dovecot/2008-December/035720.html

Comment 6 Ben Webb 2008-12-16 18:45:48 UTC
Yes, that upstream bug certainly looks like it's causing our problem. Any idea when a new dovecot release will make it into Fedora? Or will these patches be backported to the current F10 dovecot package?

FWIW, deleting dovecot.index* seemed to only make the problem go away for a little while - now I'm getting the "next message unexpectedly lost" errors again. But I can look into getting a coredump for you if you think that'll be useful.

Comment 7 Michal Hlavinka 2008-12-17 16:11:24 UTC
From mailing list:

> I'll make a new v1.1 release after seeing what bug fixes are coming from
Apple, so probably next week.

(In reply to comment #6)
> Any idea when a new dovecot release will make it into Fedora?

Since it was posted last week, I hope new version will be released soon.

> But I can look into getting a coredump for you if you think that'll be
> useful.

I think it's not necessary since problem is known and fixed already.

Comment 8 Ben Webb 2008-12-18 00:06:12 UTC
Right, I saw that upstream were releasing a new version - I just wanted to know whether the Fedora package would follow that release.

Comment 9 Michal Hlavinka 2008-12-18 06:42:21 UTC
Fedora 10 have the most recent version of dovecot, so as soon as new version is released, you can expect it in updates or updates-testing (that wasn't decided yet).

Comment 10 Fedora Update System 2009-01-08 06:54:27 UTC
dovecot-1.1.8-1.fc10 has been submitted as an update for Fedora 10.
http://admin.fedoraproject.org/updates/dovecot-1.1.8-1.fc10

Comment 11 Ben Webb 2009-01-14 00:13:27 UTC
Perhaps showing my ignorance here, but following the link in comment #10 all I can find are unsigned packages in Koji. I am reluctant to install unsigned packages on my server, although I am happy to test *signed* packages prior to release. Am I missing something somewhere, or is there just a time delay before signed packages show up on the FTP sites (in updates or updates-testing)?

Comment 12 Michal Hlavinka 2009-01-14 07:15:37 UTC
most important from that web page is "status = pending" it means package was submitted as an update, but it is still on the way now (pending).

Also, this package would be not in updates now but updates-testing first. When package is ready in updates-testing bodhi system should change status of this bug to "ON_QA" (this don't have to be true for all updates of all packages). After 3 karmas (successful testing), package will be pushed to updates. If it doesn't get 3 karmas, it is staying in updates-testing for 3 weeks before I manually push it into the updates.

Comment 13 Fedora Update System 2009-01-15 03:10:33 UTC
dovecot-1.1.8-1.fc10 has been pushed to the Fedora 10 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update dovecot'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F10/FEDORA-2009-0570

Comment 14 Fedora Update System 2009-01-29 23:11:44 UTC
dovecot-1.1.10-1.fc10 has been pushed to the Fedora 10 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update dovecot'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F10/FEDORA-2009-1184

Comment 15 Michal Hlavinka 2009-03-11 12:39:34 UTC
dovecot-1.1.10-1.fc10 is in updates for some time... I wonder why bodhi didn't close this...


Note You need to log in before you can comment on or make changes to this bug.