Bug 248092

Summary: mbox corruption
Product: Red Hat Enterprise Linux 4 Reporter: Enzo <e.tano>
Component: dovecotAssignee: Dan HorĂ¡k <dhorak>
Status: CLOSED DEFERRED QA Contact:
Severity: high Docs Contact:
Priority: low    
Version: 4.5CC: tao
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-08-11 08:42:21 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Enzo 2007-07-13 06:28:40 UTC
Description of problem:
I read the bug #162781 #178683 #206376 on mbox corruption. But also with version
dovecot-0.99.11-8.EL4 lastest in RedHat 4.5 the problem persist.
Postfix and dovecot use fnctl as lock method.
The mbox are located on external storage but not NFS it's a local filesystem. 
Only Postfix and dovecot can access to the mbox.

Version-Release number of selected component (if applicable):
dovecot-0.99.11-8.EL4

How reproducible:
It's very difficult reproduce the problem it's random. I tried also to delete
the first message as describe in three bug above, but in my test account the
corruption don't happen, only in real accounts.

Comment 1 Tomas Janousek 2007-07-13 12:43:50 UTC
What sort of corruption is this? May it be related to bug 235750?

Comment 2 Enzo 2007-07-14 18:13:06 UTC
Sorry if I don't specified it. 

The problem is this:
pop3(user1): File isn't in mbox format: /var/mail/user1
but also:
imap(user2): File isn't in mbox format: /home/user2/mail/mail/sent-mail

In the head of mbox corrupted remain a little part of the last line of message.
For example 

.36C077EA--

and originally it was 

------_=_NextPart_001_1156E71A.36C077EA--


Comment 3 Tomas Janousek 2007-07-16 12:32:51 UTC
Ah. Then it looks like a duplicate of bug 206376.

But, well.. if we can get a reproducer...
there's some info about in in bug 235750, notably comments 5, 6, 7 and 8.

Also, there's a workaround mentioned in bug 206376 -- put the indexes to a local
place. Do you have them on NFS or something like that?

Comment 4 Enzo 2007-07-16 13:43:01 UTC
My filesystem is ext3 not NFS, I don't think my problem is equal to bug 206376.
My system is x86 not x86_64. The author of bug 206376 tell in x86 works but in
x86_64 fails.

I hare read the comments in bug 235750. 
pop3_lock_session not exist in version 0.99.x, I'm testing the parameter
mail_extra_groups.

I have read also the bug 178683 and I have think the problem is resolved with
RHEL 4.5


Comment 5 Tomas Janousek 2007-07-17 10:23:45 UTC
The thing with ext3/NFS is interesting, but the platform.. there's a comment in
bug 206376 saying it is in fact reproducible on x86 as well.

Then, wrt bug 235750, I meant the comments about catching the error, not about
the workarounds. He says it only happens with outlook clients -- can you check
whether this is the case with your problem? He says he might be able to collect
some data to help us locate and reproduce the problem and there are a few
comments saying how to do it -- do you think you might be able to do anything
similar? And in the today's comment he says the pop3 process actually dies (is
killed with signal 11) -- can you check whether you observe anything similar?

And regarding your last paragraph, bug 178683 is most likely the same as bug
235750, therefore I don't think it's resolved in EL4.5.

Comment 6 Enzo 2007-07-17 15:03:42 UTC
I'm testing with telnet client. Sometimes I have corrupted the mbox with a
sequence of operations, but after I have restored the mbox and nextly I tried to
repeat the test with same sequence the error is not repeated.

1. telnet a pop3 server
2. user and pass
3. dele 1
4. another telnet a pop3 server
5. user and pass
6. dele 1 on 2 session
7. quit   on 2 session
8. quit   on 1 session (mbox corrupted)

I tried to repeat this sequence 20 times after I restored the mbox but the error
is not repeated.

I have corrupted the mbox with this sequence:
1. telnet a pop3 server
2. user and pass
3. access with same user with webmail Horde/Imp IMAP
4. login, read a message and logout
5. dele 1 from session with telnet
6. quit (mbox corrupted)

Also this corruption happen only one time.

I hope this test is useful