Bug 728148

Summary: Cyrus-imapd: IOERROR: zero index/expunge record
Product: Red Hat Enterprise Linux 6 Reporter: Marc Muehlfeld <marc.muehlfeld>
Component: cyrus-imapdAssignee: Pavel Šimerda (pavlix) <psimerda>
Status: CLOSED WONTFIX QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.1CC: muehlfeld, thozza
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-07-19 14:10:48 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
imapd.conf
none
cyrus.conf none

Description Marc Muehlfeld 2011-08-04 08:15:02 UTC
Description of problem:
Cyrus mailboxes get broken here by themselfs and the mailclients show an "IOERROR" to the users and the following is logged:

imap[21918]: IOERROR: user.xxx zero index/expunge record 41/47
imap[16600]: IOERROR: user.yyy zero index/expunge record 16/17

Also the mailbox (sub-)folder contains *.NEW files when this happens:
-rw-------   1 cyrus mail  13660 22. Jul 09:22 cyrus.cache.NEW
-rw-------   1 cyrus mail    800 22. Jul 09:22 cyrus.index.NEW

Reconstructing the mailbox removes the *.NEW files and it's working for some 
hours or a day. But the problem and the *.NEW files are always comming back. On client side nothing had changed.



Version-Release number of selected component (if applicable):
cyrus-imapd-2.3.16-6



How reproducible:
Sporadic. We have almost 150 mailboxes containing 2300 subfolders. Every day between 5 and 20 folders get corrupted and the errors appear. Reconstructing the mailboxes fixes the problem for a few folders where the error doesn't came back. But on other folders the error comes back after hours or days.



Steps to Reproduce:
I haven't find any way to force the problem comming up. It just happens on various mailboxes and subfolders. It seems to be caused by expunge.


  
Actual results:
imap[16600]: IOERROR: user.yyy zero index/expunge record 16/17
messages in the logfile. And "IOERRORS" reported to the users.



Expected results:
No errors.



Additional info:
I meanwhile discussed the problem on the cyrus mailing list and got the information that the problems are caused by blank records in cyrus.index. This is the mailing list entry where Bron Gondwana explains what goes wrong:
http://www.mail-archive.com/info-cyrus@lists.andrew.cmu.edu/msg41677.html

Comment 1 Marc Muehlfeld 2011-08-04 08:15:30 UTC
Created attachment 516654 [details]
imapd.conf

Comment 2 Marc Muehlfeld 2011-08-04 08:15:45 UTC
Created attachment 516655 [details]
cyrus.conf

Comment 4 RHEL Program Management 2011-08-04 08:37:45 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unfortunately unable to
address this request at this time. Red Hat invites you to
ask your support representative to propose this request, if
appropriate and relevant, in the next release of Red Hat
Enterprise Linux. If you would like it considered as an
exception in the current release, please ask your support
representative.

Comment 5 Marc Muehlfeld 2013-01-10 18:27:20 UTC
We meanwhile figured something out:

The partition where the mailboxes are, was formated with XFS. After we had formated it with ext4 the problem was gone.

So it is a problem with cyrus and XFS on RHEL6

Comment 6 Marc Muehlfeld 2015-04-01 12:54:38 UTC
Just to give an update:

We have meanwhile migrated the server to CentOS 7 with the mailboxes stored on an XFS volume. There we didn't saw any problems like we had on RHEL6+XFS.

Comment 9 Pavel Šimerda (pavlix) 2016-07-19 13:34:09 UTC
(In reply to Marc Muehlfeld from comment #6)
> Just to give an update:
> 
> We have meanwhile migrated the server to CentOS 7 with the mailboxes stored
> on an XFS volume. There we didn't saw any problems like we had on RHEL6+XFS.

Thanks for the updates.

Cheers,

Pavel

Comment 11 Pavel Šimerda (pavlix) 2016-07-19 14:10:48 UTC
Red Hat Enterprise Linux version 6 is entering the Production 2 phase of its lifetime and this bug doesn't meet the criteria for it, i.e. only high severity issues will be fixed. Please see https://access.redhat.com/support/policy/updates/errata/ for further information.

This issue is fixed in Red Hat Enterprise Linux version 7 according to information from the reporter.