Bug 1369499
Summary: | readpst -r incorrectly names a file according to its type: Recoverable Items/Calendar Logging/mbox | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Ivan Zakharyaschev <imz> | ||||||
Component: | libpst | Assignee: | Carl Byington <carl> | ||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 27 | CC: | carl, ppisar | ||||||
Target Milestone: | --- | Flags: | carl:
needinfo-
carl: rhel-rawhide- |
||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | All | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | 0.6.69 | Doc Type: | If docs needed, set a value | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2017-08-16 15:50:07 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Ivan Zakharyaschev
2016-08-23 15:00:05 UTC
I must have explained the problem not clear enough: the types output by file are correct. (I have looked inside the file, and it indeed contains calendar cards, not mail.) /tmp/readpst/root/mailtst/Recoverable Items/Calendar Logging/mbox: vCalendar calendar file But the name as used by readpst -r is wrong. It is named "mbox", but this doesn't correspond to the actual type. It should be named "calendar". Created attachment 1193404 [details] fgrep -1 'I have a' ../tmp/readpst/root/mailtst.readpst.log > Can you re-run that readpst command with > > -d some.log.file.txt > > and then > > grep "I have" some.log.file.txt I've done that, although the log file became really huge (27G). I have grepped for "I have a" not to get extra content from the message bodies (I've just checked the readpst.c source code, and this seems to be a narrower pattern that wouldn't miss any debugging message). I've also include one line of context. I believe now that the problem is a more global one: Some folders contain items of different several types. I've run readpst -e, and see that in this specific case the first item parsed is probably not an email, whereas the folder type is probably undefined, therefore it is created as mbox, but then the type gets overridden, and calendar items are saved there: $ ls readpst/root.extensions/mailtst/Recoverable\ Items/Calendar\ Logging/ | head 1.ics 10.ics 100.eml 101.ics 102.eml 103.ics 104.eml 105.ics 106.ics 107.eml $ This is not nice: naming something "mbox", but not saving well-formed emails there. I believe this specific issue requires a fix. Globally, I'm thinking about a solution where readpst -r would create several files in the same folder (mbox, calendar) for such cases. This would allow to save everything and not mess things up. (Other modes except for readpst -r are not very attractive for me, because they do no create mbox--the only output format understood completely by dovecot, and hence, doveadm sync (for reading and importing whole user's accounts). MH is not supported by dovecot.) Here one can see all the folders with different types of items: $ find readpst/root.extensions/ -type d -print -exec sh -c 'ls "{}" | fgrep . | cut -d. --fields=2 | sort -u' ';' readpst/root.extensions/ readpst/root.extensions/mailtst readpst/root.extensions/mailtst/Заметки readpst/root.extensions/mailtst/МСЭД eml readpst/root.extensions/mailtst/Входящие eml readpst/root.extensions/mailtst/Входящие/103 eml readpst/root.extensions/mailtst/Входящие/Миграция ADEX eml readpst/root.extensions/mailtst/Входящие/Оперативка eml readpst/root.extensions/mailtst/Входящие/СПО - ФСТЭК eml readpst/root.extensions/mailtst/Входящие/СУТП eml readpst/root.extensions/mailtst/Входящие/Схемы ЛВС eml readpst/root.extensions/mailtst/Нежелательная почта readpst/root.extensions/mailtst/ТТ на согласование eml readpst/root.extensions/mailtst/Junk eml readpst/root.extensions/mailtst/Задачи readpst/root.extensions/mailtst/Прочитать и дать ответ! eml readpst/root.extensions/mailtst/Прочитать и дать ответ!/Что-то важное eml readpst/root.extensions/mailtst/Отправленные eml readpst/root.extensions/mailtst/aрхив eml readpst/root.extensions/mailtst/Контакты vcf readpst/root.extensions/mailtst/Контакты/Recipient Cache vcf readpst/root.extensions/mailtst/Черновики readpst/root.extensions/mailtst/Календарь ics readpst/root.extensions/mailtst/Sent eml readpst/root.extensions/mailtst/Удаленные eml ics readpst/root.extensions/mailtst/ЕКП eml readpst/root.extensions/mailtst/Recoverable Items readpst/root.extensions/mailtst/Recoverable Items/Deletions eml ics readpst/root.extensions/mailtst/Recoverable Items/Calendar Logging eml ics readpst/root.extensions/mailtst/Предлагаемые контакты vcf readpst/root.extensions/mailtst/Ошибки синхронизации readpst/root.extensions/mailtst/Ошибки синхронизации/Конфликты eml $ So, the problematic folders are not normal ones: * Удаленные (means "Deleted" in Russian) * Recoverable Items/Deletions * Recoverable Items/Calendar Logging Created attachment 1193451 [details]
libpst-no-bad-mboxes.patch
Fixed the problem with bad mboxes by the attached patch.
A related question that appeared to me was: in Thunderbird mode, is it OK that zero would be written to .type for such folders? (I have not changed this.)
A check of the result I've made by comparing to the previous output (a big account, around 6G):
-bash-4.3$ diff -rq --exclude='*.log' -Iboundary -Iname -IDTSTAMP root root.nobad
Files root/mailtst/Recoverable Items/Calendar Logging/mbox and root.nobad/mailtst/Recoverable Items/Calendar Logging/mbox differ
-bash-4.3$ find root.nobad/ -name mbox -print0 | xargs -0 file
root.nobad/mailtst/МСЭД/mbox: ISO-8859 mail text, with very long lines
root.nobad/mailtst/Входящие/Оперативка/mbox: UTF-8 Unicode mail text
root.nobad/mailtst/Входящие/Схемы ЛВС/mbox: Non-ISO extended-ASCII mail text, with very long lines
root.nobad/mailtst/Входящие/103/mbox: UTF-8 Unicode mail text, with very long lines
root.nobad/mailtst/Входящие/mbox: ISO-8859 mail text, with very long lines
root.nobad/mailtst/Входящие/СПО - ФСТЭК/mbox: Non-ISO extended-ASCII mail text, with very long lines, with LF, NEL line terminators
root.nobad/mailtst/Входящие/Миграция ADEX/mbox: UTF-8 Unicode mail text, with very long lines
root.nobad/mailtst/Входящие/СУТП/mbox: UTF-8 Unicode mail text, with very long lines
root.nobad/mailtst/ТТ на согласование/mbox: UTF-8 Unicode mail text, with very long lines
root.nobad/mailtst/Junk/mbox: UTF-8 Unicode mail text, with very long lines
root.nobad/mailtst/Прочитать и дать ответ!/Что-то важное/mbox: Non-ISO extended-ASCII mail text, with very long lines, with LF, NEL line terminators
root.nobad/mailtst/Прочитать и дать ответ!/mbox: Non-ISO extended-ASCII mail text, with very long lines, with LF, NEL line terminators
root.nobad/mailtst/Отправленные/mbox: UTF-8 Unicode HTML document text, with very long lines
root.nobad/mailtst/aрхив/mbox: Non-ISO extended-ASCII mail text, with very long lines
root.nobad/mailtst/Sent/mbox: UTF-8 Unicode text
root.nobad/mailtst/Удаленные/mbox: Non-ISO extended-ASCII mail text, with very long lines, with LF, NEL line terminators
root.nobad/mailtst/ЕКП/mbox: UTF-8 Unicode mail text
root.nobad/mailtst/Recoverable Items/Deletions/mbox: UTF-8 Unicode mail text, with very long lines
root.nobad/mailtst/Recoverable Items/Calendar Logging/mbox: UTF-8 Unicode mail text
root.nobad/mailtst/Ошибки синхронизации/Конфликты/mbox: UTF-8 Unicode mail text
-bash-4.3$
Everything is fine.
BTW, the previous output seems to have saved the calendars to "root/mailtst/Recoverable Items/Calendar Logging/mbox" -- those ones which are also present as attachments in the new correct output mbox.
fixed. This bug appears to have been reported against 'rawhide' during the Fedora 27 development cycle. Changing version to '27'. fixed in 0.6.69 |