Bug 1994178 - readpst does not detect headers correctly in many cases
Summary: readpst does not detect headers correctly in many cases
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: libpst
Version: 36
Hardware: All
OS: All
unspecified
medium
Target Milestone: ---
Assignee: Milan Crha
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-08-17 01:48 UTC by Paul Wise (Debian)
Modified: 2023-05-25 17:51 UTC (History)
3 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2023-05-25 17:51:51 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
sample pst exhibiting the problem and patches to fix the problem (30.84 KB, application/x-xz)
2021-08-17 01:48 UTC, Paul Wise (Debian)
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Debian BTS 984581 0 None None None 2021-08-17 01:48:38 UTC

Description Paul Wise (Debian) 2021-08-17 01:48:38 UTC
Created attachment 1814582 [details]
sample pst exhibiting the problem and patches to fix the problem

Forwarding part of https://bugs.debian.org/984581

Description of problem:

readpst only detects the message headers in a PST file when a limited set of headers are the first line of the headers. With some sites now using ARC, the first header is often Arc-*, which isn't detected. With all the large variety of MTAs on the Internet now, the first header is also often a custom header, which aren't detected. The header detection was added because of problems with corrupted headers in PST files, so it is probably still needed in some cases. Instead of detecting a specific set of headers, readpst could use the RFC specification of email headers to detect all valid headers, but then some of them might match corrupted information, at least on the first line. An alternative solution is to detect "reasonable" headers, basically detect first header lines that look like most of the email header lines. That is the solution I have gone with in the attached patches. I also added some more debugging and added detection of space-wrapped headers in addition to tab-wrapped headers, since both wrapping styles are common.

Version-Release number of selected component (if applicable):

0.6.75, 0.6.76

How reproducible:

Fully reproducible

Steps to Reproduce:

1. Download attached files: sample-pst-and-patches.tar.gz
2. Unpack the attached files: tar zxf sample-pst-and-patches.tar.gz
3. Convert the file to mbox: readpst -d debug forpst.pst
4. View the resulting mbox: less 'for pst.mbox'
5. Note the missing info: grep To: 'for pst.mbox'
6. Apply the attached patches: hg import *.patch
7: Compile patched libpst
8. Repeat steps 3-5 with the fixed libpst

Actual results:

The headers available will be minimal and the To header will be missing recipient information.

Expected results:

The full headers will be available and the To header will include the recipient information.

Additional info:

Please use `hg import` when applying the patches, so that commit and authorship information is preserved.

Comment 1 Milan Crha 2021-08-30 10:31:39 UTC
Thanks for a bug report. The upstream maintainer seems to be inactive on the Fedora bugzilla [1], unfortunately. I can add the patches to the Fedora package, but I'm afraid it's not your intention here. The main purpose of this report is to include the patches in the upstream sources, not in each distribution providing libpst, right?

I'm setting need-info to Carl.

[1] You wrote this near the end of the Debian bug tracker:
>  I have forwarded the patches to the Fedora bug tracker, hopefully that
>  will mean that the upstream maintainer will accept them now.

Comment 2 Paul Wise (Debian) 2021-08-30 10:35:35 UTC
Yeah, I mainly posted here to get the attention of Carl, who seemed to be both upstream and Fedora libpst maintainer. He has replied to my email earlier and accepted patches via private mail before, but didn't respond to this patch yet.

I've included the patch in Debian and I think it is reasonable to include it in Fedora too until it reaches the upstream repository.

Comment 3 Carl Byington 2021-10-28 18:28:28 UTC
I am sorry, I had hopes of being able to get at least one more round of patches integrated. But it is clear even to me that that won't happen. Someone else will need to take over libpst.

Comment 4 Paul Wise (Debian) 2021-10-29 04:46:25 UTC
No problem, thanks for the notice. I'm willing to be the primary upstream maintainer. I won't be able to contribute to the Fedora package though.

Comment 5 Milan Crha 2021-10-29 08:47:58 UTC
(In reply to Paul Wise (Debian) from comment #4)
> I won't be able to contribute to the Fedora package though.

That's okay, I can update the package with the upstream releases.

Comment 6 Paul Wise (Debian) 2021-12-12 06:17:21 UTC
Carl suggested via email that it would be best to migrate the repository hosting elsewhere, so I've migrated the repository here:

https://github.com/pst-format/libpst

If either of you want to join the GitHub org I would be happy to add you.

Over the next weeks I'll migrate the remaining released tarballs there too, merge the available patches (including the one in this bug), make a new release and notify downstream redistributors.

I'll of course accept patches via GitHub PRs, but if you prefer not to use GitHub then patches via email will be fine too.

Comment 7 Paul Wise (Debian) 2022-01-21 01:11:55 UTC
The patches attached to this issue have been merged into the new upstream project:

https://github.com/pst-format/libpst/compare/234ac9131fa805cd43ce0b70246304fa011be45f...2766c09463dd0645bb569bced9ee217de4f66dce

Comment 8 Ben Cotton 2022-02-08 21:09:15 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 36 development cycle.
Changing version to 36.

Comment 9 Ben Cotton 2023-04-25 16:43:42 UTC
This message is a reminder that Fedora Linux 36 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 36 on 2023-05-16.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '36'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version. Note that the version field may be hidden.
Click the "Show advanced fields" button if you do not see it.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 36 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.

Comment 10 Ludek Smid 2023-05-25 17:51:51 UTC
Fedora Linux 36 entered end-of-life (EOL) status on 2023-05-16.

Fedora Linux 36 is no longer maintained, which means that it
will not receive any further security or bug fix updates. As a result we
are closing this bug.

If you can reproduce this bug against a currently maintained version of Fedora Linux
please feel free to reopen this bug against that version. Note that the version
field may be hidden. Click the "Show advanced fields" button if you do not see
the version field.

If you are unable to reopen this bug, please file a new report against an
active release.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.