Bug 1368640 - Non-UTF-8 unencoded headers sent incorrectly from Gmail
Summary: Non-UTF-8 unencoded headers sent incorrectly from Gmail
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: evolution
Version: 24
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Milan Crha
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-20 09:40 UTC by Mikhail
Modified: 2016-08-26 07:40 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-08-25 16:20:01 UTC
Type: Bug


Attachments (Terms of Use)
Screenshot (263.68 KB, image/png)
2016-08-20 09:40 UTC, Mikhail
no flags Details
Yet another interesting example (429.21 KB, image/png)
2016-08-22 21:50 UTC, Mikhail
no flags Details
message with subject problem (2.09 KB, application/mbox)
2016-08-22 21:51 UTC, Mikhail
no flags Details

Description Mikhail 2016-08-20 09:40:09 UTC
Description of problem:
some messages in IMAPx account have incorrect displaying sender
If this message export/import then sender in new imported message displaying correctly.

$ rpm -qa | grep evolution | sort
evolution-3.20.5-1.fc24.x86_64
evolution-data-server-3.20.5-2.fc24.x86_64
evolution-data-server-debuginfo-3.20.5-2.fc24.x86_64
evolution-debuginfo-3.20.5-1.fc24.x86_64
evolution-ews-3.20.5-1.fc24.x86_64
evolution-ews-debuginfo-3.20.5-1.fc24.x86_64
evolution-help-3.20.5-1.fc24.noarch

Comment 1 Mikhail 2016-08-20 09:40:42 UTC
Created attachment 1192440 [details]
Screenshot

Comment 2 Milan Crha 2016-08-22 08:56:38 UTC
Thanks for a bug report. It looks like the code failed to decode the email address for some reason. Those extra quotes look suspicious to me, but it's only on the first look.

Could you view the message source of the first two top messages, one of the working and the other non-working, and paste here the whole content of the From header, please? Though maybe the issue lies somewhere else [1].

[1] https://mail.gnome.org/archives/evolution-list/2016-August/msg00139.html

Comment 3 Mikhail 2016-08-22 20:15:30 UTC
messages with correct sender have such headers:

MIME-Version: 1.0
Received: by 10.221.12.79 with HTTP; Tue, 28 Oct 2014 03:02:48 -0700 (PDT)
Received: by 10.221.12.79 with HTTP; Tue, 28 Oct 2014 03:02:48 -0700 (PDT)
In-Reply-To: <CAN8p6c+WiSOO6KMUpd8iH3RzgE+jSh3Z=68ZeNPr49C=b5mPtw.com>
References: <CAN8p6cKgogBFN1hBD0TZxXoWx1p7LGfkwQYb3MwxMa0++hDD-w.com>
        <CAN8p6c+WiSOO6KMUpd8iH3RzgE+jSh3Z=68ZeNPr49C=b5mPtw.com>
Date: Tue, 28 Oct 2014 16:02:48 +0600
Delivered-To: mikhail.v.gavrilov
Message-ID: <CABXGCsMHT1S3xayQ8ObuJ7L6iTnOLXgKP+=RJXUWUnbhrEaWAg.com>
Subject:
 =?UTF-8?B?RndkOiBSZTog0KTQvtGA0LzRiyDQtNC70Y8g0YbQtdC90L3Ri9GFINC/0LjRgdC10Lw=?=
From: =?UTF-8?B?0JzQuNGF0LDQuNC7INCT0LDQstGA0LjQu9C+0LI=?= <mikhail.v.gavrilov>
To:
 =?UTF-8?B?0JrQsNGA0LjQvdCwINCY0LvRjNC00LDRgNC+0LLQvdCwINCk0LDRhdGA0LXRgtC00LjQvdC+0LI=?=
 =?UTF-8?B?0LA=?= <fakhretdinova.karina>
Content-Type: multipart/related; boundary=bcaec51a7d30dedcf1050678bf02
X-Evolution-Source: 1462687265.2005.1


messages with incorrect sender have such headers:

MIME-Version: 1.0
Received: by 10.220.138.5 with HTTP; Tue, 5 Aug 2014 02:00:38 -0700 (PDT)
In-Reply-To: <04a301cfad61$a36211e0$ea2635a0$@gmail.com>
References: <WC20140801033712.9512FF>
        <04a301cfad61$a36211e0$ea2635a0$@gmail.com>
Date: Tue, 5 Aug 2014 15:00:38 +0600
Delivered-To: mikhail.v.gavrilov
Message-ID: <CABXGCsOg8UNTX9gfkXPzciFOnre_31sERuSJYfeREKnLHSWT4Q.com>
Subject: Fwd: FW: Синерджи
From: "Михаил Гаврилов" <mikhail.v.gavrilov>
To: "Алия Радиковна Сагитова" <newpandorrra>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: base64
X-Evolution-Source: 1462687265.2005.1

Comment 4 Mikhail 2016-08-22 20:19:00 UTC
Very very interesting but after export/import message have same headers, but display sender correctly.

Comment 5 Mikhail 2016-08-22 21:50:08 UTC
Created attachment 1193094 [details]
Yet another interesting example

Comment 6 Mikhail 2016-08-22 21:51:03 UTC
Created attachment 1193095 [details]
message with subject problem

Comment 7 Milan Crha 2016-08-24 21:01:08 UTC
I think I see the pattern. Whenever a message has the text unencoded, then it is broken. Like in the comment #3, the broken part is the one where the From is with 8-bit characters, the same as with the comment #6. When the header content is encoded, then the text is shown fine.

The empty Subject in the message source you captured in the comment #5 is due to the WebKit not showing those "unknown" 8-bit bytes (or the pre-filtering removed them, it's also possible). I think it's correct, when the text is unprintable.

I copied your test message into my Gmail account and I can reproduce the broken Subject in the message list with it, when I left the IMAP account to download it (lie when I changed labels in the Gmail web UI and then updated the evolution. The problem is that IMAPx downloads only the top headers, which are supposed to be in UTF-8, but your message is whole in Windows-1251 character set, which causes the breakage. Just try to pick a different character set in the View->Character Encoding menu, notably when you choose UTF-8, then most of the message content is gone.

Evolution (Camel) guesses the character encoding from the Content-Type header, if available. I see it being returned from the Gmail server, but it's not used for some reason. I tried the same with a Courier server and it works properly there, thus it's something about the way the Gmail returns the message headers. I see a little difference there, an empty line after the headers, thius maybe it's it.

I'll investigate this further and let you know my findings. Currently it looks like Gmail changed something in a way how it sends message headers through the IMAP interface.

Comment 8 Milan Crha 2016-08-25 16:20:01 UTC
I debugged this further and as far as I can tell, this is not an issue on the Evolution side. The problem is that the Gmail server returns garbled data when the evolution code asks for headers only (RFC822.HEADER), but intact headers when the evolution asks for the entire message body (BODY.PEEK[]). They seem to convert the 8-bit characters in Windows-1250 encoding into UTF-8 encoded question marks, which are shown in the message list.

I tried the same message with a courier instance and that works properly and doesn't mangle the 8-bit letters in the headers. I guess it's caused by some recent change in Gmail's IMAP interface, they probably decided to validate header responses to UTF-8, which resulted in those question marks.

To be fair, the real issue is on the sender's side of those messages, because according to the IMAP RFC 3501 [1]:

         8-bit textual data is permitted if a [CHARSET] identifier is
         part of the body parameter parenthesized list for this section.
         Note that headers (part specifiers HEADER or MIME, or the
         header portion of a MESSAGE/RFC822 part), MUST be 7-bit; 8-bit
         characters are not permitted in headers.

The sender's software should encode the headers properly.

[1] https://tools.ietf.org/html/rfc3501.html#page-74

Comment 9 Mikhail 2016-08-26 04:57:34 UTC
(In reply to Milan Crha from comment #8)
> The sender's software should encode the headers properly.
But sender in comment #1 it's me that means this mistake is created by Evolution software.

As an engineer I am want that all developers adhere to standards too, but in real world it impossible.

For users is more important that all the messages were readable. Maybe it makes sense to reconsider the categorical support for any non-standard situations.

If we want to popularize Open Source it is necessary to support any situation.

Comment 10 Milan Crha 2016-08-26 07:40:44 UTC
(In reply to Mikhail from comment #9)
> (In reply to Milan Crha from comment #8)
> > The sender's software should encode the headers properly.
> But sender in comment #1 it's me that means this mistake is created by
> Evolution software.

Would it be possible to upload this message too, please?

If it generated evolution, and it is not able to decode the sender, then it's wrong in the evolution and should be fixed there. Are you able to reproduce this with evolution-data-server 3.20.5 and evolution 3.20.5? If yes, how exactly?

> For users is more important that all the messages were readable. Maybe it
> makes sense to reconsider the categorical support for any non-standard
> situations.

There are several tweaks here and there to support "odd behaviour" of certain servers in the code already, the problem with the message from comment #6 is that the Gmail server sends those question marks when the IMAPx code asks for headers only, which means that any client which reads only the headers is affected and there is nothing to be done for anyone but the Gmail developers (or the sender, to follow the standard). That's why i closed this as NotABug, because it's out of the evolution hands.


Note You need to log in before you can comment on or make changes to this bug.