Bug 209105 - Bad decoding of multiple Subjects
Summary: Bad decoding of multiple Subjects
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: squirrelmail
Version: 3.0
Hardware: All
OS: Linux
Target Milestone: ---
Assignee: Martin Bacovsky
QA Contact:
Depends On:
Blocks: 241860 241861
TreeView+ depends on / blocked
Reported: 2006-10-03 09:33 UTC by Bastien Nocera
Modified: 2007-11-30 22:07 UTC (History)
4 users (show)

Clone Of:
Last Closed: 2007-05-14 15:04:41 UTC

Attachments (Terms of Use)
multiline-i18n-header-decode.patch (557 bytes, patch)
2006-10-03 09:33 UTC, Bastien Nocera
no flags Details | Diff
functions/i18n.php sm-1.4.10a patch (2.90 KB, patch)
2007-05-14 17:15 UTC, Tomas
no flags Details | Diff

Description Bastien Nocera 2006-10-03 09:33:00 UTC

Squirrelmail doesn't decode properly multi-line subjects.
For example, the ASCII version of an e-mail sent to Squirrelmail:

Subject: =?ISO-2022-JP?B?GyRCJCIbKEI=?=a=?ISO-2022-JP?B?GyRCJCQbKEI=?=i=

The decodeheader i18n option doesn't handle newlines, or tabs at the beginning
of the lines:
-            $ret = str_replace("\t", "", $ret);
+            $ret = str_replace(array("\t", "\n"," "), "", $ret);

Patch from Hideshi Fuchi <hfuchi@redhat.com>

Comment 1 Bastien Nocera 2006-10-03 09:33:01 UTC
Created attachment 137633 [details]

Comment 4 Warren Togami 2006-10-23 18:34:43 UTC
Is this broken in RHEL3 but not RHEL4?

Either way, *PLEASE* submit this patch and have upstream accept it.

Comment 8 Tomas 2007-02-04 16:36:51 UTC
Could you attach encoded header instead of pasting it in bug report? I suspect
that some data is lost.

If header looks same way as in bug report, could you explain why encoded word is
splitted into two lines and why there is no separator between two atoms? With
appropriate references to rfcs.

rfc 2047 states that "'encoded-word's are designed to be recognized as 'atom's
by an RFC 822 parser". According to rfc822 atoms are separated with single
space. They don't look separated in your bug report.

Comment 10 Martin Bacovsky 2007-03-26 10:27:03 UTC
This issue should be fixed squirrelmail-1.4.8-5.el3

Comment 11 Tomas 2007-03-26 10:45:16 UTC
You can't fix it in SquirrelMail, because bug is not in SquirrelMail.

Comment 13 Jan Hutař 2007-03-27 07:58:18 UTC
provided Subject (multi)line seems wrong. It is not decoded correctly even in 
Evolution/Sylpheed, so I used this, slightly modified:

Subject: =?ISO-2022-JP?B?GyRCJCIbKEI=?=

But this one is OK everywhere, SquirrelMail works correctly with it. Please 
could you attach Subject (multi)line which causes problems and description of 
that problems?

Thanks in advance,

Comment 14 Tomas 2007-03-27 09:13:09 UTC

header in original report contains incorrectly encoded mixed Japanese and ASCII
characters. Probably incorrect decision by some smart mime header encoding
function. No spaces between atoms and incorrect multiline wrapping.

When you modified header, you've removed ASCII chars.

I suspect that header contains encoded "あ a い i う u え e お o" or "あaいiうu
えeおo" text. Depends on how broken mime encoder is.

Comment 16 Tomas 2007-04-16 08:30:45 UTC
Header is generated with SquirrelMail Japanese translation. Create message with
"あaいiうuえeおo" subject and save it as draft. SquirrelMail has similar bug
report (#1693322). Fix proposed in bugzilla is not correct. Japanese XTRA_CODE
functions must be fixed.


Comment 17 Martin Bacovsky 2007-04-24 15:23:26 UTC
Problem seems to be in foldLine() function in class/deliver/Deliver.class.php.
It tries to split long headers along RFC2047, but it has troubles with more than
one Japanese char separated by ASCII chars. 

I rewrote foldLine() so that it split subject at right places, but since the
ascii chars are left as they are, we got some unwanted extra spaces in header
(like this あaいi うuえe おo). I hope that converting ascii chars to ISO-2022-JP
will help to avoid those extra spaces.

I tried to create mail with our subject in thunderbird and it produces this:
which is nice also according to RFC2047.

I'm not sure wahat is the preffered way to fix this for the upstream so I have
to consult them first.

Comment 18 Tomas 2007-04-24 15:59:05 UTC
I am not sure if there is a bug in foldLine. Invalid code is in
japanese_charset_xtra() function, encodeheader switch.

Header posted by Bastien Nocera can be generated with SquirrelMail in Japanese
translation. Before you change foldLine() make sure that header is encoded

Comment 20 Martin Bacovsky 2007-05-10 07:04:48 UTC
(In reply to comment #18)
As I understand it, header is encoded properly (but not very nicely)
If it is not folded later most mail clients display the header properly.
Foldmethod as it is by now folds the header inside the multibyte atoms (it is
wrong). "Fixed" foldmethod folds the header on borders of atoms, but this adds
some extra spaces when interpreted by mail client.

I think the ideal way to fix this is to convert all chars in headers to either 
ISO-2022-JP or UTF-8 or UNICODE. Unfortunatley I'm not very familiar with
encodings conversion so it will need some more investigation.

Comment 23 Suzanne Yeghiayan 2007-05-14 15:04:41 UTC
Patch provided was insufficient.  Devel is working with the upstream community
however they do not expect to have a new patch by the 3.9 deadline.
Marco (GSS) checked with customer and they are happy with a php workaround.
The errata has been dropped from RHEL3-QU9.
Changed bugzilla status to CLOSED/WONTFIX.
If this continues to be an issue post 3.9 GA, feel free to reopen and raise for
an async errata.

Comment 24 Tomas 2007-05-14 17:15:53 UTC
Created attachment 154673 [details]
functions/i18n.php sm-1.4.10a patch

replaces 'encodeheader' code with simple non-smart encoding.

Comment 25 Tomas 2007-05-14 17:41:29 UTC
"Bad decoding of multiple Subjects" issue can't be fixed. rfc2047 - "6.3. Mail
reader handling of incorrectly formed 'encoded-word's" chapter.

A mail reader need not attempt to display the text associated with an
'encoded-word' that is incorrectly formed.  However, a mail reader
MUST NOT prevent the display or handling of a message because an
'encoded-word' is incorrectly formed.

"Bad encoding issue" should be fixed in attached patch. Please note that I am
the one who wrote encodeHeaderBase64(). SquirrelMail has three header encoding
functions and I trust only encodeHeaderBase64(). Patch requires SquirrelMail
1.4.6 and php mbstring extension with Japanese support. If patch is applied to
older SquirrelMail version, you must port related functions. If code executed on
PHP install without mbstring support, Japanese translation should fail on
set_up_language() call.

Fix is simple. I have added blocks of comments in order to explain what I am
doing. If you need explanations why original code is broken, I can tag invalid
decisions made by original code.

Note You need to log in before you can comment on or make changes to this bug.