Description of problem: Similar to bug 130118, when plain text files to ps file, mapge tries to determine the charset of the file from the environment variables. and as a result, it prevent multi-lingual document from converting properly. If the user is in the zh_TW locale and the text document contains both zh_CN, zh_TW, ja_JP and ko_KR characters, only zh_TW characters are process and the rest are blank or replaced by dots. Version-Release number of selected component (if applicable): mpage-2.5.4-2 How reproducible: Always Steps to Reproduce: 1.cat CJKtest.txt |mpage > test.ps 2.ggv test.ps 3. Actual results: Only TC row (zh_TW) got display properly, the rest are blank Expected results: Be able to display all the contents in the document Additional info:
Created attachment 113074 [details] text file containing all 4 CJK locale characters
Created attachment 113075 [details] ps file generated in the zh_TW.UTF-8 locale
So C, J and K individually are ok?
Yes, individually they are OK, as long as the locale match the charset.
I have test mpage-2.5.5 with this file, it will determine the code as UTF-8 correctly. But unlucky evince can't open that file, maybe it is the font problem.
FYI, mpage generates PS file that is relying on CID-keyed font (or fonts emulated by gs) and CMap to pick the glyphs up from the font in PS. and PS itself allows to mix languages up in one file. Just supporting this feature may be easy, but an issue would be how to determine the font for unified ideographs in UTF-8. e.g. for CJK. possible idea would be to refers current locale and sort out the font priority against it.
requested by Jens Petersen (#27995)
Lets face it, this is not going to happen. It would increase complexity of mpage too much. Anyway, if anyone wants to spend his time and prepare the patch, he's welcome to do so.