Created attachment 771391 [details] Example hack to work around the problem on windows. Description of problem: pull command for gettext file format pulls files in host OS default encoding which on windows platforms is not utf8. this corrupts the language translation strings on pull. Version-Release number of selected component (if applicable): 3.0.0 How reproducible: 100% Steps to Reproduce: 1. on windows, push a gettext project with some nice utf8 characters (some Japanese for example) 2. on windows, pull the gettext project. Actual results: pulled .po file has '?' where there should have been nice Japanese characters Expected results: same characters that were pushed. Additional info: See attachment for temporary work around. Not suggesting this be used of course (it's horrid), I have attached purely for further context. Ideally, we could configure on the Zanata web page, or in the arguments to the pull command the desired pulled encoding.
Hi Matthew, It is necessary to configure the locale of database to UTF-8. If your database is already UTF-8, please tell us: 1. The name, version of database you used. 2. The client type (maven, python, or java) and its version. 3. Zanata server version. 4. Your windows region setting (locale) and version. With the data you provide, we have more chance to reproduce the bug. Regards,
Thank you. I have found out I was being mislead by running the zanata client in eclipse verses the command line. From eclipse, things worked. From the command line in windows it was failing. Eclipse was setting the java encoding to UTF8. I added: -Dfile.encoding=UTF-8 to my command line and things now work fine there too. So, ignore my bug report I think. Best regards, Matt.
It looks like we still have a few places in the client which use the class FileWriter, which uses the platform default encoding. We should change these to specify the encoding explicitly. We actually write out "UTF-8" in the PO header, so the fact that we don't write UTF-8 is a bug.
Test items: 1. By default, the pulled po file should be in UTF-8 encoding. 2. By specifying the locale and character encoding, the output file should be in the specified encoding.
*** Bug 915886 has been marked as a duplicate of this bug. ***
Migrated; check JIRA for bug status: http://zanata.atlassian.net/browse/ZNTA-356