Red Hat Bugzilla – Bug 143323
set charset UTF-8 or other charset according to user's locale to html generating by xmlto
Last modified: 2007-11-30 17:10:57 EST
Description of problem:
xmlto always generates html files having iso-8859-1 charset from xml.
but non iso-8859-1 environment, we usually make xml using by UFT-8 or
please correct xslt to make suitable charset from xml.
and non-ASCII charactors are replaced by character entities when using
xmlto. please correct also.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1.write xml using non iso-8859-1 charset(ex.Japanese)
2.make html from above xml by xmlto.
html has wrong charset
and all non-ASCII charactors become character entities.
set suitable charset(UTF-8 or others) to html.
xmlto uses docbook-style-xsl.
> character entities.
-> character entity references
So the character encoding is correct for the document in that it accurately
describes the character encoding it contains, but it's not the best (most
efficient or appropriate) encoding to use -- is that right?
I agree with you. I hope the below example
<?xml version = '1.0' encoding = 'UTF-8'?>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
Following URL is translation of fedora-docs.
Japanese will become a numerical entity if xmlto is used.
class="copyright">製作著作 ï½© 2003 Red Hat,
Inc., Tammy Fox</p></d
Created attachment 109065 [details]
displayed by a browser.
I forgot URL.
Tadashi: I think we are all in agreement here. Comment #3 describes the desired
However, just because a document is encoded in UTF-8 for input should not
dictate the output charset: I think LC_CTYPE should be used for that.
In fact, in later releases, it is: see bug #126921.