Red Hat Bugzilla – Bug 475684
Find solution for using Glossaries with publican
Last modified: 2011-08-22 19:53:15 EDT
Description of problem:
The latest version of publican (0.39) has banned the use of glosslist tags, making using Glossaries impossible (or at least very difficult), and causing books that use them to not build.
Reasoning: Considered unprofessional, difficult to translate, sorting issues.
Arguments against: Being considered unprofessional is an opinion. Glossaries are a useful resource in technical doc. If necessary they can remain untranslated.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
*ERROR: BUILD FAILED! Banned tag found*
glosslist: This tag set imposes English-language order on glossaries, making them useless when translated.
Remove all glosslist tags before attempting to build.
make: *** [xml-en-US] Error 4
I think glosslist should be allowed and glossdiv should be banned instead, because a glosslist is, IMHO, an unordered list.
I've added Manuel as a CC for his input from a translators perspective.
Manuel, would a UTS #10 collation routine conforming to http://www.unicode.org/reports/tr10/ be sufficient to sort a flat glossary?
i.e. a glosslist or glossary without glossdivs?
e.g. of the form:
Where TermX may or may not be translated, but definitionX is always translated.
Oops, I buggered up the flags, still need info from Manuel.
It would be sufficient for a phonetic alphabet. I'm not sure how it would work in other systems. How does this algorithm sort non-alphabetical writing systems?
I have added Chester for his opinion.
(In reply to comment #2)
> Manuel, would a UTS #10 collation routine conforming to
> http://www.unicode.org/reports/tr10/ be sufficient to sort a flat glossary?
> i.e. a glosslist or glossary without glossdivs?
> e.g. of the form:
> Term1: definition1
> Term2: definition2
> Term3: definition3
> Where TermX may or may not be translated, but definitionX is always translated.
> Cheers, Jeff.
Non-alphabetical characters are in "Symbols". I guess it's in ASCII order.
To me glossary is quite important.
(In reply to comment #4)
> It would be sufficient for a phonetic alphabet. I'm not sure how it would work
> in other systems. How does this algorithm sort non-alphabetical writing
> I have added Chester for his opinion.
I had a chat to Asgeir and he believes that TR10 collation should be sufficient to sort mixed language content in the correct order.
1: Package Unicode::Collate
2: Use Unicode::Collate in cleanXml to sort the glosslist on glossterm after the translated XML has been cleaned.
3: Remove the ban on glosslist
4: Consider banning glossdiv as it can have mixed language content at multiple levels. This breaks l10n layout.
There is no time frame for this ATM due to work on RHTS.
Can we remove the ban sooner rather than later?
I ask because, afaik, the only (RH) books that use glossaries are not translated yet. Is it possible to apply these bans on a brand basis? This way I can still update publican to take advantage of fixes and enhancements without removing my glossary.
There are glossaries in the IDM doc, none of which is translated. IPA doc is scheduled for translation in the next release, but that is not for some time. The rest of the IDM doc (Directory Server, Cert. System, etc.) is not scheduled for translation.
I *think* the oVirt doc uses glossaries, but I don't know what translation plans exist.
(In reply to comment #7)
> Can we remove the ban sooner rather than later?
lol no. The mere possibility that there _may_ be a fix at some unknown time in the future is not a sane reason to change anything now.
> I ask because, afaik, the only (RH) books that use glossaries are not
> translated yet. Is it possible to apply these bans on a brand basis?
Brands already control this by setting STRICT. Red Hat brands set STRICT, common, fedora, etc brands do not set STRICT. Non-STRICT brands get a warning instead of an error about these things.
This is not something that is getting fixed. This is something that is getting banned due to opinions. The mere possibility that someone might add a glossary to a book that is going to get translated is not a sane reason to ban the necessary tags in the first place.
I thought STRICT settings were involved in how this was treated but wasn't sure, thanks.
(In reply to comment #9)
> This is not something that is getting fixed.
It breaks our ability to translate content, so from any perspective that doesn't ignore translation it is broken and needs to be fixed.
> This is something that is getting
> banned due to opinions.
I think it's highly insulting that you insinuate we have not done due diligence on this functionality. It takes in to account all aspects of the Documentation work flow; our customers expectations, and the real history and decisions of the past that have positively and negatively affected the Docs team and Red Hat.
> The mere possibility that someone might add a glossary
> to a book that is going to get translated is not a sane reason to ban the
> necessary tags in the first place.
It is not acceptable to break translation work flow regardless of the current translation status of a particular work.
It is a sane policy given the volume of content, the size of the team we work in, and that ignoring translation work flow has bitten us in the ass previously and it cost us significantly to rectify that short sightedness.
I suppose if you don't have to care about the other people in the team and you chose to ignore that this exact same attitude has occurred before and cost us dearly, then sure, maybe we are just being silly.
*** Bug 485949 has been marked as a duplicate of this bug. ***
I don't particularly care about other people! So, can we please allow glossaries? Is there an ETA for that, even for the glosslist compromise?
Glossaries are very useful, whether it's for new products like IPA or RH Virtual Directory or long-standing and intricate products like RHEL itself. They're a great reference for every level of user. I personally use them all the time. That is my opinion. Your opinion is that they aren't worth the effort because of the amount of time they take. Great. We have two opinions.
Is it not possible simply to not translate the glossaries or to leave them out of translated docs? It seems there can be a procedural resolution rather than flat out prohibiting glossaries.
(In reply to comment #12)
> I don't particularly care about other people!
Welcome to the public mailing list.
> So, can we please allow
They break translation, until there is a solution that doesn't breach the stated ECS policy, that breaking translation is _never_ acceptable, they will remain disabled for STRICT brands.
> Is there an ETA for that, even for the glosslist compromise?
> Glossaries are very useful, whether it's for new products like IPA or RH
> Virtual Directory or long-standing and intricate products like RHEL itself.
> They're a great reference for every level of user.
No one is arguing they aren't useful or desirable.
> I personally use them all
> the time. That is my opinion. Your opinion is that they aren't worth the effort
> because of the amount of time they take. Great. We have two opinions.
No, we have a dozen opinions, and one policy that breaking translation is never acceptable.
> Is it not possible simply to not translate the glossaries or to leave them out
> of translated docs? It seems there can be a procedural resolution rather than
> flat out prohibiting glossaries.
I have been informed by management that treating translated content as of secondary importance or excluding content from translations is not acceptable. Your manager is aware of the effects of these policies and you should take up the prioritisation of these issues with them directly.
Sigh. I thought the facetiousness was implied in "I don't care about people, so can I have my glossary now." Next time, I'll use a /sarc tag. (Unless those are banned, too...)
Still awaiting disposition from blocker.
That's cool. The promise of a resolution being in the works is good for now. Thanks for keeping the bug updated.
I removed the blocker because:
1: newer XSL is available in the docs brew root and yum repo
2: publican 1.3 (due next week) will have glossary.sort enabled for all formats
Still requires testing on a translated glossary.
I had time to experiment a bit with this a few weeks ago; here's what I found, with help from translators:
<glossentry>s inside a <glossary> get sorted correctly (at least superficially) for languages that use the Latin and Cyrillic alphabets. Languages with different writing systems present different problems:
<glossentry>s appear in no discernible pattern. They're probably being sorted according to Unicode codepoint.
A glossary in a Japanese technical publications could include up to four different writing systems: Latin, Katakana, Hiragana, and Kanji. Terms presented in Latin script should be separated from those presented in the three Japanese writing systems (already sorted correctly), but terms in Katakana, Hiragana, and Kanji should be interspersed according to their pronunciation. At present, we're getting all the Katakana first, then all the Hiragana, then all the Kanji. Katakana and Hiragana are syllabic scripts that represent the same 50 syllables; sorting them shouldn't be difficult and can probably be achieved easily in an update to the docbook locale. The problem is that a single Kanji character can represent one, two, or more syllables and its pronunciation (and therefore sort order) can change when combined with other Kanji.
all Indic languages
Korean and the various Indic languages that we support use syllabic scripts; if they aren't already working correctly, I think that should be easily fixed in the locale.
I note that these sorting issues affect not only glossaries, but any books that have indexes as well.
 not all languages sort the Latin alphabet the same way, particularly when it comes to handling accented characters or characters outside the "basic Latin" group. I didn't explore what happens at these edges.
Hi Rudi, can we get access to this glossary? Also can we get it in an ordered list in the correct sorting order?
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated in the
current release, Red Hat is unfortunately unable to address this
request at this time. Red Hat invites you to ask your support
representative to propose this request, if appropriate and relevant,
in the next release of Red Hat Enterprise Linux.