Bug 456188
Summary: | Incorrect characters for Romanian in gucharmap | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Răzvan Sandu <rsandu2004> |
Component: | gucharmap | Assignee: | Matthias Clasen <mclasen> |
Status: | CLOSED WONTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | high | Docs Contact: | |
Priority: | low | ||
Version: | rawhide | CC: | alexxed, dcantrell, katzj, marius.stracna, notting |
Target Milestone: | --- | Keywords: | Reopened |
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
URL: | http://www.secarica.ro | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2008-08-25 19:00:34 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Răzvan Sandu
2008-07-22 01:59:27 UTC
Not sure what you are complaining about here. Certainly both variants of the character are present. Are you complaining that the Unicode standard got them wrong ? There is nothing we can do about that here, you'll have to complain to the Unicode consortium at www.unicode.org Thank you for your response ! When one selects, for example, U+0613 (small t with cedilla below) the text in gucharmap (bottom line) identifies it as being a Romanian-used character. The four characters with cedilla below are simply *not a part* of the Romanian language/alphabet - only the comma-below ones are. This confusion is due to a very old bug in Windows implementations (pre-Linux, before 1993...). It was corrected in Windows Vista; patches are also available for pre-Vista Windowses. The bug in Windows leads to a very large number of Romanian documents, webpages, UIs, etc. containing wrong characters (cedilla-below instead of comma-below ones); correctly-generated documents are incorrectly displayed because the lack of Romanian-correct fonts, etc. Please help eliminating for good this confusion about the Romanian language and documents, which is not easily "visible" to non-Romanian speakers/developers. Best regards, Răzvan How does ancillary (informative only, no actual usage of it) text change anything about the characters used in documents? By being disinformative (i.e. by help to maintain a very old and "popular" confusion among Romanian users, especially non-technical ones). If not offered (or aware of) other direct means (like national keyboard layout) of inserting Romanian-specific characters in his documents, a non-technical user seeks an easy-to-use way to do it. He finds gucharmap among the graphical tools in his standard Gnome menus and cuts & paste these characters into the document. If he uses the cedilla-below chars (because of the informative text below) instead of the comma ones, he inserts wrong Unicode in the document. *Especially* because this confusion is very old and affects a large number of users, we seek ways to eliminate it in all aspects/components of Linux (distro doesn't matter): fonts, tools, keyboard maps, etc. Thanks again for your kind help, Răzvan I'm sorry, but you'll have to complain to the Unicode consortium. The text that gucharmap displays in the statusbar is taken directly from the Unicode character database. Hello, If one compares these two files from the Unicode Consortium: http://www.unicode.org/charts/PDF/U0180.pdf (Latin Extended-B) http://www.unicode.org/charts/PDF/U0100.pdf (Latin Extended-A) will note (in the first document, page 6) that the comma-below characters are given as the *preffered* ones for Romanian language. They still didn't *entirely* remove the historical error in the second document, since they still list cedilla-below characters as *valid* (i.e. acceptable) for the Romanian language. This is simply a bug in the standard for which I'll fill a bug report. According to the Romanian Academy rules (an to the common, day-by-day practice in Romanian schools, that every Romanian pupil knows), the only acceptable characters are s and t with *comma* below. Of course, this is not so evident to non-Romanian speakers, so the bug in the Unicode standard is easy to understand. Please also note that this bug is *very* old (since 1988, I think), when Microsoft made the first implementations for the Romanian language in DOS/Windows, without and possibility to officially consult the Romanian (Communist) Academy or authorities of the era. But there is no valid reason to perpetuate this bug today (i.e. continue to produce *new* documents and webpages with the wrong characters). The problem *was fixed* in Windows Vista; drivers for a correct Romanian keyboard for pre-Vista Windowses are available at http://www.secarica.ro. An *official* national standard for the Romanian keyboard, with comma-below characters, do exist, i.e. the SR 13392:2004 standard. So please help us fixing this longstanding bug (and *confusion* related to it, among users), even the official correction in the Unicode documents will still last for a while... Many thanks, Răzvan gucharmap pulls its data directly from the Unicode standard. Thats not going to change... Hello, For the record, here's the official answer I've received from the Unicode Consortium when I've reported this issue: ------------------------------------------------------------------- Dear Mr. Sandu, My apologies for taking so long to respond to your email. Unfortunately there is nothing we can do about this. It isn't a "subtle error" and it is perfectly obvious to us non-Romanian speakers by the way -- in fact this is a *notorious* issue, and was long ago decided by ISO ballot in WG2. Please see page 228 in TUS 5.0, (http://www.unicode.org/versions/Unicode5.0.0/ch07.pdf) which talks about this issue, and is the best we can do at this point. We already have appropriate annotations for all these characters. Best regards, --------------------------- Magda Danish Sr. Administrative Director The Unicode Consortium 650.693.3921 magda ------------------------------------------------------------------- Now the presence of these "foreign" characters in Romanian setups seems to be a *well-known issue*, that surpasses far beyond the gucharmap issue reported by me in this actual bug. Until the Unicode Consortium fix the actual text of the standard (which may take years...), can we make the Fedora Project developers aware of this and eliminate this bug (foreign characters in a language...) at least in the Fedora distro, all over ? Thanks a lot, Răzvan This is not a distribution-level decision, period. If the correct characters are not in a font? Fix the font package. If the unicode standard isn't correct? Fix the standard. (If the packages aren't willing to fork the standard from upstream, well, I understand that. Ergo, restoring previous component and resolution.) If existing documents use the wrong characters... well, there's not much we can do to help that at the distribution level. |