Description of problem: Selecting these languages results in an invalid locale: Belarusian be Bosnian bs Basque eu Armenian hy Georgian ka Serbian sr These alternative languages from the language menu result in a valid locale: Basque (Spain) eu_ES Serbian (Serbia) sr_RS There are no alternatives for Belarusian, Bosnian, Armenian, Georgian. There are 74 languages in the language menu. I did not test every one ... Tested by repeatedly running clean, minimal installs from the DVD. Command-line: $ qemu-kvm -m 2048 -hda f18-test-1.img -cdrom ~/xfr/fedora/F18/F18-Beta/TC4/Fedora-18-Beta-TC4-x86_64-DVD.iso -usb -vga qxl -boot menu=on -usbdevice mouse Version-Release number of selected component (if applicable): anaconda 18.16 Fedora-18-Beta-TC4-x86_64-DVD.iso How reproducible: Always. Steps to Reproduce: 1. Do a clean, minimal install with one of the languages listed above. 2. After rebooting, login as root. 3. # locale # cat /etc/sysconfig/i18n # cat /etc/locale.conf Actual results: During login, "cannot change locale" messages are displayed. Attachment 620140 [details] is a screenshot with examples. The 'locale' command returns "No such file or directory" errors. The files /etc/sysconfig/i18n and /etc/locale.conf have invalid locales. Expected results: No messages are displayed during login. No errors are returned by the 'locale' command. The files /etc/sysconfig/i18n and /etc/locale.conf have valid locales. Additional info: Valid locales are the names of text files in /usr/share/i18n/locales/. The GNU documentation says this: 2.3.1 Locale Names "A locale name usually has the form ‘ll_CC’. Here ‘ll’ is an ISO 639 two-letter language code, and ‘CC’ is an ISO 3166 two-letter country code." http://www.gnu.org/software/gettext/manual/html_node/Locale-Names.html#Locale-Names See also: Bug 858591 - anaconda setting invalid system locale xx.UTF-8 not xx_YY.UTF-8
Proposing as a Beta blocker to make sure we consider this. However I'd actually vote for NTH status rather than blocker, on the basis that the set of affected languages is pretty small in terms of absolute size and number of affected users.
Specifically, the affected languages are Belarusian (be), Bosnian (bs), Armenian (hy), Georgian (ka). There appear to be alternatives for Basque (eu) and Serbian (sr) in the menu, namely 'Basque (Spain)' (eu_ES) and 'Serbian (Serbia)' (sr_RS), respectively. For reference, the last column of this table shows locales from /usr/share/i18n/locales/: Belarusian be be_BY, be_BY@latin Bosnian bs bs_BA Basque eu eu_ES, eu_ES@euro Armenian hy hy_AM Georgian ka ka_GE Serbian sr sr_ME, sr_RS, sr_RS@latin Hand-edited from the this: $ ls -1 `cat invalid-locales-18.16-1.txt | sed -e 's@^@/usr/share/i18n/locales/@' -e 's/$/_*/'` /usr/share/i18n/locales/be_BY /usr/share/i18n/locales/be_BY@latin /usr/share/i18n/locales/bs_BA /usr/share/i18n/locales/eu_ES /usr/share/i18n/locales/eu_ES@euro /usr/share/i18n/locales/hy_AM /usr/share/i18n/locales/ka_GE /usr/share/i18n/locales/sr_ME /usr/share/i18n/locales/sr_RS /usr/share/i18n/locales/sr_RS@latin
Chris: Would it be possible to dump a table of the languages and the corresponding locales to a file? There are 74 languages by my count, and it would be a lot easier to inspect a text file than to run through 74 test installs just to see what gets configured. Also, I experimented with raising an exception when an invalid locale was configured, but it became apparent that recovery (e.g. cleaning up filesystems) was going to require experts ...
Assuming I didn't make an mistakes ... Anaconda Master translation coverage at Transifex: be 5% bs 12% eu_ES 0% eu 13% hy 3% ka 28% sr@latin 28% sr 28% anaconda translations in F17: $ find `cat invalid-locales-18.16-1.txt | sed -e 's@^@/usr/share/locale/@' -e 's/$/*/'` -name 'anaconda.mo' | xargs ls -1 --size 16 /usr/share/locale/be/LC_MESSAGES/anaconda.mo 40 /usr/share/locale/bs/LC_MESSAGES/anaconda.mo 4 /usr/share/locale/eu_ES/LC_MESSAGES/anaconda.mo 20 /usr/share/locale/eu/LC_MESSAGES/anaconda.mo 4 /usr/share/locale/hy/LC_MESSAGES/anaconda.mo 12 /usr/share/locale/ka/LC_MESSAGES/anaconda.mo 100 /usr/share/locale/sr@latin/LC_MESSAGES/anaconda.mo 128 /usr/share/locale/sr/LC_MESSAGES/anaconda.mo anaconda translation links at Transifex: https://fedora.transifex.com/projects/p/fedora/language/be/?project=2059 https://fedora.transifex.com/projects/p/fedora/language/bs/?project=2059 https://fedora.transifex.com/projects/p/fedora/language/eu_ES/?project=2059 https://fedora.transifex.com/projects/p/fedora/language/eu/?project=2059 https://fedora.transifex.com/projects/p/fedora/language/hy/?project=2059 https://fedora.transifex.com/projects/p/fedora/language/ka/?project=2059 https://fedora.transifex.com/projects/p/fedora/language/sr@latin/?project=2059 https://fedora.transifex.com/projects/p/fedora/language/sr/?project=2059
(In reply to comment #3) > Chris: Would it be possible to dump a table of the languages and the > corresponding locales to a file? There are 74 languages by my count, and it > would be a lot easier to inspect a text file than to run through 74 test > installs just to see what gets configured. The languages Steve mentions which result in an invalid locale are all missing in “mangleMap”: mfabian@ari:~/rpmsources/fedora/anaconda/anaconda-18.16 (f18) $ grep 'mangleMap =' ./pyanaconda/localization.py -A17 mangleMap = {"af": "af_ZA", "am": "am_ET", "ar": "ar_SA", "as": "as_IN", "ast": "ast_ES", "bg": "bg_BG", "bn": "bn_BD", "ca": "ca_ES", "cs": "cs_CZ", "cy": "cy_GB", "da": "da_DK", "de": "de_DE", "el": "el_GR", "en": "en_US", "es": "es_ES", "et": "et_EE", "fa": "fa_IR", "fi": "fi_FI", "fr": "fr_FR", "gl": "gl_ES", "gu": "gu_IN", "he": "he_IL", "hi": "hi_IN", "hr": "hr_HR", "hu": "hu_HU", "id": "id_ID", "ilo": "ilo_PH", "is": "is_IS", "it": "it_IT", "ja": "ja_JP", "kk": "kk_KZ", "kn": "kn_IN", "ko": "ko_KR", "lt": "lt_LT", "lv": "lv_LV", "mai": "mai_IN", "mk": "mk_MK", "ml": "ml_IN", "mr": "mr_IN", "ms": "ms_MY", "nb": "nb_NO", "nds": "nds_DE", "ne": "ne_NP", "nl": "nl_NL", "nn": "nn_NO", "nso": "nso_ZA", "or": "or_IN", "pa": "pa_IN", "pl": "pl_PL", "pt": "pt_PT", "ro": "ro_RO", "ru": "ru_RU", "si": "si_LK", "sk": "sk_SK", "sl": "sl_SI", "sq": "sq_AL", "sr": "sr_RS", "sv": "sv_SE", "ta": "ta_IN", "te": "te_IN", "tg": "tg_TJ", "th": "th_TH", "tr": "tr_TR", "uk": "uk_UA", "ur": "ur_PK", "vi": "vi_VN", "zu": "zu_ZA"} mfabian@ari:~/rpmsources/fedora/anaconda/anaconda-18.16 (f18) $ > Also, I experimented with raising an exception when an invalid locale was > configured, but it became apparent that recovery (e.g. cleaning up > filesystems) was going to require experts ... Woudln‘t it be better to set a valid fallback locale like en_US.UTF-8 if a language cannot be found in mangleMap instead of setting an invalid locale?
Discussed at 2012-10-17 blocker review meeting: http://meetbot.fedoraproject.org/fedora-qa/2012-10-17/f18beta-blocker-review-4.2012-10-17-16.00.log.txt . Agreed that this is a conditional breakage of the criteria, but the likely impact of this bug (the total number of users of the affected languages) is too small for it to qualify as a blocker - especially since the affected translations are apparently highly incomplete anyway (thanks Steve) so in practice you couldn't use them without knowing English. It is rejected as a blocker, accepted as NTH.
(In reply to comment #5) > Wouldn‘t it be better to set a valid fallback locale like en_US.UTF-8 if > a language cannot be found in mangleMap instead of setting an invalid locale? Yes +1 for this.
Created attachment 629895 [details] anaconda-18.18-more-locales-fix.patch Thanks, Steve, for the careful anaylysis. This patch adds the 5 missing locales from comment 2 to the locale mangleMap and makes it fallback to en_US instead of an invalid locale if the language code is not listed in the map. (It might not do the right thing for sr@latin yet but I think this is good enough for now anyway.)
Thanks for your patch. - return mangleMap.get(inLocale, inLocale) + return mangleMap.get(inLocale, "en_US") 1. We might want to know if a locale cannot be found, so writing something to a log file might be a good idea. 2. How would users be notified that the "en_US" fallback locale has been configured instead of a locale that corresponds to the language they selected from the menu?
The patch wouldn't apply against 18.18-1 until I reversed the patch from Attachment 621937 [details]: $ patch -b -R < anaconda-fix-kk-tg-locales.patch $ patch -b < anaconda-18.18-more-locales-fix-1.patch http://git.fedorahosted.org/git/anaconda.git
- return mangleMap.get(inLocale, inLocale) + return mangleMap.get(inLocale, "en_US") I don't think this will work. The list 'languages' can have 'langcode's not in mangleMap, because mangleMap is not a complete list. All of those will get mapped to "en_US". $ less -N anaconda-18.18-1/pyanaconda/localization.py 165 for langcode in languages: 166 try: 167 localedata = babel.Locale.parse(mangleLocale(langcode)) 168 except babel.core.UnknownLocaleError: 169 continue
Created attachment 630267 [details] mangle-test-2.py mangleMap locale validation test The attached mangle-test-2.py validates the locales in mangleMap against the locale file names in /usr/share/i18n/locales/. The executive summary is that they are all valid except for ilo_PH and a fake test locale. Usage: ./mangle-test-2.py Output is a report to stdout. The copy of mangleMap used has the patch from Jens applied. Python programmers working with locales might be interested in the locale.normalize() function. It would work in place of mangleMap, if the locale_alias table it uses were complete. locale.normalize(localename) Returns a normalized locale code for the given locale name. The returned locale code is formatted for use with setlocale(). If normalization fails, the original name is returned unchanged. If the given encoding is not known, the function defaults to the default encoding for the locale code just like setlocale(). http://docs.python.org/library/locale.html The normalize() function and the locale_alias table are here: $ rpm -qf /usr/lib64/python2.7/locale.py python-libs-2.7.3-7.2.fc17.x86_64
Created attachment 630273 [details] mangle-test-report-1.txt report from the previously attached mangle-test-2.py This report shows the results of running mangle-test-2.py on an updated F17 system with: $ rpm -q python-libs glibc-common python-libs-2.7.3-7.2.fc17.x86_64 glibc-common-2.15-57.fc17.x86_64 The rightmost column allows you to compare mangleMap with locale.normalize().
I ran seven test installs with Jens' pyanaconda/localization.py patch (without the "en_US" change) from the F18-Beta-TC6 Live CD. The results are an improvement. Now, Belarusian, Bosnian, Armenian, Georgian are listed with countries in the language menu, and a valid locale is configured for them. Basque is no longer listed in the menu, although 'Basque (Spain)' is still listed. The one remaining problem is that Serbian still appears in the menu, and 'sr' is configured for the locale. Menu Item Locale Belarusian (Belarus) be_BY Bosnian (Bosnia and Herzegovina) bs_BA Basque [not in menu] Basque (Spain) eu_ES Armenian (Armenia) hy_AM Georgian (Georgia) ka_GE Serbian sr Serbian (Serbia) sr_RS Command-line: $ qemu-kvm -m 2048 -hda f18-test-2.img -cdrom ~/xfr/fedora/F18/F18-Beta/TC6/Fedora-18-Beta-TC6-x86_64-Live-Desktop.iso -usb -vga qxl -boot menu=on -usbdevice mouse
Thank you, Steve, nice work. If you have time perhaps perhaps it is worth seeing how much of the mangleMap could be replaced by something like: locale.normalize(inLocale + 'utf8') It sounds like it could simplify things a bit. Seems to do the right thing though for me: >>> locale.normalize('sr@latin') 'sr_RS.UTF-8@latin' >>> locale.normalize('de.UTF-8') 'de_DE.UTF-8' Otherwise not really sure what to do about sr@latin at this stage: it only seems to be 27% translated though for anaconda... (In reply to comment #9) > - return mangleMap.get(inLocale, inLocale) > + return mangleMap.get(inLocale, "en_US") > > 1. We might want to know if a locale cannot be found, so writing something > to a log file might be a good idea. Good point - I think the lang/locale setting should always be logged anyway. > 2. How would users be notified that the "en_US" fallback locale has been > configured instead of a locale that corresponds to the language they > selected from the menu? Right - there could be a popup dialog saying that anaconda doesn't know that the locale for the lang, but actually I would really prefer anaconda just didn't list langs without a valid associated locale. Otherwise I think it is better to fallback to a valid locale (en_US.utf8) than setting an incorrect system one.
Part of the problem is that 'sr@latin' is getting changed to 'sr': $ python Python 2.7.3 (default, Jul 24 2012, 10:05:38) [GCC 4.7.0 20120507 (Red Hat 4.7.0-5)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import babel >>> babel.parse_locale('sr') ('sr', None, None, None) >>> babel.parse_locale('sr@latin') ('sr', None, None, None) >>> babel.parse_locale('sr_RS') ('sr', 'RS', None, None) >>> The result is that 'sr' is returned *twice* by get_available_translations(), because babel.Locale.parse() calls babel.parse_locale(). See Line 147 below and /usr/lib/python2.7/site-packages/babel/core.py. $ less -N anaconda-18.12-1/pyanaconda/localization.py ... 133 def get_available_translations(domain=None, localedir=None): 134 domain = domain or gettext._current_domain 135 localedir = localedir or gettext._default_localedir 136 137 langdict = babel.Locale('en', 'US').languages 138 messagefiles = gettext.find(domain, localedir, langdict.keys(), all=True) 139 languages = [path.split(os.path.sep)[-3] for path in messagefiles] 140 141 # usually there are no message files for en 142 if 'en' not in languages: 143 languages.append('en') 144 145 for langcode in languages: 146 try: 147 localedata = babel.Locale.parse(langcode) 148 except babel.core.UnknownLocaleError: 149 continue 150 151 yield LocaleInfo(localedata) ...
Thanks, Jens. The locale 'sr_RS.UTF-8@latin' seems to be valid: $ LANG='sr_RS.UTF-8@latin' locale LANG=sr_RS.UTF-8@latin ... There is a locale file /usr/share/i18n/locales/sr_RS@latin. $ grep title /usr/share/i18n/locales/sr_RS@latin title "Serbian Latin locale for Serbia" What would the language menu show? 'Serbian Latin (Serbia)'?
(In reply to comment #17) ... > What would the language menu show? 'Serbian Latin (Serbia)'? At Transifex, it is called 'Serbian (Latin)': https://fedora.transifex.com/projects/p/fedora/language/sr@latin/
Created attachment 632405 [details] localization-test-report-1.txt showing locales returned by locale.normalize() This report shows locales for which there are anaconda translations for F17. The 'short_name' column has the strings returned by get_available_translations(). The 'locale.normalize()' column has the strings returned by locale.normalize(short_name). There are 73 translations. Of these, only 5 have a valid short_name, while 68 have a valid normalized name. (These numbers are at the end of the report.) There are three anomalies: 1. 'sr' is returned twice by get_available_translations(). 2. The babel module raises this exception: "unknown locale 'ar_AA'". 3. The actual number of anaconda.mo files is 81: $ find /usr/share/locale -name 'anaconda.mo' | wc -l 81 Testing was done with pyanaconda/localization.py from anaconda-18.12-1. That version does not have the mangleMap. anaconda-17.29-1.fc17.x86_64 glibc-common-2.15-57.fc17.x86_64 python-babel-0.9.6-3.fc17.noarch python-libs-2.7.3-7.2.fc17.x86_64
Created attachment 632419 [details] localization-test-2.py locale validation test
(In reply to comment #19) > 3. The actual number of anaconda.mo files is 81: > $ find /usr/share/locale -name 'anaconda.mo' | wc -l > 81 We appear to be losing some translations. These are returned by locale.normalize(): 'bn_IN.UTF-8' 'mai_IN.UTF-8' 'sr_RS.UTF-8@latin' 'zh_TW.big5' '<' anaconda.mo translation files in /usr/share/locale '>' Languages returned by get_available_translations() $ diff lang-trans-1.txt lang-anac-1.txt 5,6d4 < ast < bal 10d7 < bn_IN 19c16 < en@boldquot --- > en 21d17 < en@quot 37d32 < ilo 46d40 < mai 52d45 < nds 69c62 < sr@latin --- > sr 80d72 < zh_TW == The input files were generated with: $ find /usr/share/locale -name 'anaconda.mo' | cut -d '/' -f 5 | sort > lang-trans-1.txt $ egrep '^ *[0-9]*:' localization-test-report-1.txt | grep -v 'babel:' | sed -e 's/ */:/g' -e 's/\.UTF-8//' | cut -d ':' -f 4 > lang-anac-1.txt (localization-test-report-1.txt is machine readable, but just barely ... :-))
As a wise old man once said, every time you use a non-unified diff, a puppy dies.
locale.normalize() doesn't normalize some languages that have a corresponding locale in /usr/share/i18n/locales/. $ ls -1 `echo 'ast:bal:bn:ilo:mai:nds:sr:zh' | sed -e 's@:@\n@g' | sed -e 's@^@/usr/share/i18n/locales/@' -e 's@$@*@'` ls: cannot access /usr/share/i18n/locales/bal*: No such file or directory ls: cannot access /usr/share/i18n/locales/ilo*: No such file or directory /usr/share/i18n/locales/ast_ES /usr/share/i18n/locales/bn_BD /usr/share/i18n/locales/bn_IN /usr/share/i18n/locales/mai_IN /usr/share/i18n/locales/nds_DE /usr/share/i18n/locales/nds_NL /usr/share/i18n/locales/sr_ME /usr/share/i18n/locales/sr_RS /usr/share/i18n/locales/sr_RS@latin /usr/share/i18n/locales/zh_CN /usr/share/i18n/locales/zh_HK /usr/share/i18n/locales/zh_SG /usr/share/i18n/locales/zh_TW
(In reply to comment #22) > As a wise old man once said, every time you use a non-unified diff, a puppy > dies. The "diff -u" was longer and unreadable ... so I liberated myself from habit. :-)
(In reply to comment #23) > locale.normalize() doesn't normalize some languages that have a > corresponding locale in /usr/share/i18n/locales/. > > $ ls -1 `echo 'ast:bal:bn:ilo:mai:nds:sr:zh' | sed -e 's@:@\n@g' | sed -e > 's@^@/usr/share/i18n/locales/@' -e 's@$@*@'` mai, sr and, zh work for me at least on F17. eg: $ python >>> import locale >>> locale.normalize('sr.utf8') 'sr_RS.UTF-8' >>> locale.normalize('zh.utf8') 'zh_CN.UTF-8' Without getting too philosophical here, there is also "Perfect is the enemy of good". :) Anyway I think we have to be a bit pragmatic here - I am not sure if there is time to integrate and test a perfect solution for F18 and quite honestly the number of people doing installs in some of these languages is going to be rather small. But of course I am all for improving things to get a better F18 installer. :) Do you want to post an improved patch so that it can get reviewed and hopefully integrated into anaconda for testing?
IMO, pyanaconda/localization.py should not be touched at all at this point. However, we can analyze and document the problems. Some of the translations that are missing cause babel to raise an UnknownLocaleError exception: translation UnknownLocaleError ast 1 bal 1 bn_IN 0 ilo 1 mai 1 nds 1 sr@latin 0 zh_TW 0 Test with: >>> import babel >>> l = babel.Locale.parse('ast') # if there is no exception: >>> l = babel.Locale.parse('bn_IN') >>> l.__dict__ {'_Locale__data': None, 'territory': 'IN', 'variant': None, 'language': 'bn', 'script': None}
Created attachment 635283 [details] localization-test-3.py locale validation test for pyanaconda/localization.py Updated locale validation test for pyanaconda/localization.py: Usage: ./localization-test-3.py [localization_module[.py]] This version adds two features: 1. The name of the file under test can be given as an argument: $ ./localization-test-3.py localization_18_21_1.py If no argument is given, the file name defaults to "localization.py". 2: Records are tagged for machine-processing: D: Documentation L: Locale S: Summary E: Error
Created attachment 635315 [details] localization-test-report-2.txt showing results of latest commit with mangleMap patch Here is a locale validation test report for the latest commit[1] with mangleMap patched to include the additional mappings from Jens' patch. There are 75 locales[2] with only one invalid locale: 'sr'. There are three missing languages for which there is an anaconda translation and a locale file: translation in locale file in /usr/share/locale /usr/share/i18n/locales/ ast ast_ES bal none ilo none mai mai_IN nds nds_DE, nds_NL Report generated with: $ ./localization-test-3.py localization_a5ca147a7738ffc7257bd857ea99eac9ea7642a6_2.py > localization-test-report-2.txt [1] Use a slightly different method to get supported languages (#858801, tagoh). http://git.fedorahosted.org/cgit/anaconda.git/commit/pyanaconda/localization.py?id=a5ca147a7738ffc7257bd857ea99eac9ea7642a6 [2] Tested with F17.
Created attachment 635331 [details] localization-manglemap-1.patch Apply this patch, which is based on the one from Jens, to commit a5ca147a7738ffc7257bd857ea99eac9ea7642a6 to obtain localization.py generating 74 valid locales out of 75 total. Use a slightly different method to get supported languages (#858801, tagoh). http://git.fedorahosted.org/cgit/anaconda.git/commit/pyanaconda/localization.py?id=a5ca147a7738ffc7257bd857ea99eac9ea7642a6
CLDR 22.1, which was released 2012-10-26, has support for 'ast', but not the other four languages for which there are anaconda translations.[1] There is an open babel ticket for an upgrade to CLDR 21 that is dated 08/19/12.[2] So, I would say that with the addition of the modified patch from Jens and a possible resolution of Bug 871464, this bug will be fixed as well as it can be. translation in locale file in CLDR 22.1 /usr/share/locale /usr/share/i18n/locales/ ast ast_ES yes bal none no ilo none no mai mai_IN no nds nds_DE, nds_NL no [1] CLDR Releases/Downloads http://cldr.unicode.org/index/downloads [2] Ticket #312 (new enhancement) Upgrade to CLDR 21 http://babel.edgewall.org/ticket/312
Created attachment 636385 [details] localization-manglemap-sr-latin-1.patch This proposed patch for pyanaconda/localization.py: 1. Adds 'Serbian (Latin)' to the languages menu. 2. Writes 'sr_RS.UTF-8@latin' as the locale. 3. Includes the mangleMap changes from Jens. While working on this, I found that LocaleInfo.__repr__() may return invalid locales. The function appends the encoding, but that format is rejected by the locale command: $ LANG=sr_RS locale # error messages $ LANG=sr_RS.UTF-8@latin locale # no error messages The latter is also returned by locale.normalize(): Comment 15 (Thanks, Jens). Patch is against: http://git.fedorahosted.org/cgit/anaconda.git/commit/pyanaconda/localization.py?id=a5ca147a7738ffc7257bd857ea99eac9ea7642a6 Tested by installing with: 1. anaconda-0:18.21-1.fc18.x86_64 2. patched pyanaconda/localization.py 3. Fedora-18-Beta-TC6-x86_64-Live-Desktop.iso Command-line: $ qemu-kvm -m 4096 -hda f18-test-2.img -cdrom ~/xfr/fedora/F18/F18-Beta/TC6/Fedora-18-Beta-TC6-x86_64-Live-Desktop.iso -usb -vga qxl -boot menu=on -usbdevice mouse
Created attachment 637082 [details] Patch for LocaleInfo.__repr__ What about using this patch? If I understand it correctly we would only need add mangleMap changes and put "sr_RS@latin" instead of "sr_Latn" to the mangleMap on top of it. No mangleReprMap needed in such case. In [8]: loc = babel.Locale.parse("sr_RS.UTF-8@latin") In [9]: loc.language Out[9]: 'sr' In [10]: loc.territory Out[10]: 'RS' In [11]: loc.english_name Out[11]: u'Serbian (Serbia)' In [12]: print loc.display_name Српски (Србија) In [13]: print loc.script None ^^^^^^^ bug in babel?
Babel explicitly ignores modifiers of all types: Bug 871464 - [sr@latin] script not parsed as 'Latn' IMO, Babel should support both parsing and formatting of locale strings, but it cannot, because it throws away modifiers, so 'sr@latin' and 'sr' produce the same Locale object. Only with 'sr_Latn' does the 'script' attribute have a value: >>> import babel >>> babel.Locale.parse('sr@latin').__dict__ {'_Locale__data': None, 'territory': None, 'variant': None, 'language': 'sr', 'script': None} >>> babel.Locale.parse('sr').__dict__ {'_Locale__data': None, 'territory': None, 'variant': None, 'language': 'sr', 'script': None} >>> babel.Locale.parse('sr_Latn').__dict__ {'_Locale__data': None, 'territory': None, 'variant': None, 'language': 'sr', 'script': 'Latn'}
(In reply to comment #33) > Babel explicitly ignores modifiers of all types: > Bug 871464 - [sr@latin] script not parsed as 'Latn' ... 'sr_RS.UTF-8@latin' and 'sr_RS.UTF-8' produce the same Locale object. Babel also throws away the encoding. >>> babel.Locale.parse('sr_RS.UTF-8@latin').__dict__ {'_Locale__data': None, 'territory': 'RS', 'variant': None, 'language': 'sr', 'script': None} >>> babel.Locale.parse('sr_RS.UTF-8').__dict__ {'_Locale__data': None, 'territory': 'RS', 'variant': None, 'language': 'sr', 'script': None}
I applied your patch to localization_18_23_1.py: $ ./localization-test-3.py localization_18_23_1_EXP_1.py D: summary: pyanaconda/localization.py locale validation test D: module: localization_18_23_1_EXP_1 D: timestamp: 2012-11-02 17:29:38 UTC D: n: count D: short_name: from LocaleInfo object D: locale.normalize(): value of normalize(short_name) from locale module D: v1: short_name is valid if 1, invalid if 0 D: v2: normalize(short_name) is valid if 1, invalid if 0 D: english_name: from LocaleInfo object D: note: A valid locale is defined as having a language and a territory. L: n: short_name locale.normalize() v1 v2 english_name Traceback (most recent call last): File "./localization-test-3.py", line 65, in <module> for loc_1 in ll.get_available_translations(domain='anaconda', localedir='/usr/share/locale'): File "/home/stephent/src/exp/anaconda-languages/localization_18_23_1_EXP_1.py", line 190, in get_available_translations localedata.script(script) TypeError: 'NoneType' object is not callable == $ less -N localization_18_23_1_EXP_1.py ... 187 # BUG: babel.Locale.parse does not parse @script 188 script = _get_locale_script(langcode) 189 if script: 190 localedata.script(script) 191 192 yield localedata ...
Yeah, I've also encountered that traceback during testing.
Created attachment 637222 [details] PatchV2 for the LocaleInfo.__repr__ Could you please try it with this one? There were more problems in the previous version of the patch.
Created attachment 637237 [details] localization-test-report-18_23_1_EXP_2.txt The locale is 'sr.UTF-8@latin', not 'sr_RS.UTF-8@latin' The english_name is 'Serbian', not 'Serbian (Latin)'. == D: summary: pyanaconda/localization.py locale validation test D: module: localization_18_23_1_EXP_2 D: timestamp: 2012-11-02 20:01:18 UTC ... L: 61: sq_AL.UTF-8 sq_AL.UTF-8 1 1 'Albanian (Albania)' L: 62: sr.UTF-8@latin sr_RS.utf_8_latin 0 1 'Serbian' L: 63: sr_RS.UTF-8 sr_RS.UTF-8 1 1 'Serbian (Serbia)' L: 64: sv_SE.UTF-8 sv_SE.UTF-8 1 1 'Swedish (Sweden)' ... == $ LANG='sr.UTF-8@latin' locale locale: Cannot set LC_CTYPE to default locale: No such file or directory locale: Cannot set LC_MESSAGES to default locale: No such file or directory locale: Cannot set LC_ALL to default locale: No such file or directory LANG=sr.UTF-8@latin LC_CTYPE="sr.UTF-8@latin" LC_NUMERIC="sr.UTF-8@latin" LC_TIME="sr.UTF-8@latin" LC_COLLATE="sr.UTF-8@latin" LC_MONETARY="sr.UTF-8@latin" LC_MESSAGES="sr.UTF-8@latin" LC_PAPER="sr.UTF-8@latin" LC_NAME="sr.UTF-8@latin" LC_ADDRESS="sr.UTF-8@latin" LC_TELEPHONE="sr.UTF-8@latin" LC_MEASUREMENT="sr.UTF-8@latin" LC_IDENTIFICATION="sr.UTF-8@latin" LC_ALL=
(In reply to comment #38) > Created attachment 637237 [details] > localization-test-report-18_23_1_EXP_2.txt > > The locale is 'sr.UTF-8@latin', not 'sr_RS.UTF-8@latin' The mangleMap update is needed. And the same goes for: > $ LANG='sr.UTF-8@latin' locale > The english_name is 'Serbian', not 'Serbian (Latin)'. This really doesn't have a reasonable (without additional mapping) solution. But it really should be resolved in babel and not in Anaconda. Also "Serbian (Latin)" doesn't fit in the rest of the list with the "language (territory)" pattern. I'm posting your patch to the anaconda-patches. If it gets approved for F18, I will push it. Otherwise I'd like to push my patch (with mangleMap update) for F18 and hope that it will fulfill the requirements of the NTH flag. Having "Serbian (Serbia)" twice is not such a big deal. One thing that I don't understand -- I thought that one could see which one is latin by trying to select one or the other, but both appear to have the same translations (both using cyrilic not latin).
anaconda-18.24-1.fc18 has been submitted as an update for Fedora 18. https://admin.fedoraproject.org/updates/anaconda-18.24-1.fc18
Bug 872786 - [sr@latin] 'Serbian (Latin)' not listed in languages menu; 'Serbian (Serbia)' listed twice Since valid locales are configured by both 'Serbian (Serbia)' menu entries, I opened a new bug. Thanks for getting the patches in.
Created attachment 637385 [details] localization-test-report-18_24_1.txt showing 75 out of 75 valid locales This locale validation test report for anaconda-18.24-1 shows 75 out of 75 valid locales. There are two remaining issues that do not directly pertain to this bug: 1. 'Serbian (Serbia)' is listed twice with different locales (Bug 872786). 2. 'Basque (Spain)' is listed twice with the same locale. (The installer languages menu lists 'Basque (Spain)' once, however.)
Package anaconda-18.24-1.fc18: * should fix your issue, * was pushed to the Fedora 18 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing anaconda-18.24-1.fc18' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2012-17543/anaconda-18.24-1.fc18 then log in and leave karma (feedback).
anaconda-18.25-1.fc18 has been submitted as an update for Fedora 18. https://admin.fedoraproject.org/updates/anaconda-18.25-1.fc18
anaconda-18.26-1.fc18 has been submitted as an update for Fedora 18. https://admin.fedoraproject.org/updates/anaconda-18.26-1.fc18
anaconda-18.27-1.fc18 has been submitted as an update for Fedora 18. https://admin.fedoraproject.org/updates/anaconda-18.27-1.fc18
18.26 went stable. Closing. (Bodhi closing of bugs when updates go stable is currently broken).