Bug 1924207

Summary: glibc: Letter 'w' or 'W' not in range [a-z] or [A-Z] in sv_SE locale
Product: Red Hat Enterprise Linux 9 Reporter: Paulo Andrade <pandrade>
Component: glibcAssignee: glibc team <glibc-bugzilla>
Status: CLOSED CURRENTRELEASE QA Contact: Sergey Kolosov <skolosov>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 9.0CC: ashankar, codonell, dj, fweimer, mnewsome, pfrankli, sipoyare
Target Milestone: betaKeywords: Bugfix, Triaged
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: glibc-2.33.9000-36.el9 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-12-07 21:42:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1958224    
Bug Blocks:    

Description Paulo Andrade 2021-02-02 19:17:12 UTC
It appears to be caused as a side effect of https://sourceware.org/git/?p=glibc.git;a=commit;f=localedata/locales/sv_SE;h=159738548130d5ac4fe6178977e940ed5f8cfdc4

Probably an unexpected effect of the IGNORE keyword.

Example test case:

LANG=sv_SE.iso88591 && [[ w =~ [a-z] ]] && echo ok || echo nok

It should print 'ok', but on rhel8, and other recent systems it prints 'nok'

Comment 1 Carlos O'Donell 2021-02-03 02:03:26 UTC
This is a known upstream bug:
https://sourceware.org/bugzilla/show_bug.cgi?id=25036

The addition of the new rules to sort &v<<<V<<w<<<W cause "collation element order" to become different which impacts 'a-z' sorting and moves the characters out of the range for sv_SE.

Swedish doesn't use this rule any more so we can fix upstream and RHEL8 by removing this specific rule.

Comment 3 DJ Delorie 2021-04-06 18:44:36 UTC
Moving to RHEL 9.  This change would change the collation order, which is not allowed in an 8.Y release, as it would break existing databases.

Comment 4 Carlos O'Donell 2021-04-30 14:02:30 UTC
This is fixed in upstream now and it we will work to ensure it makes it into RHEL9.

As a workaround the customer could compile their own variant of the locale with the upstream fix.

The upstream fix is here and will be a part of glibc 2.34 releasing 2021-08-01:

commit ebde2baeb535661019b8f774a906d6abd332f3b8
Author: Sebastian Rasmussen <sebras>
Date:   Thu Mar 18 17:21:43 2021 -0400

    Update sv_SE to treate 'W' as a distinct character (Bug 25036)
    
    The 13th edition of Svenska Akademiens ordlista lists 'W' as a
    distinct letter that sorts after 'V'. We adjust the sv_SE locale
    (and tests) to match this updated and "reformed" language change.
    This harmonizes us with CLDR 1.5.0 (2007) for sv_SE sorting of
    the letter 'W'.
    
    No regressions on x86_64, and locale sorting tests all pass.
    
    Co-authored-by: Carlos O'Donell <carlos>