Bug 1924207 - glibc: Letter 'w' or 'W' not in range [a-z] or [A-Z] in sv_SE locale
Summary: glibc: Letter 'w' or 'W' not in range [a-z] or [A-Z] in sv_SE locale
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: glibc
Version: 9.0
Hardware: All
OS: Linux
unspecified
medium
Target Milestone: beta
: ---
Assignee: glibc team
QA Contact: Sergey Kolosov
URL:
Whiteboard:
Depends On: 1958224
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-02-02 19:17 UTC by Paulo Andrade
Modified: 2023-07-18 14:29 UTC (History)
7 users (show)

Fixed In Version: glibc-2.33.9000-36.el9
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-12-07 21:42:00 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Sourceware 25036 0 P2 UNCONFIRMED Update collation order for Swedish 2021-02-19 14:56:00 UTC

Description Paulo Andrade 2021-02-02 19:17:12 UTC
It appears to be caused as a side effect of https://sourceware.org/git/?p=glibc.git;a=commit;f=localedata/locales/sv_SE;h=159738548130d5ac4fe6178977e940ed5f8cfdc4

Probably an unexpected effect of the IGNORE keyword.

Example test case:

LANG=sv_SE.iso88591 && [[ w =~ [a-z] ]] && echo ok || echo nok

It should print 'ok', but on rhel8, and other recent systems it prints 'nok'

Comment 1 Carlos O'Donell 2021-02-03 02:03:26 UTC
This is a known upstream bug:
https://sourceware.org/bugzilla/show_bug.cgi?id=25036

The addition of the new rules to sort &v<<<V<<w<<<W cause "collation element order" to become different which impacts 'a-z' sorting and moves the characters out of the range for sv_SE.

Swedish doesn't use this rule any more so we can fix upstream and RHEL8 by removing this specific rule.

Comment 3 DJ Delorie 2021-04-06 18:44:36 UTC
Moving to RHEL 9.  This change would change the collation order, which is not allowed in an 8.Y release, as it would break existing databases.

Comment 4 Carlos O'Donell 2021-04-30 14:02:30 UTC
This is fixed in upstream now and it we will work to ensure it makes it into RHEL9.

As a workaround the customer could compile their own variant of the locale with the upstream fix.

The upstream fix is here and will be a part of glibc 2.34 releasing 2021-08-01:

commit ebde2baeb535661019b8f774a906d6abd332f3b8
Author: Sebastian Rasmussen <sebras>
Date:   Thu Mar 18 17:21:43 2021 -0400

    Update sv_SE to treate 'W' as a distinct character (Bug 25036)
    
    The 13th edition of Svenska Akademiens ordlista lists 'W' as a
    distinct letter that sorts after 'V'. We adjust the sv_SE locale
    (and tests) to match this updated and "reformed" language change.
    This harmonizes us with CLDR 1.5.0 (2007) for sv_SE sorting of
    the letter 'W'.
    
    No regressions on x86_64, and locale sorting tests all pass.
    
    Co-authored-by: Carlos O'Donell <carlos>


Note You need to log in before you can comment on or make changes to this bug.