It appears to be caused as a side effect of https://sourceware.org/git/?p=glibc.git;a=commit;f=localedata/locales/sv_SE;h=159738548130d5ac4fe6178977e940ed5f8cfdc4 Probably an unexpected effect of the IGNORE keyword. Example test case: LANG=sv_SE.iso88591 && [[ w =~ [a-z] ]] && echo ok || echo nok It should print 'ok', but on rhel8, and other recent systems it prints 'nok'
This is a known upstream bug: https://sourceware.org/bugzilla/show_bug.cgi?id=25036 The addition of the new rules to sort &v<<<V<<w<<<W cause "collation element order" to become different which impacts 'a-z' sorting and moves the characters out of the range for sv_SE. Swedish doesn't use this rule any more so we can fix upstream and RHEL8 by removing this specific rule.
Moving to RHEL 9. This change would change the collation order, which is not allowed in an 8.Y release, as it would break existing databases.
This is fixed in upstream now and it we will work to ensure it makes it into RHEL9. As a workaround the customer could compile their own variant of the locale with the upstream fix. The upstream fix is here and will be a part of glibc 2.34 releasing 2021-08-01: commit ebde2baeb535661019b8f774a906d6abd332f3b8 Author: Sebastian Rasmussen <sebras> Date: Thu Mar 18 17:21:43 2021 -0400 Update sv_SE to treate 'W' as a distinct character (Bug 25036) The 13th edition of Svenska Akademiens ordlista lists 'W' as a distinct letter that sorts after 'V'. We adjust the sv_SE locale (and tests) to match this updated and "reformed" language change. This harmonizes us with CLDR 1.5.0 (2007) for sv_SE sorting of the letter 'W'. No regressions on x86_64, and locale sorting tests all pass. Co-authored-by: Carlos O'Donell <carlos>