Bug 2230217
| Summary: | Please enable CONFIG_UNICODE kernel option | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Martin Schwenke <martin> |
| Component: | kernel | Assignee: | fs-maint Bot <fs-maint> |
| kernel sub component: | File Systems | QA Contact: | Boyang Xue <bxue> |
| Status: | NEW --- | Docs Contact: | |
| Severity: | medium | ||
| Priority: | unspecified | CC: | asn, dchinner, dhowells, esandeen, madam, mszeredi, swhiteho, xzhou |
| Version: | 9.1 | Flags: | swhiteho:
needinfo?
(madam) esandeen: needinfo? (dchinner) |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | Bug | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Martin Schwenke
2023-08-09 02:13:26 UTC
Yes, this is useful for Samba as it improves performance for those shares! If Windows looks for a file foo, we do not have to check all permutations. See the following options in `man smb.conf` to improve performance: case sensitive default case preserve case short preserve case See also https://wiki.samba.org/index.php/Performance_Tuning#Directories_with_a_Large_Number_of_Files (Re-adding these notes as non-private comments, sorry about that.)
XFS has had ascii-ci capability for a long time, for just this purpose (samba). However, it was recently deprecated due to problems:
commit 7ba83850ca2691865713b307ed001bde5fddb084
Author: Darrick J. Wong <djwong>
Date: Tue Apr 11 19:05:19 2023 -0700
xfs: deprecate the ascii-ci feature
This feature is a mess -- the hash function has been broken for the
entire 15 years of its existence if you create names with extended ascii
bytes; metadump name obfuscation has silently failed for just as long;
and the feature clashes horribly with the UTF8 encodings that most
systems use today. There is exactly one fstest for this feature.
In other words, this feature is crap. Let's deprecate it now so we can
remove it from the codebase in 2030.
Signed-off-by: Darrick J. Wong <djwong>
Reviewed-by: Christoph Hellwig <hch>
The more involved CONFIG_UNICODE implementation actually started with XFS as well, as noted in the upstream commit that got merged:
commit 955405d1174eebcd1b89ab335f720adc27d52b67
Author: Gabriel Krisman Bertazi <krisman>
Date: Thu Apr 25 13:38:44 2019 -0400
unicode: introduce UTF-8 character database
The original XFS RFCs for this feature can be found at:
V1: https://lore.kernel.org/linux-xfs/20140911203735.GA19952@sgi.com/
V2: https://lore.kernel.org/linux-xfs/20140918195650.GI19952@sgi.com/
V3: https://lore.kernel.org/linux-xfs/20141003214758.GY1865@sgi.com/
The EXT4 RFC based on the above can be found here:
https://lore.kernel.org/linux-ext4/20180112071234.29470-1-krisman@collabora.co.uk/
These contain a lot of the information about how this works and the rationale for it.
I don't remember why the work died on the vine for XFS - Dave?
Maybe the other thing to note is that today, enabling CONFIG_UNICODE affects/alters primarily ext4, but also f2fs (which we don't build or ship) and ksmbd. In the past, this would have been a KABI issue but in RHEL9 that would not be a problem since we reset anyway.
I'm also not sure how robust the ext4 implementation is; ext4 has a habit of merging new features that are not quite complete. At a minimum we'd want to look for robust test coverage before enabling and supporting this.
|