Bug 82032
Summary: | huge sort speed regression | ||||||
---|---|---|---|---|---|---|---|
Product: | [Retired] Red Hat Public Beta | Reporter: | Pádraig Brady <p> | ||||
Component: | coreutils | Assignee: | Tim Waugh <twaugh> | ||||
Status: | CLOSED RAWHIDE | QA Contact: | |||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | phoebe | CC: | behdad, chinen | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2003-03-12 10:02:01 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Pádraig Brady
2003-01-16 15:54:38 UTC
With LC_CTYPE=C it is much faster. In UTF-8 charset it just has much more work to do. We patch coreutils to make it handle multi-byte encodings; this also makes it L18NUX compliant. Wow, that's some speed difference though! 585 times slower? 1m11s to sort 1526 items is pretty much useless IMHO and at least shouldn't be the default? Yes my LANG was en_IE.UTF-8 and when I changed it to C or LC_CTYPE=C it reverted to the previous operation. Hello, My name is Mitsuru Chinen. I'm a staff of OpenI18N (aka LI18NUX). I found this problem by chance. This problem was caused by the 2 matters. 1) strlen() was called too many times in spite of being unnecessary. 2) wctype() was called too many times in spite of being unnecessary. If you modify the coreutis-4.5.3-i18n.patch with the following methods, I18N sort utility becomes much faster. (However new I18N sort is still 8 time slower than original one.) 1) After all patches in rpm package are applied, you can find the following sentence at ismbblank() in sort.c. | mblength = mbrtowc (&wc, str, strlen (str), &state); strlen() spends a lot of time. It should be replaced with MB_LEN_MAX, There are similar defects in coreutis-4.5.3-i18n.patch. You can find those with `grep "mbrtowc.*strlen"'. 2) After all patches in rpm package are applied, you can find the following sentence at ismbblank() in sort.c. | return (iswctype (wc, wctype ("blank"))); wctype spends a lot of time, too. iswctype (wc, wctype ("blank")) should be replaced with iswblank (wc) There are similar defects in coreutis-4.5.3-i18n.patch, too. You can find those with `grep "iswctype.*wctype"'. Thank you, > strlen() spends a lot of time. It should be replaced with MB_LEN_MAX It can't be outright replaced with MB_LEN_MAX; otherwise mbrtowc risks reading past the end of the string in the case that no complete character is found. But it could be replaced withs strnlen (str, MB_LEN_MAX). Is that what you meant? > iswctype (wc, wctype ("blank")) Oops, good catch! Thanks. You are right. I could not find such a risk out. I agree with you: strnlen (str, MB_LEN_MAX) is correct one. Thank you, Created attachment 90549 [details]
coreutils-4.5.3-i18n.patch
Actually it turns out that most callers of ismbblank() already know what the
length is, so I adjusted its prototype to include that.
Here's a new version of coreutils-4.5.3-i18n.patch with the changes you
suggested and the change I just mentioned.
I reviewed your new patch. It's very nice. Not concerned with this problem, I found a tiny bug. In getmonth_mb(), | month_wcs = (wchar_t *) alloca (len + 1); shoud be | month_wcs = (wchar_t *) alloca (len + 1) * sizeof (wchar_t); I think you should close this bug now. Thanks so much! ITYM: month_wcs = (wchar_t *) alloca ((len + 1) * sizeof (wchar_t)); Thanks, also fixed. Fixed package is coreutils-4.5.3-21, which will appear in rawhide in due course. I tested the patch (rawhide doesn't have it yet) and it's a LOT better. In fact 4500% better from my tests. It's still 1300% slower than LC_CTYPE=C though, maybe room for more improvments? $ ./sort_test sorting 1534 lines timing: sort -k2,2n -k4,4nr -k3,3 -u real 1m11.922s user 2m23.500s sys 0m0.060s same with LC_CTYPE=C real 0m0.063s user 0m0.080s sys 0m0.030s timing: ~/sort -k2,2n -k4,4nr -k3,3 -u real 0m0.804s user 0m1.580s sys 0m0.040s Note the real time is correct, the user seems to be multiplied by 2 (sometimes 4) for some reason? btw coreutils is currently at version 4.5.9 > maybe room for more improvments? Perhaps. Patches welcome. > btw coreutils is currently at version 4.5.9 The last version that Jim Meyering suggested that distributors use is 4.5.3; also 4.5.11 is targetted as being a bona fide stable release, so I'm holding out for that. |