Bug 177246
Summary: | Appalling Performance with UTF-8 | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | JW <ohtmvyyn> | ||||||
Component: | sed | Assignee: | Petr Machata <pmachata> | ||||||
Status: | CLOSED RAWHIDE | QA Contact: | Ben Levenson <benl> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 4 | CC: | mnewsome, pbonzini | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | i686 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | FC-4 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2006-08-03 14:00:26 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
JW
2006-01-08 07:16:33 UTC
Created attachment 133474 [details]
fix for this problem
In str_append, sed does some extra processing of UTF-8 strings. It looks like
a) there is a bug inside in that the first character is processed again and
again, and b) even if it wasn't, there is no actual outcome of the processing.
Taking this code out gives back lost performance. Testsuite passes.
The result of the code is to update to->mbstate. This is a no-op indeed for UTF-8, but the code needs to be there for other encodings. It's ok with me to fix the bug that Petr reported, and conditionalize the code on multibyte, non-UTF-8 encoding (rather than on multibyte encoding only). Created attachment 133551 [details]
correct patch
This patch only disables the code in question for UTF-8, and fixes the bug for
other multi-byte locales.
Thanks. I applied the patch verbatim and did a build, it should be available shortly. |