Bug 55005 - strdup could be faster
strdup could be faster
Status: CLOSED NOTABUG
Product: Red Hat Linux
Classification: Retired
Component: glibc (Show other bugs)
7.1
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Jakub Jelinek
Aaron Brown
: FutureFeature
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2001-10-24 04:50 EDT by David Baron
Modified: 2016-11-24 10:18 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2001-10-24 04:51:26 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
strdup_perf.c (2.04 KB, text/plain)
2001-10-24 04:51 EDT, David Baron
no flags Details

  None (edit)
Description David Baron 2001-10-24 04:50:33 EDT
strdup could be made faster.  I found (when comparing some Mozilla string
code, which uses memcpy) that calling strlen, malloc, and memcpy, is faster
than calling strdup.  Probably there's some optimization in memcpy (moving
more bits each instruction?) that's not being used for strdup.  If it were,
strdup would be faster.

I'll attach a testcase that has a my_strdup that beats the libc strdup on
my 1GHz Pentium III processor, with the results below.  The test tests 3
million calls for each on a string a little over 400 characters, although
my_strdup beats strdup for a string of 31 characters (see testcase for the
strings).

strdup:
10630 ms
my_strdup:
6218 ms
Comment 1 David Baron 2001-10-24 04:51:22 EDT
Created attachment 34862 [details]
strdup_perf.c
Comment 2 Jakub Jelinek 2001-10-24 05:54:50 EDT
You know how imprecise that measurement is, especially when the loop is
long enough for reschedule, right?
I've modified the test to measure the minimum amount of ticks for a strdup
resp. my_strdup call during those 3 million invocations, and the real timing
really depends on the actual flags used to compile strdup_perf.c
(especially whether -O2 (resp. -O3) is used or whether -fno-builtin is used).
If optimizing and -fno-builtin is not specified, then gcc itself optimizes
strdup away (but of course not my_strdup), as it knows the length of the string,
so just calls malloc and memcpy (or if memcpy is worth expanding inline, only
calls malloc).
So, I usually get numbers like ~ 550 ticks for builtin strdup and ~ 2600 ticks
for either not builtin strdup or my_strdup.

Note You need to log in before you can comment on or make changes to this bug.