Bug 55005 - strdup could be faster
Summary: strdup could be faster
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: glibc   
(Show other bugs)
Version: 7.1
Hardware: i386
OS: Linux
Target Milestone: ---
Assignee: Jakub Jelinek
QA Contact: Aaron Brown
Keywords: FutureFeature
Depends On:
TreeView+ depends on / blocked
Reported: 2001-10-24 08:50 UTC by David Baron
Modified: 2016-11-24 15:18 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2001-10-24 08:51:26 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
strdup_perf.c (2.04 KB, text/plain)
2001-10-24 08:51 UTC, David Baron
no flags Details

Description David Baron 2001-10-24 08:50:33 UTC
strdup could be made faster.  I found (when comparing some Mozilla string
code, which uses memcpy) that calling strlen, malloc, and memcpy, is faster
than calling strdup.  Probably there's some optimization in memcpy (moving
more bits each instruction?) that's not being used for strdup.  If it were,
strdup would be faster.

I'll attach a testcase that has a my_strdup that beats the libc strdup on
my 1GHz Pentium III processor, with the results below.  The test tests 3
million calls for each on a string a little over 400 characters, although
my_strdup beats strdup for a string of 31 characters (see testcase for the

10630 ms
6218 ms

Comment 1 David Baron 2001-10-24 08:51:22 UTC
Created attachment 34862 [details]

Comment 2 Jakub Jelinek 2001-10-24 09:54:50 UTC
You know how imprecise that measurement is, especially when the loop is
long enough for reschedule, right?
I've modified the test to measure the minimum amount of ticks for a strdup
resp. my_strdup call during those 3 million invocations, and the real timing
really depends on the actual flags used to compile strdup_perf.c
(especially whether -O2 (resp. -O3) is used or whether -fno-builtin is used).
If optimizing and -fno-builtin is not specified, then gcc itself optimizes
strdup away (but of course not my_strdup), as it knows the length of the string,
so just calls malloc and memcpy (or if memcpy is worth expanding inline, only
calls malloc).
So, I usually get numbers like ~ 550 ticks for builtin strdup and ~ 2600 ticks
for either not builtin strdup or my_strdup.

Note You need to log in before you can comment on or make changes to this bug.