strdup could be made faster. I found (when comparing some Mozilla string
code, which uses memcpy) that calling strlen, malloc, and memcpy, is faster
than calling strdup. Probably there's some optimization in memcpy (moving
more bits each instruction?) that's not being used for strdup. If it were,
strdup would be faster.
I'll attach a testcase that has a my_strdup that beats the libc strdup on
my 1GHz Pentium III processor, with the results below. The test tests 3
million calls for each on a string a little over 400 characters, although
my_strdup beats strdup for a string of 31 characters (see testcase for the
Created attachment 34862 [details]
You know how imprecise that measurement is, especially when the loop is
long enough for reschedule, right?
I've modified the test to measure the minimum amount of ticks for a strdup
resp. my_strdup call during those 3 million invocations, and the real timing
really depends on the actual flags used to compile strdup_perf.c
(especially whether -O2 (resp. -O3) is used or whether -fno-builtin is used).
If optimizing and -fno-builtin is not specified, then gcc itself optimizes
strdup away (but of course not my_strdup), as it knows the length of the string,
so just calls malloc and memcpy (or if memcpy is worth expanding inline, only
So, I usually get numbers like ~ 550 ticks for builtin strdup and ~ 2600 ticks
for either not builtin strdup or my_strdup.