strdup could be made faster. I found (when comparing some Mozilla string code, which uses memcpy) that calling strlen, malloc, and memcpy, is faster than calling strdup. Probably there's some optimization in memcpy (moving more bits each instruction?) that's not being used for strdup. If it were, strdup would be faster. I'll attach a testcase that has a my_strdup that beats the libc strdup on my 1GHz Pentium III processor, with the results below. The test tests 3 million calls for each on a string a little over 400 characters, although my_strdup beats strdup for a string of 31 characters (see testcase for the strings). strdup: 10630 ms my_strdup: 6218 ms
Created attachment 34862 [details] strdup_perf.c
You know how imprecise that measurement is, especially when the loop is long enough for reschedule, right? I've modified the test to measure the minimum amount of ticks for a strdup resp. my_strdup call during those 3 million invocations, and the real timing really depends on the actual flags used to compile strdup_perf.c (especially whether -O2 (resp. -O3) is used or whether -fno-builtin is used). If optimizing and -fno-builtin is not specified, then gcc itself optimizes strdup away (but of course not my_strdup), as it knows the length of the string, so just calls malloc and memcpy (or if memcpy is worth expanding inline, only calls malloc). So, I usually get numbers like ~ 550 ticks for builtin strdup and ~ 2600 ticks for either not builtin strdup or my_strdup.