Description of problem: slapi_ch_free()'s prototype makes it unportable to hosts where some pointer types have different representation than void*: slapi_ch_free((void **)&ptr); will access ptr as a void*, regardless of ptr's real type. Pity to do that for something as trivial as nulling the pointer after freeing it. (Actually I encountered this in OpenLDAP which also implements slapi, but there isn't much point in "fixing" your API there when that makes it incompatible with yours.) Chris Torek gives an example of how this can break even for a mostly normal architecture: http://groups.google.com/groups?selm=98amku$3bp$1@elf.bsdi.com The Data General Eclipse had word pointers as the native pointer format, so casting int* to char* (this was before void*:-) had to left-shift the address. If int*foo; contained address 666, that was word#666. free(*(void**)foo) would free byte#666 = word#333. The void** type of the function for slapi_be_getentrypoint() is similarly broken, but if it knows the type of the function pointer which the argument points to, then it can convert the argument back to a pointer to that function pointer type before using it. The void*[] array of function pointers for slapi_apib_...() is worse, since a void* may not be able to hold a function pointer according to the C standard. It would have been OK to use typedef (*slapi_fn)(void); and make the api an array of function pointers cast to slapi_fn. This is probably not a problem now since dlsym() works, but Open Group apparently plans to provide a replacement function for this reason - see RATIONALE in http://www.opengroup.org/onlinepubs/009695399/functions/dlsym.html
On which hosts/platforms/OS (other than Data General) are pointer types different than void *? On those hosts, how do you find out what the type is? Or is the pointer type different depending on what the base type is? Nulling the pointer after freeing it is not trivial - it has saved hundreds of hours of debugging and prevented countless other hard to find bugs. Sure, we could have made a macro to do this instead. The proper fix would be to introduce slapi_ch_free_xxx functions for the various types. This is already done for the most common case, slapi_ch_free_string. Most of the other types have their own free or done function e.g. slapi_entry_free, slapi_sdn_free, etc. We could not remove or change the signature to slapi_ch_free because that would break binary compatability, regardless of how incorrect it actually is, since it "just works" on all of our supported platforms.
> On which hosts/platforms/OS (other than Data General) are pointer > types different than void *? It's been years since I followed that kind of thing, but this has a few old examples: http://www.psgd.org/paul/docs/cstyle/cstyle16.htm type Cray-2 Unisys 1100 _____________________________ char* 64 72 int* 64(24) 72 int(*)() 64 576 The 64(24) means there was a compile-time option to pick the pointer model to use. When I paid attention there were discussions of how transition from 32-bit to 64-bit might cause similar issues to the one I described for Data General Eclipse. I haven't kept track, but one variant Google found was: http://h71000.www7.hp.com/commercial/c/docs/5492profile_008.html HP has a #pragma pointer_size which allows 32-bit and 64-bit pointer models to coexist. 32-bit pointers are automatically promoted to 64-bit if they are used as specified. Of course, a simple answer to that one may be "don't do that, then". I have no idea how good answer that is in this case. (MSDOS pointers (at least in the far/huge memory models) was one major example of breaking a number of non-standard assumptions about pointers, though I don't know if it would have broken this API. Unless there was a memory model with 32-bit function pointers and 16-bit data pointers. Don't remember at the moment.) Still, I'm not concerned about a particular host at this time. Currently I live on Linux myself:-) I'm concerned about a hopefully growing code base which uses an API which enforces invalid code. If assumptions the server is making internally break for a major enough architecture that you want to fix it, then you can fix that. But you can't fix code which other people have been writing for years to this API. And the same goes for OpenLDAP of course. So it's better to fix the API early and hope any such major portability problem is years in the future. > On those hosts, how do you find out what the type is? Or is the > pointer type different depending on what the base type is? Yes, in general the program just has to know the type. Just as it must know whether it's accessing an int or a long. > Nulling the pointer after freeing it is not trivial - it has saved > hundreds of hours of debugging and prevented countless other hard to > find bugs. Sure, we could have made a macro to do this instead. I see. A macro must evaluate the argument twice though, that can invite breakage too. But with a capital-letter name that's hopefully warning enough. > The proper fix would be to introduce slapi_ch_free_xxx functions for > the various types. This is already done for the most common case, > slapi_ch_free_string. Most of the other types have their own free or > done function e.g. slapi_entry_free, slapi_sdn_free, etc. Fine. I did find a number of things that should be freed with slapi_ch_free in the doc though. > We could not remove or change the signature to slapi_ch_free because > that would break binary compatability, regardless of how incorrect it > actually is, since it "just works" on all of our supported platforms. True enough, and the existence of machines where it fails is in any case no reason to break applications on machines where it works. But it would make sense to deprecate these functions and state that they will/may be removed in some future release.
Upstream ticket: https://fedorahosted.org/389/ticket/155
Update: I think optimizations could manifest this bug on existing systems for years now: Strict type aliasing analysis + link-time optimization. I don't use slapi myself, but anyway: The compiler may optimize on the assumption that code with undefined behavior is not executed - in particular, that objects are accessed through the correct type or some exceptions like char: void slapi_ch_free(void **pptr) { free(*pptr); *pptr = NULL; } void foo(double *dp) { if (dp != NULL) { slapi_ch_free((void **) &dp); assert(dp == NULL); } ... } The compiler may optimize the assert into assert(0): slapi_ch_free() accesses nothing as a double* or char. So dp cannot have been (validly) modified. Doing so yields undefined behavior, which in any case the compiler can optimize into whatever it finds convenient. And with link-time optimization, we can't protect code from this optimization by keeping foo and slapi_ch_free in different source files. Though I suppose slapi_ch_free() can protect it by reading and zeroing *pptr as a char array, in a way which the compiler cannot figure out just reads the pointer as-is and then puts a null pointer there. (The compiler can also note that dp is NULLed and optimize away the assert, and in fact gcc does, but it is not required to have that level of cleverness just beacuse it implements strict aliasing analysys.)
Argh. I wrote: "So dp cannot have been (validly) modified. Doing so yields undefined behavior, which..." I mean - "OTOH modifying it invalidly yields undefined behaivor, which..."
Closing this bug since we moved to the ticket system: https://fedorahosted.org/389/ticket/155