Bug 163622 - slapi void** function arguments not generally portable
slapi void** function arguments not generally portable
Status: CLOSED DEFERRED
Product: 389
Classification: Community
Component: Server - Plugins (Show other bugs)
7.1
All Linux
medium Severity medium
: ---
: ---
Assigned To: Rich Megginson
Ben Levenson
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-07-19 12:24 EDT by Hallvard B Furuseth
Modified: 2015-11-19 18:17 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-11-19 18:17:16 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Hallvard B Furuseth 2005-07-19 12:24:13 EDT
Description of problem:

slapi_ch_free()'s prototype makes it unportable to hosts where
some pointer types have different representation than void*:
    slapi_ch_free((void **)&ptr);
will access ptr as a void*, regardless of ptr's real type.
Pity to do that for something as trivial as nulling the pointer
after freeing it.

(Actually I encountered this in OpenLDAP which also implements
slapi, but there isn't much point in "fixing" your API there
when that makes it incompatible with yours.)

Chris Torek gives an example of how this can break even for a
mostly normal architecture:
  http://groups.google.com/groups?selm=98amku$3bp$1@elf.bsdi.com
The Data General Eclipse had word pointers as the native pointer
format, so casting int* to char* (this was before void*:-) had to
left-shift the address.  If int*foo; contained address 666, that
was word#666.  free(*(void**)foo) would free byte#666 = word#333.

The void** type of the function for slapi_be_getentrypoint() is
similarly broken, but if it knows the type of the function pointer
which the argument points to, then it can convert the argument
back to a pointer to that function pointer type before using it.

The void*[] array of function pointers for slapi_apib_...() is
worse, since a void* may not be able to hold a function pointer
according to the C standard.  It would have been OK to use
    typedef (*slapi_fn)(void);
and make the api an array of function pointers cast to slapi_fn.

This is probably not a problem now since dlsym() works, but
Open Group apparently plans to provide a replacement function
for this reason - see RATIONALE in
   http://www.opengroup.org/onlinepubs/009695399/functions/dlsym.html
Comment 1 Rich Megginson 2005-07-19 12:36:10 EDT
On which hosts/platforms/OS (other than Data General) are pointer types
different than void *?  On those hosts, how do you find out what the type is? 
Or is the pointer type different depending on what the base type is?

Nulling the pointer after freeing it is not trivial - it has saved hundreds of
hours of debugging and prevented countless other hard to find bugs.  Sure, we
could have made a macro to do this instead.

The proper fix would be to introduce slapi_ch_free_xxx functions for the various
types.  This is already done for the most common case, slapi_ch_free_string. 
Most of the other types have their own free or done function e.g.
slapi_entry_free, slapi_sdn_free, etc.  We could not remove or change the
signature to slapi_ch_free because that would break binary compatability,
regardless of how incorrect it actually is, since it "just works" on all of our
supported platforms.
Comment 2 Hallvard B Furuseth 2005-07-19 15:54:02 EDT
> On which hosts/platforms/OS (other than Data General) are pointer
> types different than void *?

It's been years since I followed that kind of thing, but this has a
few old examples:
   http://www.psgd.org/paul/docs/cstyle/cstyle16.htm
type      Cray-2  Unisys 1100
_____________________________
char*         64      72
int*      64(24)      72
int(*)()      64     576
The 64(24) means there was a compile-time option to pick the pointer
model to use.

When I paid attention there were discussions of how transition
from 32-bit to 64-bit might cause similar issues to the one I
described for Data General Eclipse.  I haven't kept track, but one
variant Google found was:
   http://h71000.www7.hp.com/commercial/c/docs/5492profile_008.html
HP has a #pragma pointer_size which allows 32-bit and 64-bit
pointer models to coexist.  32-bit pointers are automatically
promoted to 64-bit if they are used as specified.  Of course, a
simple answer to that one may be "don't do that, then".  I have no
idea how good answer that is in this case.

(MSDOS pointers (at least in the far/huge memory models) was one
major example of breaking a number of non-standard assumptions
about pointers, though I don't know if it would have broken this
API.  Unless there was a memory model with 32-bit function
pointers and 16-bit data pointers.  Don't remember at the moment.)

Still, I'm not concerned about a particular host at this time.
Currently I live on Linux myself:-) I'm concerned about a
hopefully growing code base which uses an API which enforces
invalid code.  If assumptions the server is making internally
break for a major enough architecture that you want to fix it,
then you can fix that.  But you can't fix code which other people
have been writing for years to this API.  And the same goes for
OpenLDAP of course.  So it's better to fix the API early and hope
any such major portability problem is years in the future.

> On those hosts, how do you find out what the type is?  Or is the
> pointer type different depending on what the base type is?

Yes, in general the program just has to know the type.  Just as it
must know whether it's accessing an int or a long.

> Nulling the pointer after freeing it is not trivial - it has saved
> hundreds of hours of debugging and prevented countless other hard to
> find bugs.  Sure, we could have made a macro to do this instead.

I see.  A macro must evaluate the argument twice though, that can
invite breakage too.  But with a capital-letter name that's
hopefully warning enough.

> The proper fix would be to introduce slapi_ch_free_xxx functions for
> the various types.  This is already done for the most common case,
> slapi_ch_free_string.  Most of the other types have their own free or
> done function e.g.  slapi_entry_free, slapi_sdn_free, etc.

Fine.  I did find a number of things that should be freed with
slapi_ch_free in the doc though.

> We could not remove or change the signature to slapi_ch_free because
> that would break binary compatability, regardless of how incorrect it
> actually is, since it "just works" on all of our supported platforms.

True enough, and the existence of machines where it fails is in
any case no reason to break applications on machines where it works.

But it would make sense to deprecate these functions and state
that they will/may be removed in some future release.
Comment 3 Martin Kosek 2012-01-04 08:51:15 EST
Upstream ticket:
https://fedorahosted.org/389/ticket/155
Comment 4 Hallvard B Furuseth 2012-01-04 11:04:48 EST
Update: I think optimizations could manifest this bug on existing
systems for years now:  Strict type aliasing analysis + link-time
optimization.

I don't use slapi myself, but anyway:

The compiler may optimize on the assumption that code with undefined
behavior is not executed - in particular, that objects are accessed
through the correct type or some exceptions like char:

    void slapi_ch_free(void **pptr) {
        free(*pptr);
        *pptr = NULL;
    }

    void foo(double *dp) {
        if (dp != NULL) {
            slapi_ch_free((void **) &dp);
            assert(dp == NULL);
        }
        ...
    }

The compiler may optimize the assert into assert(0):  slapi_ch_free()
accesses nothing as a double* or char.  So dp cannot have been (validly)
modified.  Doing so yields undefined behavior, which in any case the
compiler can optimize into whatever it finds convenient.

And with link-time optimization, we can't protect code from this
optimization by keeping foo and slapi_ch_free in different source files.

Though I suppose slapi_ch_free() can protect it by reading and zeroing
*pptr as a char array, in a way which the compiler cannot figure out
just reads the pointer as-is and then puts a null pointer there.

(The compiler can also note that dp is NULLed and optimize away the
assert, and in fact gcc does, but it is not required to have that level
of cleverness just beacuse it implements strict aliasing analysys.)
Comment 5 Hallvard B Furuseth 2012-01-04 11:07:40 EST
Argh.  I wrote:

"So dp cannot have been (validly) modified.  Doing so yields undefined behavior, which..."

I mean - "OTOH modifying it invalidly yields undefined behaivor, which..."
Comment 7 Noriko Hosoi 2015-11-19 18:17:16 EST
Closing this bug since we moved to the ticket system:
https://fedorahosted.org/389/ticket/155

Note You need to log in before you can comment on or make changes to this bug.