Bug 996021

Summary: [RFE][python-sdk] Non standard reference management
Product: Red Hat Enterprise Virtualization Manager Reporter: Alon Bar-Lev <alonbl>
Component: ovirt-engine-sdkAssignee: Juan Hernández <juan.hernandez>
Status: CLOSED WONTFIX QA Contact: Shai Revivo <srevivo>
Severity: high Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: acathrow, alonbl, bazulay, didi, emesika, iheim, mishka8520, oramraz, oschreib, pstehlik, Rhev-m-bugs, sbonazzo, smizrahi, yeylon
Target Milestone: ---Keywords: FutureFeature, Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: infra
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-06-23 11:32:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
code for reproducing the issue
none
minimal test for reproducing the issue
none
destroyer.py none

Description Alon Bar-Lev 2013-08-12 08:53:54 UTC
OUTLINE

For some reason there is a restriction of how API object are to be used, this restriction violates the python object use.

EXPECTED BEHAVIOR

API should not be singleton, nor restrict reference count of its objects.
Two instances of API object should be complete independent.
It is up to the programmer to decide which instance he want to keep and when to free resources.

Reference[1].

DISCUSSION

---
Michael Pasternak:

AFAIU ovirt-live reusing same pointer to the sdk proxy for new instance, i do allow this mode, but force to release the resources been used by the previous proxy instance

if you want to keep the old proxy (as is) along with the new one, just create new sdk proxy at new pointer (python variable)
---
Alon Bar-Lev:

Nobody of us understands the above statement, Ohad will try to figure your this offline with you.

We want two separate functions within same python script to be able to use the sdk in order to interact with engine without any dependency nor conflict between these two functions.

So if we are doing something wrong it should be fixed, otherwise we should discuss the limitation.
---
Michael Pasternak:

i do not limit anything, there are two scenarios

1)

1. a = API()
2. a = API()

2)

1. a = API()
2. b = API()

we're talking about scenario #1, in order for #1.2 to work first you should call a.disconnect() to release the resources,

the reason for that is to avoid having zombie sessions in the restapi, this is about #1,

about #2, unfortunately i have no control over this scenario and rely on programming skills of developer to call .disconnect() when his program exists,

otherwise if he created endless session upon sdk construction, it will stay in restapi forever consuming resources and being potential security breach.

hope it's clear now.
---
Alon Bar-Lev:

> hope it's clear now.

sorry... but no...

what is the difference between:

a = object()
b = object()
a.do()
b.do()

and:

a = object()
a.do()
a = object()
a.do()

both should behave exactly the same. how do you know what is the variable name you are using? from your example, unless you are doing some magic, there should be no difference, python will call __del__ at some point in time for both objects in both examples.
---
Michael Pasternak:

you confusing between live instance i've mentioned in the #2 and dead instance i've mentioned #1,

python will never call any destructor on #2 as it is live form program PoV,

while in scenario #1 i do know (yes i have such ability) that the old instance is dies now cause you assign new one to the same pointer and intentionally force calling disconnect()

of course i could do it implicitly behind the scenes, but i wanted to let programmers to have control.
---
Alon Bar-Lev:

I still don't understand why:

a = API()
a.do()
b = API()
a.do()
b.do()

should not work, and what why it related to live/dead instances, and why user should care?

if user want to leak resources it is his decision, but rergadless, the above should work.
---

[1] http://gerrit.ovirt.org/#/c/17847/

Comment 1 Sandro Bonazzola 2013-08-12 09:52:42 UTC
Created attachment 785630 [details]
code for reproducing the issue

Comment 2 Sandro Bonazzola 2013-08-12 10:04:45 UTC
Created attachment 785632 [details]
minimal test for reproducing the issue

Comment 3 Michael Pasternak 2013-08-12 10:47:39 UTC
(In reply to Alon Bar-Lev from comment #0)
> OUTLINE
> 
> For some reason there is a restriction of how API object are to be used,
> this restriction violates the python object use.
> 
> EXPECTED BEHAVIOR
> 
> API should not be singleton, 

it is not.

> nor restrict reference count of its objects.

it is not.

> Two instances of API object should be complete independent.

they are.

> It is up to the programmer to decide which instance he want to keep and when
> to free resources.

this is the reason for [*] (see bellow)

> 
> Reference[1].
> 
> DISCUSSION
> 
> ---
> Michael Pasternak:
> 
> AFAIU ovirt-live reusing same pointer to the sdk proxy for new instance, i
> do allow this mode, but force to release the resources been used by the
> previous proxy instance
> 
> if you want to keep the old proxy (as is) along with the new one, just
> create new sdk proxy at new pointer (python variable)
> ---
> Alon Bar-Lev:
> 
> Nobody of us understands the above statement, Ohad will try to figure your
> this offline with you.
> 
> We want two separate functions within same python script to be able to use
> the sdk in order to interact with engine without any dependency nor conflict
> between these two functions.
> 
> So if we are doing something wrong it should be fixed, otherwise we should
> discuss the limitation.
> ---
> Michael Pasternak:
> 
> i do not limit anything, there are two scenarios
> 
> 1)
> 
> 1. a = API()
> 2. a = API()
> 
> 2)
> 
> 1. a = API()
> 2. b = API()
> 
> we're talking about scenario #1, in order for #1.2 to work first you should
> call a.disconnect() to release the resources,
> 
> the reason for that is to avoid having zombie sessions in the restapi, this
> is about #1,
> 
> about #2, unfortunately i have no control over this scenario and rely on
> programming skills of developer to call .disconnect() when his program
> exists,
> 
> otherwise if he created endless session upon sdk construction, it will stay
> in restapi forever consuming resources and being potential security breach.
> 
> hope it's clear now.
> ---
> Alon Bar-Lev:
> 
> > hope it's clear now.
> 
> sorry... but no...
> 
> what is the difference between:
> 
> a = object()
> b = object()
> a.do()
> b.do()
> 
> and:
> 
> a = object()
> a.do()
> a = object()
> a.do()
> 
> both should behave exactly the same. how do you know what is the variable
> name you are using? from your example, unless you are doing some magic,
> there should be no difference, python will call __del__ at some point in
> time for both objects in both examples.
> ---
> Michael Pasternak:
> 
> you confusing between live instance i've mentioned in the #2 and dead
> instance i've mentioned #1,
> 
> python will never call any destructor on #2 as it is live form program PoV,
> 
> while in scenario #1 i do know (yes i have such ability) that the old
> instance is dies now cause you assign new one to the same pointer and
> intentionally force calling disconnect()
> 
> of course i could do it implicitly behind the scenes, but i wanted to let
> programmers to have control.

[*] ^^^

> ---
> Alon Bar-Lev:
> 
> I still don't understand why:
> 
> a = API()
> a.do()
> b = API()
> a.do()
> b.do()
> 
> should not work, and what why it related to live/dead instances, and why
> user should care?
> 
> if user want to leak resources it is his decision, but rergadless, the above
> should work.

Alon, it ^ is working, those are completely independent python objects, and
behave as such,

i think you misunderstood the discussion, only scenario that has to be
handled is:

[1] a = API()
[2] a = API()

where two python objects are assigned to the same variable, and
to make it possible, first you need to release the resources at [1]
by calling a.disconnect(),

python garbage collection of first instance doesn't help as it releases
the resource/s in python only, while sdk consumes the racecourses at the
engine/restapi as well (like authentication session, which may stay for good
if not released and this is much bigger leak than client local resource/s).

triggering engine/restapi resources release in API.__del__ is not good cause in 
this case python GC won't treat this object and it will be released only when 
it's (ref-count == 0) what is dangerous when it cross-referenced, i.e it may be 
never collected at all.

theretofore as owners of both sides (server/client) we may force clients
to release the resources (when we know for sure that object is dying) based on
constraints we have - this is a business logic and not python thing.

> ---
> 
> [1] http://gerrit.ovirt.org/#/c/17847/

Comment 4 Michael Pasternak 2013-08-12 10:52:15 UTC
besides that, "releasing resources" upon exist is a well understood common practice.

Comment 5 Alon Bar-Lev 2013-08-12 10:59:10 UTC
(In reply to Michael Pasternak from comment #3) 
> Alon, it ^ is working, those are completely independent python objects, and
> behave as such,
>
> i think you misunderstood the discussion, only scenario that has to be
> handled is:
> 
> [1] a = API()
> [2] a = API()
> 
> where two python objects are assigned to the same variable, and
> to make it possible, first you need to release the resources at [1]
> by calling a.disconnect(),

variable in python is a complete void.
The above example is creating two objects and reuse variable while marking the first at refcount 0.
 
> python garbage collection of first instance doesn't help as it releases
> the resource/s in python only, while sdk consumes the racecourses at the
> engine/restapi as well (like authentication session, which may stay for good
> if not released and this is much bigger leak than client local resource/s).

I do no understand... do you claim that python leaks resources by default behavior?

The following example of:
---
a = API()
a = API()
---

Behaves exactly as:
---
a = API()
b = a
a = API()
---

Or:
---
a = API()
b = API()
a = b
---
 
> triggering engine/restapi resources release in API.__del__ is not good cause
> in 
> this case python GC won't treat this object and it will be released only
> when 
> it's (ref-count == 0) what is dangerous when it cross-referenced, i.e it may
> be 
> never collected at all.

I do not follow... do you mean that python garbage collection is faulty for any sequence? I really don't understand.... of course object are released when they are at refcount == 0.
What do you mean cross reference?
Why does ovirt API is so special that it needs a special behavior?

> theretofore as owners of both sides (server/client) we may force clients
> to release the resources (when we know for sure that object is dying) based
> on
> constraints we have - this is a business logic and not python thing.

This is the bug, and should be fixed.
The garbage collection should work properly and as expected within the framework of language.
We should not force anything, unless explicitly required by language (example: C API).

Thanks,
Alon

Comment 6 Alon Bar-Lev 2013-08-12 11:03:56 UTC
(In reply to Michael Pasternak from comment #4)
> besides that, "releasing resources" upon exist is a well understood common
> practice.

Please show me that common practice, and define what exist is...

There are three standard mechanisms in python to release resources:

1. with statement:

with API(....) as api:
    # do something with api object
    # this api should be released at end of with

2. scope:

def f():
    api = API(....)
    # api object will be released at end of scope
    # or afterwards when python finds time

3. explicit:

def f():
    api = API(....)
    del api

And there is one that is not standard and not best practice:

def f():
    api = API(...)
    try:
        pass
    finally:
        api.free()

The use of this model is discouraged and error prone, the chance of having resource leak because of the non standard model is higher than any of the above.

Comment 7 Michael Pasternak 2013-08-12 11:53:14 UTC
(In reply to Alon Bar-Lev from comment #5)
> (In reply to Michael Pasternak from comment #3) 
> > Alon, it ^ is working, those are completely independent python objects, and
> > behave as such,
> >
> > i think you misunderstood the discussion, only scenario that has to be
> > handled is:
> > 
> > [1] a = API()
> > [2] a = API()
> > 
> > where two python objects are assigned to the same variable, and
> > to make it possible, first you need to release the resources at [1]
> > by calling a.disconnect(),
> 
> variable in python is a complete void.
> The above example is creating two objects and reuse variable while marking
> the first at refcount 0.
>  
> > python garbage collection of first instance doesn't help as it releases
> > the resource/s in python only, while sdk consumes the racecourses at the
> > engine/restapi as well (like authentication session, which may stay for good
> > if not released and this is much bigger leak than client local resource/s).
> 
> I do no understand... do you claim that python leaks resources by default
> behavior?

i *did not* say that!, what i'm saying that i don't care about python object,
python GC will clean it, what i care about is a session acquired on the server
side, and only way to release it to perform a http query to the server asking
to release the session that no longer used.

(there a option in the backed/api to create endless session, ordinal session
dyes when idle TTL expires, mentioned functionality addresses the former)

> 
> The following example of:
> ---
> a = API()
> a = API()
> ---
> 
> Behaves exactly as:
> ---
> a = API()
> b = a
> a = API()
> ---
> 
> Or:
> ---
> a = API()
> b = API()
> a = b

how this is related?

> ---
>  
> > triggering engine/restapi resources release in API.__del__ is not good cause
> > in 
> > this case python GC won't treat this object and it will be released only
> > when 
> > it's (ref-count == 0) what is dangerous when it cross-referenced, i.e it may
> > be 
> > never collected at all.
> 
> I do not follow... do you mean that python garbage collection is faulty for
> any sequence? I really don't understand.... of course object are released
> when they are at refcount == 0.
> What do you mean cross reference?

cycle-reference where refcount != 0 (forever), all i'm saying that i can't
put .disconnect() call in to API.__delete__ cause it will change the way object
is collected in python, so it has to be explicitly called.

> Why does ovirt API is so special that it needs a special behavior?
> 
> > theretofore as owners of both sides (server/client) we may force clients
> > to release the resources (when we know for sure that object is dying) based
> > on
> > constraints we have - this is a business logic and not python thing.
> 
> This is the bug, and should be fixed.
> The garbage collection should work properly and as expected within the
> framework of language.

again, it has no impact to the standard python object life-cycle

> We should not force anything, unless explicitly required by language
> (example: C API).

or by business logic.

> 
> Thanks,
> Alon

Comment 8 Alon Bar-Lev 2013-08-12 11:58:25 UTC
I do not understand.... sorry I don't.

There is expected behavior of object life cycle management including python gc.

As far as I see sdk violates this.

And I ask this to be fixed.

Thanks.

Comment 9 Michael Pasternak 2013-08-12 12:10:21 UTC
(In reply to Alon Bar-Lev from comment #8)
> I do not understand.... sorry I don't.

see this as business-logic constraint like authentication or putting host
in maintenance before removing, it's pretty much the same.

> 
> There is expected behavior of object life cycle management including python
> gc.
> 
> As far as I see sdk violates this.

it doesn't, python object life cycle remains the same, all i'm asking is to
release resources acquired on a sever side by calling method doing that,

but even after calling .disconnect() on the API instance, actual python 
object remains in memory and will be collected by GC as any other.

> 
> And I ask this to be fixed.
> 
> Thanks.

Comment 10 Michael Pasternak 2013-08-13 10:35:46 UTC
Alon,

Hope it's clear now, mentioned functionality is *not* about python nor about "refernce" it's just i block construction of the sdk proxy object in API() 
constructor in very specific use-case in sake of business logic mentioned above,

i.e this is not "Non standard reference management", even not "reference
management" at all, there is no such thing in sdk.

sdk behaves as any other python object, collected by python GC when needed, etc.,
theretofore i'm about to close this bug,

please let me know if you need any more details.

thanks.

Comment 11 Michael Pasternak 2013-08-14 08:10:48 UTC
Alon,

Despite i disagree that this is a bug or something that has not to be
done, i found an "easy" solution to perform cleanup/disconnect silently [1]
to ease on your (specific [2]) use-case,

but please note, on any other scenario (not [2]) you still have to call
.disconnet() on sdk when you no longer need it.

[1] http://gerrit.ovirt.org/#/c/18059/
[2] 

engine_api = API()
engine_api = API()
engine_api = API()
...

Comment 12 Alon Bar-Lev 2013-08-14 09:07:59 UTC
Thank you.

In constructor you added:

# Implicitly disconnect and perform cleanup
# when detected instance of the SDK proxy with
# ref-count == 0
if context.manager.has_key(self.__id):
    self.disconnect()

This implies that there is some state saved between references of separate objects which cause dependencies between object.

What you have done is hiding the problem by explicit disconnecting at object creation instead of having unique id for each construction and cleanup at destructor.

Now that I can read the code snip, the problem is that you are using the id python function:
---
 id(object)

    Return the “identity” of an object. This is an integer (or long integer) which is guaranteed to be unique and constant for this object during its lifetime. Two objects with non-overlapping lifetimes may have the same id() value.
---

Notice: "Two objects with non-overlapping lifetimes may have the same id() value."

This mean that you cannot use id to store object references.

I suggest you use uuid instead, something like uuid.uuid4().int if you want to keep the integer use.

And add destructor to the class to cleanup resources.

In another thought, if you clean up during object destructor, then you probably have unique id at that point anyway as object still alive.

Thank you.

Comment 13 Michael Pasternak 2013-08-14 09:34:43 UTC
(In reply to Alon Bar-Lev from comment #12)
> Thank you.
> 
> In constructor you added:
> 
> # Implicitly disconnect and perform cleanup
> # when detected instance of the SDK proxy with
> # ref-count == 0
> if context.manager.has_key(self.__id):
>     self.disconnect()
> 
> This implies that there is some state saved between references of separate
> objects which cause dependencies between object.

not dependency, but knowledge, explanations bellow ...

> 
> What you have done is hiding the problem by explicit disconnecting at object
> creation instead of having unique id for each construction and cleanup at
> destructor.

this is not correct, i'm using simple but effective memory monitoring
mechanism, explanations bellow.

> 
> Now that I can read the code snip, the problem is that you are using the id
> python function:
> ---
>  id(object)
> 
>     Return the “identity” of an object. This is an integer (or long integer)
> which is guaranteed to be unique and constant for this object during its
> lifetime. Two objects with non-overlapping lifetimes may have the same id()
> value.
> ---
> 
> Notice: "Two objects with non-overlapping lifetimes may have the same id()
> value."

this is exactly what i need, this is a mechanism to recognize that sdk
has not disconnected context, and this is how i know that old instance is
dead - otherwise i wouldn't get same memory allocation ...

> 
> This mean that you cannot use id to store object references.

of course i can, - objects have non-overlapping lifetimes, i.e 
i will never have two live objects with same id, this is guaranteed ...

> 
> I suggest you use uuid instead, something like uuid.uuid4().int if you want
> to keep the integer use.
> 
> And add destructor to the class to cleanup resources.

i cannot use destructor for that, as i mentioned overriding destructor 
in python is very dangerous as it potentially leak resources, 

e.g it will not be collected by GC as regular object, but only when
refcount=0, and if there is cycle-reference to sdk object, it refcount
will never be 0, i.e it won't be collected at all and disconnect() will
never be called.

> 
> In another thought, if you clean up during object destructor, then you
> probably have unique id at that point anyway as object still alive.

as explained, i always have unique id as there is no two live objects with
same id in a same time, if i had id in cache in a time of construction, it 
means that proper cleanup did not took place, and this why error is get thrown,
and this is why you've filed this BZ (that you cannot create new sdk
before old one was cleared)

patch mentioned above now do implicit cache cleanup with disconnect()

> 
> Thank you.

Comment 14 Alon Bar-Lev 2013-08-14 10:07:05 UTC
> e.g it will not be collected by GC as regular object, but only when
refcount=0, and if there is cycle-reference to sdk object, it refcount
will never be 0, i.e it won't be collected at all and disconnect() will
never be called.

I still do not understand but please consider using weak references[1].

[1] http://docs.python.org/2/library/weakref.html

Comment 15 Michael Pasternak 2013-08-14 10:19:03 UTC
(In reply to Alon Bar-Lev from comment #14)
> > e.g it will not be collected by GC as regular object, but only when
> refcount=0, and if there is cycle-reference to sdk object, it refcount
> will never be 0, i.e it won't be collected at all and disconnect() will
> never be called.
> 
> I still do not understand but please consider using weak references[1].
> 
> [1] http://docs.python.org/2/library/weakref.html

i would, but unfortunately i have no control over client's code :-),

btw just FYI, in my TODO list at SDK wiki, i have task to implement
support for __exit__ so you could use a 'with' to manage instance lifetime,
it's better than __delete__ cause it does not involve GC, but disadvantage of it
is that developers simply may not use it, - then it pointless,

anyway, i guess now i'll prioritize it for you and from next drop, you
will be able creating instances in the 'with' like you do with files.

Comment 16 Michael Pasternak 2013-08-14 10:23:54 UTC
(In reply to Michael Pasternak from comment #15)
> (In reply to Alon Bar-Lev from comment #14)
> > > e.g it will not be collected by GC as regular object, but only when
> > refcount=0, and if there is cycle-reference to sdk object, it refcount
> > will never be 0, i.e it won't be collected at all and disconnect() will
> > never be called.
> > 
> > I still do not understand but please consider using weak references[1].
> > 
> > [1] http://docs.python.org/2/library/weakref.html
> 
> i would, but unfortunately i have no control over client's code :-),
> 

/s/client's/users's

Comment 17 Alon Bar-Lev 2013-08-14 10:29:28 UTC
I am sorry... but I do not understand yet again.

with statement is not good for long living objects:

class A(object):

    def __init__(self):
        self._api = ...

    def menthod1(self):
        self._api.do1()

    def menthod2(self):
        self._api.do2()

Using with statement to free resources at each method is wrong as it actually takes more resources as you actually perform http call.

We need destructor use to cleanup resources and not relay on future construction of object to clean up previous instance.

Of course we do not control user's code, but we do need to cleanup resources when object dies.

I do not agree that this issue is fixed.

Comment 18 Michael Pasternak 2013-08-14 10:40:27 UTC
(In reply to Alon Bar-Lev from comment #17)
> I am sorry... but I do not understand yet again.
> 
> with statement is not good for long living objects:

why is that? it only will call __exist__ at the end of the scope ...,
you do not acquire any resources like streams/connections etc.

> 
> class A(object):
> 
>     def __init__(self):
>         self._api = ...
> 
>     def menthod1(self):
>         self._api.do1()
> 
>     def menthod2(self):
>         self._api.do2()
> 
> Using with statement to free resources at each method is wrong as it
> actually takes more resources as you actually perform http call.

? what resource? http call managed out of scope of 'with', 'with'
only ensures that API.__exit__ get called at the end of the scope.

if it does not fit for your way of object definition (in constructor), don't
use it, it was just an example of another life-cycle management.

> 
> We need destructor use to cleanup resources and not relay on future
> construction of object to clean up previous instance.
> 
> Of course we do not control user's code, but we do need to cleanup resources
> when object dies.

you again ignoring my arguments, - let's stay with different opinions on this.

> 
> I do not agree that this issue is fixed.

Comment 21 Michael Pasternak 2013-08-15 11:10:29 UTC
(In reply to Alon Bar-Lev from comment #17)
> We need destructor use to cleanup resources

destructors are not designed for that, they designed to be used by "garbage
collector" and "garbage collector" designed to free memory, not closing
sockets etc., - this is time consuming/heavy operations,

just imagine GC stopping the app for the collect/copy cycle and closing
pool of N connections/sockets where some of them can throw exceptions/be stuck
on i/o, etc...

Comment 22 Alon Bar-Lev 2013-08-15 11:24:42 UTC
(In reply to Michael Pasternak from comment #21)
> (In reply to Alon Bar-Lev from comment #17)
> > We need destructor use to cleanup resources
> 
> destructors are not designed for that, they designed to be used by "garbage
> collector" and "garbage collector" designed to free memory, not closing
> sockets etc., - this is time consuming/heavy operations,
> 
> just imagine GC stopping the app for the collect/copy cycle and closing
> pool of N connections/sockets where some of them can throw exceptions/be
> stuck
> on i/o, etc...

I am trying to imagine... very hard for me because my lack of imagination I guess, and the inability to understand why we reinvent the wheel... and prefer resource leak on the sake of arguments I cannot process.

From Python-2.7.3 source:

./Lib/socket.py:

    def __del__(self):
        try:
            self.close()
        except:
            # close() may fail if __init__ didn't complete
            pass

./Lib/tarfile.py:

    def __del__(self):
        if hasattr(self, "closed") and not self.closed:
            self.close()


./Lib/tempfile.py:

        def __del__(self):
            self.close()

./Lib/urllib.py:

    def __del__(self):
        self.close()

And more.

Comment 23 Michael Pasternak 2013-08-15 14:38:06 UTC
(In reply to Alon Bar-Lev from comment #22)
> (In reply to Michael Pasternak from comment #21)
> > (In reply to Alon Bar-Lev from comment #17)
> > > We need destructor use to cleanup resources
> > 
> > destructors are not designed for that, they designed to be used by "garbage
> > collector" and "garbage collector" designed to free memory, not closing
> > sockets etc., - this is time consuming/heavy operations,
> > 
> > just imagine GC stopping the app for the collect/copy cycle and closing
> > pool of N connections/sockets where some of them can throw exceptions/be
> > stuck
> > on i/o, etc...
> 
> I am trying to imagine... very hard for me because my lack of imagination I
> guess, 

Alon,

i know who you are, i saw your code, without any doubt you're
incredibly good architect and developer, last thing i'd like to
do is offending you,

but this is a technical discussion where we discuss/argue/and learn from
each other, in my view healthy discussion should look like this.

apologize if you've got me wrong.  

> and the inability to understand why we reinvent the wheel... and
> prefer resource leak on the sake of arguments I cannot process.
> 
> From Python-2.7.3 source:
> 
> ./Lib/socket.py:
> 
>     def __del__(self):
>         try:
>             self.close()
>         except:
>             # close() may fail if __init__ didn't complete
>             pass
> 
> ./Lib/tarfile.py:
> 
>     def __del__(self):
>         if hasattr(self,"closed") and not self.closed:
>             self.close()
> 
> 
> ./Lib/tempfile.py:
> 
>         def __del__(self):
>             self.close()
> 
> ./Lib/urllib.py:
> 
>     def __del__(self):
>         self.close()
> 
> And more.

unfortunately the world is not perfect, such code probably will be here and
there, but it is wrong (there are tones of discussions around this) as it
sacrifice performance/stability of all python applications running on a same
box to cover on developers that does not follow documentation/performance 
consideration/design patterns/etc. what is leads not once to bugs like [1],

also you have no guarantee __del__ will be ever executed, so best practice
is still having object defined in 'with' clause where __exit__ is triggered
at the end of the scope. 

[1] http://mail.python.org/pipermail/python-bugs-list/2004-February/022210.html
    http://bugs.python.org/issue9543
    http://bugs.jython.org/issue544891

Comment 24 Alon Bar-Lev 2013-08-15 14:51:42 UTC
(In reply to Michael Pasternak from comment #23)
> unfortunately the world is not perfect, such code probably will be here and
> there, but it is wrong (there are tones of discussions around this)

wrong in that regard is subjective, you (and may other) think it is wrong, while other people think it is right.

the options to deal with unreleased object resources:

1. do nothing, allow resource leak, blame programmer or anybody else, but not the library.

2. as there is 99% chance destructor will be called, and release resources, so in most cases there will be no resource leak.

> also you have no guarantee __del__ will be ever executed, so best practice
> is still having object defined in 'with' clause where __exit__ is triggered
> at the end of the scope. 

Again, the with statement is not something that is good for all cases, see comment#17.

Your disconnect is not light:
	195	201	  proxy.request(
	196	202	      method='GET',
	197	203	      url='/api',
	198	204	      last=True
	199	205	  )

So multiple __enter__, __exit__ will produce huge overhead.

---

Now, lets resume the original bug.

What happened is that you cached objected based on their id(), and as object was deleted you got duplicate id() for a new object.

The fact you got duplicate id() means that destructor was called, otherwise there was no duplicate id. 

Your solution was to cleanup the cache when object is created, so you have 0.001% chance of releasing resources of previous dead objects, as enough there is one more unrelated object creation you miss that duplication and hold resource forever.

I would like to see free resources at destructor so you have 99% chance of releasing resources.

Comment 25 Alon Bar-Lev 2013-08-15 22:24:20 UTC
Created attachment 787086 [details]
destroyer.py

Only for the amusement... as in this case regular destructor can provide solution.

I've written a small class to allow destructor logic even in cyclic reference model.

Tested on python-2.7 and python-3.2.

Maybe you find this easier to integrate.

Enjoy.

Comment 26 Michael Pasternak 2013-08-25 14:54:26 UTC
(In reply to Alon Bar-Lev from comment #24)
> (In reply to Michael Pasternak from comment #23)
> > unfortunately the world is not perfect, such code probably will be here and
> > there, but it is wrong (there are tones of discussions around this)
> 
> wrong in that regard is subjective, you (and may other) think it is wrong,
> while other people think it is right.

"think is right" is not a technical argument, people used to "think" that
world is rests on the back of elephants, below are strict technical arguments
proving that implementing objects destruction that involves other object/s
initiation in destructor is conceptually wrong.

> 
> the options to deal with unreleased object resources:
> 
> 1. do nothing, allow resource leak, blame programmer or anybody else, but
> not the library.
> 
> 2. as there is 99% chance destructor will be called, and release resources,
> so in most cases there will be no resource leak.

1. this is not correct, you have no guarantee that destructor will be ever 
called see python spec [1],

btw also 'del x' doesn’t directly call x.__del__(), it get called only when x 
ref-count is 0, so you either can't do 'del engine_api' in your code (as it
done in the attached reproducer)

2. "It is not guaranteed that __del__() methods are called for objects that still exist when the interpreter exits." [1],

3. when __del__() is invoked in response to a module being deleted (e.g., when execution of the program is done), other globals referenced by the __del__() method may already have been deleted or in the process of being torn down (e.g. the import machinery shutting down). e.g sdk can hit NulPointerException/s at
__del__() cause cache may be already deleted.

4. any live variable that holding reference to one of the root-collection
defined in the sdk API proxy will prevent API.__del__() being called cause
ref-count will never go to 0.

5. there are other situations that may prevent the reference count of an object from going to zero include: circular references between objects (e.g., a doubly-linked list or a tree data structure with parent and child pointers); a reference to the object on the stack frame of a function that caught an exception (the traceback stored in sys.exc_traceback keeps the stack frame alive); or a reference to the object on the stack frame that raised an unhandled exception in interactive mode (the traceback stored in sys.last_traceback keeps the stack frame alive).


[1] http://docs.python.org/2/reference/datamodel.html

> 
> > also you have no guarantee __del__ will be ever executed, so best practice
> > is still having object defined in 'with' clause where __exit__ is triggered
> > at the end of the scope. 
> 
> Again, the with statement is not something that is good for all cases, see
> comment#17.
> 
> Your disconnect is not light:

not sure i understand what "light" is means

> 	195	201	  proxy.request(
> 	196	202	      method='GET',
> 	197	203	      url='/api',
> 	198	204	      last=True
> 	199	205	  )

this is exactly what i've mentioned before, this .disconnect() code performs
http request to the server to clean resources on the server side, so it mean
calling .disconnect() in destructor will make garbage collector to
create/initiate new resources (connection that collected/idle), while by design
GC should delete resources ....

> 
> So multiple __enter__, __exit__ will produce huge overhead.

not sure i understand, what "multiple __enter__, __exit__" you're refer to?
x=API() and x.disconnect() will happen only once at 'with' clause,

of course for N 'with' clauses, it will happen N times, but this is true
for any other impl as well.

> 
> ---
> 
> Now, lets resume the original bug.
> 
> What happened is that you cached objected based on their id(), and as object
> was deleted you got duplicate id() for a new object.

i didn't get any duplicate id, i just got valuable information in the time
of construction of the new SDK proxy, e.g i have not-cleaned content for
the old instance, and because i have this id of dead instance of sdk, i
can locate the cache for this instance (server details/credentials/etc.) and
perform real cleanup for this instance against remote server it was connected
to, and then i drop this cache and put there new one for the new instance.

> 
> The fact you got duplicate id() means that destructor was called, otherwise
> there was no duplicate id.

it's true, but for this very specific use-case, - when you assigning new
instance to the old variable, but in general i cannot rely on destructor
because of the reasons mentioned above, - best practice is still invoking
.disconnect() manually.

> 
> Your solution was to cleanup the cache when object is created, so you have
> 0.001% chance of releasing resources of previous dead objects, as enough

cause i'm able to recognize with 100% probability *only one* scenario,
where i know for 100% that old instance of the object is dead (otherwise i
won't get same memory allocation)

> there is one more unrelated object creation you miss that duplication and
> hold resource forever.

because you don't cleanup after yourself in your code [2] ...

[2] http://gerrit.ovirt.org/#/c/17847/

> 
> I would like to see free resources at destructor so you have 99% chance of
> releasing resources

no, you have 100% if you implement cleanup procedure in your code and not rely
on something that may never happen.

Comment 27 Michael Pasternak 2013-08-25 14:57:29 UTC
(In reply to Alon Bar-Lev from comment #25)
> Created attachment 787086 [details]
> destroyer.py
> 
> Only for the amusement... as in this case regular destructor can provide
> solution.
> 
> I've written a small class to allow destructor logic even in cyclic
> reference model.
> 
> Tested on python-2.7 and python-3.2.
> 
> Maybe you find this easier to integrate.
> 
> Enjoy.

Thanks Alon,

But it's not enough,  because of:

1. #4, #5 in Comment 26.
2. circular references which are garbage are detected when the option cycle detector is enabled (it’s on by default), but can only be cleaned up if there are no Python-level __del__() methods involved, so it's kinda reinventing the wheel.

Comment 28 Alon Bar-Lev 2013-08-25 15:12:23 UTC
You repeat your-self over and over protecting your approach of leaking resources at all cost at any time, insisting that this approach is correctly only because there is the chance of this happens.

I won't get into the argument over and over.

But the basic fact is that one object instance clean resources allocated by other object.

This is not standard behavior.

If you think that sdk objects should leak resources, then do not use the id() method to generate ids, use uuid for each object so you do not have conflict between instances, see comment#12.

(In reply to Michael Pasternak from comment #27)
> But it's not enough,  because of:
> 1. #4, #5 in Comment 26.
> 2. circular references which are garbage are detected when the option cycle
> detector is enabled (it’s on by default), but can only be cleaned up if
> there are no Python-level __del__() methods involved, so it's kinda
> reinventing the wheel.

Have you actually run the source and checked before responding?

Comment 29 Michael Pasternak 2013-08-25 15:39:56 UTC
Hi Alon,

(In reply to Alon Bar-Lev from comment #28)
> You repeat your-self over and over protecting your approach of leaking
> resources at all cost at any time, insisting that this approach is correctly
> only because there is the chance of this happens.

i'm not insist to leak resources, i'm insist that relying on destructor
to do the cleanup for you is wrong cause it may never be called and that
to make it 100% work you have no other choice than calling disconnect() in
'finally' or using 'with' where it will be called implicitly.

> 
> I won't get into the argument over and over.

what arguments?, maybe i missed something, but i didn't saw any,
if you can "defeat" #4, #5 from Comment 26, i promise to consider using
destructor.

> 
> But the basic fact is that one object instance clean resources allocated by
> other object.
> 
> This is not standard behavior.
> 
> If you think that sdk objects should leak resources, then do not use the
> id() method to generate ids, use uuid for each object so you do not have
> conflict between instances, see comment#12.

alon, using id() is not leaking any resources, it's only gives the ability
to locate the content of dead instance and clean the resources on both server
and client sides that it used to use.

> 
> (In reply to Michael Pasternak from comment #27)
> > But it's not enough,  because of:
> > 1. #4, #5 in Comment 26.
> > 2. circular references which are garbage are detected when the option cycle
> > detector is enabled (it’s on by default), but can only be cleaned up if
> > there are no Python-level __del__() methods involved, so it's kinda
> > reinventing the wheel.
> 
> Have you actually run the source and checked before responding?

i admit i didn't, but i guess it cannot solve #4, #5 from Comment 26, also
how can you make sure that SDK instance created this way is not collected
during your application lifetime when you still need it?

thanks.

Comment 30 Alon Bar-Lev 2013-08-25 16:08:39 UTC
This is getting ridiculous,
Stop changing the subject, until you understand what needed to be done.

(In reply to Michael Pasternak from comment #29)
> Hi Alon,
> 
> (In reply to Alon Bar-Lev from comment #28)
> > You repeat your-self over and over protecting your approach of leaking
> > resources at all cost at any time, insisting that this approach is correctly
> > only because there is the chance of this happens.
> 
> i'm not insist to leak resources, i'm insist that relying on destructor
> to do the cleanup for you is wrong cause it may never be called and that
> to make it 100% work you have no other choice than calling disconnect() in
> 'finally' or using 'with' where it will be called implicitly.

Then you need to connect/disconnect every time you use the sdk object, which is unacceptable.
 
> > 
> > I won't get into the argument over and over.
> 
> what arguments?, maybe i missed something, but i didn't saw any,
> if you can "defeat" #4, #5 from Comment 26, i promise to consider using
> destructor.

I wrote that I won't.
 
> > 
> > But the basic fact is that one object instance clean resources allocated by
> > other object.
> > 
> > This is not standard behavior.
> > 
> > If you think that sdk objects should leak resources, then do not use the
> > id() method to generate ids, use uuid for each object so you do not have
> > conflict between instances, see comment#12.
> 
> alon, using id() is not leaking any resources, it's only gives the ability
> to locate the content of dead instance and clean the resources on both server
> and client sides that it used to use.

The use of id() forces a dependency between two independent instances.

This dependency should not exist and cause the original bug.

The solution applied in comment#11 is incorrect, as instead of solving the problem using a method to eliminate the dependencies between instances you created even more dependency.

> > 
> > (In reply to Michael Pasternak from comment #27)
> > > But it's not enough,  because of:
> > > 1. #4, #5 in Comment 26.
> > > 2. circular references which are garbage are detected when the option cycle
> > > detector is enabled (it’s on by default), but can only be cleaned up if
> > > there are no Python-level __del__() methods involved, so it's kinda
> > > reinventing the wheel.
> > 
> > Have you actually run the source and checked before responding?
> 
> i admit i didn't, but i guess it cannot solve #4, #5 from Comment 26, also
> how can you make sure that SDK instance created this way is not collected
> during your application lifetime when you still need it?

Please don't guess.

Comment 31 Michael Pasternak 2013-08-25 20:15:34 UTC
Alon,

(In reply to Alon Bar-Lev from comment #30)
> This is getting ridiculous,
> Stop changing the subject, until you understand what needed to be done.

i completely understand what you're asking for:

1. stop using object memory address as object id
2. always cleanup object at destructor

i disagree to both because:

1. this is intentionally done for being able to recognize dead instances
of the sdk object that did not released theirs resources, this is happens only
when you do:

a = API()
a = API()

the exact scenario mentioned in reproducer and this is only use-case that i'm
willing to support for cleanup as only in this case i know for sure that old
object has refcount=0 and was collected by GC (i.e i'm reusing it's memory now),

before it was done by throwing error (mentioned in reproducer), but in Comment 11 i've posted a patch that cleanup resources manually,

in all other use-cases i expect developer to trigger .disconnect() method
that does the same.

2. there are tones of comments in this bug explaining why doing this
will be abuse of GC, the last one at the end of this comment.

> 
> (In reply to Michael Pasternak from comment #29)
> > Hi Alon,
> > 
> > (In reply to Alon Bar-Lev from comment #28)
> > > You repeat your-self over and over protecting your approach of leaking
> > > resources at all cost at any time, insisting that this approach is correctly
> > > only because there is the chance of this happens.
> > 
> > i'm not insist to leak resources, i'm insist that relying on destructor
> > to do the cleanup for you is wrong cause it may never be called and that
> > to make it 100% work you have no other choice than calling disconnect() in
> > 'finally' or using 'with' where it will be called implicitly.
> 
> Then you need to connect/disconnect every time you use the sdk object, which
> is unacceptable.

sorry, but this is not correct, connect() is happens when sdk instance is created, disconnect() happens only once when you leaving 'with' scope, i.e
your sdk instance is no longer needed.

>  
> > > 
> > > I won't get into the argument over and over.
> > 
> > what arguments?, maybe i missed something, but i didn't saw any,
> > if you can "defeat" #4, #5 from Comment 26, i promise to consider using
> > destructor.
> 
> I wrote that I won't.

than this discussion is not taking us anywhere, you ignored real technical
arguments / python specs and gave nothing in return,

unfortunately i have no other tools to convince you.

>  
> > > 
> > > But the basic fact is that one object instance clean resources allocated by
> > > other object.
> > > 
> > > This is not standard behavior.
> > > 
> > > If you think that sdk objects should leak resources, then do not use the
> > > id() method to generate ids, use uuid for each object so you do not have
> > > conflict between instances, see comment#12.
> > 
> > alon, using id() is not leaking any resources, it's only gives the ability
> > to locate the content of dead instance and clean the resources on both server
> > and client sides that it used to use.
> 
> The use of id() forces a dependency between two independent instances.
> 
> This dependency should not exist and cause the original bug.

again, there is *no* dependency between sdk objects, they are completely
agnostic to each other.

> 
> > > 
> > > (In reply to Michael Pasternak from comment #27)
> > > > But it's not enough,  because of:
> > > > 1. #4, #5 in Comment 26.
> > > > 2. circular references which are garbage are detected when the option cycle
> > > > detector is enabled (it’s on by default), but can only be cleaned up if
> > > > there are no Python-level __del__() methods involved, so it's kinda
> > > > reinventing the wheel.
> > > 
> > > Have you actually run the source and checked before responding?
> > 
> > i admit i didn't, but i guess it cannot solve #4, #5 from Comment 26, also
> > how can you make sure that SDK instance created this way is not collected
> > during your application lifetime when you still need it?
> 
> Please don't guess.

??, this is not an answer, you may rewrite python to deal with #5, but you
can't collect the object if other object is reference it as mentioned in #4
even if variable holding original ref is a weakref, (when it's a valid use-case
for object being "disconnect()'ed".

you also ignored my question on how you plan to prevent from object being
collected during your application (when you still need it) if only
reference to it is a weakref,

also your example [1] that forcibly invoking gc.collect() is good only for
case standing, cause:

1. in real world none is invoking GC manually and you can't ask developers
doing that at the end of theirs program, and you can't expect GC always to run
at the end.

2. even if it would be correct, main() may take ages to finish, while you call
gc.collect() only at the end of the program, when using standard object
life-cycle management (with/finally) you could release resources during
application lifetime by completely controlling their life-cycles without
relying on something that you have no control over it (GC).

3. as Comment 21 saying, .disconnect() can stuck (as it performs http request)
if server is busy/etc. (today our apache proxy TTL is 5 minutes IIRC), what
will stuck all garbage collection/all python apps if .disconnect() get called
from the destructor,

[1]

    try:
        main()
    finally:
        gc.collect()

maybe there are a few use-cases where you want/can/have to to try and cover up
for the lazy developers that does not clean up after themselves (when it's
cheap & safe), usually this is true when not cleaning these resources will
prevent from app to exist or data corruption or similar, 

but definitely this is not the case as it creates more harm than good.

Comment 33 Juan Hernández 2014-02-04 22:24:46 UTC
The reproducer in comment 2 works fine for me. Reopen if you can provide a reproducer with the current version.

Comment 34 Alon Bar-Lev 2014-02-04 22:29:00 UTC
(In reply to Juan Hernández from comment #33)
> The reproducer in comment 2 works fine for me. Reopen if you can provide a
> reproducer with the current version.

It is working, but the solution is invalid.

There should be explicit cleanup of objects, not to cleanup reference if there is random reuse of same id.

Please refer to comment#12.

Comment 35 Barak 2014-02-24 09:31:15 UTC
Moving to 3.5, as this was not resolved in 3.4 timeframe

Comment 36 Saggi Mizrahi 2014-02-24 19:09:19 UTC
I'll first admit I've only skimmed the bug comments since life is just too short.

I'll just say how things should be.
I don't care who is right or how things are now.

For connection and disconnection you have 2 valid ways to do it.
It depends how how you want to use the connections.
If you want short lived connections or to implement a connection pool the way to do it is to have a factory method.

You create a class api.ConnectionProvider. It gets all the information required to log in and you have a method that returns an active connection that supports the context manager syntax.

p = api.ConnectionProvider(user, pass, ssl, etc)

with p.createConnection() as conn:
   conn.do()

This means that you have short lived connections using the same credentials.
You also want to implement conn.close() in case for some reason the context style can't work (passing to a callback, starting another thread, etc...).

If you want to have a single log lived connection you ditch the provider and have all the information be provided in the connect() method of the Connection() object.

You need to implement a close() method to close the socket.

In both cases Connection objects can't be reused.

As for destruction. You probably don't need to have a dtor. Sockets already have a dtor so you know they'll be closed() when they are collected. I very much doubt you need to implement a destructor and I will need to have someone tell me what resource needs to be freed.

As for worrying about garbage collection.

Don't!

As long as you don't leak, it's the application writers responsibility to take care of when objects are collected.

In general Python has ref-counting so unless you have a circular reference you object will be freed ASAP. If you do have a circular reference then the GC kicks in. In general code shouldn't have circular references. Some internal objects might reference each other in a loop and you need to use weak references for that.

The only way to "leak" in Python when using only standard python objects is to have a circular reference when there is more than one object with a destructor in the loop. This is why you should avoid implementing destructors if you don't absolutely need them.

Just to make things clear. If you have a class which has a member that is a socket, you *don't* need to implement a destructor to explicitly close the socket. The socket already does it for itself.

Destructors are only for very low level objects (like sockets) and not for higher level composite objects.

I hope this clears up everything.

Comment 37 Saggi Mizrahi 2014-02-24 20:42:45 UTC
alonbl explained to me over an email that I misunderstood the area of contention.

Apparently there is a cache indexed by the id of a socket or something of the sort.

So the problem is that you have a global "cache" in the first place.
And if you have one the fact that it's globs.

In general I'd avoid putting caching\pooling in the API base.
Connection reuse strategy is an application level decision and not
a library level decision.

But, if you want to cache connections use the "provider" pattern and
store the connection pool there.

Have the pool only contain weakrefs [1].

Have the Connection object strongly reference a connection from the pool
and have close() just do:

def close():
   self._conn = None

This means if you forget to close() the strong ref is deleted
and the weakref system will take care of cleaning it up for you.
This means no need for a dtor.

I will need to actually see the code or have someone explain it to me
to have more specific input than that.

But I will strongly advise against over-designing the API since it
usually makes things too complex for simple users and too inflexible
for advanced users.

[1] http://docs.python.org/2/library/weakref.html

Comment 38 Alon Bar-Lev 2014-02-24 20:49:09 UTC
(In reply to Saggi Mizrahi from comment #37)
> Have the pool only contain weakrefs [1].

Right, this is what attachment#787086 [details] is all about, using weakref for cleanup.

> I will need to actually see the code or have someone explain it to me
> to have more specific input than that.
> 
> But I will strongly advise against over-designing the API since it
> usually makes things too complex for simple users and too inflexible
> for advanced users.

I agree, API should not cache sessions.

Thanks,
Alon
 
> [1] http://docs.python.org/2/library/weakref.html

Comment 40 Yedidyah Bar David 2014-05-15 23:00:35 UTC
Just adding a short comment after Alon pointed out that I needed [1] due to this bug: At least in all-in-one case, the api call fails during instantiation:

for ... :
    sdk = None
    try:
        sdk = API(...)
    except:
        ...
        sleep(3)

So in the except clause, sdk is still None, and disconnect can't be called. And FWIW, the internal call to disconnect added during the long life of this bug also fails, with DisconnectedError, because the api is not connected yet...

[1] http://gerrit.ovirt.org/27732

I did not read all of the above comments, but it seems to me that even accepting the view that the sdk should protect the api from being called "too much", the above fix might be acceptable. Sorry if this was already discussed.

Well, actually, there might be another solution [2] to this problem. Didn't test it.

[2] http://gerrit.ovirt.org/27753

Comment 41 Michael Pasternak 2014-05-18 10:07:07 UTC
Hey Saggi,

Thanks for helping us, your opinion is much appreciated,

(In reply to Saggi Mizrahi from comment #37)
> alonbl explained to me over an email that I misunderstood the area of
> contention.
> 
> Apparently there is a cache indexed by the id of a socket or something of
> the sort.
> 
> So the problem is that you have a global "cache" in the first place.
> And if you have one the fact that it's globs.
> 
> In general I'd avoid putting caching\pooling in the API base.
> Connection reuse strategy is an application level decision and not
> a library level decision.
> 
> But, if you want to cache connections use the "provider" pattern and
> store the connection pool there.
> 
> Have the pool only contain weakrefs [1].
> 
> Have the Connection object strongly reference a connection from the pool
> and have close() just do:
> 
> def close():
>    self._conn = None

this is exactly how it works.

> 
> This means if you forget to close() the strong ref is deleted
> and the weakref system will take care of cleaning it up for you.
> This means no need for a dtor.

this is ain't about pooling connections/releasing resources etc., it all works
in a standard way, the use-case we discussing is releasing a server resources
by the client upon close(), i.e among standard cleanup client should perform
http query to server asking to release resources he had acquired (session etc.),

it can happen when user explicitly call .disconnect() on sdk instance or when
'with' exists, etc., but if none of these is happened, client resources will
be cleaned by GC, but server's will leak,

to address this, i have memory monitoring mechanism that implicitly invokes
disconnect() if old sdk instance was not closed properly, and it triggered
in the constructor of next sdk instance (i.e if addr of old sdk is used for new instance and it has leftovers, it was not closed properly and will be done implicitly by this constructor),

now Alon proposed removing this logic from constructor and putting it in
the destructor instead, i.e every destructor will be performing an http
request upon GC,

hope it's clear now, what is your opinion on using destructor for http
communication?

Comment 42 Red Hat Bugzilla 2023-09-14 01:49:04 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days