Bug 1312061

Summary: rcache leaks fd when using AuthContext with KRB5_AUTH_CONTEXT_DO_TIME
Product: Red Hat Enterprise Linux 6 Reporter: Jan Stancek <jstancek>
Component: python-krbVAssignee: John Dennis <jdennis>
Status: CLOSED WONTFIX QA Contact: Namita Soman <nsoman>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.7CC: dcallagh, dpal, jburke, jplans, nalin, pkis, rharwood
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-12-06 10:31:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jan Stancek 2016-02-25 16:14:23 UTC
Description of problem:
Python application based on python-krbV is leaking fds in /var/tmp/ directory named HTTP_XXX, until it eventually reaches 1024 limit and stops working.

After attaching gdb I could see that these are opened by ./src/lib/krb5/rcache/rc_io.c:krb5_rc_io_open_internal()

After digging a bit further I found following code at ./src/lib/krb5/krb/rd_req_dec.c:rd_req_decoded_opt() which seems to be responsible for opening this fd, when KRB5_AUTH_CONTEXT_DO_TIME flag is present:
------------------------------------------------------------------------------
   if (((*auth_context)->rcache == NULL)
        && ((*auth_context)->auth_context_flags & KRB5_AUTH_CONTEXT_DO_TIME)
        && server) {
        if ((retval = krb5_get_server_rcache(context, krb5_princ_component(context,server,0), &(*auth_context)->rcache)))
------------------------------------------------------------------------------
fd opened in this way ^^ doesn't seem to be ever closed.

One workaround I found is to create/attach rcache to AuthContext explicitly, then problem goes away.

Version-Release number of selected component (if applicable):
krb5-libs-1.10.3-42.el6.x86_64
krb5-libs-1.10.3-42.el6.i686
krb5-devel-1.10.3-42.el6.x86_64
krb5-debuginfo-1.10.3-42.el6.x86_64
krb5-workstation-1.10.3-42.el6.x86_64
python-krbV-1.0.90-3.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
Make an rd_req() call where AuthContext has KRB5_AUTH_CONTEXT_DO_TIME flag set and at same time it doesn't set any rcache via krb5_auth_con_setrcache().
(I'm sorry, but I don't have a reproducer in C.)

Actual results:
rcache fd is opened every time rd_req() is called when KRB5_AUTH_CONTEXT_DO_TIME flag is present and there's no rcache attached to AuthContext.

Expected results:
rcache fds should not pile up

Additional info:
https://git.fedorahosted.org/cgit/python-krbV.git

This is what python source on server side looks like:
------------------------------------------------------------------------------
    def login_krbv(self, krb_request, remote_addr):
        context = krbV.default_context()
        principal = self.krb_auth_principal
        keytabfile = self.krb_auth_keytabfile

        server_principal = krbV.Principal(name = principal, context = context)
        server_keytab = krbV.Keytab(name = keytabfile, context = context)

        auth_context = krbV.AuthContext(context = context)
        auth_context.flags = (krbV.KRB5_AUTH_CONTEXT_DO_SEQUENCE |
                              krbV.KRB5_AUTH_CONTEXT_DO_TIME)

        hostname = principal.split("/")[1]

        auth_context.addrs = (socket.gethostbyname(hostname), 0,
                              socket.gethostbyname(remote_addr), 0)

        # decode and read the authentication request
        decoded_request = base64.decodestring(krb_request)
        auth_tuple = context.rd_req(decoded_request,
                               server = server_principal,
                               keytab = server_keytab,
                               auth_context = auth_context,
                               options = krbV.AP_OPTS_MUTUAL_REQUIRED)
        auth_context, opts, server_principal, cache_credentials = auth_tuple
        cprinc = cache_credentials[2]

        # remove @REALM and return username
        return cprinc.name.split("@")[0]
------------------------------------------------------------------------------

And the workaround is:
------------------------------------------------------------------------------
diff --git a/server/autobkr/server/auth.py b/server/autobkr/server/auth.py
index 9ec9f203ba9e..c18027a0ab25 100644
--- a/server/autobkr/server/auth.py
+++ b/server/autobkr/server/auth.py
@@ -38,6 +38,8 @@ class KrbAuth():
         auth_context = krbV.AuthContext(context = context)
         auth_context.flags = (krbV.KRB5_AUTH_CONTEXT_DO_SEQUENCE |
                               krbV.KRB5_AUTH_CONTEXT_DO_TIME)
+        auth_context.rcache = context.default_rcache()
+
         hostname = principal.split("/")[1]
------------------------------------------------------------------------------

Comment 2 Jan Stancek 2016-02-25 16:29:27 UTC
I just realized, that maybe this is because python doesn't release auth_context objects immediately. And "context.default_rcache()" works around that because it keeps single instance around through entire life-time.

Comment 3 Robbie Harwood 2016-02-25 16:47:09 UTC
(In reply to Jan Stancek from comment #2)
> I just realized, that maybe this is because python doesn't release
> auth_context objects immediately. And "context.default_rcache()" works
> around that because it keeps single instance around through entire life-time.

That makes sense to me.  Adding in python-krbV to hopefully provide more information.

Comment 4 Jan Stancek 2016-02-25 17:58:11 UTC
After call rd_req, auth_context never gets released - gdb never triggers on krb5_auth_con_free().

I added following to end of login_krbv()
        ...

        import gc
        import sys
        gc.collect()

        print 'auth_context', auth_context
        print str(sys.getrefcount(auth_context))
        print str(auth_context)
        print str(gc.get_referrers(auth_context))

And this is what I get:
auth_context <krbV.AuthContext instance at 0x2c2ec20>
2
<krbV.AuthContext instance at 0x2c2ec20>
[<frame object at 0x7f766400a250>]

I don't get the first referrer. It looks like it's referencing itself. Perhaps it's coming from this code?

static PyObject*
Context_rd_req(PyObject *unself, PyObject *args, PyObject *kw)
...
  if(auth_context)
    {
      Py_INCREF(auth_context);
    }

Comment 5 Jan Stancek 2016-02-25 23:29:13 UTC
Here's a reproducer that should demonstrate the problem:
--------------------------------------------------------------------------
import krbV
import sys
import gc

def login_krbv():
    context = krbV.default_context()
    auth_context = krbV.AuthContext(context = context)

    for i in range(0, 3):
        print sys.getrefcount(auth_context._ac)
        print gc.get_referrers(auth_context._ac)

        auth_context.flags = (krbV.KRB5_AUTH_CONTEXT_DO_SEQUENCE |
                              krbV.KRB5_AUTH_CONTEXT_DO_TIME)


login_krbv()
--------------------------------------------------------------------------
python reproducer.py
2
[{'_ac': <PyCObject object at 0x7f9c882ec378>, 'context': <krbV.Context instance at 0x7f9c88236758>}]
3
[{'_ac': <PyCObject object at 0x7f9c882ec378>, 'context': <krbV.Context instance at 0x7f9c88236758>}]
4
[{'_ac': <PyCObject object at 0x7f9c882ec378>, 'context': <krbV.Context instance at 0x7f9c88236758>}]
--------------------------------------------------------------------------

If I run this via gdb with breakpoint set on destroy_ac(), it never triggers.
If I comment the 2 lines with "auth_context.flags = ...", breakpoint triggers.

Or with ltrace:
# ltrace --dl python reproducer.py 2>&1 | grep destroy_ac
#
# # After commenting out flags line
# ltrace --dl python reproducer.py 2>&1 | grep destroy_ac
(11597) destroy_ac

I'm guessing this is PyObject_GetAttrString() in Context_setattr() grabbing new reference that is never Py_DECREF-ed. If I add that, reproducer above no longer works:

AuthContext_setattr(PyObject *unself __UNUSED, PyObject *args)
...
      tmp = PyObject_GetAttrString(self, "_ac");
      if(tmp) {
        ac = PyCObject_AsVoidPtr(tmp);
        Py_DECREF(tmp);
      }

Comment 6 Jan Kurik 2017-12-06 10:31:17 UTC
Red Hat Enterprise Linux 6 is in the Production 3 Phase. During the Production 3 Phase, Critical impact Security Advisories (RHSAs) and selected Urgent Priority Bug Fix Advisories (RHBAs) may be released as they become available.

The official life cycle policy can be reviewed here:

http://redhat.com/rhel/lifecycle

This issue does not meet the inclusion criteria for the Production 3 Phase and will be marked as CLOSED/WONTFIX. If this remains a critical requirement, please contact Red Hat Customer Support to request a re-evaluation of the issue, citing a clear business justification. Note that a strong business justification will be required for re-evaluation. Red Hat Customer Support can be contacted via the Red Hat Customer Portal at the following URL:

https://access.redhat.com/