Bug 976561 - uploading a large number of content units causes an OSError: Too many open files error on the server
Summary: uploading a large number of content units causes an OSError: Too many open fi...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Pulp
Classification: Retired
Component: z_other
Version: Master
Hardware: Unspecified
OS: Unspecified
urgent
unspecified
Target Milestone: ---
: 2.2.0
Assignee: Jason Connor
QA Contact: Preethi Thomas
URL:
Whiteboard:
: 982833 1026606 (view as bug list)
Depends On:
Blocks: 950743
TreeView+ depends on / blocked
 
Reported: 2013-06-20 19:22 UTC by Jeremy Cline
Modified: 2013-11-17 04:19 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-09-10 15:44:33 UTC
Embargoed:


Attachments (Terms of Use)
patch that removes the dispatch system from a repo update call (1.17 KB, patch)
2013-07-15 14:50 UTC, Michael Hrivnak
no flags Details | Diff
traceback from pulp.log (4.46 KB, text/plain)
2013-11-05 20:50 UTC, Dennis Gregorovic
no flags Details

Description Jeremy Cline 2013-06-20 19:22:31 UTC
Description of problem: When uploading a large amount of content to an iso or rpm repository, the uploads eventually fail with a pulp server error. It looks like a large number of connections are being established to the mongo database and are not closed.


Version-Release number of selected component (if applicable): master


How reproducible: 


Steps to Reproduce:
1. Generate a whole bunch of rpms: https://github.com/mccun934/fakerpmrepo-generator with the command ./generate-repo.py -n 2000 -o /path/to/dir
2. pulp-admin rpm repo create --repo-id=test_repo
3. pulp-admin rpm repo uploads rpm --repo-id=test_repo -d=/path/to/rpms

Actual results: The upload runs for a while and then every upload results in
An internal error occurred on the Pulp server. More information can be found in
the client log file ~/.pulp/admin.log.

As the upload continues the number of established connections to port 27017 (the default mongodb port) appears to grow until it stops at 8946 and the server errors start occuring.

[jcline@jcline-desk ~]$ sudo lsof | grep apache | grep ESTABLISHED | wc                                               
lsof: WARNING: can't stat() fuse.gvfsd-fuse file system /run/user/1000/gvfs                                           
      Output information may be incomplete.                                                                           
   8946   97909 1404504

Expected results: success


Additional info:
The traceback from the server error

2013-06-20 15:13:05,993 pulp.server.webservices.middleware.exception:ERROR: Unhandled Exception
Traceback (most recent call last):
  File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/middleware/exception.py", line 44, in __call__
  File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/middleware/postponed.py", line 39, in __call__
  File "/usr/lib/python2.7/site-packages/web/application.py", line 279, in wsgi
  File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/application.py", line 26, in _handle_with_process
ors
  File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/application.py", line 23, in process
  File "/usr/lib/python2.7/site-packages/web/application.py", line 566, in processor
  File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/application.py", line 23, in <lambda>
  File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/application.py", line 23, in process
  File "/usr/lib/python2.7/site-packages/web/application.py", line 581, in processor
  File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/application.py", line 23, in <lambda>
  File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/application.py", line 25, in process
  File "/usr/lib/python2.7/site-packages/web/application.py", line 230, in handle
  File "/usr/lib/python2.7/site-packages/web/application.py", line 422, in _delegate
  File "/usr/lib/python2.7/site-packages/web/application.py", line 430, in <lambda>
  File "/usr/lib/python2.7/site-packages/web/application.py", line 455, in _delegate_sub_application
  File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/application.py", line 26, in _handle_with_process
ors
  File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/application.py", line 23, in process
  File "/usr/lib/python2.7/site-packages/web/application.py", line 566, in processor
  File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/application.py", line 23, in <lambda>
  File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/application.py", line 23, in process
  File "/usr/lib/python2.7/site-packages/web/application.py", line 581, in processor
  File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/application.py", line 23, in <lambda>
  File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/application.py", line 25, in process
  File "/usr/lib/python2.7/site-packages/web/application.py", line 230, in handle
  File "/usr/lib/python2.7/site-packages/web/application.py", line 405, in _delegate
  File "/usr/lib/python2.7/site-packages/web/application.py", line 396, in handle_class
  File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/controllers/decorators.py", line 194, in _auth_de
corator
  File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/controllers/decorators.py", line 87, in user_cert
_authentication
  File "/home/jcline/devel/pulp/platform/src/pulp/server/managers/auth/authentication.py", line 140, in check_user_cer
t
  File "/home/jcline/devel/pulp/platform/src/pulp/server/managers/auth/cert/cert_generator.py", line 112, in verify_ce
rt
  File "/usr/lib64/python2.7/subprocess.py", line 672, in __init__
  File "/usr/lib64/python2.7/subprocess.py", line 1038, in _get_handles
  File "/usr/lib64/python2.7/subprocess.py", line 1091, in pipe_cloexec
OSError: [Errno 24] Too many open files

Comment 1 Jeremy Cline 2013-06-20 20:54:05 UTC
I forgot to mention: after this, all pulp-admin commands result in the same error, and restarting apache fixes this.

Comment 2 Michael Hrivnak 2013-06-21 15:25:35 UTC
I'm hoping this is the result of some anomaly in his environment. Sayli, see if you can reproduce this, and if not, we'll move it off of 2.2.1.

Comment 3 Michael Hrivnak 2013-07-10 13:25:05 UTC
*** Bug 982833 has been marked as a duplicate of this bug. ***

Comment 4 Michael Hrivnak 2013-07-10 13:26:20 UTC
The duplicate bug report above includes some useful diagnosis.

Comment 5 Mike McCune 2013-07-10 20:28:53 UTC
See the dupe for clear reproducer scripts:

https://bugzilla.redhat.com/show_bug.cgi?id=982833

Comment 6 Michael Hrivnak 2013-07-15 14:48:31 UTC
This appears to be an issue in the coordinator or tasking system.

I was able to reproduce the problem by trying to update a repository that doesn't exist. It should be a very simple call that fails quickly. Each attempt results in an additional 30+ connections to mongo.

I am counting mongo connections with this command:

$ sudo lsof | grep apache| grep '27017 (ESTABLISHED)'| wc -l

I'll attach a patch momentarily that takes the dispatch system out of the workflow for the repo update call. After applying the patch, the same call to update a non-existing repo doesn't cause new connections. I am seeing initially that the number of connections can rise, but not for every call, and it levels off around 72. Without the patch, the number of connections grows for every single call and does not stop until the OS complains.

Comment 7 Michael Hrivnak 2013-07-15 14:50:14 UTC
Created attachment 773800 [details]
patch that removes the dispatch system from a repo update call

Comment 8 Jason Connor 2013-07-26 14:35:58 UTC
It turns out that the "automatic connection pooling" feature in pymongo isn't applicable to long-running or threaded applications. Pulp is both.

Added decorator to PulpCollection query methods that returns the connection socket to the pool by calling end_request at the end.

PR: https://github.com/pulp/pulp/pull/541

Comment 9 Jeff Ortel 2013-08-02 15:18:06 UTC
build: 2.2.0-0.23.beta.

Comment 10 Preethi Thomas 2013-08-06 20:05:56 UTC
Fails-qa
[root@cloud-qe-14 ~]# rpm -qa |grep pulp-server
pulp-server-2.2.0-0.24.beta.el6.noarch
[root@cloud-qe-14 ~]# 



Looks like uploading large number of content is still an issue


Steps

1. Created fake rpms /generate-repo.py -n 2000 -o /tmp/fake-rpm/

2.  pulp-admin rpm repo create --repo-id=test_repo

3.  pulp-admin rpm repo uploads rpm --repo-id=test_repo -d /tmp/fake-rpm/

[root@cloud-qe-14 ~]# lsof | grep apache| grep '27017 (ESTABLISHED)'| wc -l
6
[root@cloud-qe-14 ~]# lsof | grep apache | grep ESTABLISHED | wc                                
      7      70     902
[root@cloud-qe-14 ~]# 



towards the end the uploaded started to give the following error

Uploading: unconsideringly-5.1.4-1.noarch.rpm
An internal error occurred on the Pulp server. More information can be found in
the client log file ~/.pulp/admin.log.

Uploading: hypothec-8.2.0-1.noarch.rpm
An internal error occurred on the Pulp server. More information can be found in
the client log file ~/.pulp/admin.log.



2013-08-06 15:26:27,321 pulp.server.webservices.middleware.exception:ERROR: Unhandled Exception
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/middleware/exception.py", line 44, in __call__
    return self.app(environ, start_response)
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/middleware/postponed.py", line 39, in __call__
    return self.app(environ, start_response)
  File "/usr/lib/python2.6/site-packages/web/application.py", line 279, in wsgi
    result = self.handle_with_processors()
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 26, in _handle_with_processors
    return process(self.processors)
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in process
    return p(lambda : process(processors))
  File "/usr/lib/python2.6/site-packages/web/application.py", line 566, in processor
    return handler()
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in <lambda>
    return p(lambda : process(processors))
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in process
    return p(lambda : process(processors))
  File "/usr/lib/python2.6/site-packages/web/application.py", line 581, in processor
    result = handler()
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in <lambda>
    return p(lambda : process(processors))
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 25, in process
    return self.handle()
  File "/usr/lib/python2.6/site-packages/web/application.py", line 230, in handle
    return self._delegate(fn, self.fvars, args)
  File "/usr/lib/python2.6/site-packages/web/application.py", line 422, in _delegate
    return f()
  File "/usr/lib/python2.6/site-packages/web/application.py", line 430, in <lambda>
    f = lambda: self._delegate_sub_application(pat, what)
  File "/usr/lib/python2.6/site-packages/web/application.py", line 455, in _delegate_sub_application
    return app.handle_with_processors()
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 26, in _handle_with_processors
    return process(self.processors)
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in process
    return p(lambda : process(processors))
  File "/usr/lib/python2.6/site-packages/web/application.py", line 566, in processor
    return handler()
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in <lambda>
    return p(lambda : process(processors))
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in process
    return p(lambda : process(processors))
  File "/usr/lib/python2.6/site-packages/web/application.py", line 581, in processor
    result = handler()
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in <lambda>
    return p(lambda : process(processors))
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 25, in process
    return self.handle()
  File "/usr/lib/python2.6/site-packages/web/application.py", line 230, in handle
    return self._delegate(fn, self.fvars, args)
  File "/usr/lib/python2.6/site-packages/web/application.py", line 405, in _delegate
    return handle_class(f)
  File "/usr/lib/python2.6/site-packages/web/application.py", line 396, in handle_class
    return tocall(*args)
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/controllers/decorators.py", line 194, in _auth_decorator
    userid = authenticate_user()
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/controllers/decorators.py", line 87, in user_cert_authentication
    userid = factory.authentication_manager().check_user_cert(cert_pem)
  File "/usr/lib/python2.6/site-packages/pulp/server/managers/auth/authentication.py", line 140, in check_user_cert
    if not cert_gen_manager.verify_cert(cert_pem):
  File "/usr/lib/python2.6/site-packages/pulp/server/managers/auth/cert/cert_generator.py", line 112, in verify_cert
    stdout=subprocess.PIPE, stderr=subprocess.PIPE)
  File "/usr/lib64/python2.6/subprocess.py", line 632, in __init__
    errread, errwrite) = self._get_handles(stdin, stdout, stderr)
  File "/usr/lib64/python2.6/subprocess.py", line 1055, in _get_handles
    p2cread, p2cwrite = os.pipe()
OSError: [Errno 24] Too many open files
[root@cloud-qe-14 ~]#

Comment 11 Michael Hrivnak 2013-08-07 17:51:27 UTC
I'm seeing that once the actual uploads start, there are tons of new open file handles appearing that "lsof" displays like this:

httpd      9749  9776   apache  229u  a_inode                0,9          0       6094 [eventpoll]
httpd      9749  9776   apache  230u  a_inode                0,9          0       6094 [eventpoll]
httpd      9749  9776   apache  231u  a_inode                0,9          0       6094 [eventpoll]
httpd      9749  9776   apache  232u  a_inode                0,9          0       6094 [eventpoll]
httpd      9749  9776   apache  233u  a_inode                0,9          0       6094 [eventpoll]
httpd      9749  9776   apache  234u  a_inode                0,9          0       6094 [eventpoll]
httpd      9749  9776   apache  235u  a_inode                0,9          0       6094 [eventpoll]
httpd      9749  9776   apache  236u  a_inode                0,9          0       6094 [eventpoll]

Comment 12 Michael Hrivnak 2013-08-08 13:51:55 UTC
In quick sniffing around, it looks like this may be happening for all API calls that involve async tasks, but not for API calls that don't use the task system.

Comment 14 Justin Sherrill 2013-08-13 14:37:47 UTC
I can confirm that this still occurs, its just takes much longer to reproduce than before.

Comment 15 Jason Connor 2013-08-15 15:20:44 UTC
Changed up the fix to decorate low-level connection methods instead of the collection methods. And fixed a bug in the coordinator that was grabbing a handle to a collection a second time.

Fixes merged into pulp-2.2 and master at
002dbcee7d3b9552e98a3d61a45ec75a4e264a80 and
b232c1fe98558471c999a4ee2579b24750f23a3d respectively

Comment 16 Jeff Ortel 2013-08-15 19:18:27 UTC
build: 2.2.0-0.26.beta

Comment 17 Preethi Thomas 2013-08-19 17:17:47 UTC
Looks like its still failing.

I tried to upload about 2000 rpms, and didn't get any errors. Although when I did a repo list the uploaded repo showed only around 1600 rpms.

Now did  second upload with 5000 rpms

Saw this in the pulp.log

2013-08-19 12:52:43,839 pulp.server.webservices.middleware.exception:ERROR: Unhandled Exception
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/middleware/exception.py", line 44, in __call__
    return self.app(environ, start_response)
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/middleware/postponed.py", line 39, in __call__
    return self.app(environ, start_response)
  File "/usr/lib/python2.6/site-packages/web/application.py", line 279, in wsgi
    result = self.handle_with_processors()
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 26, in _handle_with_processors
    return process(self.processors)
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in process
    return p(lambda : process(processors))
  File "/usr/lib/python2.6/site-packages/web/application.py", line 566, in processor
    return handler()
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in <lambda>
    return p(lambda : process(processors))
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in process
    return p(lambda : process(processors))
  File "/usr/lib/python2.6/site-packages/web/application.py", line 581, in processor
    result = handler()
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in <lambda>
    return p(lambda : process(processors))
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 25, in process
    return self.handle()
  File "/usr/lib/python2.6/site-packages/web/application.py", line 230, in handle
    return self._delegate(fn, self.fvars, args)
  File "/usr/lib/python2.6/site-packages/web/application.py", line 422, in _delegate
    return f()
  File "/usr/lib/python2.6/site-packages/web/application.py", line 430, in <lambda>
    f = lambda: self._delegate_sub_application(pat, what)
  File "/usr/lib/python2.6/site-packages/web/application.py", line 455, in _delegate_sub_application
    return app.handle_with_processors()
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 26, in _handle_with_processors
    return process(self.processors)
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in process
    return p(lambda : process(processors))
  File "/usr/lib/python2.6/site-packages/web/application.py", line 566, in processor
    return handler()
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in <lambda>
    return p(lambda : process(processors))
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in process
    return p(lambda : process(processors))
  File "/usr/lib/python2.6/site-packages/web/application.py", line 581, in processor
    result = handler()
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in <lambda>
    return p(lambda : process(processors))
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 25, in process
    return self.handle()
  File "/usr/lib/python2.6/site-packages/web/application.py", line 230, in handle
    return self._delegate(fn, self.fvars, args)
  File "/usr/lib/python2.6/site-packages/web/application.py", line 405, in _delegate
    return handle_class(f)
  File "/usr/lib/python2.6/site-packages/web/application.py", line 396, in handle_class
    return tocall(*args)
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/controllers/decorators.py", line 194, in _auth_decorator
    userid = authenticate_user()
  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/controllers/decorators.py", line 87, in user_cert_authentication
    userid = factory.authentication_manager().check_user_cert(cert_pem)
  File "/usr/lib/python2.6/site-packages/pulp/server/managers/auth/authentication.py", line 140, in check_user_cert
    if not cert_gen_manager.verify_cert(cert_pem):
  File "/usr/lib/python2.6/site-packages/pulp/server/managers/auth/cert/cert_generator.py", line 112, in verify_cert
    stdout=subprocess.PIPE, stderr=subprocess.PIPE)
  File "/usr/lib64/python2.6/subprocess.py", line 632, in __init__
    errread, errwrite) = self._get_handles(stdin, stdout, stderr)
  File "/usr/lib64/python2.6/subprocess.py", line 1055, in _get_handles
    p2cread, p2cwrite = os.pipe()
OSError: [Errno 24] Too many open files


And then 
these errors appeared 


Uploading: unbrilliantly-6.2.5-1.noarch.rpm
An internal error occurred on the Pulp server. More information can be found in
the client log file ~/.pulp/admin.log.

Uploading: macropsy-2.0.8-1.noarch.rpm
An internal error occurred on the Pulp server. More information can be found in
the client log file ~/.pulp/admin.log.

Uploading: vernacle-2.7.8-1.noarch.rpm
An internal error occurred on the Pulp server. More information can be found in
the client log file ~/.pulp/admin.log.

Uploading: nonprosaic-0.9.9-1.noarch.rpm
An internal error occurred on the Pulp server. More information can be found in
the client log file ~/.pulp/admin.log.

[root@cloud-qe-9 ~]# tail -f  ~/.pulp/admin.log
        data:      {}
        
2013-08-19 13:07:34,965 - ERROR - Exception occurred:
        href:      /pulp/api/v2/content/uploads/369726ba-287a-4163-b8fc-17fe821f2e5a/0/
        method:    PUT
        status:    500
        error:     Unhandled Exception
        traceback: [u'  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/middleware/exception.py", line 44, in __call__\n    return self.app(environ, start_response)\n', u'  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/middleware/postponed.py", line 39, in __call__\n    return self.app(environ, start_response)\n', u'  File "/usr/lib/python2.6/site-packages/web/application.py", line 279, in wsgi\n    result = self.handle_with_processors()\n', u'  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 26, in _handle_with_processors\n    return process(self.processors)\n', u'  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in process\n    return p(lambda : process(processors))\n', u'  File "/usr/lib/python2.6/site-packages/web/application.py", line 566, in processor\n    return handler()\n', u'  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in <lambda>\n    return p(lambda : process(processors))\n', u'  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in process\n    return p(lambda : process(processors))\n', u'  File "/usr/lib/python2.6/site-packages/web/application.py", line 581, in processor\n    result = handler()\n', u'  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in <lambda>\n    return p(lambda : process(processors))\n', u'  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 25, in process\n    return self.handle()\n', u'  File "/usr/lib/python2.6/site-packages/web/application.py", line 230, in handle\n    return self._delegate(fn, self.fvars, args)\n', u'  File "/usr/lib/python2.6/site-packages/web/application.py", line 422, in _delegate\n    return f()\n', u'  File "/usr/lib/python2.6/site-packages/web/application.py", line 430, in <lambda>\n    f = lambda: self._delegate_sub_application(pat, what)\n', u'  File "/usr/lib/python2.6/site-packages/web/application.py", line 455, in _delegate_sub_application\n    return app.handle_with_processors()\n', u'  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 26, in _handle_with_processors\n    return process(self.processors)\n', u'  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in process\n    return p(lambda : process(processors))\n', u'  File "/usr/lib/python2.6/site-packages/web/application.py", line 566, in processor\n    return handler()\n', u'  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in <lambda>\n    return p(lambda : process(processors))\n', u'  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in process\n    return p(lambda : process(processors))\n', u'  File "/usr/lib/python2.6/site-packages/web/application.py", line 581, in processor\n    result = handler()\n', u'  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in <lambda>\n    return p(lambda : process(processors))\n', u'  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 25, in process\n    return self.handle()\n', u'  File "/usr/lib/python2.6/site-packages/web/application.py", line 230, in handle\n    return self._delegate(fn, self.fvars, args)\n', u'  File "/usr/lib/python2.6/site-packages/web/application.py", line 405, in _delegate\n    return handle_class(f)\n', u'  File "/usr/lib/python2.6/site-packages/web/application.py", line 396, in handle_class\n    return tocall(*args)\n', u'  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/controllers/decorators.py", line 194, in _auth_decorator\n    userid = authenticate_user()\n', u'  File "/usr/lib/python2.6/site-packages/pulp/server/webservices/controllers/decorators.py", line 87, in user_cert_authentication\n    userid = factory.authentication_manager().check_user_cert(cert_pem)\n', u'  File "/usr/lib/python2.6/site-packages/pulp/server/managers/auth/authentication.py", line 140, in check_user_cert\n    if not cert_gen_manager.verify_cert(cert_pem):\n', u'  File "/usr/lib/python2.6/site-packages/pulp/server/managers/auth/cert/cert_generator.py", line 112, in verify_cert\n    stdout=subprocess.PIPE, stderr=subprocess.PIPE)\n', u'  File "/usr/lib64/python2.6/subprocess.py", line 632, in __init__\n    errread, errwrite) = self._get_handles(stdin, stdout, stderr)\n', u'  File "/usr/lib64/python2.6/subprocess.py", line 1055, in _get_handles\n    p2cread, p2cwrite = os.pipe()\n']
        data:      {}

Comment 18 Michael Hrivnak 2013-08-20 15:32:22 UTC
Confirmed that this is still a problem. I'm verifying by doing this:

$ ps -Af| grep wsgi
apache   10346 10345  1 11:07 ?        00:00:07 (wsgi:pulp)     -DFOREGROUND
mhrivnak 10626 10207  0 11:18 pts/2    00:00:00 grep --color=auto wsgi

With the PID of 10346, I now do this:

$ sudo lsof| grep 10346 | grep eventpoll | wc -l
lsof: WARNING: can't stat() fuse.gvfsd-fuse file system /run/user/1000/gvfs
      Output information may be incomplete.
162

Now doing operations like sync, publish, orphan remove, etc. will cause that number to increase.

Comment 19 Michael Hrivnak 2013-08-27 15:26:46 UTC
Fixed in 2.2.0-0.28.beta

Comment 20 Preethi Thomas 2013-08-27 19:03:07 UTC
[root@cloud-qe-9 ~]# rm -rf /var/lib/pulp/uploads/*
[root@cloud-qe-9 ~]# pulp-admin rpm repo create --repo-id bz976561Successfully created repository [bz976561]

[root@cloud-qe-9 ~]# ls -l /tmp/fake-rpm/ |wc   5003   45020  366996
[root@cloud-qe-9 ~]# pulp-admin rpm repo uploads rpm --repo-id=bz976561 -d /tmp/fake-rpm
+----------------------------------------------------------------------+
                              Unit Upload
+----------------------------------------------------------------------+

Extracting necessary metadata for each request...
[==================================================] 100%
Analyzing: quaighs-1.9.10-1.noarch.rpm
... completed

Comment 21 Preethi Thomas 2013-08-28 14:49:27 UTC
[root@cloud-qe-9 ~]# pulp-admin repo list

+----------------------------------------------------------------------+
                              Repositories
+----------------------------------------------------------------------+

Id:                  iso-repo
Display Name:        iso-repo
Description:         None
Content Unit Counts: 
  Iso: 3

Id:                  scientific-linux
Display Name:        scientific-linux
Description:         None
Content Unit Counts: 
  Distribution:     1
  Package Category: 15
  Package Group:    213
  Rpm:              6449

Id:                  upload
Display Name:        upload
Description:         None
Content Unit Counts: 

Id:                  test1
Display Name:        test1
Description:         None
Content Unit Counts: 
  Erratum:          4
  Package Category: 1
  Package Group:    2
  Rpm:              32

Id:                  bz976561
Display Name:        bz976561
Description:         None
Content Unit Counts: 
  Rpm: 5000


[root@cloud-qe-9 ~]# 
[root@cloud-qe-9 ~]# 


root@cloud-qe-9 ~]#  ps -Af| grep wsgi
apache    4958  4956  2 Aug27 ?        00:36:30 (wsgi:pulp)    
root     25674 25398  0 10:43 pts/1    00:00:00 grep wsgi
[root@cloud-qe-9 ~]# sudo lsof| grep 4958 | grep eventpoll | wc -l
0
[root@cloud-qe-9 ~]# 
[root@cloud-qe-9 ~]# sudo lsof| grep 4958 | grep eventpoll | wc -l
0
[root@cloud-qe-9 ~]#

Comment 22 Preethi Thomas 2013-08-29 12:32:56 UTC
verified
see https://bugzilla.redhat.com/show_bug.cgi?id=976561#c20 and

https://bugzilla.redhat.com/show_bug.cgi?id=976561#c21

for details on steps followed
[root@cloud-qe-9 ~]# rpm -qa pulp-server
pulp-server-2.2.0-0.28.beta.el6.noarch

Comment 23 Preethi Thomas 2013-09-10 15:44:33 UTC
2.2 released
http://repos.fedorapeople.org/repos/pulp/pulp/stable/2.2/

Comment 24 Jay Dobies 2013-11-05 13:58:18 UTC
*** Bug 1026606 has been marked as a duplicate of this bug. ***

Comment 25 Dennis Gregorovic 2013-11-05 20:47:48 UTC
This fixed the open mongo connections on my pulp server, but I'm still hitting the "too many open files" error.  This time it's coming from anonymous inodes.

# lsof | grep -i httpd | grep anon_ino | wc -l
996

Comment 27 Dennis Gregorovic 2013-11-05 20:50:20 UTC
Created attachment 820013 [details]
traceback from pulp.log

Comment 28 Michael Hrivnak 2013-11-06 14:40:14 UTC
Please add details about what versions of packages are installed. For example:

$ rpm -qa | grep pulp
$ rpm -q python-nectar
$ pip freeze

Do you have any patches applied to pulp?

What specific actions cause the number of anon_inode file handles to grow? How many new ones appear for each action?

Comment 29 Dennis Gregorovic 2013-11-06 15:27:07 UTC
# rpm -qa | grep pulp | sort
createrepo-0.9.9-21.2.pulp.git.0.1f5209f.el6eng.noarch
m2crypto-0.21.1.pulp-7.el6.x86_64
mod_wsgi-3.4-1.pulp.el6.x86_64
pulp-agent-2.2.0-1.git.0.1c5236e.el6eng.cdn.1.noarch
pulp-builtins-consumer-extensions-2.2.0-0.20.beta.git.0.f692dc9.el6eng.noarch
pulp-consumer-client-2.2.0-1.git.0.1c5236e.el6eng.cdn.1.noarch
pulp-rpm-plugins-2.2.0-1.el6eng.git.16.936f379.cdn.1.noarch
pulp-selinux-2.2.0-1.git.0.1c5236e.el6eng.cdn.1.noarch
pulp-server-2.2.0-1.git.0.1c5236e.el6eng.cdn.1.noarch
python-isodate-0.5.0-1.pulp.el6.noarch
python-oauth2-1.5.170-2.pulp.el6eng.noarch
python-pulp-agent-lib-2.2.0-1.git.0.1c5236e.el6eng.cdn.1.noarch
python-pulp-bindings-2.2.0-1.git.0.1c5236e.el6eng.cdn.1.noarch
python-pulp-client-lib-2.2.0-1.git.0.1c5236e.el6eng.cdn.1.noarch
python-pulp-common-2.2.0-1.git.0.1c5236e.el6eng.cdn.1.noarch
python-pulp-rpm-common-2.2.0-1.el6eng.git.16.936f379.cdn.1.noarch

# rpm -q python-nectar
python-nectar-1.1.6-1.el6sat.noarch

# /usr/bin/pip-python freeze
BeautifulSoup==3.0.8.1
Django==1.3.1
M2Crypto==0.21.1
PyPAM==0.5.0
PyYAML==3.09
backports.ssl-match-hostname==3.2a3
certmaster==0.28
chardet==2.0.1
distribute==0.6.10
ethtool==0.6
eventlet==0.9.17
func==0.30
greenlet==0.3.1
grinder==0.1.16
httplib2==0.6.0
iniparse==0.3.1
iotop==0.3.2
isodate==0.5.0
iwlib==1.0
nectar==1.1.6
oauth2==1.5.170
okaara==1.0.32
ordereddict==1.1
policycoreutils-default-encoding==0.1
psycopg2==2.0.13
pulp-rpm==2.0.0
pulp-rpm-plugins==2.2.0a
pyOpenSSL==0.10
pycurl==7.19.0
pygpgme==0.1
pyliblzma==0.5.3
pymongo==2.1.1
python-dmidecode==3.10.12
python-ldap==2.3.10
qpid-python==0.14
requests==2.0.0
rhnlib==2.5.22
rhsm==1.8.9
setools==1.0
simplejson==2.0.9
six==1.1.0
spacewalk-backend==1.2.13
urlgrabber==3.9.1
urllib3==1.5
web.py==0.32
yum-metadata-parser==1.1.2

I do have some patches applied, but nothing relevant as far as I can tell.  

The command I'm running is:

pulp-admin rpm repo uploads rpm --repo-id all-rpm-content --file <rpm>

It looks like 1 anon_inode file handle is leaked per rpm upload.

Comment 30 Michael Hrivnak 2013-11-06 21:26:19 UTC
The patches in use are linked below. None look like they could cause something like this. Obviously I don't think we need to test it with the patches, but we should at least verify that this is not an issue with 2.2 stable on rhel6.

http://pkgs.devel.redhat.com/cgit/rpms/pulp/tree/?h=eng-rhel-6
http://pkgs.devel.redhat.com/cgit/rpms/pulp-rpm/tree/?h=eng-rhel-6

Comment 31 Jay Dobies 2013-11-06 21:41:08 UTC
Please add some more information on the number of RPMs you're uploading (I know you said one per file, just trying to get an idea of scale) and what sort of instance you're running it on.

Comment 32 Michael Hrivnak 2013-11-12 17:44:24 UTC
The rpm-upload.patch does not apply to pulp 2.2 stable without conflict. How was that reconciled when you tested with 2.2 stable?

Comment 34 Jay Dobies 2013-11-13 15:02:50 UTC
I ran tests on the RPM builds for both 2.2.0 and 2.2.1-0.2.beta. 1000 RPMs uploaded while keeping a watch on the lsof command included above. I never saw the number increase above 0.

Comment 35 Dennis Gregorovic 2013-11-13 15:32:37 UTC
Could you provide a list of RPMs running on the server?  I wonder if this is an issue with kernel or a related package.

Comment 36 Dennis Gregorovic 2013-11-14 01:59:51 UTC
I've upgraded to RHEL 6.4 and so far the problem has gone away.  I'll run an import overnight and will see how it looks in the morning.

Comment 37 Dennis Gregorovic 2013-11-17 04:19:04 UTC
I haven't seen the anon_inode file descriptor leak since the upgrade to RHEL 6.4. So, it appears to have been an environment issue.  Thank you to everyone who took time to look into this.


Note You need to log in before you can comment on or make changes to this bug.