Description of problem: When uploading a large amount of content to an iso or rpm repository, the uploads eventually fail with a pulp server error. It looks like a large number of connections are being established to the mongo database and are not closed. Version-Release number of selected component (if applicable): master How reproducible: Steps to Reproduce: 1. Generate a whole bunch of rpms: https://github.com/mccun934/fakerpmrepo-generator with the command ./generate-repo.py -n 2000 -o /path/to/dir 2. pulp-admin rpm repo create --repo-id=test_repo 3. pulp-admin rpm repo uploads rpm --repo-id=test_repo -d=/path/to/rpms Actual results: The upload runs for a while and then every upload results in An internal error occurred on the Pulp server. More information can be found in the client log file ~/.pulp/admin.log. As the upload continues the number of established connections to port 27017 (the default mongodb port) appears to grow until it stops at 8946 and the server errors start occuring. [jcline@jcline-desk ~]$ sudo lsof | grep apache | grep ESTABLISHED | wc lsof: WARNING: can't stat() fuse.gvfsd-fuse file system /run/user/1000/gvfs Output information may be incomplete. 8946 97909 1404504 Expected results: success Additional info: The traceback from the server error 2013-06-20 15:13:05,993 pulp.server.webservices.middleware.exception:ERROR: Unhandled Exception Traceback (most recent call last): File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/middleware/exception.py", line 44, in __call__ File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/middleware/postponed.py", line 39, in __call__ File "/usr/lib/python2.7/site-packages/web/application.py", line 279, in wsgi File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/application.py", line 26, in _handle_with_process ors File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/application.py", line 23, in process File "/usr/lib/python2.7/site-packages/web/application.py", line 566, in processor File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/application.py", line 23, in <lambda> File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/application.py", line 23, in process File "/usr/lib/python2.7/site-packages/web/application.py", line 581, in processor File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/application.py", line 23, in <lambda> File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/application.py", line 25, in process File "/usr/lib/python2.7/site-packages/web/application.py", line 230, in handle File "/usr/lib/python2.7/site-packages/web/application.py", line 422, in _delegate File "/usr/lib/python2.7/site-packages/web/application.py", line 430, in <lambda> File "/usr/lib/python2.7/site-packages/web/application.py", line 455, in _delegate_sub_application File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/application.py", line 26, in _handle_with_process ors File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/application.py", line 23, in process File "/usr/lib/python2.7/site-packages/web/application.py", line 566, in processor File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/application.py", line 23, in <lambda> File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/application.py", line 23, in process File "/usr/lib/python2.7/site-packages/web/application.py", line 581, in processor File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/application.py", line 23, in <lambda> File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/application.py", line 25, in process File "/usr/lib/python2.7/site-packages/web/application.py", line 230, in handle File "/usr/lib/python2.7/site-packages/web/application.py", line 405, in _delegate File "/usr/lib/python2.7/site-packages/web/application.py", line 396, in handle_class File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/controllers/decorators.py", line 194, in _auth_de corator File "/home/jcline/devel/pulp/platform/src/pulp/server/webservices/controllers/decorators.py", line 87, in user_cert _authentication File "/home/jcline/devel/pulp/platform/src/pulp/server/managers/auth/authentication.py", line 140, in check_user_cer t File "/home/jcline/devel/pulp/platform/src/pulp/server/managers/auth/cert/cert_generator.py", line 112, in verify_ce rt File "/usr/lib64/python2.7/subprocess.py", line 672, in __init__ File "/usr/lib64/python2.7/subprocess.py", line 1038, in _get_handles File "/usr/lib64/python2.7/subprocess.py", line 1091, in pipe_cloexec OSError: [Errno 24] Too many open files
I forgot to mention: after this, all pulp-admin commands result in the same error, and restarting apache fixes this.
I'm hoping this is the result of some anomaly in his environment. Sayli, see if you can reproduce this, and if not, we'll move it off of 2.2.1.
*** Bug 982833 has been marked as a duplicate of this bug. ***
The duplicate bug report above includes some useful diagnosis.
See the dupe for clear reproducer scripts: https://bugzilla.redhat.com/show_bug.cgi?id=982833
This appears to be an issue in the coordinator or tasking system. I was able to reproduce the problem by trying to update a repository that doesn't exist. It should be a very simple call that fails quickly. Each attempt results in an additional 30+ connections to mongo. I am counting mongo connections with this command: $ sudo lsof | grep apache| grep '27017 (ESTABLISHED)'| wc -l I'll attach a patch momentarily that takes the dispatch system out of the workflow for the repo update call. After applying the patch, the same call to update a non-existing repo doesn't cause new connections. I am seeing initially that the number of connections can rise, but not for every call, and it levels off around 72. Without the patch, the number of connections grows for every single call and does not stop until the OS complains.
Created attachment 773800 [details] patch that removes the dispatch system from a repo update call
It turns out that the "automatic connection pooling" feature in pymongo isn't applicable to long-running or threaded applications. Pulp is both. Added decorator to PulpCollection query methods that returns the connection socket to the pool by calling end_request at the end. PR: https://github.com/pulp/pulp/pull/541
build: 2.2.0-0.23.beta.
Fails-qa [root@cloud-qe-14 ~]# rpm -qa |grep pulp-server pulp-server-2.2.0-0.24.beta.el6.noarch [root@cloud-qe-14 ~]# Looks like uploading large number of content is still an issue Steps 1. Created fake rpms /generate-repo.py -n 2000 -o /tmp/fake-rpm/ 2. pulp-admin rpm repo create --repo-id=test_repo 3. pulp-admin rpm repo uploads rpm --repo-id=test_repo -d /tmp/fake-rpm/ [root@cloud-qe-14 ~]# lsof | grep apache| grep '27017 (ESTABLISHED)'| wc -l 6 [root@cloud-qe-14 ~]# lsof | grep apache | grep ESTABLISHED | wc 7 70 902 [root@cloud-qe-14 ~]# towards the end the uploaded started to give the following error Uploading: unconsideringly-5.1.4-1.noarch.rpm An internal error occurred on the Pulp server. More information can be found in the client log file ~/.pulp/admin.log. Uploading: hypothec-8.2.0-1.noarch.rpm An internal error occurred on the Pulp server. More information can be found in the client log file ~/.pulp/admin.log. 2013-08-06 15:26:27,321 pulp.server.webservices.middleware.exception:ERROR: Unhandled Exception Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/pulp/server/webservices/middleware/exception.py", line 44, in __call__ return self.app(environ, start_response) File "/usr/lib/python2.6/site-packages/pulp/server/webservices/middleware/postponed.py", line 39, in __call__ return self.app(environ, start_response) File "/usr/lib/python2.6/site-packages/web/application.py", line 279, in wsgi result = self.handle_with_processors() File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 26, in _handle_with_processors return process(self.processors) File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in process return p(lambda : process(processors)) File "/usr/lib/python2.6/site-packages/web/application.py", line 566, in processor return handler() File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in <lambda> return p(lambda : process(processors)) File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in process return p(lambda : process(processors)) File "/usr/lib/python2.6/site-packages/web/application.py", line 581, in processor result = handler() File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in <lambda> return p(lambda : process(processors)) File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 25, in process return self.handle() File "/usr/lib/python2.6/site-packages/web/application.py", line 230, in handle return self._delegate(fn, self.fvars, args) File "/usr/lib/python2.6/site-packages/web/application.py", line 422, in _delegate return f() File "/usr/lib/python2.6/site-packages/web/application.py", line 430, in <lambda> f = lambda: self._delegate_sub_application(pat, what) File "/usr/lib/python2.6/site-packages/web/application.py", line 455, in _delegate_sub_application return app.handle_with_processors() File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 26, in _handle_with_processors return process(self.processors) File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in process return p(lambda : process(processors)) File "/usr/lib/python2.6/site-packages/web/application.py", line 566, in processor return handler() File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in <lambda> return p(lambda : process(processors)) File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in process return p(lambda : process(processors)) File "/usr/lib/python2.6/site-packages/web/application.py", line 581, in processor result = handler() File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in <lambda> return p(lambda : process(processors)) File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 25, in process return self.handle() File "/usr/lib/python2.6/site-packages/web/application.py", line 230, in handle return self._delegate(fn, self.fvars, args) File "/usr/lib/python2.6/site-packages/web/application.py", line 405, in _delegate return handle_class(f) File "/usr/lib/python2.6/site-packages/web/application.py", line 396, in handle_class return tocall(*args) File "/usr/lib/python2.6/site-packages/pulp/server/webservices/controllers/decorators.py", line 194, in _auth_decorator userid = authenticate_user() File "/usr/lib/python2.6/site-packages/pulp/server/webservices/controllers/decorators.py", line 87, in user_cert_authentication userid = factory.authentication_manager().check_user_cert(cert_pem) File "/usr/lib/python2.6/site-packages/pulp/server/managers/auth/authentication.py", line 140, in check_user_cert if not cert_gen_manager.verify_cert(cert_pem): File "/usr/lib/python2.6/site-packages/pulp/server/managers/auth/cert/cert_generator.py", line 112, in verify_cert stdout=subprocess.PIPE, stderr=subprocess.PIPE) File "/usr/lib64/python2.6/subprocess.py", line 632, in __init__ errread, errwrite) = self._get_handles(stdin, stdout, stderr) File "/usr/lib64/python2.6/subprocess.py", line 1055, in _get_handles p2cread, p2cwrite = os.pipe() OSError: [Errno 24] Too many open files [root@cloud-qe-14 ~]#
I'm seeing that once the actual uploads start, there are tons of new open file handles appearing that "lsof" displays like this: httpd 9749 9776 apache 229u a_inode 0,9 0 6094 [eventpoll] httpd 9749 9776 apache 230u a_inode 0,9 0 6094 [eventpoll] httpd 9749 9776 apache 231u a_inode 0,9 0 6094 [eventpoll] httpd 9749 9776 apache 232u a_inode 0,9 0 6094 [eventpoll] httpd 9749 9776 apache 233u a_inode 0,9 0 6094 [eventpoll] httpd 9749 9776 apache 234u a_inode 0,9 0 6094 [eventpoll] httpd 9749 9776 apache 235u a_inode 0,9 0 6094 [eventpoll] httpd 9749 9776 apache 236u a_inode 0,9 0 6094 [eventpoll]
In quick sniffing around, it looks like this may be happening for all API calls that involve async tasks, but not for API calls that don't use the task system.
I can confirm that this still occurs, its just takes much longer to reproduce than before.
Changed up the fix to decorate low-level connection methods instead of the collection methods. And fixed a bug in the coordinator that was grabbing a handle to a collection a second time. Fixes merged into pulp-2.2 and master at 002dbcee7d3b9552e98a3d61a45ec75a4e264a80 and b232c1fe98558471c999a4ee2579b24750f23a3d respectively
build: 2.2.0-0.26.beta
Looks like its still failing. I tried to upload about 2000 rpms, and didn't get any errors. Although when I did a repo list the uploaded repo showed only around 1600 rpms. Now did second upload with 5000 rpms Saw this in the pulp.log 2013-08-19 12:52:43,839 pulp.server.webservices.middleware.exception:ERROR: Unhandled Exception Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/pulp/server/webservices/middleware/exception.py", line 44, in __call__ return self.app(environ, start_response) File "/usr/lib/python2.6/site-packages/pulp/server/webservices/middleware/postponed.py", line 39, in __call__ return self.app(environ, start_response) File "/usr/lib/python2.6/site-packages/web/application.py", line 279, in wsgi result = self.handle_with_processors() File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 26, in _handle_with_processors return process(self.processors) File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in process return p(lambda : process(processors)) File "/usr/lib/python2.6/site-packages/web/application.py", line 566, in processor return handler() File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in <lambda> return p(lambda : process(processors)) File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in process return p(lambda : process(processors)) File "/usr/lib/python2.6/site-packages/web/application.py", line 581, in processor result = handler() File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in <lambda> return p(lambda : process(processors)) File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 25, in process return self.handle() File "/usr/lib/python2.6/site-packages/web/application.py", line 230, in handle return self._delegate(fn, self.fvars, args) File "/usr/lib/python2.6/site-packages/web/application.py", line 422, in _delegate return f() File "/usr/lib/python2.6/site-packages/web/application.py", line 430, in <lambda> f = lambda: self._delegate_sub_application(pat, what) File "/usr/lib/python2.6/site-packages/web/application.py", line 455, in _delegate_sub_application return app.handle_with_processors() File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 26, in _handle_with_processors return process(self.processors) File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in process return p(lambda : process(processors)) File "/usr/lib/python2.6/site-packages/web/application.py", line 566, in processor return handler() File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in <lambda> return p(lambda : process(processors)) File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in process return p(lambda : process(processors)) File "/usr/lib/python2.6/site-packages/web/application.py", line 581, in processor result = handler() File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in <lambda> return p(lambda : process(processors)) File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 25, in process return self.handle() File "/usr/lib/python2.6/site-packages/web/application.py", line 230, in handle return self._delegate(fn, self.fvars, args) File "/usr/lib/python2.6/site-packages/web/application.py", line 405, in _delegate return handle_class(f) File "/usr/lib/python2.6/site-packages/web/application.py", line 396, in handle_class return tocall(*args) File "/usr/lib/python2.6/site-packages/pulp/server/webservices/controllers/decorators.py", line 194, in _auth_decorator userid = authenticate_user() File "/usr/lib/python2.6/site-packages/pulp/server/webservices/controllers/decorators.py", line 87, in user_cert_authentication userid = factory.authentication_manager().check_user_cert(cert_pem) File "/usr/lib/python2.6/site-packages/pulp/server/managers/auth/authentication.py", line 140, in check_user_cert if not cert_gen_manager.verify_cert(cert_pem): File "/usr/lib/python2.6/site-packages/pulp/server/managers/auth/cert/cert_generator.py", line 112, in verify_cert stdout=subprocess.PIPE, stderr=subprocess.PIPE) File "/usr/lib64/python2.6/subprocess.py", line 632, in __init__ errread, errwrite) = self._get_handles(stdin, stdout, stderr) File "/usr/lib64/python2.6/subprocess.py", line 1055, in _get_handles p2cread, p2cwrite = os.pipe() OSError: [Errno 24] Too many open files And then these errors appeared Uploading: unbrilliantly-6.2.5-1.noarch.rpm An internal error occurred on the Pulp server. More information can be found in the client log file ~/.pulp/admin.log. Uploading: macropsy-2.0.8-1.noarch.rpm An internal error occurred on the Pulp server. More information can be found in the client log file ~/.pulp/admin.log. Uploading: vernacle-2.7.8-1.noarch.rpm An internal error occurred on the Pulp server. More information can be found in the client log file ~/.pulp/admin.log. Uploading: nonprosaic-0.9.9-1.noarch.rpm An internal error occurred on the Pulp server. More information can be found in the client log file ~/.pulp/admin.log. [root@cloud-qe-9 ~]# tail -f ~/.pulp/admin.log data: {} 2013-08-19 13:07:34,965 - ERROR - Exception occurred: href: /pulp/api/v2/content/uploads/369726ba-287a-4163-b8fc-17fe821f2e5a/0/ method: PUT status: 500 error: Unhandled Exception traceback: [u' File "/usr/lib/python2.6/site-packages/pulp/server/webservices/middleware/exception.py", line 44, in __call__\n return self.app(environ, start_response)\n', u' File "/usr/lib/python2.6/site-packages/pulp/server/webservices/middleware/postponed.py", line 39, in __call__\n return self.app(environ, start_response)\n', u' File "/usr/lib/python2.6/site-packages/web/application.py", line 279, in wsgi\n result = self.handle_with_processors()\n', u' File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 26, in _handle_with_processors\n return process(self.processors)\n', u' File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in process\n return p(lambda : process(processors))\n', u' File "/usr/lib/python2.6/site-packages/web/application.py", line 566, in processor\n return handler()\n', u' File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in <lambda>\n return p(lambda : process(processors))\n', u' File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in process\n return p(lambda : process(processors))\n', u' File "/usr/lib/python2.6/site-packages/web/application.py", line 581, in processor\n result = handler()\n', u' File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in <lambda>\n return p(lambda : process(processors))\n', u' File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 25, in process\n return self.handle()\n', u' File "/usr/lib/python2.6/site-packages/web/application.py", line 230, in handle\n return self._delegate(fn, self.fvars, args)\n', u' File "/usr/lib/python2.6/site-packages/web/application.py", line 422, in _delegate\n return f()\n', u' File "/usr/lib/python2.6/site-packages/web/application.py", line 430, in <lambda>\n f = lambda: self._delegate_sub_application(pat, what)\n', u' File "/usr/lib/python2.6/site-packages/web/application.py", line 455, in _delegate_sub_application\n return app.handle_with_processors()\n', u' File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 26, in _handle_with_processors\n return process(self.processors)\n', u' File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in process\n return p(lambda : process(processors))\n', u' File "/usr/lib/python2.6/site-packages/web/application.py", line 566, in processor\n return handler()\n', u' File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in <lambda>\n return p(lambda : process(processors))\n', u' File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in process\n return p(lambda : process(processors))\n', u' File "/usr/lib/python2.6/site-packages/web/application.py", line 581, in processor\n result = handler()\n', u' File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 23, in <lambda>\n return p(lambda : process(processors))\n', u' File "/usr/lib/python2.6/site-packages/pulp/server/webservices/application.py", line 25, in process\n return self.handle()\n', u' File "/usr/lib/python2.6/site-packages/web/application.py", line 230, in handle\n return self._delegate(fn, self.fvars, args)\n', u' File "/usr/lib/python2.6/site-packages/web/application.py", line 405, in _delegate\n return handle_class(f)\n', u' File "/usr/lib/python2.6/site-packages/web/application.py", line 396, in handle_class\n return tocall(*args)\n', u' File "/usr/lib/python2.6/site-packages/pulp/server/webservices/controllers/decorators.py", line 194, in _auth_decorator\n userid = authenticate_user()\n', u' File "/usr/lib/python2.6/site-packages/pulp/server/webservices/controllers/decorators.py", line 87, in user_cert_authentication\n userid = factory.authentication_manager().check_user_cert(cert_pem)\n', u' File "/usr/lib/python2.6/site-packages/pulp/server/managers/auth/authentication.py", line 140, in check_user_cert\n if not cert_gen_manager.verify_cert(cert_pem):\n', u' File "/usr/lib/python2.6/site-packages/pulp/server/managers/auth/cert/cert_generator.py", line 112, in verify_cert\n stdout=subprocess.PIPE, stderr=subprocess.PIPE)\n', u' File "/usr/lib64/python2.6/subprocess.py", line 632, in __init__\n errread, errwrite) = self._get_handles(stdin, stdout, stderr)\n', u' File "/usr/lib64/python2.6/subprocess.py", line 1055, in _get_handles\n p2cread, p2cwrite = os.pipe()\n'] data: {}
Confirmed that this is still a problem. I'm verifying by doing this: $ ps -Af| grep wsgi apache 10346 10345 1 11:07 ? 00:00:07 (wsgi:pulp) -DFOREGROUND mhrivnak 10626 10207 0 11:18 pts/2 00:00:00 grep --color=auto wsgi With the PID of 10346, I now do this: $ sudo lsof| grep 10346 | grep eventpoll | wc -l lsof: WARNING: can't stat() fuse.gvfsd-fuse file system /run/user/1000/gvfs Output information may be incomplete. 162 Now doing operations like sync, publish, orphan remove, etc. will cause that number to increase.
Fixed in 2.2.0-0.28.beta
[root@cloud-qe-9 ~]# rm -rf /var/lib/pulp/uploads/* [root@cloud-qe-9 ~]# pulp-admin rpm repo create --repo-id bz976561Successfully created repository [bz976561] [root@cloud-qe-9 ~]# ls -l /tmp/fake-rpm/ |wc 5003 45020 366996 [root@cloud-qe-9 ~]# pulp-admin rpm repo uploads rpm --repo-id=bz976561 -d /tmp/fake-rpm +----------------------------------------------------------------------+ Unit Upload +----------------------------------------------------------------------+ Extracting necessary metadata for each request... [==================================================] 100% Analyzing: quaighs-1.9.10-1.noarch.rpm ... completed
[root@cloud-qe-9 ~]# pulp-admin repo list +----------------------------------------------------------------------+ Repositories +----------------------------------------------------------------------+ Id: iso-repo Display Name: iso-repo Description: None Content Unit Counts: Iso: 3 Id: scientific-linux Display Name: scientific-linux Description: None Content Unit Counts: Distribution: 1 Package Category: 15 Package Group: 213 Rpm: 6449 Id: upload Display Name: upload Description: None Content Unit Counts: Id: test1 Display Name: test1 Description: None Content Unit Counts: Erratum: 4 Package Category: 1 Package Group: 2 Rpm: 32 Id: bz976561 Display Name: bz976561 Description: None Content Unit Counts: Rpm: 5000 [root@cloud-qe-9 ~]# [root@cloud-qe-9 ~]# root@cloud-qe-9 ~]# ps -Af| grep wsgi apache 4958 4956 2 Aug27 ? 00:36:30 (wsgi:pulp) root 25674 25398 0 10:43 pts/1 00:00:00 grep wsgi [root@cloud-qe-9 ~]# sudo lsof| grep 4958 | grep eventpoll | wc -l 0 [root@cloud-qe-9 ~]# [root@cloud-qe-9 ~]# sudo lsof| grep 4958 | grep eventpoll | wc -l 0 [root@cloud-qe-9 ~]#
verified see https://bugzilla.redhat.com/show_bug.cgi?id=976561#c20 and https://bugzilla.redhat.com/show_bug.cgi?id=976561#c21 for details on steps followed [root@cloud-qe-9 ~]# rpm -qa pulp-server pulp-server-2.2.0-0.28.beta.el6.noarch
2.2 released http://repos.fedorapeople.org/repos/pulp/pulp/stable/2.2/
*** Bug 1026606 has been marked as a duplicate of this bug. ***
This fixed the open mongo connections on my pulp server, but I'm still hitting the "too many open files" error. This time it's coming from anonymous inodes. # lsof | grep -i httpd | grep anon_ino | wc -l 996
Created attachment 820013 [details] traceback from pulp.log
Please add details about what versions of packages are installed. For example: $ rpm -qa | grep pulp $ rpm -q python-nectar $ pip freeze Do you have any patches applied to pulp? What specific actions cause the number of anon_inode file handles to grow? How many new ones appear for each action?
# rpm -qa | grep pulp | sort createrepo-0.9.9-21.2.pulp.git.0.1f5209f.el6eng.noarch m2crypto-0.21.1.pulp-7.el6.x86_64 mod_wsgi-3.4-1.pulp.el6.x86_64 pulp-agent-2.2.0-1.git.0.1c5236e.el6eng.cdn.1.noarch pulp-builtins-consumer-extensions-2.2.0-0.20.beta.git.0.f692dc9.el6eng.noarch pulp-consumer-client-2.2.0-1.git.0.1c5236e.el6eng.cdn.1.noarch pulp-rpm-plugins-2.2.0-1.el6eng.git.16.936f379.cdn.1.noarch pulp-selinux-2.2.0-1.git.0.1c5236e.el6eng.cdn.1.noarch pulp-server-2.2.0-1.git.0.1c5236e.el6eng.cdn.1.noarch python-isodate-0.5.0-1.pulp.el6.noarch python-oauth2-1.5.170-2.pulp.el6eng.noarch python-pulp-agent-lib-2.2.0-1.git.0.1c5236e.el6eng.cdn.1.noarch python-pulp-bindings-2.2.0-1.git.0.1c5236e.el6eng.cdn.1.noarch python-pulp-client-lib-2.2.0-1.git.0.1c5236e.el6eng.cdn.1.noarch python-pulp-common-2.2.0-1.git.0.1c5236e.el6eng.cdn.1.noarch python-pulp-rpm-common-2.2.0-1.el6eng.git.16.936f379.cdn.1.noarch # rpm -q python-nectar python-nectar-1.1.6-1.el6sat.noarch # /usr/bin/pip-python freeze BeautifulSoup==3.0.8.1 Django==1.3.1 M2Crypto==0.21.1 PyPAM==0.5.0 PyYAML==3.09 backports.ssl-match-hostname==3.2a3 certmaster==0.28 chardet==2.0.1 distribute==0.6.10 ethtool==0.6 eventlet==0.9.17 func==0.30 greenlet==0.3.1 grinder==0.1.16 httplib2==0.6.0 iniparse==0.3.1 iotop==0.3.2 isodate==0.5.0 iwlib==1.0 nectar==1.1.6 oauth2==1.5.170 okaara==1.0.32 ordereddict==1.1 policycoreutils-default-encoding==0.1 psycopg2==2.0.13 pulp-rpm==2.0.0 pulp-rpm-plugins==2.2.0a pyOpenSSL==0.10 pycurl==7.19.0 pygpgme==0.1 pyliblzma==0.5.3 pymongo==2.1.1 python-dmidecode==3.10.12 python-ldap==2.3.10 qpid-python==0.14 requests==2.0.0 rhnlib==2.5.22 rhsm==1.8.9 setools==1.0 simplejson==2.0.9 six==1.1.0 spacewalk-backend==1.2.13 urlgrabber==3.9.1 urllib3==1.5 web.py==0.32 yum-metadata-parser==1.1.2 I do have some patches applied, but nothing relevant as far as I can tell. The command I'm running is: pulp-admin rpm repo uploads rpm --repo-id all-rpm-content --file <rpm> It looks like 1 anon_inode file handle is leaked per rpm upload.
The patches in use are linked below. None look like they could cause something like this. Obviously I don't think we need to test it with the patches, but we should at least verify that this is not an issue with 2.2 stable on rhel6. http://pkgs.devel.redhat.com/cgit/rpms/pulp/tree/?h=eng-rhel-6 http://pkgs.devel.redhat.com/cgit/rpms/pulp-rpm/tree/?h=eng-rhel-6
Please add some more information on the number of RPMs you're uploading (I know you said one per file, just trying to get an idea of scale) and what sort of instance you're running it on.
The rpm-upload.patch does not apply to pulp 2.2 stable without conflict. How was that reconciled when you tested with 2.2 stable?
I ran tests on the RPM builds for both 2.2.0 and 2.2.1-0.2.beta. 1000 RPMs uploaded while keeping a watch on the lsof command included above. I never saw the number increase above 0.
Could you provide a list of RPMs running on the server? I wonder if this is an issue with kernel or a related package.
I've upgraded to RHEL 6.4 and so far the problem has gone away. I'll run an import overnight and will see how it looks in the morning.
I haven't seen the anon_inode file descriptor leak since the upgrade to RHEL 6.4. So, it appears to have been an environment issue. Thank you to everyone who took time to look into this.