Description of problem: Following unit tests from endpoints.py are currently failing with qpid-cpp-mrg-0.14-14: qpid.tests.messaging.endpoints.SetupTests.testOpenCloseResourceLeaks qpid.tests.messaging.endpoints.SetupTests.testOpenFailResourceLeaks The tests are failing by opening broker connection due to ValueError: filedescriptor out of range in select() please see additional info for more details This was seen on rhel5 and rhel6 (x86_64 & i386) Version-Release number of selected component (if applicable): python-qpid-0.14-6.el5 python-qpid-qmf-0.14-4.el5 qpid-cpp-client-0.14-14.el5 qpid-cpp-client-devel-0.14-14.el5 qpid-cpp-client-devel-docs-0.14-14.el5 qpid-cpp-client-rdma-0.14-14.el5 qpid-cpp-client-ssl-0.14-14.el5 qpid-cpp-mrg-debuginfo-0.14-14.el5 qpid-cpp-server-0.14-14.el5 qpid-cpp-server-cluster-0.14-14.el5 qpid-cpp-server-devel-0.14-14.el5 qpid-cpp-server-rdma-0.14-14.el5 qpid-cpp-server-ssl-0.14-14.el5 qpid-cpp-server-store-0.14-14.el5 qpid-cpp-server-xml-0.14-14.el5 qpid-java-client-0.14-3.el5 qpid-java-common-0.14-3.el5 qpid-java-example-0.14-3.el5 qpid-qmf-0.14-4.el5 qpid-qmf-debuginfo-0.14-4.el5 qpid-qmf-devel-0.14-4.el5 qpid-tests-0.14-1.el5 qpid-tools-0.14-1.el5 rh-qpid-cpp-tests-0.14-14.el5 ruby-qpid-qmf-0.14-4.el5 How reproducible: 100% Steps to Reproduce: 1. start the broker and execute the tests: # qpid-python-test --broker localhost:5672 'qpid.tests.messaging.endpoints.SetupTests.testOpen*Leaks' Actual results: two unit tests from python-qpid are currently failing Expected results: No error occurs by running these unit tests, result is pass or explanation provided, test removed/skipped Additional info: qpid-python-test --broker localhost:5672 'qpid.tests.messaging.endpoints.SetupTests.testOpen*Leaks' qpid.tests.messaging.endpoints.SetupTests.testOpenCloseResourceLeaks ................................................................................................................................... start Exception in thread Thread-1: Traceback (most recent call last): File "/usr/lib64/python2.4/threading.py", line 442, in __bootstrap self.run() File "/usr/lib64/python2.4/threading.py", line 422, in run self.__target(*self.__args, **self.__kwargs) File "/usr/lib/python2.4/site-packages/qpid/selector.py", line 119, in run rd, wr, ex = select(self.reading, self.writing, (), timeout) ValueError: filedescriptor out of range in select() qpid.tests.messaging.endpoints.SetupTests.testOpenCloseResourceLeaks ................................................................................................................................... fail Error during test: Traceback (most recent call last): File "/usr/bin/qpid-python-test", line 311, in run phase() File "/usr/lib/python2.4/site-packages/qpid/tests/messaging/endpoints.py", line 87, in testOpenCloseResourceLeaks conn = Connection.establish(self.broker, **self.connection_options()) File "/usr/lib/python2.4/site-packages/qpid/messaging/endpoints.py", line 68, in establish conn.open() File "<string>", line 6, in open File "/usr/lib/python2.4/site-packages/qpid/messaging/endpoints.py", line 244, in open self.attach() File "<string>", line 6, in attach File "/usr/lib/python2.4/site-packages/qpid/messaging/endpoints.py", line 262, in attach self._ewait(lambda: self._transport_connected and not self._unlinked()) File "/usr/lib/python2.4/site-packages/qpid/messaging/endpoints.py", line 196, in _ewait result = self._wait(lambda: self.error or predicate(), timeout) File "/usr/lib/python2.4/site-packages/qpid/messaging/endpoints.py", line 181, in _wait return self._waiter.wait(predicate, timeout=timeout) File "/usr/lib/python2.4/site-packages/qpid/concurrency.py", line 57, in wait self.condition.wait(3) File "/usr/lib/python2.4/site-packages/qpid/concurrency.py", line 96, in wait sw.wait(timeout) File "/usr/lib/python2.4/site-packages/qpid/compat.py", line 53, in wait ready, _, _ = select([self], [], [], timeout) ValueError: filedescriptor out of range in select() qpid.tests.messaging.endpoints.SetupTests.testOpenFailResourceLeaks .................................................................................................................................... fail Error during test: Traceback (most recent call last): File "/usr/bin/qpid-python-test", line 311, in run phase() File "/usr/lib/python2.4/site-packages/qpid/tests/messaging/endpoints.py", line 103, in testOpenFailResourceLeaks conn._wait(lambda: False, timeout=0.001) File "/usr/lib/python2.4/site-packages/qpid/messaging/endpoints.py", line 181, in _wait return self._waiter.wait(predicate, timeout=timeout) File "/usr/lib/python2.4/site-packages/qpid/concurrency.py", line 59, in wait self.condition.wait(timeout - passed) File "/usr/lib/python2.4/site-packages/qpid/concurrency.py", line 96, in wait sw.wait(timeout) File "/usr/lib/python2.4/site-packages/qpid/compat.py", line 53, in wait ready, _, _ = select([self], [], [], timeout) ValueError: filedescriptor out of range in select() Totals: 2 tests, 0 passed, 0 skipped, 0 ignored, 2 failed
Petr, is this still an issue with 0.18?
Justin, I missed your needinfo question, my apologize. I've investigated a bit in this issue and I realize that the test failures are caused by the fact that our QE test wrapper resets the maximum number of open file descriptors to 16384. When system default value (1024) is used the tests are passing. By further testing I realized that the tests starts to fail when the maximum number of open file descriptors is set to the number higher than 1052 (which I found as a bit strange number). In detail: qpid.tests.messaging.endpoints.SetupTests.testOpenCloseResourceLeaks starts to fail when 'ulimit -n' is set to 1053 or higher qpid.tests.messaging.endpoints.SetupTests.testOpenFailResourceLeaks starts to report an error when 'ulimit -n' is set to 1054-1060 (but is passing), and starts to fail with value 1061 and higher. This is valid across all supported OS's x architectures. I can easily update our QE test to use the default system value of max open fds for these unit tests, but I'm not sure if this is not a defect. Can someone assess, please.
Created attachment 686086 [details] terminal transcript please see the terminal transcript for details
(In reply to comment #3) > Justin, I missed your needinfo question, my apologize. > > I've investigated a bit in this issue and I realize that the test failures > are caused by the fact that our QE test wrapper resets the maximum number of > open file descriptors to 16384. When system default value (1024) is used the > tests are > passing. By further testing I realized that the tests starts to fail when > the maximum number of open file descriptors is set to the number higher than > 1052 (which I found as a bit strange number). > > In detail: > qpid.tests.messaging.endpoints.SetupTests.testOpenCloseResourceLeaks starts > to fail when 'ulimit -n' is set to 1053 or higher > qpid.tests.messaging.endpoints.SetupTests.testOpenFailResourceLeaks starts > to report an error when 'ulimit -n' is set to 1054-1060 (but is passing), > and starts to fail with value 1061 and higher. > > This is valid across all supported OS's x architectures. > > I can easily update our QE test to use the default system value of max open > fds for these unit tests, but I'm not sure if this is not a defect. Can > someone assess, please. Petr, that's pretty weird! And worth investigating. I'm going to keep this as an issue for 2.4.
This limits python client to max 1024 endpoints and there is already a real user case behind this. class BaseWaiter in /usr/lib/python2.6/site-packages/qpid/compat.py must use epoll instead of select method. As the select() system call uses a fixed size buffer of size equal to the FD_SETSIZE kernel constant. On RHEL this is set to 1024. Bumping severity/priority and notifying engineering via mail.
Candidate fix pushed upstream: https://svn.apache.org/viewvc?view=revision&revision=r1573028
This issue has been fixed on RHEL6. On RHEL5, the issue is on python side, this cannot be fixed on MRG/Messaging side. Packages used for testing: python-qpid-0.18-11 python-qpid-qmf-0.18-23 qpid-cpp-0.18-23 qpid-qmf-0.18-23 qpid-tests-0.18-2 qpid-tools-0.18-10 -> VERIFIED
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2014-0804.html