Description of problem: After AIO resources get exhausted, an attempt to create a durable queue fails but the queue is listed as created. # qpid-config add queue queue_aio_fail --argument durable=True Failed: Exception: Exception from Agent: {u'error_code': 7, u'error_text': 'Queue queue_aio_fail: create() failed: jexception 0x0103 pmgr::initialize() threw JERR__AIO: AIO error. (io_queue_init() failed: errno=11 (Resource temporarily unavailable)) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:421)'} # qpid-stat -q | grep queue_aio_fail queue_aio_fail Y 0 0 0 0 0 0 0 0 Journal is created for this queue: root@mrg-jca-rhel6i_1:/var/dtests/node_data/clients# ls /var/lib/qpidd/qls/jrnl/queue_aio_fail/ 7d6203b8-0fdf-45b4-a435-e9987592e221.jrnl After broker restart, corrupted queue doesnt appear. Version-Release number of selected component (if applicable): qpid-cpp-0.22-47 How reproducible: 100% Steps to Reproduce: 1. exhaust fs.aio-max-nr 2. create a durable queue qpid-config add queue queue_aio_fail --argument durable=True 3. check queue for existence qpid-config queues queue_aio_fail 4. check journal Actual results: queue is listed as created, journal is created Expected results: no queue created, neither the journal Additional info:
info: root@mrg-jca-rhel6i_1:/var/dtests/node_data/clients# sysctl -a | grep fs.aio fs.aio-nr = 65505 fs.aio-max-nr = 65536 root@mrg-jca-rhel6i_1:/var/dtests/node_data/clients# for i in `seq 1 2000`;do qpid-config add queue Q$i --argument durable=True; done
Further information (as I was asked for review for Vienna urgency): Qpidd survives this testing scenario, just after around 1982 created queues refuse to create new one. Retested on latest MRG/M 2.x (qpid-cpp-*0.18-25.el6.x86_64) with similar results including seeing 'failed queues' in qpid-stat -q (see the bottom) Note this situation is IMHO similar as with kernel file open limit. (ulimit -n). The key is that qpidd keeps running. As result, I'm keeping 3.1 and will double-check with kpvdr/jross. MRG/M 2.5 details [root@localhost gamma]# service qpidd restart Stopping Qpid AMQP daemon: [ OK ] Starting Qpid AMQP daemon: 2014-09-02 14:04:05 [Broker] debug Forked daemon child process [ OK ] [root@localhost gamma]# for i in `seq 1 2000`;do qpid-config add queue Q$i --argument durable=True; done Failed: Exception: Exception from Agent: {u'error_code': 7, u'error_text': 'Queue Q157: create() failed: jexception 0x0103 pmgr::initialize() threw JERR__AIO: AIO error. (io_queue_init() failed: errno=11 (Resource temporarily unavailable)) (MessageStoreImpl.cpp:539)'}
Further clarifications after yesterday's discussion on 3.x call. This defect does not track that qpidd should reuse AIO resources / more clever resource management. The core of the problem is the non-atomicity of durable queue creation, atomicity should be established and in case one of the requirements (in this case AIO resource, but there are couple of others I believe) is not met then exception should be raised (working already) AND rollback needs to trash all created objects (journal files, QMF objects, ...)
Hi Kim, could you please provide formula how many AIO requests are required by one durable queue? This is necessary to know when scaling qpid e.g. in Satellite. Thanks in advance.
(In reply to Pavel Moravec from comment #5) > Hi Kim, > could you please provide formula how many AIO requests are required by one > durable queue? > > This is necessary to know when scaling qpid e.g. in Satellite. > > Thanks in advance. Checking by myself: one durable queue consumes 33 AIO requests, i.e. creating one durable queue, fs.aio-nr is increased by 33.