Description of problem: 1) started registering 70 hosts to capsule concurrently 2) in different batches continue to scale till 2000 content hosts. Each capsule gets 1000 hosts 3) After registering each host, started gofered and downgraded one package 4) Above 2480 hosts, started noticing below error. stopped accepting any more new registrations qpidd[23792]: 2016-05-12 20:46:28 [Management] error Detected two management objects with the same identifier: 0-0-1--77190(org.apache.qp...c9f7cc3504) qpidd[23792]: 2016-05-12 20:46:28 [Management] error Detected two management objects with the same identifier: 0-0-1--77190(org.apache.qp...c9f7cc3504) qpidd[23792]: 2016-05-12 20:46:29 [Management] error Detected two management objects with the same identifier: 0-0-1--77207(org.apache.qp...235b3e274e) qpidd[23792]: 2016-05-12 20:46:29 [Management] error Detected two management objects with the same identifier: 0-0-1--77207(org.apache.qp...235b3e274e) qpidd[23792]: 2016-05-12 20:46:29 [Management] error Detected two management objects with the same identifier: 0-0-1--77216(org.apache.qp...058afc6f0b) qpidd[23792]: 2016-05-12 20:46:29 [Management] error Detected two management objects with the same identifier: 0-0-1--77216(org.apache.qp...058afc6f0b) qpidd[23792]: 2016-05-12 20:46:30 [Management] error Detected two management objects with the same qpidd[23792]: 2016-05-12 20:46:30 [Management] error Detected two management objects with the same identifier: 0-0-1--77225(org.apache.qp...4c481ed048) qpidd[23792]: 2016-05-12 20:46:38 [Management] error Detected two management objects with the same identifier: 0-0-1--77234(org.apache.qp...2a6bb3f234) qpidd[23792]: 2016-05-12 20:46:38 [Management] error Detected two management objects with the same identifier: 0-0-1--77234(org.apache.qp.. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
lsof | cut -d ' ' -f 1 | sort | uniq -c | sort -n | tail 1011 Passenger 1174 ruby-time 1441 ruby 6142 postgres 10824 mongod 13948 celery 17000 diagnosti 23296 java 37710 qpidd 491462 httpd
qpid-stat --ssl-certificate=/etc/pki/katello/qpid_client_striped.crt -b amqps://localhost:5671 -g Broker Summary: uptime cluster connections sessions exchanges queues ================================================================= 13m 18s <standalone> 29 960 14 2,029 Aggregate Broker Statistics: Statistic Messages Bytes ================================================== queue-depth 17,422 83,736,178 total-enqueues 17,871 88,172,269 total-dequeues 449 4,436,091 persistent-enqueues 17,422 83,736,178 persistent-dequeues 0 0 transactional-enqueues 0 0 transactional-dequeues 0 0 flow-to-disk-depth 0 0 flow-to-disk-enqueues 0 0 flow-to-disk-dequeues 0 0 acquires 449 releases 0 discards-no-route 1,850 discards-ttl-expired 0 discards-limit-overflow 0 discards-ring-overflow 0 discards-lvq-replace 0 discards-subscriber-reject 0 discards-purged 0 reroutes 0 abandoned 0 abandoned-via-alt 0
Created attachment 1156956 [details] foreman-debug
@pavel: in your rich QPID history, have you seen something like this, any suggestions?
(In reply to Ivan Necas from comment #6) > @pavel: in your rich QPID history, have you seen something like this, any > suggestions? Fast answer: yes. qpid maintains management objects for queues, sessions, connections etc. and recycle them every 10 seconds by default. If meantime same object with same ID is created, deleted and created again, old mgmt object isnt purged yet and qpid raises that error. Workaround: lower mgmt-pub-interval parameter from default 10s to something lower (needs tuning, I guess) - add it to /etc/qpid/qpidd.conf and restart qpidd service. Btw. what particular object is affected? "0-0-1--77190(org.apache.qp...c9f7cc3504)" - what's in the dots? AFAIK /var/log/messages should have full object ID. I would be curious to know the class object (connection, session, .. ?). Or ideally provide the complete logfile to let me suggest suitably low value for mgmt-pub-interval parameter.
> > Btw. what particular object is affected? > "0-0-1--77190(org.apache.qp...c9f7cc3504)" - what's in the dots? AFAIK > /var/log/messages should have full object ID. I would be curious to know the > class object (connection, session, .. ?). Or ideally provide the complete > logfile to let me suggest suitably low value for mgmt-pub-interval parameter. Aug 25 03:46:40 ip-172-31-27-118 qpidd: 2016-08-25 03:46:40 [Management] error Detected two management objects with the same identifier: 0-3-1--687191(org.apache.qpid.broker:session:0x7faf185fd6d0) Aug 25 03:46:40 ip-172-31-27-118 qpidd[8803]: 2016-08-25 03:46:40 [Management] error Detected two management objects with the same identifier: 0-3-1--687191(org.apache.qpid.broker:session:0x7faf185fd6d0) Aug 25 03:46:40 ip-172-31-27-118 qpidd[8803]: 2016-08-25 03:46:40 [Management] error Detected two management objects with the same identifier: 0-3-1--687186(org.apache.qpid.broker:session:0x7fab8624b1a0) Aug 25 03:46:40 ip-172-31-27-118 qpidd[8803]: 2016-08-25 03:46:40 [Management] error Detected two management objects with the same identifier: 0-3-1--687185(org.apache.qpid.broker:session:0x7fad31f589b0) Aug 25 03:46:40 ip-172-31-27-118 qpidd[8803]: 2016-08-25 03:46:40 [Management] error Detected two management objects with the same identifier: 0-3-1--687183(org.apache.qpid.broker:session:0x7fae968e9cd0) Aug 25 03:46:40 ip-172-31-27-118 qpidd: 2016-08-25 03:46:40 [Management] error Detected two management objects with the same identifier: 0-3-1--687186(org.apache.qpid.broker:session:0x7fab8624b1a0) Aug 25 03:46:40 ip-172-31-27-118 qpidd: 2016-08-25 03:46:40 [Management] error Detected two management objects with the same identifier: 0-3-1--687185(org.apache.qpid.broker:session:0x7fad31f589b0) Aug 25 03:46:40 ip-172-31-27-118 qpidd: 2016-08-25 03:46:40 [Management] error Detected two management objects with the same identifier: 0-3-1--687183(org.apache.qpid.broker:session:0x7fae968e9cd0)
> > Btw. what particular object is affected? > "0-0-1--77190(org.apache.qp...c9f7cc3504)" - what's in the dots? AFAIK > /var/log/messages should have full object ID. I would be curious to know the > class object (connection, session, .. ?). Or ideally provide the complete > logfile to let me suggest suitably low value for mgmt-pub-interval parameter. ervice qpidd status Redirecting to /bin/systemctl status qpidd.service ● qpidd.service - An AMQP message broker daemon. Loaded: loaded (/usr/lib/systemd/system/qpidd.service; enabled; vendor preset: disabled) Drop-In: /etc/systemd/system/qpidd.service.d └─limits.conf Active: active (running) since Thu 2016-08-25 04:37:49 EDT; 2min 15s ago Docs: man:qpidd(1) http://qpid.apache.org/ Main PID: 92464 (qpidd) CGroup: /system.slice/qpidd.service └─92464 /usr/sbin/qpidd --config /etc/qpid/qpidd.conf Aug 25 04:37:49 ip-10-1-10-1.us-west-2.compute.internal systemd[1]: Started An AMQP message broker daemon.. Aug 25 04:37:49 ip-10-1-10-1.us-west-2.compute.internal systemd[1]: Starting An AMQP message broker daemon.... Aug 25 04:39:14 ip-10-1-10-1.us-west-2.compute.internal qpidd[92464]: 2016-08-25 04:39:14 [Broker] error Channel exception: session-busy: Session already attached: guest...ager.cpp:55) Aug 25 04:39:14 ip-10-1-10-1.us-west-2.compute.internal qpidd[92464]: 2016-08-25 04:39:14 [Broker] error Channel exception: session-busy: Session already attached: guest...ager.cpp:55) Aug 25 04:39:14 ip-10-1-10-1.us-west-2.compute.internal qpidd[92464]: 2016-08-25 04:39:14 [Broker] error Channel exception: not-attached: Channel 2 is not attached (/builddir/build/BUILD/qpid-cpp-0.30/src/...dler.cpp:39) Aug 25 04:39:14 ip-10-1-10-1.us-west-2.compute.internal qpidd[92464]: 2016-08-25 04:39:14 [Broker] error Channel exception: not-attached: Channel 2 is not attached (/builddir/build/BUILD/qpid-cpp-0.30/src/...dler.cpp:39) Aug 25 04:39:14 ip-10-1-10-1.us-west-2.compute.internal qpidd[92464]: 2016-08-25 04:39:14 [Broker] error Channel exception: not-attached: Channel 2 is not attached (/builddir/build/BUILD/qpid-cpp-0.30/src/...dler.cpp:39) Aug 25 04:39:14 ip-10-1-10-1.us-west-2.compute.internal qpidd[92464]: 2016-08-25 04:39:14 [Broker] error Channel exception: not-attached: Channel 2 is not attached (/builddir/build/BUILD/qpid-cpp-0.30/src/...dler.cpp:39) Aug 25 04:39:14 ip-10-1-10-1.us-west-2.compute.internal qpidd[92464]: 2016-08-25 04:39:14 [Broker] error Channel exception: not-attached: Channel 2 is not attached (/builddir/build/BUILD/qpid-cpp-0.30/src/...dler.cpp:39) Aug 25 04:39:14 ip-10-1-10-1.us-west-2.compute.internal qpidd[92464]: 2016-08-25 04:39:14 [Broker] error Channel exception: not-attached: Channel 2 is not attached (/builddir/build/BUILD/qpid-cpp-0.30/src/...dler.cpp:39) Hint: Some lines were ellipsized, use -l to show in full. [root@ip-10-1-10-1 ~]#
Not sure if needed, but with "-l" [root@ip-10-1-10-1 ~]# systemctl status qpidd -l ● qpidd.service - An AMQP message broker daemon. Loaded: loaded (/usr/lib/systemd/system/qpidd.service; enabled; vendor preset: disabled) Drop-In: /etc/systemd/system/qpidd.service.d └─limits.conf Active: deactivating (stop-sigterm) since Thu 2016-08-25 04:46:48 EDT; 4s ago Docs: man:qpidd(1) http://qpid.apache.org/ Main PID: 92464 (qpidd) CGroup: /system.slice/qpidd.service └─92464 /usr/sbin/qpidd --config /etc/qpid/qpidd.conf Aug 25 04:39:14 ip-10-1-10-1.us-west-2.compute.internal qpidd[92464]: 2016-08-25 04:39:14 [Broker] error Channel exception: not-attached: Channel 2 is not attached (/builddir/build/BUILD/qpid-cpp-0.30/src/qpid/amqp_0_10/SessionHandler.cpp:39) Aug 25 04:39:14 ip-10-1-10-1.us-west-2.compute.internal qpidd[92464]: 2016-08-25 04:39:14 [Broker] error Channel exception: not-attached: Channel 2 is not attached (/builddir/build/BUILD/qpid-cpp-0.30/src/qpid/amqp_0_10/SessionHandler.cpp:39) Aug 25 04:39:14 ip-10-1-10-1.us-west-2.compute.internal qpidd[92464]: 2016-08-25 04:39:14 [Broker] error Channel exception: not-attached: Channel 2 is not attached (/builddir/build/BUILD/qpid-cpp-0.30/src/qpid/amqp_0_10/SessionHandler.cpp:39) Aug 25 04:39:14 ip-10-1-10-1.us-west-2.compute.internal qpidd[92464]: 2016-08-25 04:39:14 [Broker] error Channel exception: not-attached: Channel 2 is not attached (/builddir/build/BUILD/qpid-cpp-0.30/src/qpid/amqp_0_10/SessionHandler.cpp:39) Aug 25 04:39:14 ip-10-1-10-1.us-west-2.compute.internal qpidd[92464]: 2016-08-25 04:39:14 [Broker] error Channel exception: not-attached: Channel 2 is not attached (/builddir/build/BUILD/qpid-cpp-0.30/src/qpid/amqp_0_10/SessionHandler.cpp:39) Aug 25 04:42:53 ip-10-1-10-1.us-west-2.compute.internal qpidd[92464]: 2016-08-25 04:42:53 [Protocol] error Connection qpid.10.1.10.1:5671-10.1.10.1:53790 timed out: closing Aug 25 04:42:53 ip-10-1-10-1.us-west-2.compute.internal qpidd[92464]: 2016-08-25 04:42:53 [Protocol] error Connection qpid.10.1.10.1:5671-10.1.10.1:53790 timed out: closing Aug 25 04:43:04 ip-10-1-10-1.us-west-2.compute.internal qpidd[92464]: 2016-08-25 04:43:04 [Protocol] error Connection qpid.10.1.10.1:5671-10.1.10.1:53953 timed out: closing Aug 25 04:43:04 ip-10-1-10-1.us-west-2.compute.internal qpidd[92464]: 2016-08-25 04:43:04 [Protocol] error Connection qpid.10.1.10.1:5671-10.1.10.1:53953 timed out: closing Aug 25 04:46:48 ip-10-1-10-1.us-west-2.compute.internal systemd[1]: Stopping An AMQP message broker daemon....
Hmm, are we talking here only about: Connection qpid.10.1.10.1:5671-10.1.10.1:53790 timed out: closing or is: Channel exception: not-attached: Channel 2 is not attached (/builddir/build/BUILD/qpid-cpp-0.30/src/qpid/amqp_0_10/SessionHandler.cpp:39) part of the problem/discussion as well?
moving severity to high.
Thank you for your interest in Satellite 6. We have evaluated this request, and we do not expect this to be implemented in the product in the foreseeable future. We are therefore closing this out as WONTFIX. If you have any concerns about this, please feel free to contact Rich Jerrido or Bryan Kearney. Thank you.