Created attachment 1000528 [details] candlepin.log Description of problem: Candlepin error after the content hosts are registered in Satellite6 Version-Release number of selected component (if applicable): Satellite-6.1.0-RHEL-6-20150303.0 How reproducible: Always Steps to Reproduce: 1.Install sat6 2.Register and subscribe few content hosts to satellite6 3.Review /var/log/candlepin/candlepin.log Actual results: (see the attached log) 2015-03-11 10:43:06,234 [=, org=] ERROR org.candlepin.audit.AMQPBusPublisher - Unable to send event: Event [id=null, target=CONSUMER, type=CREATED, time=Wed Mar 11 10:42:03 EDT 2015, entity=8a9084974be35df8014c094904510bfb] ... ... Expected results: No error Additional info:
Created attachment 1000529 [details] foreman-debug
For me I am seeing candlepin complain but I don't have any content hosts. Seems that I trigger this issue by: * Importing a valid manifest * Visit the Content > Red Hat Repositories page and expanding the Red Hat Enterprise Linux Server product Log shows: ==> /var/log/candlepin/candlepin.log <== 2015-03-11 16:59:35,129 [=, org=] ERROR org.candlepin.audit.AMQPBusPublisher - Unable to send event: Event [id=null, target=POOL, type=CREATED, time=Wed Mar 11 16:58:34 EDT 2015, entity=8a90e0064c0a34be014c0aa1bcd60192] javax.jms.JMSException: Exception when sending message:timed out waiting for completion at org.apache.qpid.client.BasicMessageProducer_0_10.sendMessage(BasicMessageProducer_0_10.java:260) ~[qpid-client-0.30.redhat-1.jar:0.30.redhat-1] at org.apache.qpid.client.BasicMessageProducer.sendImpl(BasicMessageProducer.java:528) ~[qpid-client-0.30.redhat-1.jar:0.30.redhat-1] at org.apache.qpid.client.BasicMessageProducer.send(BasicMessageProducer.java:326) ~[qpid-client-0.30.redhat-1.jar:0.30.redhat-1] at org.apache.qpid.client.TopicPublisherAdapter.send(TopicPublisherAdapter.java:120) ~[qpid-client-0.30.redhat-1.jar:0.30.redhat-1] at org.candlepin.audit.AMQPBusPublisher.onEvent(AMQPBusPublisher.java:60) ~[AMQPBusPublisher.class:na] at org.candlepin.audit.ListenerWrapper.onMessage(ListenerWrapper.java:52) [ListenerWrapper.class:na] at org.hornetq.core.client.impl.ClientConsumerImpl.callOnMessage(ClientConsumerImpl.java:1117) [hornetq-core-client-2.3.14.Final-redhat-1.jar:2.3.14.Final-redhat-1] at org.hornetq.core.client.impl.ClientConsumerImpl.access$500(ClientConsumerImpl.java:57) [hornetq-core-client-2.3.14.Final-redhat-1.jar:2.3.14.Final-redhat-1] at org.hornetq.core.client.impl.ClientConsumerImpl$Runner.run(ClientConsumerImpl.java:1252) [hornetq-core-client-2.3.14.Final-redhat-1.jar:2.3.14.Final-redhat-1] at org.hornetq.utils.OrderedExecutorFactory$OrderedExecutor$1.run(OrderedExecutorFactory.java:107) [hornetq-core-client-2.3.14.Final-redhat-1.jar:2.3.14.Final-redhat-1] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_75] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75] Caused by: org.apache.qpid.transport.SessionException: timed out waiting for completion at org.apache.qpid.transport.Session.invoke(Session.java:747) ~[qpid-common-0.30.redhat-1.jar:0.30.redhat-1] at org.apache.qpid.transport.Session.invoke(Session.java:627) ~[qpid-common-0.30.redhat-1.jar:0.30.redhat-1] at org.apache.qpid.transport.SessionInvoker.messageTransfer(SessionInvoker.java:93) ~[qpid-common-0.30.redhat-1.jar:0.30.redhat-1] at org.apache.qpid.client.BasicMessageProducer_0_10.sendMessage(BasicMessageProducer_0_10.java:246) ~[qpid-client-0.30.redhat-1.jar:0.30.redhat-1] ... 12 common frames omitted 2015-03-11 16:59:35,290 [req=4b45db65-b67f-41b2-a76e-fc3a8146bcc7, org=] INFO org.candlepin.common.filter.LoggingFilter - Request: verb=GET, uri=/candlepin/pools/8a90e0064c0a34be014c0aa1bd59021a 2015-03-11 16:59:35,362 [req=4b45db65-b67f-41b2-a76e-fc3a8146bcc7, org=Default_Organization] INFO org.candlepin.common.filter.LoggingFilter - Response: status=200, content-type="application/json", time=72 ==> /var/log/candlepin/error.log <== 2015-03-11 16:59:35,129 [=, org=] ERROR org.candlepin.audit.AMQPBusPublisher - Unable to send event: Event [id=null, target=POOL, type=CREATED, time=Wed Mar 11 16:58:34 EDT 2015, entity=8a90e0064c0a34be014c0aa1bcd60192] javax.jms.JMSException: Exception when sending message:timed out waiting for completion at org.apache.qpid.client.BasicMessageProducer_0_10.sendMessage(BasicMessageProducer_0_10.java:260) ~[qpid-client-0.30.redhat-1.jar:0.30.redhat-1] at org.apache.qpid.client.BasicMessageProducer.sendImpl(BasicMessageProducer.java:528) ~[qpid-client-0.30.redhat-1.jar:0.30.redhat-1] at org.apache.qpid.client.BasicMessageProducer.send(BasicMessageProducer.java:326) ~[qpid-client-0.30.redhat-1.jar:0.30.redhat-1] at org.apache.qpid.client.TopicPublisherAdapter.send(TopicPublisherAdapter.java:120) ~[qpid-client-0.30.redhat-1.jar:0.30.redhat-1] at org.candlepin.audit.AMQPBusPublisher.onEvent(AMQPBusPublisher.java:60) ~[AMQPBusPublisher.class:na] at org.candlepin.audit.ListenerWrapper.onMessage(ListenerWrapper.java:52) [ListenerWrapper.class:na] at org.hornetq.core.client.impl.ClientConsumerImpl.callOnMessage(ClientConsumerImpl.java:1117) [hornetq-core-client-2.3.14.Final-redhat-1.jar:2.3.14.Final-redhat-1] at org.hornetq.core.client.impl.ClientConsumerImpl.access$500(ClientConsumerImpl.java:57) [hornetq-core-client-2.3.14.Final-redhat-1.jar:2.3.14.Final-redhat-1] at org.hornetq.core.client.impl.ClientConsumerImpl$Runner.run(ClientConsumerImpl.java:1252) [hornetq-core-client-2.3.14.Final-redhat-1.jar:2.3.14.Final-redhat-1] at org.hornetq.utils.OrderedExecutorFactory$OrderedExecutor$1.run(OrderedExecutorFactory.java:107) [hornetq-core-client-2.3.14.Final-redhat-1.jar:2.3.14.Final-redhat-1] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_75] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75] Caused by: org.apache.qpid.transport.SessionException: timed out waiting for completion at org.apache.qpid.transport.Session.invoke(Session.java:747) ~[qpid-common-0.30.redhat-1.jar:0.30.redhat-1] at org.apache.qpid.transport.Session.invoke(Session.java:627) ~[qpid-common-0.30.redhat-1.jar:0.30.redhat-1] at org.apache.qpid.transport.SessionInvoker.messageTransfer(SessionInvoker.java:93) ~[qpid-common-0.30.redhat-1.jar:0.30.redhat-1] at org.apache.qpid.client.BasicMessageProducer_0_10.sendMessage(BasicMessageProducer_0_10.java:246) ~[qpid-client-0.30.redhat-1.jar:0.30.redhat-1] ... 12 common frames omitted ==> /var/log/foreman/production.log <== 2015-03-11 16:59:35 [I] re-indexing pool 8a90e0064c0a34be014c0aa1bd59021a. ==> /var/log/candlepin/candlepin.log <== 2015-03-11 17:00:00,030 [job=ExpiredPoolsJob-17e91971-84c2-4640-be9d-488dfbd15747, org=] INFO org.candlepin.pinsetter.tasks.ExpiredPoolsJob - Executing ExpiredPoolsJob 2015-03-11 17:00:00,039 [job=ExpiredPoolsJob-17e91971-84c2-4640-be9d-488dfbd15747, org=] INFO org.candlepin.controller.CandlepinPoolManager - Expired pools: 0
This is likely being caused by tomcat being started before the message bus is ready. We had a similar issue with gutterball and was addressed by waiting/retrying connection for x amount of time on start up. We could do something like this for candlepin as well.
Initially I thought this BZ was similar to the one that was addressed in gutterball, but it does not appear to be. This will have to be addressed in a different way.
*** Bug 1202896 has been marked as a duplicate of this bug. ***
Marking this as a blocker since the blocker bug https://bugzilla.redhat.com/show_bug.cgi?id=1202896 is closed as a duplicate of this bug.
If we do not fix this we get hit with: " Subscriptions page shows incorrect counts" that was found in the above mentioned closed dupe. marking as a beta blocker
@omaciel - Could you please attach the manifest that you are using to trigger those errors? @sthirugn - Could you provide more info on how you are registering the content-hosts? Initially we were thinking that this could be due to dropped connections, but I'm starting to become suspicious that it may be related to the size of the message (event) that is being sent. Is there a test machine somewhere that I could use to reproduce this issue?
(In reply to Michael Stead from comment #10) > @sthirugn - Could you provide more info on how you are registering the > content-hosts? Nothing fancy, just the usual steps: 1. Install sat6 cert rpm -Uvh http://$sat6host/pub/katello-ca-consumer-latest.noarch.rpm 2. Register/subscribe via activation key: subscription-manager register --org="Default_Organization" --activationkey=$ak
The time outs appear to be caused by sending a large message and is reproducible by importing a manifest containing a large subscription such as the Employee SKU. It usually presents itself 60 seconds after the import completes. The issues seen while registering appear to be a side affect of the initial error that occurs post import. We are hopeful that the commit outlined below will address the time out issues. It seems to fix the issue on a locally installed candlepin. Once the PR is reviewed, it will be merged into master and the BZ status will be updated. Addressed by PR: https://github.com/candlepin/candlepin/pull/859 in commit https://github.com/candlepin/candlepin/commit/2f14efc1988602609c76af178f47fc42e3839f74 == commit message == The Qpid client was throwing SessionExceptions complaining that it "timed out waiting for completion". By adding the sync_publisher='persistent' option to the connection URL, we can force Qpid to sync after ever persistent object is sent. The end result is that the client blocks until the bus confirms receipt of the message. Since the session is maintained due to the blocking, we don't have the issue of timing out waiting to establish a session. See https://qpid.apache.org/releases/qpid-0.14/books/Programming-In-Apache-Qpid/html/ch03s02.html
PR merged into master. We will get a new build ready ASAP.
Ready in candlepin-0.9.48-1.
Verified. Satellite-6.1.0-RHEL-6-20150331.1 # rpm -qa | grep candlepin candlepin-scl-runtime-1-5.el6_4.noarch candlepin-selinux-0.9.48-1.el6.noarch candlepin-scl-rhino-1.7R3-1.el6_4.noarch candlepin-scl-quartz-2.1.5-5.el6_4.noarch candlepin-scl-1-5.el6_4.noarch candlepin-tomcat6-0.9.48-1.el6.noarch candlepin-0.9.48-1.el6.noarch candlepin-common-1.0.22-1.el6.noarch But https://bugzilla.redhat.com/show_bug.cgi?id=1208100 is still open.
*** Bug 1115602 has been marked as a duplicate of this bug. ***
This bug is slated to be released with Satellite 6.1.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2015:1592