Bug 1200877 - Candlepin raising "Unable to send event" error
Summary: Candlepin raising "Unable to send event" error
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Candlepin
Version: 6.1.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: Unspecified
Assignee: Michael Stead
QA Contact: sthirugn@redhat.com
URL:
Whiteboard:
: 1115602 1202896 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-03-11 14:48 UTC by sthirugn@redhat.com
Modified: 2020-08-13 08:13 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-08-12 05:29:41 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
candlepin.log (6.94 KB, text/plain)
2015-03-11 14:48 UTC, sthirugn@redhat.com
no flags Details
foreman-debug (645.78 KB, application/x-xz)
2015-03-11 14:51 UTC, sthirugn@redhat.com
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1208100 0 unspecified CLOSED content-host unregister returns success but shows dynflow validation error 2021-02-22 00:41:40 UTC
Red Hat Product Errata RHSA-2015:1592 0 normal SHIPPED_LIVE Important: Red Hat Satellite 6.1.1 on RHEL 6 2015-08-12 09:04:35 UTC

Internal Links: 1208100

Description sthirugn@redhat.com 2015-03-11 14:48:53 UTC
Created attachment 1000528 [details]
candlepin.log

Description of problem:
Candlepin error after the content hosts are registered in Satellite6

Version-Release number of selected component (if applicable):
Satellite-6.1.0-RHEL-6-20150303.0

How reproducible:
Always

Steps to Reproduce:
1.Install sat6
2.Register and subscribe few content hosts to satellite6
3.Review /var/log/candlepin/candlepin.log

Actual results:
(see the attached log)
2015-03-11 10:43:06,234 [=, org=] ERROR org.candlepin.audit.AMQPBusPublisher - Unable to send event: Event [id=null, target=CONSUMER, type=CREATED, time=Wed Mar 11 10:42:03 EDT 2015, entity=8a9084974be35df8014c094904510bfb]
...
...

Expected results:
No error

Additional info:

Comment 1 sthirugn@redhat.com 2015-03-11 14:51:39 UTC
Created attachment 1000529 [details]
foreman-debug

Comment 3 Og Maciel 2015-03-11 21:05:58 UTC
For me I am seeing candlepin complain but I don't have any content hosts. Seems that I trigger this issue by:

* Importing a valid manifest
* Visit the Content > Red Hat Repositories page and expanding the Red Hat Enterprise Linux Server product

Log shows:

==> /var/log/candlepin/candlepin.log <==
2015-03-11 16:59:35,129 [=, org=] ERROR org.candlepin.audit.AMQPBusPublisher - Unable to send event: Event [id=null, target=POOL, type=CREATED, time=Wed Mar 11 16:58:34 EDT 2015, entity=8a90e0064c0a34be014c0aa1bcd60192]
javax.jms.JMSException: Exception when sending message:timed out waiting for completion
	at org.apache.qpid.client.BasicMessageProducer_0_10.sendMessage(BasicMessageProducer_0_10.java:260) ~[qpid-client-0.30.redhat-1.jar:0.30.redhat-1]
	at org.apache.qpid.client.BasicMessageProducer.sendImpl(BasicMessageProducer.java:528) ~[qpid-client-0.30.redhat-1.jar:0.30.redhat-1]
	at org.apache.qpid.client.BasicMessageProducer.send(BasicMessageProducer.java:326) ~[qpid-client-0.30.redhat-1.jar:0.30.redhat-1]
	at org.apache.qpid.client.TopicPublisherAdapter.send(TopicPublisherAdapter.java:120) ~[qpid-client-0.30.redhat-1.jar:0.30.redhat-1]
	at org.candlepin.audit.AMQPBusPublisher.onEvent(AMQPBusPublisher.java:60) ~[AMQPBusPublisher.class:na]
	at org.candlepin.audit.ListenerWrapper.onMessage(ListenerWrapper.java:52) [ListenerWrapper.class:na]
	at org.hornetq.core.client.impl.ClientConsumerImpl.callOnMessage(ClientConsumerImpl.java:1117) [hornetq-core-client-2.3.14.Final-redhat-1.jar:2.3.14.Final-redhat-1]
	at org.hornetq.core.client.impl.ClientConsumerImpl.access$500(ClientConsumerImpl.java:57) [hornetq-core-client-2.3.14.Final-redhat-1.jar:2.3.14.Final-redhat-1]
	at org.hornetq.core.client.impl.ClientConsumerImpl$Runner.run(ClientConsumerImpl.java:1252) [hornetq-core-client-2.3.14.Final-redhat-1.jar:2.3.14.Final-redhat-1]
	at org.hornetq.utils.OrderedExecutorFactory$OrderedExecutor$1.run(OrderedExecutorFactory.java:107) [hornetq-core-client-2.3.14.Final-redhat-1.jar:2.3.14.Final-redhat-1]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_75]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_75]
	at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]
Caused by: org.apache.qpid.transport.SessionException: timed out waiting for completion
	at org.apache.qpid.transport.Session.invoke(Session.java:747) ~[qpid-common-0.30.redhat-1.jar:0.30.redhat-1]
	at org.apache.qpid.transport.Session.invoke(Session.java:627) ~[qpid-common-0.30.redhat-1.jar:0.30.redhat-1]
	at org.apache.qpid.transport.SessionInvoker.messageTransfer(SessionInvoker.java:93) ~[qpid-common-0.30.redhat-1.jar:0.30.redhat-1]
	at org.apache.qpid.client.BasicMessageProducer_0_10.sendMessage(BasicMessageProducer_0_10.java:246) ~[qpid-client-0.30.redhat-1.jar:0.30.redhat-1]
	... 12 common frames omitted
2015-03-11 16:59:35,290 [req=4b45db65-b67f-41b2-a76e-fc3a8146bcc7, org=] INFO  org.candlepin.common.filter.LoggingFilter - Request: verb=GET, uri=/candlepin/pools/8a90e0064c0a34be014c0aa1bd59021a
2015-03-11 16:59:35,362 [req=4b45db65-b67f-41b2-a76e-fc3a8146bcc7, org=Default_Organization] INFO  org.candlepin.common.filter.LoggingFilter - Response: status=200, content-type="application/json", time=72

==> /var/log/candlepin/error.log <==
2015-03-11 16:59:35,129 [=, org=] ERROR org.candlepin.audit.AMQPBusPublisher - Unable to send event: Event [id=null, target=POOL, type=CREATED, time=Wed Mar 11 16:58:34 EDT 2015, entity=8a90e0064c0a34be014c0aa1bcd60192]
javax.jms.JMSException: Exception when sending message:timed out waiting for completion
	at org.apache.qpid.client.BasicMessageProducer_0_10.sendMessage(BasicMessageProducer_0_10.java:260) ~[qpid-client-0.30.redhat-1.jar:0.30.redhat-1]
	at org.apache.qpid.client.BasicMessageProducer.sendImpl(BasicMessageProducer.java:528) ~[qpid-client-0.30.redhat-1.jar:0.30.redhat-1]
	at org.apache.qpid.client.BasicMessageProducer.send(BasicMessageProducer.java:326) ~[qpid-client-0.30.redhat-1.jar:0.30.redhat-1]
	at org.apache.qpid.client.TopicPublisherAdapter.send(TopicPublisherAdapter.java:120) ~[qpid-client-0.30.redhat-1.jar:0.30.redhat-1]
	at org.candlepin.audit.AMQPBusPublisher.onEvent(AMQPBusPublisher.java:60) ~[AMQPBusPublisher.class:na]
	at org.candlepin.audit.ListenerWrapper.onMessage(ListenerWrapper.java:52) [ListenerWrapper.class:na]
	at org.hornetq.core.client.impl.ClientConsumerImpl.callOnMessage(ClientConsumerImpl.java:1117) [hornetq-core-client-2.3.14.Final-redhat-1.jar:2.3.14.Final-redhat-1]
	at org.hornetq.core.client.impl.ClientConsumerImpl.access$500(ClientConsumerImpl.java:57) [hornetq-core-client-2.3.14.Final-redhat-1.jar:2.3.14.Final-redhat-1]
	at org.hornetq.core.client.impl.ClientConsumerImpl$Runner.run(ClientConsumerImpl.java:1252) [hornetq-core-client-2.3.14.Final-redhat-1.jar:2.3.14.Final-redhat-1]
	at org.hornetq.utils.OrderedExecutorFactory$OrderedExecutor$1.run(OrderedExecutorFactory.java:107) [hornetq-core-client-2.3.14.Final-redhat-1.jar:2.3.14.Final-redhat-1]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_75]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_75]
	at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]
Caused by: org.apache.qpid.transport.SessionException: timed out waiting for completion
	at org.apache.qpid.transport.Session.invoke(Session.java:747) ~[qpid-common-0.30.redhat-1.jar:0.30.redhat-1]
	at org.apache.qpid.transport.Session.invoke(Session.java:627) ~[qpid-common-0.30.redhat-1.jar:0.30.redhat-1]
	at org.apache.qpid.transport.SessionInvoker.messageTransfer(SessionInvoker.java:93) ~[qpid-common-0.30.redhat-1.jar:0.30.redhat-1]
	at org.apache.qpid.client.BasicMessageProducer_0_10.sendMessage(BasicMessageProducer_0_10.java:246) ~[qpid-client-0.30.redhat-1.jar:0.30.redhat-1]
	... 12 common frames omitted

==> /var/log/foreman/production.log <==
2015-03-11 16:59:35 [I] re-indexing pool 8a90e0064c0a34be014c0aa1bd59021a.

==> /var/log/candlepin/candlepin.log <==
2015-03-11 17:00:00,030 [job=ExpiredPoolsJob-17e91971-84c2-4640-be9d-488dfbd15747, org=] INFO  org.candlepin.pinsetter.tasks.ExpiredPoolsJob - Executing ExpiredPoolsJob
2015-03-11 17:00:00,039 [job=ExpiredPoolsJob-17e91971-84c2-4640-be9d-488dfbd15747, org=] INFO  org.candlepin.controller.CandlepinPoolManager - Expired pools: 0

Comment 5 Michael Stead 2015-03-16 10:53:41 UTC
This is likely being caused by tomcat being started before the message bus is ready. We had a similar issue with gutterball and was addressed by waiting/retrying connection for x amount of time on start up.

We could do something like this for candlepin as well.

Comment 6 Michael Stead 2015-03-17 19:41:56 UTC
Initially I thought this BZ was similar to the one that was addressed in gutterball, but it does not appear to be.

This will have to be addressed in a different way.

Comment 7 Devan Goodwin 2015-03-25 12:19:21 UTC
*** Bug 1202896 has been marked as a duplicate of this bug. ***

Comment 8 sthirugn@redhat.com 2015-03-25 12:22:30 UTC
Marking this as a blocker since the blocker bug https://bugzilla.redhat.com/show_bug.cgi?id=1202896 is closed as a duplicate of this bug.

Comment 9 Mike McCune 2015-03-26 20:36:35 UTC
If we do not fix this we get hit with:

" Subscriptions page shows incorrect counts" that was found in the above mentioned closed dupe.

marking as a beta blocker

Comment 10 Michael Stead 2015-03-27 12:30:49 UTC
@omaciel - Could you please attach the manifest that you are using to trigger those errors?

@sthirugn - Could you provide more info on how you are registering the content-hosts?

Initially we were thinking that this could be due to dropped connections, but I'm starting to become suspicious that it may be related to the size of the message (event) that is being sent.

Is there a test machine somewhere that I could use to reproduce this issue?

Comment 11 sthirugn@redhat.com 2015-03-27 13:51:59 UTC
(In reply to Michael Stead from comment #10)
> @sthirugn - Could you provide more info on how you are registering the
> content-hosts?

Nothing fancy, just the usual steps:
1. Install sat6 cert
rpm -Uvh http://$sat6host/pub/katello-ca-consumer-latest.noarch.rpm

2. Register/subscribe via activation key:
subscription-manager register --org="Default_Organization" --activationkey=$ak

Comment 12 Michael Stead 2015-03-30 23:02:51 UTC
The time outs appear to be caused by sending a large message and is reproducible by importing a manifest containing a large subscription such as the Employee SKU. It usually presents itself 60 seconds after the import completes. The issues seen while registering appear to be a side affect of the initial error that occurs post import.

We are hopeful that the commit outlined below will address the time out issues. It seems to fix the issue on a locally installed candlepin.

Once the PR is reviewed, it will be merged into master and the BZ status will be updated.

Addressed by PR: https://github.com/candlepin/candlepin/pull/859 in commit https://github.com/candlepin/candlepin/commit/2f14efc1988602609c76af178f47fc42e3839f74

== commit message ==
The Qpid client was throwing SessionExceptions complaining that it
"timed out waiting for completion".  By adding the
sync_publisher='persistent' option to the connection URL, we can force
Qpid to sync after ever persistent object is sent.  The end result is
that the client blocks until the bus confirms receipt of the message.
Since the session is maintained due to the blocking, we don't have the
issue of timing out waiting to establish a session.

See
https://qpid.apache.org/releases/qpid-0.14/books/Programming-In-Apache-Qpid/html/ch03s02.html

Comment 13 Michael Stead 2015-03-31 13:02:16 UTC
PR merged into master. We will get a new build ready ASAP.

Comment 14 Devan Goodwin 2015-03-31 14:22:37 UTC
Ready in candlepin-0.9.48-1.

Comment 16 sthirugn@redhat.com 2015-04-01 14:27:07 UTC
Verified. 

Satellite-6.1.0-RHEL-6-20150331.1
# rpm -qa | grep candlepin
candlepin-scl-runtime-1-5.el6_4.noarch
candlepin-selinux-0.9.48-1.el6.noarch
candlepin-scl-rhino-1.7R3-1.el6_4.noarch
candlepin-scl-quartz-2.1.5-5.el6_4.noarch
candlepin-scl-1-5.el6_4.noarch
candlepin-tomcat6-0.9.48-1.el6.noarch
candlepin-0.9.48-1.el6.noarch
candlepin-common-1.0.22-1.el6.noarch

But https://bugzilla.redhat.com/show_bug.cgi?id=1208100 is still open.

Comment 17 Dustin Tsang 2015-04-10 18:03:17 UTC
*** Bug 1115602 has been marked as a duplicate of this bug. ***

Comment 18 Bryan Kearney 2015-08-11 13:37:20 UTC
This bug is slated to be released with Satellite 6.1.

Comment 19 errata-xmlrpc 2015-08-12 05:29:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2015:1592


Note You need to log in before you can comment on or make changes to this bug.