Bug 1599778 - [System] error JournalInactive:resource_manager couldn't setup next timer firing [NEEDINFO]
Summary: [System] error JournalInactive:resource_manager couldn't setup next timer firing
Keywords:
Status: ASSIGNED
Alias: None
Product: Red Hat Satellite 6
Classification: Red Hat
Component: Qpid
Version: 6.3.2
Hardware: Unspecified
OS: Unspecified
medium
low vote
Target Milestone: Unspecified
Assignee: Kim van der Riet
QA Contact: Radovan Drazny
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-07-10 15:01 UTC by Ben
Modified: 2019-09-09 17:41 UTC (History)
19 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
bbuckingham: needinfo? (kim.vdriet)


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3530241 None None None 2018-07-12 13:12:56 UTC
Apache JIRA QPID-8220 None None None 2018-07-24 13:18:57 UTC

Description Ben 2018-07-10 15:01:41 UTC
Description of problem:
Ever since the upgrade from 6.2.15 to 6.3.2 I've been seeing the below in /var/log/messages:

Jul 10 11:19:45 satellite1 qpidd: 2018-07-10 11:19:45 [System] error JournalInactive:reserved_resource_worker-0@satellite1.srv.uis.private.cam.ac.uk.dq couldn't setup next timer firing: -489803366ns[500ms]
Jul 10 11:19:45 satellite1 qpidd[17643]: 2018-07-10 11:19:45 [System] error JournalInactive:reserved_resource_worker-0@satellite1.srv.uis.private.cam.ac.uk.dq couldn't setup next timer firing: -489803366ns[500ms]
Jul 10 11:25:11 satellite1 qpidd: 2018-07-10 11:25:11 [System] error JournalInactive:reserved_resource_worker-0@satellite1.srv.uis.private.cam.ac.uk.dq couldn't setup next timer firing: -494655935ns[500ms]
Jul 10 11:25:11 satellite1 qpidd[17643]: 2018-07-10 11:25:11 [System] error JournalInactive:reserved_resource_worker-0@satellite1.srv.uis.private.cam.ac.uk.dq couldn't setup next timer firing: -494655935ns[500ms]
Jul 10 11:25:40 satellite1 qpidd[17643]: 2018-07-10 11:25:40 [System] error JournalInactive:reserved_resource_worker-0@satellite1.srv.uis.private.cam.ac.uk.dq couldn't setup next timer firing: -488106997ns[500ms]
Jul 10 11:25:40 satellite1 qpidd: 2018-07-10 11:25:40 [System] error JournalInactive:reserved_resource_worker-0@satellite1.srv.uis.private.cam.ac.uk.dq couldn't setup next timer firing: -488106997ns[500ms]
Jul 10 11:25:46 satellite1 qpidd[17643]: 2018-07-10 11:25:46 [System] error JournalInactive:reserved_resource_worker-0@satellite1.srv.uis.private.cam.ac.uk.dq couldn't setup next timer firing: -487929775ns[500ms]
Jul 10 11:25:46 satellite1 qpidd: 2018-07-10 11:25:46 [System] error JournalInactive:reserved_resource_worker-0@satellite1.srv.uis.private.cam.ac.uk.dq couldn't setup next timer firing: -487929775ns[500ms]
Jul 10 11:26:48 satellite1 qpidd[17643]: 2018-07-10 11:26:48 [System] error JournalInactive:reserved_resource_worker-2@satellite1.srv.uis.private.cam.ac.uk.dq couldn't setup next timer firing: -300645087ns[500ms]
Jul 10 11:26:48 satellite1 qpidd: 2018-07-10 11:26:48 [System] error JournalInactive:reserved_resource_worker-2@satellite1.srv.uis.private.cam.ac.uk.dq couldn't setup next timer firing: -300645087ns[500ms]
Jul 10 11:26:50 satellite1 qpidd[17643]: 2018-07-10 11:26:50 [System] error JournalInactive:reserved_resource_worker-2@satellite1.srv.uis.private.cam.ac.uk.dq couldn't setup next timer firing: -492408420ns[500ms]
Jul 10 11:26:50 satellite1 qpidd: 2018-07-10 11:26:50 [System] error JournalInactive:reserved_resource_worker-2@satellite1.srv.uis.private.cam.ac.uk.dq couldn't setup next timer firing: -492408420ns[500ms]
Jul 10 12:00:04 satellite1 qpidd[17643]: 2018-07-10 12:00:04 [System] error JournalInactive:reserved_resource_worker-2@satellite1.srv.uis.private.cam.ac.uk.dq couldn't setup next timer firing: -483756834ns[500ms]
Jul 10 12:00:04 satellite1 qpidd: 2018-07-10 12:00:04 [System] error JournalInactive:reserved_resource_worker-2@satellite1.srv.uis.private.cam.ac.uk.dq couldn't setup next timer firing: -483756834ns[500ms]
Jul 10 12:00:27 satellite1 qpidd[17643]: 2018-07-10 12:00:27 [System] error JournalInactive:reserved_resource_worker-2@satellite1.srv.uis.private.cam.ac.uk.dq couldn't setup next timer firing: -488598707ns[500ms]
Jul 10 12:00:27 satellite1 qpidd: 2018-07-10 12:00:27 [System] error JournalInactive:reserved_resource_worker-2@satellite1.srv.uis.private.cam.ac.uk.dq couldn't setup next timer firing: -488598707ns[500ms]
Jul 10 12:00:39 satellite1 qpidd[17643]: 2018-07-10 12:00:39 [System] error JournalInactive:reserved_resource_worker-2@satellite1.srv.uis.private.cam.ac.uk.dq couldn't setup next timer firing: -467732051ns[500ms]
Jul 10 12:00:39 satellite1 qpidd: 2018-07-10 12:00:39 [System] error JournalInactive:reserved_resource_worker-2@satellite1.srv.uis.private.cam.ac.uk.dq couldn't setup next timer firing: -467732051ns[500ms]
Jul 10 12:02:57 satellite1 qpidd: 2018-07-10 12:02:57 [System] error JournalInactive:resource_manager couldn't setup next timer firing: -444784686ns[500ms]
Jul 10 12:02:57 satellite1 qpidd[17643]: 2018-07-10 12:02:57 [System] error JournalInactive:resource_manager couldn't setup next timer firing: -444784686ns[500ms]
Jul 10 12:03:02 satellite1 server: Exception in thread "Dispatcher-0-Conn-551" java.lang.IllegalStateException: dispatcher is not started
Jul 10 12:03:02 satellite1 server: at org.apache.qpid.client.AMQSession.dispatch(AMQSession.java:3251)
Jul 10 12:03:02 satellite1 server: at org.apache.qpid.client.message.UnprocessedMessage.dispatch(UnprocessedMessage.java:54)
Jul 10 12:03:02 satellite1 server: at org.apache.qpid.client.AMQSession$Dispatcher.run(AMQSession.java:3403)
Jul 10 12:03:02 satellite1 server: at java.lang.Thread.run(Thread.java:748)
Jul 10 12:04:32 satellite1 qpidd[17643]: 2018-07-10 12:04:32 [System] error JournalInactive:reserved_resource_worker-2@satellite1.srv.uis.private.cam.ac.uk.dq couldn't setup next timer firing: -122488667ns[500ms]
Jul 10 12:04:32 satellite1 qpidd: 2018-07-10 12:04:32 [System] error JournalInactive:reserved_resource_worker-2@satellite1.srv.uis.private.cam.ac.uk.dq couldn't setup next timer firing: -122488667ns[500ms]
Jul 10 12:05:43 satellite1 kernel: sched: RT throttling activated
Jul 10 12:10:43 satellite1 qpidd: 2018-07-10 12:10:43 [System] error JournalInactive:resource_manager couldn't setup next timer firing: -440677246ns[500ms]
Jul 10 12:10:43 satellite1 qpidd[17643]: 2018-07-10 12:10:43 [System] error JournalInactive:resource_manager couldn't setup next timer firing: -440677246ns[500ms]
Jul 10 12:23:27 satellite1 qpidd[17643]: 2018-07-10 12:23:27 [System] error JournalInactive:resource_manager couldn't setup next timer firing: -478566692ns[500ms]
Jul 10 12:23:27 satellite1 qpidd: 2018-07-10 12:23:27 [System] error JournalInactive:resource_manager couldn't setup next timer firing: -478566692ns[500ms]
Jul 10 12:52:52 satellite1 qpidd: 2018-07-10 12:52:52 [System] error JournalInactive:reserved_resource_worker-0@satellite1.srv.uis.private.cam.ac.uk.dq couldn't setup next timer firing: -471807626ns[500ms]
Jul 10 12:52:52 satellite1 qpidd[17643]: 2018-07-10 12:52:52 [System] error JournalInactive:reserved_resource_worker-0@satellite1.srv.uis.private.cam.ac.uk.dq couldn't setup next timer firing: -471807626ns[500ms]
Jul 10 12:58:01 satellite1 qpidd: 2018-07-10 12:58:01 [System] error JournalInactive:reserved_resource_worker-2@satellite1.srv.uis.private.cam.ac.uk.dq couldn't setup next timer firing: -490003834ns[500ms]
Jul 10 12:58:01 satellite1 qpidd[17643]: 2018-07-10 12:58:01 [System] error JournalInactive:reserved_resource_worker-2@satellite1.srv.uis.private.cam.ac.uk.dq couldn't setup next timer firing: -490003834ns[500ms]
Jul 10 13:05:43 satellite1 qpidd: 2018-07-10 13:05:43 [System] error JournalInactive:reserved_resource_worker-0@satellite1.srv.uis.private.cam.ac.uk.dq couldn't setup next timer firing: -45191188ns[500ms]
Jul 10 13:05:43 satellite1 qpidd[17643]: 2018-07-10 13:05:43 [System] error JournalInactive:reserved_resource_worker-0@satellite1.srv.uis.private.cam.ac.uk.dq couldn't setup next timer firing: -45191188ns[500ms]
Jul 10 13:05:57 satellite1 pulp: celery.concurrency.asynpool:ERROR: Timed out waiting for UP message from <Worker(Worker-203, started daemon)>
Jul 10 14:46:27 satellite1 qpidd[17643]: 2018-07-10 14:46:27 [System] error JournalInactive:resource_manager couldn't setup next timer firing: -419645658ns[500ms]
Jul 10 14:46:27 satellite1 qpidd: 2018-07-10 14:46:27 [System] error JournalInactive:resource_manager couldn't setup next timer firing: -419645658ns[500ms]
Jul 10 15:04:54 satellite1 qpidd[17643]: 2018-07-10 15:04:54 [System] error JournalInactive:resource_manager couldn't setup next timer firing: -360105284ns[500ms]
Jul 10 15:04:54 satellite1 qpidd: 2018-07-10 15:04:54 [System] error JournalInactive:resource_manager couldn't setup next timer firing: -360105284ns[500ms]
Jul 10 15:05:47 satellite1 qpidd[17643]: 2018-07-10 15:05:47 [System] error JournalInactive:resource_manager couldn't setup next timer firing: -338090542ns[500ms]
Jul 10 15:05:47 satellite1 qpidd: 2018-07-10 15:05:47 [System] error JournalInactive:resource_manager couldn't setup next timer firing: -338090542ns[500ms]
Jul 10 15:05:53 satellite1 qpidd[17643]: 2018-07-10 15:05:53 [System] error JournalInactive:resource_manager couldn't setup next timer firing: -314424617ns[500ms]
Jul 10 15:05:53 satellite1 qpidd: 2018-07-10 15:05:53 [System] error JournalInactive:resource_manager couldn't setup next timer firing: -314424617ns[500ms]
Jul 10 15:06:22 satellite1 qpidd[17643]: 2018-07-10 15:06:22 [System] error JournalInactive:resource_manager couldn't setup next timer firing: -496097906ns[500ms]
Jul 10 15:06:22 satellite1 qpidd: 2018-07-10 15:06:22 [System] error JournalInactive:resource_manager couldn't setup next timer firing: -496097906ns[500ms]
Jul 10 15:06:26 satellite1 qpidd[17643]: 2018-07-10 15:06:26 [System] error JournalInactive:resource_manager couldn't setup next timer firing: -255113198ns[500ms]
Jul 10 15:06:26 satellite1 qpidd: 2018-07-10 15:06:26 [System] error JournalInactive:resource_manager couldn't setup next timer firing: -255113198ns[500ms]

Version-Release number of selected component (if applicable):
RHEL7
Satellite 6.3.2 (update run on 2018-07-10)

How reproducible:
Possibly under load; I've been running lots of katello agent jobs to get satellite-6.3-tools-upgrade installed and katello-agent upgrades.  Although I only have 363 content hosts.

Steps to Reproduce:
1. Upgrade to 6.3.2
2. Run lots of jobs
3. Watch /var/log/messages

Actual results:
No idea if anything's actually breaking

Expected results:
Lack of scary messages

Additional info:
Please ask.

Comment 1 Ben 2018-07-10 15:03:11 UTC
Additional messages:

Jul 10 16:00:06 satellite1 qpidd: 2018-07-10 16:00:06 [System] error JournalInactive:katello_event_queue couldn't setup next timer firing: -481765625ns[500ms]
Jul 10 16:00:06 satellite1 qpidd[17643]: 2018-07-10 16:00:06 [System] error JournalInactive:katello_event_queue couldn't setup next timer firing: -481765625ns[500ms]

Comment 2 Ben 2018-07-10 15:38:39 UTC
Additional messages (but may not be connected):

Jul 10 16:27:25 satellite1 server: Exception in thread "Dispatcher-0-Conn-2117" java.lang.IllegalStateException: dispatcher is not started
Jul 10 16:27:25 satellite1 server: at org.apache.qpid.client.AMQSession.dispatch(AMQSession.java:3251)
Jul 10 16:27:25 satellite1 server: at org.apache.qpid.client.message.UnprocessedMessage.dispatch(UnprocessedMessage.java:54)
Jul 10 16:27:25 satellite1 server: at org.apache.qpid.client.AMQSession$Dispatcher.run(AMQSession.java:3403)
Jul 10 16:27:25 satellite1 server: at java.lang.Thread.run(Thread.java:748)

Comment 3 Ben 2018-07-11 11:21:12 UTC
More messages:

Jul 10 18:02:22 satellite1 puppet-master[8519]: Starting Puppet master version 3.8.6
Jul 10 18:02:22 satellite1 puppet-master[8542]: The environment must be purely alphanumeric, not ''
Jul 11 01:38:21 satellite1 qpidd: 2018-07-11 01:38:21 [System] error JournalInactive:celery couldn't setup next timer firing: -497884899ns[500ms]
Jul 11 01:38:21 satellite1 qpidd[17643]: 2018-07-11 01:38:21 [System] error JournalInactive:celery couldn't setup next timer firing: -497884899ns[500ms]
Jul 11 03:43:02 satellite1 su: (to foreman) root on none
Jul 11 10:50:19 satellite1 qpidd[17643]: 2018-07-11 10:50:19 [System] error JournalInactive:celery couldn't setup next timer firing: -447291083ns[500ms]
Jul 11 10:50:19 satellite1 qpidd: 2018-07-11 10:50:19 [System] error JournalInactive:celery couldn't setup next timer firing: -447291083ns[500ms]
Jul 11 11:27:26 satellite1 qpidd[17643]: 2018-07-11 11:27:26 [System] error JournalInactive:reserved_resource_worker-3@satellite1.srv.uis.private.cam.ac.uk.dq couldn't setup next timer firing: -489401776ns[500ms]
Jul 11 11:27:26 satellite1 qpidd: 2018-07-11 11:27:26 [System] error JournalInactive:reserved_resource_worker-3@satellite1.srv.uis.private.cam.ac.uk.dq couldn't setup next timer firing: -489401776ns[500ms]
Jul 11 11:40:19 satellite1 qpidd: 2018-07-11 11:40:19 [System] error JournalInactive:celery couldn't setup next timer firing: -490328521ns[500ms]
Jul 11 11:40:19 satellite1 qpidd[17643]: 2018-07-11 11:40:19 [System] error JournalInactive:celery couldn't setup next timer firing: -490328521ns[500ms]

Comment 4 Pavel Moravec 2018-07-11 13:55:46 UTC
Hello,
I recommend raising support case rather than filing bugzilla. The above might not be a bug but e.g. configuration issue (i.e. system time updated at some unfortunate time, causing qpidd internal timers get confused?). To investigate that, please open a support case via

https://access.redhat.com

and attach sosreport for start.

(won't katello-service restart remedy the problem? that would be also worth to try / know)

Comment 5 Ben 2018-07-11 15:57:36 UTC
I did a katello-service restart at 15:03 or so and saw similar errors at 16:07.  I've opened a support call on your recommendation.

https://access.redhat.com/support/cases/#/case/02139101, if you can see it.

Comment 6 Pavel Moravec 2018-07-12 12:39:58 UTC
Kim,
do you have an idea of reproducer for the above error log? And could you confrm the error log is harm-less?

customer is on qpid-cpp-server-linearstore 1.36-18, so up-to-date *except for* the deadlock BZ https://bugzilla.redhat.com/show_bug.cgi?id=1588015#c8


My source code analysis when 
JournalInactive:..... couldn't setup next timer firing: -338090542ns[500ms]
can happen:

1) in https://github.com/apache/qpid-cpp/blob/master/src/qpid/sys/Timer.cpp#L82 :
- when the timer wants to setup next fire, nextFireTime > AbsTime::now(), meaning there is no reason to set up yet
- the negative values is a bug, there should be reverse comparison; then the numbers make sense (values up to 500ms what is default of journal-flush-timeout)

2) in https://github.com/apache/qpid-cpp/blob/master/src/qpid/linearstore/JournalImpl.cpp#L46 :
- setupNextFire was called _before_ the timer was really fired (that caused the 1) )
- more technically from Timer.cpp, setupNextFire or restart method was called before it should / concurrently (as sometimes the value delay is almost 500ms what the period is)
- that means, InactivityFireEvent::reset was called twice within the 500ms period without firing the timer meantime; both times in state either FIRED or FLUSHED
- the method body is under a lock, so state was changed from FIRED/FLUSHED to RUNNING, then somehow to FLUSHED before 2nd call of reset method
- so InactivityFireEvent::flush must be called a short time after InactivityFireEvent::fire, before timer was executed
- that is invoked from JournalImpl::flush only
- that is invoked from:
  - MessageStoreImpl::flush
  - TxnCtxt::jrnl_flush (transactions not used, not relevant)
  - jcntl::flush (??)

That brought me to the reproducer attempt in frequent sending sole messages like:

while true; do
		qpid-send -a DurableQueue -m 1 --content-size=1390 -durable=true --capacity=1
done

but without getting that error.


(I solely see that error on some our Satellites, but without an evident trigger)

Comment 7 Pavel Moravec 2018-07-13 09:48:53 UTC
Adding some debugs, here is a sequence of calls of InactivityFireEvent methods that lead to the error log ("starts" means method is called, "executed" means method acquired the _ifeStateLock lock):


- flush starts with _state=RUNNING
- fire starts with _state=RUNNING
- flush is executed, setting _state=FLUSHED
- reset starts with _state=FLUSHED
- reset is executed, setting _state=RUNNING (and nextFireTime="now+500ms")
- fire is executed, setting _state=FIRED
- 2nd reset starts with _state=FIRED
- 2nd reset is attempted to be executed - too soon after 1st reset, so it fails in setupNextFire

Comment 8 Pavel Moravec 2018-07-13 09:51:56 UTC
(In reply to Pavel Moravec from comment #7)
> Adding some debugs, here is a sequence of calls of InactivityFireEvent
> methods that lead to the error log ("starts" means method is called,
> "executed" means method acquired the _ifeStateLock lock):
> 
> 
> - flush starts with _state=RUNNING
> - fire starts with _state=RUNNING
> - flush is executed, setting _state=FLUSHED
> - reset starts with _state=FLUSHED
> - reset is executed, setting _state=RUNNING (and nextFireTime="now+500ms")
> - fire is executed, setting _state=FIRED
> - 2nd reset starts with _state=FIRED
> - 2nd reset is attempted to be executed - too soon after 1st reset, so it
> fails in setupNextFire

That means, the problem is when:

- Timer fires fire() method
- before the method is executed, inactivity fire event performs flush and reset
- fire is completed
- another reset comes too soon

Comment 9 Pavel Moravec 2018-07-13 10:49:07 UTC
OK, here is a VERY reliable reproducer, generating the error log within seconds:

# run qpidd with small jrnl flush timeout, to really really speed the reproducer
qpidd --journal-flush-timeout=50ms &

# receive messages from durable queue - just to prevent queue full
queue="DurableQueue; {create: always, node:{ type:queue, durable:true}}"
qpid-receive -a "$queue" --print-content=no -f &

# in a loop, send messages concurrently, triggering flush by each producer
while true; do 
  for i in $(seq 1 30); do
    qpid-send -a "$queue" --content-size=123456 --durable=true -m1 &
  done
  while [ $(pgrep qpid-send | wc -l) -gt 0 ]; do
    sleep 1
  done
done

Comment 10 Pavel Moravec 2018-07-13 11:06:03 UTC
FYI the error log does not appear in 1.36.0-9. So I assume it popped up due to QPID-7975 or QPID-7895.

Comment 11 Kim van der Riet 2018-07-16 18:18:16 UTC
I am starting to take a look at this, I will provide some feedback shortly when I have had a chance to pick at it for a little while...

Comment 12 Kim van der Riet 2018-07-18 17:20:06 UTC
I agree with Pavel's analysis. This error occurs when the following conditions are met:

1. The timer is in state RUNNING.

2. flush(), reset() and fire() events occur on separate threads almost simultaneously. The flush() event acquires _ifeStateLock first, followed by reset(), then fire().

3. A second reset() occurs within the time set on the timer in condition 2.

This will cause the following sequence:

a. flush() flushes the store, putting the timer into state FLUSHED.
b. flush() releases _ifeStateLock, allowing reset() to acquire the lock.
c. reset() calls Timer::restart(), adding Timer::period to Timer::nextFireTime. The timer is set into state RUNNING. reset() releases the lock, allowing fire() to acquire it.
d. fire() flushes the store again (inefficient, but benign), then timer is put into state FIRED. This is technically an errored state, as Timer::nextFireTime is no longer ready to fire.

If at any time while Timer::nextFireTime is still not yet ready to fire (ie within the period set in step c above):

e. reset() is called, and as the timer is in state FIRED, allows Timer::setupNextFire() to be called.  This results in the "couldn't setup next timer firing" error message described in this bugzilla.

From this analysis, I can see:
1. This is a race condition;
2. It was introduced when this state machine was added to JournalImpl::InactivityFireEvent to solve another issue;
3. This sequence of events does not affect the operation of the timer in any way, and is benign (other than the confusing error messages that show up in the broker log file).

I will look at a way to solve this, and I don't think it is serious from an operational point-of-view.

Comment 13 Kim van der Riet 2018-07-24 13:26:27 UTC
By changing the way the timer is set up for the next firing, this error message can be avoided. This is done by calling restart() instead of setUpNextFiring(). Initial testing has shown that there is no discernible difference in timer or journal behavior, other than the error message is no longer present. This solution also avoids any risky changes to the timer state machine.

Comment 14 Kim van der Riet 2018-07-24 13:36:52 UTC
Proposed fix committed:

Commit 428eafa1f65e22e04ea63b73aebcc192282df72c in qpid-cpp's branch refs/heads/master from Kim van der Riet
[ https://git-wip-us.apache.org/repos/asf?p=qpid-cpp.git;h=428eafa ]

QPID-8220: Proposed fix for timer error message issue, also minor change to error message itself to make less confusing

Comment 17 Ben 2018-08-17 13:22:45 UTC
Can you confirm the following messages are associated, please?

Aug 17 14:09:57 satellite1 server: Exception in thread "Dispatcher-0-Conn-1130" java.lang.IllegalStateException: dispatcher is not started
Aug 17 14:09:57 satellite1 server: at org.apache.qpid.client.AMQSession.dispatch(AMQSession.java:3251)
Aug 17 14:09:57 satellite1 server: at org.apache.qpid.client.message.UnprocessedMessage.dispatch(UnprocessedMessage.java:54)
Aug 17 14:09:57 satellite1 server: at org.apache.qpid.client.AMQSession$Dispatcher.run(AMQSession.java:3403)
Aug 17 14:09:57 satellite1 server: at java.lang.Thread.run(Thread.java:748)

Comment 18 Pavel Moravec 2018-08-20 06:03:15 UTC
(In reply to Ben from comment #17)
> Can you confirm the following messages are associated, please?
> 
> Aug 17 14:09:57 satellite1 server: Exception in thread
> "Dispatcher-0-Conn-1130" java.lang.IllegalStateException: dispatcher is not
> started
> Aug 17 14:09:57 satellite1 server: at
> org.apache.qpid.client.AMQSession.dispatch(AMQSession.java:3251)
> Aug 17 14:09:57 satellite1 server: at
> org.apache.qpid.client.message.UnprocessedMessage.
> dispatch(UnprocessedMessage.java:54)
> Aug 17 14:09:57 satellite1 server: at
> org.apache.qpid.client.AMQSession$Dispatcher.run(AMQSession.java:3403)
> Aug 17 14:09:57 satellite1 server: at java.lang.Thread.run(Thread.java:748)

It shouldnt be. Assuming this is from /var/log/messages on Satellite6 server, this is a message from a java client, that must be candlepin (or some non-Satellite client?). First it is werid that candlepin is logging to /var/log/messages at all, second the journal errors dont have any impact to broker or its behavour.

Comment 19 Ben 2018-08-20 15:12:51 UTC
Yup, this is in /var/log/messages.  And I assume it's Candlepin as there's nothing else on the box it could be.  I guess this is another issue, then?  Oh dear.

Comment 20 Pavel Moravec 2018-08-20 15:41:25 UTC
(In reply to Ben from comment #19)
> Yup, this is in /var/log/messages.  And I assume it's Candlepin as there's
> nothing else on the box it could be.  I guess this is another issue, then? 
> Oh dear.

It seems like a separate issue - please file new support case and attach a new sosreport there (an older one with such error shall be sufficient, but might not be in 100% cases).


Note You need to log in before you can comment on or make changes to this bug.