Bug 1009666 - Not all event sources are being polled by the agent
Not all event sources are being polled by the agent
Status: CLOSED CURRENTRELEASE
Product: JBoss Operations Network
Classification: JBoss
Component: Monitoring - Events (Show other bugs)
JON 3.1.2,JON 3.2
Unspecified Unspecified
unspecified Severity urgent
: ER02
: JON 3.2.0
Assigned To: John Sanda
Mike Foley
: NeedsTestCase
Depends On:
Blocks: 1009699
  Show dependency treegraph
 
Reported: 2013-09-18 16:02 EDT by Larry O'Leary
Modified: 2014-01-02 15:35 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1009699 (view as bug list)
Environment:
Last Closed: 2014-01-02 15:35:13 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Displayed events from different sources. (220.76 KB, image/png)
2013-10-02 05:43 EDT, jvlasak
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 488173 None None None Never

  None (edit)
Description Larry O'Leary 2013-09-18 16:02:04 EDT
Description of problem:
If multiple event sources are configured on an agent, not all of them are polled. It is not clear what actually causes this issue but it happens every time. 

Version-Release number of selected component (if applicable):
4.4.0.JON312GA
4.7.0.JON [adad71f]

How reproducible:
Always

Steps to Reproduce:
1.  Start JBoss ON 3.2 system.
2.  Import RHQ Agent and RHQ Server into inventory.
3.  Create /tmp/generateLog using the script below.
4.  Then run /tmp/generateLog to generate our random log data
        
        chmod +x /tmp/generateLog
        for i in $(seq -w 0 10); do /tmp/generateLog /tmp/file-${i}.log 2>/dev/null & done

5.  Modify RHQ Agent resource configuration Event Report Max Per Source and Event Report Max Total to 1000 and 10000.
6.  Add 11 event logging sources to RHQ Server resource using generated log files /tmp/file-00.log, /tmp/file-01.log, /tmp/file-02.log, ..., /tmp/file-10.log
7.  Restart agent with DEBUG logging enabled.

File: /tmp/generateLog
------------------------------------------------------------------------
#!/bin/sh
#

maxSleepTimePerSleep=2
maxEntriesPerSleep=5
maxEntries=1000000
maxSleepTime=3600





logLevels=("FATAL"
"ERROR"
"WARN "
"INFO "
"DEBUG"
"TRACE")

logThreads=(main
$(echo Thread-{0..99})
$(echo Fake-{0..9}))

logCategories=(apache.cassandra.db.commitlog.CommitLogReplayer
org.jboss.as.server.deployment 
org.jboss.as.messaging 
org.some.other.category)

logMessages=("A log message"
"Another log message"
"A log message with some meaning"
"A log message without any meaning"
"To be a log message or not to be"
"To infinity and beyond"
"Every rose has its throne"
"Come out and pay"
"That thing wasn't supposed to do that."
"Now that's what I'm talking about"'!'
"Friends don't let friends eat grapefruit")

function printRandomLogLevel() {
    numLogLevels=${#logLevels[*]}
    echo -n ${logLevels[$((RANDOM%numLogLevels))]}
}

function printRandomLogThread() {
    numLogThreads=${#logThreads[*]}
    echo -n ${logThreads[$((RANDOM%numLogThreads))]}
}

function printRandomLogCategory() {
    numLogCategories=${#logCategories[*]}
    echo -n ${logCategories[$((RANDOM%numLogCategories))]}
}

function printRandomLogMessage() {
    numLogMessages=${#logMessages[*]}
    echo -n ${logMessages[$((RANDOM%numLogMessages))]}
}


function printLogTimestamp() {
    timestamp=$(date +"%F %T|%N")
    ms="$(printf "%03d" $(( 10#${timestamp#*|}/1000000 )))"
    echo -n "${timestamp%|*},${ms}"
}

function printRandomLogEntry() {
    echo "$(printLogTimestamp) $(printRandomLogLevel) [$(printRandomLogThread)] ($(printRandomLogCategory)) $(printRandomLogMessage)"
}

function printRandomLogEntries() {
    totalSleepTime=0
    totalEntries=0

    while [ true ]; do
        randomSleep=$((RANDOM%${maxSleepTimePerSleep}))
        [ $((${maxSleepTime}-(${totalSleepTime}+${randomSleep}))) -lt 0 ] && randomSleep=$((${randomSleep} +(${maxSleepTime}-(${totalSleepTime}+${randomSleep}))))
        echo >&2 Sleeping for ${randomSleep} seconds...
        sleep ${randomSleep}s
        totalSleepTime=$((totalSleepTime+$randomSleep))
        randomNumEntries=$((RANDOM%${maxEntriesPerSleep}))
        [ $((${maxEntries}-(${totalEntries}+${randomNumEntries}))) -lt 0 ] && randomNumEntries=$((${randomNumEntries} +(${maxEntries}-(${totalEntries}+${randomNumEntries}))))
        echo >&2 Printing ${randomNumEntries} log entries...
        for ((i=0; i<randomNumEntries; i++)); do
            [ $# -gt 0 ] && {
                echo $(printRandomLogEntry) >>"$1"
            } || {
                echo $(printRandomLogEntry)
            }
        done
        totalEntries=$((totalEntries+${randomNumEntries}))
        [ ${totalEntries} -ge ${maxEntries} -o ${totalSleepTime} -ge ${maxSleepTime} ] && break
    done
    echo >&2 Total entries is ${totalEntries} and total sleep time was ${totalSleepTime} seconds.
}

printRandomLogEntries $@
------------------------------------------------------------------------

Actual results:
Agent log (after restart and debug is enabled) will only contain log messages for a single log file source instead of all 11:

2013-09-18 14:54:59,669 DEBUG [EventManager.poller-1] (core.pluginapi.event.log.LogFileEventPoller)- /tmp/file-10.log: 24405 new bytes
2013-09-18 14:55:59,668 DEBUG [EventManager.poller-3] (core.pluginapi.event.log.LogFileEventPoller)- /tmp/file-10.log: 20126 new bytes
2013-09-18 14:56:59,668 DEBUG [EventManager.poller-1] (core.pluginapi.event.log.LogFileEventPoller)- /tmp/file-10.log: 23275 new bytes
2013-09-18 14:57:59,669 DEBUG [EventManager.poller-1] (core.pluginapi.event.log.LogFileEventPoller)- /tmp/file-10.log: 25691 new bytes


Expected results:
Agent log (after restart and debug is enabled) will contain log messages indicated events are being collected from all 11 files.

Additional Information:
If I disable an event source in the JON UI and then re-enable it, events will be collected for that source. This is true for all the sources. But, as soon as the agent is restarted this problem comes back.
Comment 1 Larry O'Leary 2013-09-18 17:53:49 EDT
jsanda did some digging into this and this appears to be the result of a Map being used instead of a List when the callback listeners are registered on agent startup:

org.rhq.core.pc.PluginContainer.addInitializationListener is invoked by  org.rhq.core.pc.event.EventContextImpl.registerEventPollerInternal using the EventContextImpl class name as the listener name. Because a Map is used in the underlying implementation only one event poller can be registered.

Originally, org.rhq.core.pc.PluginContainer.initListeners was of type List but in commit https://git.fedorahosted.org/cgit/rhq/rhq.git/commit/?id=93c0e5f42426e4955b07cc66be8feab7e0e39c73 this was changed to Map.

This is a pretty nasty bug that essentially means that only one log source can be monitored per agent once the agent is restarted.
Comment 2 John Sanda 2013-09-18 20:42:06 EDT
I have pushed the fix to the bug/1009666 branch. As soon as I get the go ahead, I will merge this into master.

https://git.fedorahosted.org/cgit/rhq/rhq.git/log/?h=bug/1009666
Comment 3 John Sanda 2013-09-19 10:00:39 EDT
Changes have been merged to master.

master commit hash: 3843704e61a4
Comment 4 jvlasak 2013-10-02 05:42:57 EDT
Bug verified.
For all 11 log sources were displayed events (see screenshot - contains only a few different sources). Events were created the same way also after restarting the agent.
Comment 5 jvlasak 2013-10-02 05:43:55 EDT
Created attachment 806345 [details]
Displayed events from different sources.

Note You need to log in before you can comment on or make changes to this bug.