Description of problem: If multiple event sources are configured on an agent, not all of them are polled. It is not clear what actually causes this issue but it happens every time. Version-Release number of selected component (if applicable): 4.4.0.JON312GA 4.7.0.JON [adad71f] How reproducible: Always Steps to Reproduce: 1. Start JBoss ON 3.2 system. 2. Import RHQ Agent and RHQ Server into inventory. 3. Create /tmp/generateLog using the script below. 4. Then run /tmp/generateLog to generate our random log data chmod +x /tmp/generateLog for i in $(seq -w 0 10); do /tmp/generateLog /tmp/file-${i}.log 2>/dev/null & done 5. Modify RHQ Agent resource configuration Event Report Max Per Source and Event Report Max Total to 1000 and 10000. 6. Add 11 event logging sources to RHQ Server resource using generated log files /tmp/file-00.log, /tmp/file-01.log, /tmp/file-02.log, ..., /tmp/file-10.log 7. Restart agent with DEBUG logging enabled. File: /tmp/generateLog ------------------------------------------------------------------------ #!/bin/sh # maxSleepTimePerSleep=2 maxEntriesPerSleep=5 maxEntries=1000000 maxSleepTime=3600 logLevels=("FATAL" "ERROR" "WARN " "INFO " "DEBUG" "TRACE") logThreads=(main $(echo Thread-{0..99}) $(echo Fake-{0..9})) logCategories=(apache.cassandra.db.commitlog.CommitLogReplayer org.jboss.as.server.deployment org.jboss.as.messaging org.some.other.category) logMessages=("A log message" "Another log message" "A log message with some meaning" "A log message without any meaning" "To be a log message or not to be" "To infinity and beyond" "Every rose has its throne" "Come out and pay" "That thing wasn't supposed to do that." "Now that's what I'm talking about"'!' "Friends don't let friends eat grapefruit") function printRandomLogLevel() { numLogLevels=${#logLevels[*]} echo -n ${logLevels[$((RANDOM%numLogLevels))]} } function printRandomLogThread() { numLogThreads=${#logThreads[*]} echo -n ${logThreads[$((RANDOM%numLogThreads))]} } function printRandomLogCategory() { numLogCategories=${#logCategories[*]} echo -n ${logCategories[$((RANDOM%numLogCategories))]} } function printRandomLogMessage() { numLogMessages=${#logMessages[*]} echo -n ${logMessages[$((RANDOM%numLogMessages))]} } function printLogTimestamp() { timestamp=$(date +"%F %T|%N") ms="$(printf "%03d" $(( 10#${timestamp#*|}/1000000 )))" echo -n "${timestamp%|*},${ms}" } function printRandomLogEntry() { echo "$(printLogTimestamp) $(printRandomLogLevel) [$(printRandomLogThread)] ($(printRandomLogCategory)) $(printRandomLogMessage)" } function printRandomLogEntries() { totalSleepTime=0 totalEntries=0 while [ true ]; do randomSleep=$((RANDOM%${maxSleepTimePerSleep})) [ $((${maxSleepTime}-(${totalSleepTime}+${randomSleep}))) -lt 0 ] && randomSleep=$((${randomSleep} +(${maxSleepTime}-(${totalSleepTime}+${randomSleep})))) echo >&2 Sleeping for ${randomSleep} seconds... sleep ${randomSleep}s totalSleepTime=$((totalSleepTime+$randomSleep)) randomNumEntries=$((RANDOM%${maxEntriesPerSleep})) [ $((${maxEntries}-(${totalEntries}+${randomNumEntries}))) -lt 0 ] && randomNumEntries=$((${randomNumEntries} +(${maxEntries}-(${totalEntries}+${randomNumEntries})))) echo >&2 Printing ${randomNumEntries} log entries... for ((i=0; i<randomNumEntries; i++)); do [ $# -gt 0 ] && { echo $(printRandomLogEntry) >>"$1" } || { echo $(printRandomLogEntry) } done totalEntries=$((totalEntries+${randomNumEntries})) [ ${totalEntries} -ge ${maxEntries} -o ${totalSleepTime} -ge ${maxSleepTime} ] && break done echo >&2 Total entries is ${totalEntries} and total sleep time was ${totalSleepTime} seconds. } printRandomLogEntries $@ ------------------------------------------------------------------------ Actual results: Agent log (after restart and debug is enabled) will only contain log messages for a single log file source instead of all 11: 2013-09-18 14:54:59,669 DEBUG [EventManager.poller-1] (core.pluginapi.event.log.LogFileEventPoller)- /tmp/file-10.log: 24405 new bytes 2013-09-18 14:55:59,668 DEBUG [EventManager.poller-3] (core.pluginapi.event.log.LogFileEventPoller)- /tmp/file-10.log: 20126 new bytes 2013-09-18 14:56:59,668 DEBUG [EventManager.poller-1] (core.pluginapi.event.log.LogFileEventPoller)- /tmp/file-10.log: 23275 new bytes 2013-09-18 14:57:59,669 DEBUG [EventManager.poller-1] (core.pluginapi.event.log.LogFileEventPoller)- /tmp/file-10.log: 25691 new bytes Expected results: Agent log (after restart and debug is enabled) will contain log messages indicated events are being collected from all 11 files. Additional Information: If I disable an event source in the JON UI and then re-enable it, events will be collected for that source. This is true for all the sources. But, as soon as the agent is restarted this problem comes back.
jsanda did some digging into this and this appears to be the result of a Map being used instead of a List when the callback listeners are registered on agent startup: org.rhq.core.pc.PluginContainer.addInitializationListener is invoked by org.rhq.core.pc.event.EventContextImpl.registerEventPollerInternal using the EventContextImpl class name as the listener name. Because a Map is used in the underlying implementation only one event poller can be registered. Originally, org.rhq.core.pc.PluginContainer.initListeners was of type List but in commit https://git.fedorahosted.org/cgit/rhq/rhq.git/commit/?id=93c0e5f42426e4955b07cc66be8feab7e0e39c73 this was changed to Map. This is a pretty nasty bug that essentially means that only one log source can be monitored per agent once the agent is restarted.
I have pushed the fix to the bug/1009666 branch. As soon as I get the go ahead, I will merge this into master. https://git.fedorahosted.org/cgit/rhq/rhq.git/log/?h=bug/1009666
Changes have been merged to master. master commit hash: 3843704e61a4
Bug verified. For all 11 log sources were displayed events (see screenshot - contains only a few different sources). Events were created the same way also after restarting the agent.
Created attachment 806345 [details] Displayed events from different sources.