Bug 725251

Summary: Debug logging is too verbose for medium to large JON server farms.
Product: [Other] RHQ Project Reporter: Simeon Pinder <spinder>
Component: Core ServerAssignee: Nobody <nobody>
Status: NEW --- QA Contact:
Severity: low Docs Contact:
Priority: medium    
Version: unspecifiedCC: hrupp
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 625146    

Description Simeon Pinder 2011-07-24 17:32:49 UTC
Description of problem: Debug logging generates too many messages for large installations.  As all alerting and condition processing happens on the server side then with greater than 100 agents the files become difficult to follow and maintain relevant context.

As an example, with 138 agents and just nine simple alerts defined, each roll of the log files is around 50MB of text messages and only covered about 15 minutes of monitored time.  With a more complicated monitoring setup or more aggressive alerting, log file processing becomes fairly tedious and grep calls can begin to slow down.  Often you need to know the precise time window to focus on. When there are no telltale exceptions thrown and you don't yet know the problem it can become tedious. 

Are there alternative logging mechanisms like tomcat, apache, etc. where we can learn of better ways to group or break out logging?  Another alternative is to provide some better tooling examples/sample for log file mining. 

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. N/A
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Heiko W. Rupp 2011-07-24 18:38:02 UTC
Actually writing all that log information also has an impact on raw server performance
- creating the info can be expensive
- writing the log file involves writing to disk, which could become the bottleneck (and yes, I have seen that in the past with our internal unti tests and small file sizes for the size rolling appender).