Bug 886690
| Summary: | Alert baseline - OOM | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [JBoss] JBoss Operations Network | Reporter: | Viet Nguyen <vnguyen> | ||||
| Component: | Performance | Assignee: | RHQ Project Maintainer <rhq-maint> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Mike Foley <mfoley> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | JON 3.1.2 | CC: | hrupp, jshaughn, vnguyen | ||||
| Target Milestone: | --- | ||||||
| Target Release: | JON 3.3.0 | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2014-08-27 19:23:20 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
[ while this is coming out of edge case testing, in a real scenario, there are some massively different issues when 5k alerts are created per hour ] When the server goes OOM, can you please create a heap dump? This has not been reported recently and significant work has been done in relevant areas since 3.1. Closing as working in current release. If it occurs again, we will need a heap dump. But as Heiko mentions, this is a somewhat manufactured test, given the number of alerts required. |
Created attachment 662646 [details] server log Description of problem: Hardware: Intel Xeon 2.4GHz RHEV Host VMs: - RHEL 6.3 - JON server: 4GB RAM, 2-dual core - 2 agents on its own server: 4GB RAM, 2-dual core Out of the box configuration (default jon/agent install, default postgresql install) Set up simple alerts on filesystem, CPUs, networks to generate ~5000 alerts/hour (~100K/24hr) Run for about 1 week Observation - First few days: Reports -> Recent Alerts: timed out when alerts > 100k. Back to normal when purge job ran - After about 1 week server log showed a lot of java.lang.OutOfMemoryError: GC overhead limit exceeded. Restarted the server. After about 20 minutes OOM error came back. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: