Bug 1006619
| Summary: | Log events sources are unstable and events are lost due to Log4j log parsing not thread safe | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [JBoss] JBoss Operations Network | Reporter: | Larry O'Leary <loleary> | ||||
| Component: | Monitoring - Events | Assignee: | John Sanda <jsanda> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Mike Foley <mfoley> | ||||
| Severity: | urgent | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | JON 3.1.2 | CC: | hrupp, jsanda, mkoci, myarboro | ||||
| Target Milestone: | ER01 | ||||||
| Target Release: | JON 3.2.0 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | |||||||
| : | 1015715 (view as bug list) | Environment: | |||||
| Last Closed: | Type: | Bug | |||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | 846082 | ||||||
| Bug Blocks: | 1015715 | ||||||
| Attachments: |
|
||||||
|
Description
Larry O'Leary
2013-09-10 23:16:14 UTC
I created a test to reproduce the issues described in bug 846082. I reverted the changes in Log4JLogEntryProcessor, making the DateFormat fields static again. The test consistently fails. If you make them instance fields, the test consistently passes. I did my work in the branch bug/1006619 which has been pushed to origin, https://git.fedorahosted.org/cgit/rhq/rhq.git/log/?h=bug/1006619 For reference, here is what some of the exceptions look like: java.lang.NumberFormatException: For input string: "E.423021313E4" at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1241) at java.lang.Double.parseDouble(Double.java:540) at java.text.DigitList.getDouble(DigitList.java:168) at java.text.DecimalFormat.parse(DecimalFormat.java:1321) at java.text.SimpleDateFormat.subParse(SimpleDateFormat.java:2088) at java.text.SimpleDateFormat.parse(SimpleDateFormat.java:1455) at java.text.DateFormat.parse(DateFormat.java:355) at org.rhq.core.pluginapi.event.log.Log4JLogEntryProcessor.parseDateString(Log4JLogEntryProcessor.java:117) at org.rhq.core.pluginapi.event.log.Log4JLogEntryProcessor.processPrimaryLine(Log4JLogEntryProcessor.java:94) at org.rhq.core.pluginapi.event.log.MultiLineLogEntryProcessor.processLine(MultiLineLogEntryProcessor.java:98) at org.rhq.core.pluginapi.event.log.MultiLineLogEntryProcessor.processLines(MultiLineLogEntryProcessor.java:70) at org.rhq.core.pluginapi.event.log.Log4JEventsTest$Producer.run(Log4JEventsTest.java:46) Keep in mind that due to the lack of exception handling in 3.1.2, these errors would go completely unreported. The changes for bug 846082 adds exception handling that captures any RuntimeExceptions. Setting ON_QA [22:44:25] <loleary> Well, 6619 is actually fixed by 846082... 6619 can go to ON_QA as it was already fixed (jsanda can confirm) in ER01. [22:45:58] <loleary> The only reason 9666 related to 6619 was because it is preventing one from actually testing whether 6619 is fixed or not. I found bug 1017214. What should be done with this bug? Thanks I do not think bug 1017214 is related so I will remove from the dependency list. Created attachment 812135 [details]
All events from the log files are reported on the web.
Verified.
Tested with 15 log files when messages were simultaneously generated into files using a script. After processes generating the messages were stopped. Then number of events (considered levels: INFO, WARN, ERROR, FATAL) on the web and in all the log files was same.
See the attached screenshot.
|