Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1358136 - [LOGGING] /var/log/ovirt-engine/ui.log is growing at a rate of 16mb/second
[LOGGING] /var/log/ovirt-engine/ui.log is growing at a rate of 16mb/second
Status: CLOSED CURRENTRELEASE
Product: ovirt-engine
Classification: oVirt
Component: General (Show other bugs)
3.6.7
All Linux
high Severity urgent (vote)
: ovirt-4.0.2
: 4.0.2.4
Assigned To: vszocs
Pavel Novotny
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-07-20 03:11 EDT by Gil Klein
Modified: 2016-08-22 12:22 EDT (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: Enhancement
Doc Text:
This update prevents UI exceptions from repeatedly spamming the remote /var/log/ovirt-engine/ui.log after subsequent occurrences of the same error. Each remote operation is an HTTP request and consumes network resources. Any uncaught exception in the Administration Portal or User Portal is compared to the last one, and only logged remotely if it's not the same. All occurrences are still logged on the web browser.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-08-22 08:32:21 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: UX
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
rule-engine: ovirt‑4.0.z+
mgoldboi: exception+
rule-engine: planning_ack+
rule-engine: devel_ack+
pstehlik: testing_ack+


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 61237 master MERGED webadmin,userportal: Improve the way logging works 2016-07-26 09:48 EDT
oVirt gerrit 61411 ovirt-engine-4.0 MERGED webadmin,userportal: Improve the way logging works 2016-07-26 11:03 EDT
oVirt gerrit 61414 ovirt-engine-4.0.2 MERGED webadmin,userportal: Improve the way logging works 2016-07-26 11:59 EDT

  None (edit)
Description Gil Klein 2016-07-20 03:11:24 EDT
Description of problem:
/var/log/ovirt-engine/ui.log is growing at a rate of 16mb/second on a production system

Version-Release number of selected component (if applicable):
rhevm-3.6.7.5-0.1.el6.noarch

How reproducible:
100% on this system

Steps to Reproduce:
Monitor the /var/log/ovirt-engine/ui.log log file

Actual results:
# while [ 1 ]; do du -sh ui.log; sleep 60; done
86M	ui.log
102M	ui.log
118M	ui.log
134M	ui.log


Expected results:
Logs should suppress repeating messages or rotate quickly based on file size  

Additional info:

Log snippet:

2016-07-20 10:08:20,176 ERROR [org.ovirt.engine.ui.frontend.server.gwt.OvirtRemoteLoggingService] (ajp-/127.0.0.1:8702-13) [] Permutation name: 44BAC8E4AC393CDE74A1EDFCB3C555A1
2016-07-20 10:08:20,176 ERROR [org.ovirt.engine.ui.frontend.server.gwt.OvirtRemoteLoggingService] (ajp-/127.0.0.1:8702-13) [] Uncaught exception: : com.google.gwt.core.client.JavaScriptException: (NS_ERROR_NOT_INITIALIZED)
 __gwt$exception: <skipped>
 result: 3253927937
 filename: https://rhevm-3.qa.lab.tlv.redhat.com/ovirt-engine/webadmin/44BAC8E4AC393CDE74A1EDFCB3C555A1.cache.html
 lineNumber: 17734
 columnNumber: 0
 inner: null
 data: null
 stack: kv@https://rhevm-3.qa.lab.tlv.redhat.com/ovirt-engine/webadmin/44BAC8E4AC393CDE74A1EDFCB3C555A1.cache.html:17734:25
hu@https://rhevm-3.qa.lab.tlv.redhat.com/ovirt-engine/webadmin/44BAC8E4AC393CDE74A1EDFCB3C555A1.cache.html:9982:24
dwf@https://rhevm-3.qa.lab.tlv.redhat.com/ovirt-engine/webadmin/44BAC8E4AC393CDE74A1EDFCB3C555A1.cache.html:17259:24
O3@https://rhevm-3.qa.lab.tlv.redhat.com/ovirt-engine/webadmin/44BAC8E4AC393CDE74A1EDFCB3C555A1.cache.html:24153:7163
hwf/c.onreadystatechange<@https://rhevm-3.qa.lab.tlv.redhat.com/ovirt-engine/webadmin/44BAC8E4AC393CDE74A1EDFCB3C555A1.cache.html:16937:62
Yt@https://rhevm-3.qa.lab.tlv.redhat.com/ovirt-engine/webadmin/44BAC8E4AC393CDE74A1EDFCB3C555A1.cache.html:12103:26
au@https://rhevm-3.qa.lab.tlv.redhat.com/ovirt-engine/webadmin/44BAC8E4AC393CDE74A1EDFCB3C555A1.cache.html:21295:56
_t/<@https://rhevm-3.qa.lab.tlv.redhat.com/ovirt-engine/webadmin/44BAC8E4AC393CDE74A1EDFCB3C555A1.cache.html:20646:53
:
	at Unknown.kv(Unknown Source)
	at Unknown.hu(Unknown Source)
	at Unknown.dwf(Unknown Source)
	at Unknown.O3(Unknown Source)
	at Unknown.hwf/c.onreadystatechange<(Unknown Source)
	at Unknown.Yt(Unknown Source)
	at Unknown.au(Unknown Source)
	at Unknown._t/<(Unknown Source)
	at Unknown.anonymous(Unknown Source)
Comment 2 Yaniv Kaul 2016-07-21 03:23:05 EDT
1. I assume it also affect the server CPU usage.
2. Can you install debug symbols and report the issue specifically? This bug will be used for flood protection for this mechanism, but we should also look at the specific exception.
Comment 3 Michal Skrivanek 2016-07-21 03:30:19 EDT
can you add /var/log/httpd/ssl_access_log and check it's "POST /ovirt-engine/userportal/remote_logging" calls?
If so, you can identify the client causing that, restart that browser and the flooding should stop
Comment 5 Gil Klein 2016-07-21 04:18:20 EDT
(In reply to Yaniv Kaul from comment #2)
> 1. I assume it also affect the server CPU usage.
CPU is around 25%-30% 
> 2. Can you install debug symbols and report the issue specifically? This bug
> will be used for flood protection for this mechanism, but we should also
> look at the specific exception.
Sure, here it is with debug symbols installed:

2016-07-21 11:16:18,675 ERROR [org.ovirt.engine.ui.frontend.server.gwt.OvirtRemoteLoggingService] (ajp-/127.0.0.1:8702-7) [] Uncaught exception: : com.google.gwt.core.client.JavaScriptException: (NS_ERROR_NOT_INITIALIZED)
 __gwt$exception: <skipped>
 result: 3253927937
 filename: https://rhevm-3.qa.lab.tlv.redhat.com/ovirt-engine/webadmin/44BAC8E4AC393CDE74A1EDFCB3C555A1.cache.html
 lineNumber: 17734
 columnNumber: 0
 inner: null
 data: null
 stack: kv@https://rhevm-3.qa.lab.tlv.redhat.com/ovirt-engine/webadmin/44BAC8E4AC393CDE74A1EDFCB3C555A1.cache.html:17734:25
hu@https://rhevm-3.qa.lab.tlv.redhat.com/ovirt-engine/webadmin/44BAC8E4AC393CDE74A1EDFCB3C555A1.cache.html:9982:24
dwf@https://rhevm-3.qa.lab.tlv.redhat.com/ovirt-engine/webadmin/44BAC8E4AC393CDE74A1EDFCB3C555A1.cache.html:17259:24
O3@https://rhevm-3.qa.lab.tlv.redhat.com/ovirt-engine/webadmin/44BAC8E4AC393CDE74A1EDFCB3C555A1.cache.html:24153:7163
hwf/c.onreadystatechange<@https://rhevm-3.qa.lab.tlv.redhat.com/ovirt-engine/webadmin/44BAC8E4AC393CDE74A1EDFCB3C555A1.cache.html:16937:62
Yt@https://rhevm-3.qa.lab.tlv.redhat.com/ovirt-engine/webadmin/44BAC8E4AC393CDE74A1EDFCB3C555A1.cache.html:12103:26
au@https://rhevm-3.qa.lab.tlv.redhat.com/ovirt-engine/webadmin/44BAC8E4AC393CDE74A1EDFCB3C555A1.cache.html:21295:56
_t/<@https://rhevm-3.qa.lab.tlv.redhat.com/ovirt-engine/webadmin/44BAC8E4AC393CDE74A1EDFCB3C555A1.cache.html:20646:53
:
	at com.google.gwt.core.client.impl.UnloadSupport.setTimeout0(UnloadSupport.java:40) [gwt-servlet.jar:]
	at com.google.gwt.core.client.impl.Impl.setTimeout(Impl.java:285) [gwt-servlet.jar:]
	at com.google.gwt.xhr.client.XMLHttpRequest.$clearOnReadyStateChange(XMLHttpRequest.java:162) [gwt-servlet.jar:]
	at com.google.gwt.http.client.RequestBuilder$1.onReadyStateChange(RequestBuilder.java:409) [gwt-servlet.jar:]
	at Unknown.hwf/c.onreadystatechange<(Unknown Source)
	at com.google.gwt.core.client.impl.Impl.apply(Impl.java:296) [gwt-servlet.jar:]
	at com.google.gwt.core.client.impl.Impl.entry0(Impl.java:335) [gwt-servlet.jar:]
	at Unknown._t/<(Unknown Source)
	at Unknown.anonymous(Unknown Source)
Comment 6 vszocs 2016-07-22 07:11:28 EDT
I have a fix to prevent spamming ui.log, just need to test it.

The actual error seems to be: http://stackoverflow.com/q/26781855

TL;DR seems like a Firefox bug (race condition when canceling request while the response is coming in) -- can someone verify this doesn't reproduce in Chrome or IE?
Comment 7 vszocs 2016-07-22 09:11:06 EDT
Fix posted and verified on artificially triggered error that is always effectively the same.
Comment 8 Michal Skrivanek 2016-07-22 10:03:23 EDT
(In reply to vszocs from comment #6)
> TL;DR seems like a Firefox bug (race condition when canceling request while
> the response is coming in) -- can someone verify this doesn't reproduce in
> Chrome or IE?

Gil has the list of client IPs, worth checking with owners what browser they use
Comment 9 vszocs 2016-07-22 10:06:59 EDT
(In reply to Michal Skrivanek from comment #8)
> Gil has the list of client IPs, worth checking with owners what browser they
> use

OK, but in any case, the stack trace doesn't contain any reference to oVirt UI code, so the actual error encountered is definitely GWT core / browser related, not application code related.
Comment 12 Pavel Novotny 2016-08-17 10:59:19 EDT
Verified in rhevm-4.0.2.6-0.1.el7ev.noarch

I managed to repeatedly throw a UI exception in Webadmin
(by frantically clicking and drag&dropping in dialog host -> Setup Host Networks). 

In ui.log file, the exception is logged only once.
In browser console, the exception is logged every time and has a counter for second and further occurrence.


ui.log:
--~-- 
2016-08-17 16:10:52,000 ERROR [org.ovirt.engine.ui.frontend.server.gwt.OvirtRemoteLoggingService] (default task-59) [] Permutation name: E99F8DCCF76495B2EE9534AA6AA5E8A0
2016-08-17 16:10:52,001 ERROR [org.ovirt.engine.ui.frontend.server.gwt.OvirtRemoteLoggingService] (default task-59) [] Uncaught exception: com.google.gwt.event.shared.UmbrellaException: Exception caught: null Operands
        at java.lang.Throwable.fillInStackTrace(Throwable.java:114) [rt.jar:1.8.0_101]
        at java.lang.RuntimeException.RuntimeException(RuntimeException.java:32) [rt.jar:1.8.0_101]
--~--



browser console (Firefox 48):
--~--
Wed Aug 17 16:10:51 GMT+200 2016 remote
SEVERE: Uncaught exceptioncom.google.gwt.event.shared.UmbrellaException: Exception caught: null Operands
	at Unknown.Bv(Unknown Source)
	at Unknown.Jv(Unknown Source)
	at Unknown.D7(Unknown Source)
...
...
Wed Aug 17 16:11:03 GMT+200 2016 
SEVERE: Uncaught exception (2x)com.google.gwt.event.shared.UmbrellaException: Exception caught: null Operands
	at Unknown.Bv(Unknown Source)
	at Unknown.Jv(Unknown Source)
	at Unknown.D7(Unknown Source)
...
...
Wed Aug 17 16:11:28 GMT+200 2016 
SEVERE: Uncaught exception (3x)com.google.gwt.event.shared.UmbrellaException: Exception caught: null Operands
	at Unknown.Bv(Unknown Source)
	at Unknown.Jv(Unknown Source)
	at Unknown.D7(Unknown Source)
...
...
Wed Aug 17 16:11:53 GMT+200 2016 
SEVERE: Uncaught exception (4x)com.google.gwt.event.shared.UmbrellaException: Exception caught: null Operands
	at Unknown.Bv(Unknown Source)
	at Unknown.Jv(Unknown Source)
	at Unknown.D7(Unknown Source)
...
...
Wed Aug 17 16:14:03 GMT+200 2016 
SEVERE: Uncaught exception (5x)com.google.gwt.event.shared.UmbrellaException: Exception caught: null Operands
	at Unknown.Bv(Unknown Source)
	at Unknown.Jv(Unknown Source)
	at Unknown.D7(Unknown Source)
...
--~--
Comment 13 vszocs 2016-08-22 12:22:43 EDT
(In reply to Pavel Novotny from comment #12)
> I managed to repeatedly throw a UI exception in Webadmin
> (by frantically clicking and drag&dropping in dialog host -> Setup Host
> Networks). 
> 
> In ui.log file, the exception is logged only once.
> In browser console, the exception is logged every time and has a counter for
> second and further occurrence.

Hm, maybe we should open a bug on that (Setup Host Networks) dialog, but thanks for confirming this fix.

Note You need to log in before you can comment on or make changes to this bug.