Description of problem: A huge amount of lines from VmsStatisticsFetcher are continuously logged to engine.log under normal operation. This vastly increases the size of the log files and makes checking its contents frustrating. Version-Release number of selected component (if applicable): rhevm-3.6.7.5-0.1.el6 How reproducible: Always Steps to Reproduce: 1. Observe engine.log 2. 3. Actual results: A huge amount of lines like the following: 2016-08-16 13:09:12,149 INFO [org.ovirt.engine.core.vdsbroker.VmsStatisticsFetcher] (DefaultQuartzScheduler_Worker-38) [62369c0f] Fetched 1 VMs from VDS 'c6c2e51a-277b-4aaf-86bb-0099b ad0c826' 2016-08-16 13:09:29,013 INFO [org.ovirt.engine.core.vdsbroker.VmsStatisticsFetcher] (DefaultQuartzScheduler_Worker-15) [476c44e9] Fetched 1 VMs from VDS 'c6c2e51a-277b-4aaf-86bb-0099b ad0c826' Those account for most of the content in engine.log : # wc -l engine.log 54045 engine.log # grep org.ovirt.engine.core.vdsbroker.VmsStatisticsFetcher engine.log | wc -l 45378 Expected results: Maybe these events should be logged with DEBUG level (rather than INFO) so they are not written to engine.log with the default logging configuration. Additional info: These lines are being logged at INFO level: backend/manager/modules/vdsbroker/src/main/java/org/ovirt/engine/core/vdsbroker/VmsStatisticsFetcher.java 41 @Override 42 protected void onFetchVms() { 43 if (log.isDebugEnabled()) { 44 logBuilder = new StringBuilder(); 45 } 46 super.onFetchVms(); *47 log.info("Fetched {} VMs from VDS '{}'", 48 vdsmVms.size(), 49 vdsManager.getVdsId()); 50 if (log.isDebugEnabled()) { 51 log.debug(logBuilder.toString()); 52 } 53 }
*** This bug has been marked as a duplicate of bug 1366138 ***
(In reply to Arik from comment #1) > > *** This bug has been marked as a duplicate of bug 1366138 *** Arik - do we want to fix this in 3.6.9?
(In reply to Yaniv Kaul from comment #2) It is a really simple and easy to backport fix. If people find it an issue with high severity then sure.
Well, no. 3.6 is a version where this information could be most useful(reproduce known issues fixed in later versions), it's one line per each host every 15s. The impact is very low since in stable conditions the lines are exactly the same except timestamp so they are very well compressed.
The impact is that this, at least, doubles the size of engine.log. In my test environment those lines account for 84% of engine.log. (In reply to Michal Skrivanek from comment #5) > Well, no. 3.6 is a version where this information could be most > useful(reproduce known issues fixed in later versions) That's the very definition of DEBUG. What's the value of dumping all these details with INFO level? > it's one line per > each host every 15s. When there are 100s of hosts several lines per second.
(In reply to Julio Entrena Perez from comment #6) > The impact is that this, at least, doubles the size of engine.log. > In my test environment those lines account for 84% of engine.log. the compressed one? or you're talking about the current engine.log. Does it cause any issues?
(In reply to Michal Skrivanek from comment #8) > the compressed one? or you're talking about the current engine.log. The engine.log file attached to this bz. > Does it > cause any issues? Yes, it creates up to 10 to 12 lines per second (depending on the size of the environment). This results in most of the contents of engine.log being useless and results in increased disk usage. In some circumstances it might be useful to have this information, which is why we are not requesting to get rid of those lines, but to log them with DEBUG level instead, so they are not sent to engine.log by default but can still be enabled if required.
http://gerrit.ovirt.org/62636 shows that we are intending to still log some of those entries at INFO level. Could you please estimate the amount of lines what will still be logged in comparison with the actual approach? (e.g. we expect this to result in 2% of the current lines still being logged).
Julio, we found these printings helpful for the investigation of severe issues that were hard to reproduce. That is why changing them to lower log level is not a good option - telling a user that got into a split-brain that happens in a typical setup once in a year "great, now please change to debug level and wait for it to reproduce so we can investigate" is problematic. I do not think we can provide an estimation like you are asking for without further information about the setup: number of hosts, scheduling policy that is used and the frequency that VMs are being started+stopped+suspended+restored+migrated. But in a typical setup there should be significantly less printings - currently it is printed once in every 15 sec per-host. With the http://gerrit.ovirt.org/62636 it would be once in every 15 sec per-host *and only if the number of VMs running on it is different than it used to 15sec ago*
Verify with: Engine: rhevm-3.6.9-0.1.el6.noarch Steps: 1. Create 3 VMs and run them on one Host. 2. Monitor engine log 3. Stop On of the VMs Results: PASS the log in update only one time. Output: [root@violet-rhel yum.repos.d]# tail -f /var/log/ovirt-engine/engine.log | grep Fetched 2016-08-29 08:22:55,229 INFO [org.ovirt.engine.core.vdsbroker.VmsStatisticsFetcher] (DefaultQuartzScheduler_Worker-45) [] Fetched 3 VMs from VDS '8d1a80f2-b180-44e3-8e17-735e1a759cd7' --- after stop VM ---- 2016-08-29 08:28:46,758 INFO [org.ovirt.engine.core.vdsbroker.VmsStatisticsFetcher] (DefaultQuartzScheduler_Worker-14) [] Fetched 2 VMs from VDS '8d1a80f2-b180-44e3-8e17-735e1a759cd7'
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2016-1929.html