Created attachment 1202468 [details] engine.log Description of problem: Looking at RHEV.TLV setup, there is a flood of logs due to exception with monster01. It keeps repeating often. It's enough that 1-2 hosts are not responding (out of many!) and the logs are quite flooded. Version-Release number of selected component (if applicable): ovirt-engine-4.0.4.2-0.1.el7ev.noarch 2016-09-19 15:16:43,405 INFO [org.ovirt.engine.core.vdsbroker.monitoring.PollVmStatsRefresher] (DefaultQuartzScheduler55) [46908fbb] Failed to fetch vms info for host 'monster01' - skipping VMs monitoring. 2016-09-19 15:16:43,886 WARN [org.ovirt.vdsm.jsonrpc.client.utils.retry.Retryable] (SSL Stomp Reactor) [] Retry failed 2016-09-19 15:16:43,887 ERROR [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (DefaultQuartzScheduler11) [652074d9] Exception during connection 2016-09-19 15:16:43,889 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler11) [652074d9] Command 'GetCapabilitiesVDSCommand(HostName = monster01, VdsIdAndVdsVDSCo mmandParametersBase:{runAsync='true', hostId='4d81dd2e-e5ca-441c-a700-0c72b771c1c2', vds='Host[monster01,4d81dd2e-e5ca-441c-a700-0c72b771c1c2]'})' execution failed: java.net.UnknownHostException: monster01.eng.l ab.tlv.redhat.com 2016-09-19 15:16:43,889 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring] (DefaultQuartzScheduler11) [652074d9] Failure to refresh Vds runtime info: java.net.UnknownHostException: monster01.eng.l ab.tlv.redhat.com 2016-09-19 15:16:43,889 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring] (DefaultQuartzScheduler11) [652074d9] Exception: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: java.net. UnknownHostException: monster01.eng.lab.tlv.redhat.com at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.createNetworkException(VdsBrokerCommand.java:157) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.executeVDSCommand(VdsBrokerCommand.java:120) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.VDSCommandBase.executeCommand(VDSCommandBase.java:73) [vdsbroker.jar:] at org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:33) [dal.jar:] at org.ovirt.engine.core.vdsbroker.ResourceManager.runVdsCommand(ResourceManager.java:451) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.VdsManager.refreshCapabilities(VdsManager.java:653) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring.refreshVdsRunTimeInfo(HostMonitoring.java:121) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring.refresh(HostMonitoring.java:85) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.VdsManager.onTimer(VdsManager.java:238) [vdsbroker.jar:] at sun.reflect.GeneratedMethodAccessor94.invoke(Unknown Source) [:1.8.0_101] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_101] at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_101] at org.ovirt.engine.core.utils.timer.JobWrapper.invokeMethod(JobWrapper.java:77) [scheduler.jar:] at org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrapper.java:51) [scheduler.jar:] at org.quartz.core.JobRunShell.run(JobRunShell.java:213) [quartz.jar:] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [rt.jar:1.8.0_101] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_101] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_101] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_101] at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_101] Caused by: java.net.UnknownHostException: monster01.eng.lab.tlv.redhat.com at java.net.InetAddress.getAllByName0(InetAddress.java:1280) [rt.jar:1.8.0_101] at java.net.InetAddress.getAllByName(InetAddress.java:1192) [rt.jar:1.8.0_101] at java.net.InetAddress.getAllByName(InetAddress.java:1126) [rt.jar:1.8.0_101] at java.net.InetAddress.getByName(InetAddress.java:1076) [rt.jar:1.8.0_101] at org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient$1.call(ReactorClient.java:121) [vdsm-jsonrpc-java-client.jar:] at org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient$1.call(ReactorClient.java:117) [vdsm-jsonrpc-java-client.jar:] at org.ovirt.vdsm.jsonrpc.client.utils.retry.Retryable.call(Retryable.java:27) [vdsm-jsonrpc-java-client.jar:] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_101] at org.ovirt.vdsm.jsonrpc.client.utils.ReactorScheduler.performPendingOperations(ReactorScheduler.java:28) [vdsm-jsonrpc-java-client.jar:] at org.ovirt.vdsm.jsonrpc.client.reactors.Reactor.run(Reactor.java:61) [vdsm-jsonrpc-java-client.jar:]
It probably stops after identifying that the host isn't reachable. Isn't it?
(In reply to Oved Ourfali from comment #1) > It probably stops after identifying that the host isn't reachable. > Isn't it? No: ... 2016-09-19 15:39:15,947 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler21) [] Command 'GetCapabilitiesVDSCommand(HostName = monster01, VdsIdAndVdsVDSCommandParametersBase:{runAsync='true', hostId='4d81dd2e-e5ca-441c-a700-0c72b771c1c2', vds='Host[monster01,4d81dd2e-e5ca-441c-a700-0c72b771c1c2]'})' execution failed: java.net.UnknownHostException: monster01.eng.lab.tlv.redhat.com: unknown error 2016-09-19 15:39:18,960 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler11) [652074d9] Command 'GetCapabilitiesVDSCommand(HostName = monster01, VdsIdAndVdsVDSCommandParametersBase:{runAsync='true', hostId='4d81dd2e-e5ca-441c-a700-0c72b771c1c2', vds='Host[monster01,4d81dd2e-e5ca-441c-a700-0c72b771c1c2]'})' execution failed: java.net.UnknownHostException: monster01.eng.lab.tlv.redhat.com 2016-09-19 15:39:21,975 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler6) [] Command 'GetCapabilitiesVDSCommand(HostName = monster01, VdsIdAndVdsVDSCommandParametersBase:{runAsync='true', hostId='4d81dd2e-e5ca-441c-a700-0c72b771c1c2', vds='Host[monster01,4d81dd2e-e5ca-441c-a700-0c72b771c1c2]'})' execution failed: java.net.UnknownHostException: monster01.eng.lab.tlv.redhat.com 2016-09-19 15:39:24,988 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler30) [69f21400] Command 'GetCapabilitiesVDSCommand(HostName = monster01, VdsIdAndVdsVDSCommandParametersBase:{runAsync='true', hostId='4d81dd2e-e5ca-441c-a700-0c72b771c1c2', vds='Host[monster01,4d81dd2e-e5ca-441c-a700-0c72b771c1c2]'})' execution failed: java.net.UnknownHostException: monster01.eng.lab.tlv.redhat.com ... [ykaul@nott16 ovirt-engine]$ grep -c "GetCapabilitiesVDSCommand(HostName = monster01" engine.log 12644
Martin - let's add that to the logging items list.
Verified in rhevm-4.0.5.1-0.1.el7ev.noarch