| Summary: | (Stomp Reactor) Unable to process messages | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [oVirt] ovirt-engine | Reporter: | jniederm | ||||||||
| Component: | BLL.Infra | Assignee: | Piotr Kliczewski <pkliczew> | ||||||||
| Status: | CLOSED DUPLICATE | QA Contact: | Pavel Stehlik <pstehlik> | ||||||||
| Severity: | high | Docs Contact: | |||||||||
| Priority: | unspecified | ||||||||||
| Version: | 4.0.0 | CC: | bugs, jniederm, mperina, oourfali | ||||||||
| Target Milestone: | ovirt-4.0.0-beta | Flags: | oourfali:
ovirt-4.0.0?
rule-engine: planning_ack+ mperina: devel_ack+ rule-engine: testing_ack? |
||||||||
| Target Release: | --- | ||||||||||
| Hardware: | Unspecified | ||||||||||
| OS: | Unspecified | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | |||||||||||
| : | 1330383 (view as bug list) | Environment: | |||||||||
| Last Closed: | 2016-05-20 11:53:46 UTC | Type: | Bug | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | Infra | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Bug Depends On: | |||||||||||
| Bug Blocks: | 1330383 | ||||||||||
| Attachments: |
|
||||||||||
Created attachment 1143898 [details]
logs
please provide engine log with debug level. I updated my log level, I will send the log when it happen again. Created attachment 1144800 [details]
engine.log in debug mode
It looks like you lost connectivity with a host x.x.122.154. Please provide vdsm log from this host around 2016-04-07 17:23:57,266. Created attachment 1145077 [details]
vdsm.log host x.x.122.154
Moving from 4.0 alpha to 4.0 beta since 4.0 alpha has been already released and bug is not ON_QA. Please retest with latest master. I think that the issue was fixed as part of setup networks fixes. I've deployed latest master (commit 96db8e5) and it's working. However since there are no steps to reproduce it doesn't bring much informational value. Unable to reproduce the issue on current master and according to previous comments it was most probably fixed by patches for setupNetworks/jsonrpc issues, closing this as duplicate of BZ1323465 *** This bug has been marked as a duplicate of bug 1323465 *** |
Description of problem: Sometimes engine gets so some wrong stats such that it prints to log following message every 5 seconds. The host (IP 192.168.122.201) becomes "Not responsive" in webadmin. However the host can be still connected to using ssh, `systemct status vdsmd` reports "active (running)", remote vdsClient calls works. 2016-04-05 17:50:32,124 ERROR [org.ovirt.vdsm.jsonrpc.client.reactors.Reactor] (Stomp Reactor) [] Unable to process messages 2016-04-05 17:50:34,123 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStatusVDSCommand] (DefaultQuartzScheduler_Worker-80) [] Command 'SpmStatusVDSCommand(HostName = fedora_host1, SpmStatusVDSCommandParameters:{runAsync='true', hostId='695c477c-81bb-4797-9c75-9dd7c7cca025', storagePoolId='00000001-0001-0001-0001-000000000028'})' execution failed: org.ovirt.vdsm.jsonrpc.client.ClientConnectionException: Connection timeout 2016-04-05 17:50:35,126 INFO [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (Stomp Reactor) [] Connecting to /192.168.122.201 2016-04-05 17:50:35,129 ERROR [org.ovirt.vdsm.jsonrpc.client.reactors.Reactor] (Stomp Reactor) [] Unable to process messages 2016-04-05 17:50:37,128 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-25) [] Command 'GetCapabilitiesVDSCommand(HostName = fedora_host1, VdsIdAndVdsVDSCommandParametersBase:{runAsync='true', hostId='695c477c-81bb-4797-9c75-9dd7c7cca025', vds='Host[fedora_host1,695c477c-81bb-4797-9c75-9dd7c7cca025]'})' execution failed: org.ovirt.vdsm.jsonrpc.client.ClientConnectionException: Connection timeout 2016-04-05 17:50:37,128 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring] (DefaultQuartzScheduler_Worker-25) [] Failure to refresh Vds runtime info: org.ovirt.vdsm.jsonrpc.client.ClientConnectionException: Connection timeout 2016-04-05 17:50:37,128 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring] (DefaultQuartzScheduler_Worker-25) [] Exception: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: org.ovirt.vdsm.jsonrpc.client.ClientConnectionException: Connection timeout at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.createNetworkException(VdsBrokerCommand.java:157) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.executeVDSCommand(VdsBrokerCommand.java:120) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.VDSCommandBase.executeCommand(VDSCommandBase.java:73) [vdsbroker.jar:] at org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:33) [dal.jar:] at org.ovirt.engine.core.vdsbroker.ResourceManager.runVdsCommand(ResourceManager.java:449) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.VdsManager.refreshCapabilities(VdsManager.java:649) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring.refreshVdsRunTimeInfo(HostMonitoring.java:121) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring.refresh(HostMonitoring.java:85) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.VdsManager.onTimer(VdsManager.java:234) [vdsbroker.jar:] at sun.reflect.GeneratedMethodAccessor82.invoke(Unknown Source) [:1.8.0_77] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_77] at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_77] at org.ovirt.engine.core.utils.timer.JobWrapper.invokeMethod(JobWrapper.java:77) [scheduler.jar:] at org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrapper.java:51) [scheduler.jar:] at org.quartz.core.JobRunShell.run(JobRunShell.java:213) [quartz.jar:] at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557) [quartz.jar:] Caused by: org.ovirt.vdsm.jsonrpc.client.ClientConnectionException: Connection timeout at org.ovirt.vdsm.jsonrpc.client.reactors.stomp.StompClient$1.execute(StompClient.java:56) [vdsm-jsonrpc-java-client.jar:] at org.ovirt.vdsm.jsonrpc.client.reactors.stomp.StompClient.postConnect(StompClient.java:90) [vdsm-jsonrpc-java-client.jar:] at org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient.connect(ReactorClient.java:142) [vdsm-jsonrpc-java-client.jar:] at org.ovirt.vdsm.jsonrpc.client.JsonRpcClient.getClient(JsonRpcClient.java:114) [vdsm-jsonrpc-java-client.jar:] at org.ovirt.vdsm.jsonrpc.client.JsonRpcClient.call(JsonRpcClient.java:73) [vdsm-jsonrpc-java-client.jar:] at org.ovirt.engine.core.vdsbroker.jsonrpc.FutureMap.<init>(FutureMap.java:68) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.jsonrpc.JsonRpcVdsServer.getCapabilities(JsonRpcVdsServer.java:255) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand.executeVdsBrokerCommand(GetCapabilitiesVDSCommand.java:15) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.executeVDSCommand(VdsBrokerCommand.java:110) [vdsbroker.jar:] ... 14 more Version-Release number of selected component (if applicable): engine 4.0 master, commit 4ffd5a4 vdsm Version : 4.17.999 Release : 837.gite31d74d.fc23 How reproducible: ? Steps to Reproduce: 1. I don't know how to trigger it. It happens pretty randomly. Additional info: It can be work-arounded by restarting the engine. I've never noticed the engine to recover from this state.