Description of problem: We saw this as part of https://bugzilla.redhat.com/show_bug.cgi?id=1524119 This happens for example on hosted engine environment when VDSM dies. The new jsonrpc logic tries to reconnect.. flooding the logs in the process: Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/submonitor_base.py", line 115, in _worker self.action(self._options) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/submonitors/mgmt_bridge.py", line 44, in action logger=self._log File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/util.py", line 442, in connect_vdsm_json_rpc __vdsm_json_rpc_connect(logger, timeout) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/util.py", line 398, in __vdsm_json_rpc_connect timeout=VDSM_MAX_RETRY * VDSM_DELAY RuntimeError: Couldn't connect to VDSM within 15 seconds Client localhost:54321::WARNING::2017-12-12 13:46:53,507::betterAsyncore::177::vds.dispatcher::(log_info) unhandled close event Client localhost:54321::WARNING::2017-12-12 13:46:53,508::betterAsyncore::177::vds.dispatcher::(log_info) unhandled close event Client localhost:54321::WARNING::2017-12-12 13:46:53,509::betterAsyncore::177::vds.dispatcher::(log_info) unhandled close event Client localhost:54321::WARNING::2017-12-12 13:46:53,511::betterAsyncore::177::vds.dispatcher::(log_info) unhandled close event Client localhost:54321::WARNING::2017-12-12 13:46:53,512::betterAsyncore::177::vds.dispatcher::(log_info) unhandled close event Notice the timestamp increments are close to 1 ms (that is millisecond!) per line. Btw: we were looking for a way to control reconnect attempt frequency and reconnect attempt count limit, but found nothing. It would be wise to provide such tunables to avoid proper error handling when vdsm is really and truly gone. Version-Release number of selected component (if applicable): oVirt 4.2.0 RC1 vdsm 4.20.9 How reproducible: - Always, start hosted engine and then kill VDSM - watch the hosted engine's broker.log Additional info: We saw something similar yesterday here https://bugzilla.redhat.com/show_bug.cgi?id=1524119#c7 (which was triggered by https://bugzilla.redhat.com/show_bug.cgi?id=1522878) 14:14:09,048::stomp::617::yajsonrpc.protocols.stomp.AsyncClient::(handle_timeout) Timeout occurred, trying to reconnect Client localhost:54321::DEBUG::2017-12-12 14:14:09,049::stomp::617::yajsonrpc.protocols.stomp.AsyncClient::(handle_timeout) Timeout occurred, trying to reconnect Client localhost:54321::DEBUG::2017-12-12 14:14:09,051::stomp::617::yajsonrpc.protocols.stomp.AsyncClient::(handle_timeout) Timeout occurred, trying to reconnect
*** Bug 1529458 has been marked as a duplicate of this bug. ***
*** Bug 1529465 has been marked as a duplicate of this bug. ***
Verified in ovirt-hosted-engine-setup-2.2.6-1.el7ev.noarch. In /var/log/ovirt-hosted-engine-ha/broker.log reported RuntimeError: Couldn't connect to VDSM within 60 seconds. no jsonrpc reportings
This bugzilla is included in oVirt 4.2.1 release, published on Feb 12th 2018. Since the problem described in this bug report should be resolved in oVirt 4.2.1 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.