Description of problem: Upon restarting VDSM, the ha-agent is looping on connecting and disconnecting from VDSM with some sort of immediate timeout. Everything is working fine (ha-agent has another working and stable connection to VDSM) which is used for the calls, but there seem to be this client thread looping over and over after the restart. The result is VSDM logs are spamed with WARN aprox every ~10s (ha-agent monitoring loop cicle period) 2021-11-22 12:39:14,609+1000 INFO (Reactor thread) [ProtocolDetector.AcceptorImpl] Accepted connection from ::1:37988 (protocoldetector:61) 2021-11-22 12:39:14,612+1000 WARN (Reactor thread) [vds.dispatcher] unhandled write event (betterAsyncore:184) 2021-11-22 12:39:14,612+1000 INFO (Reactor thread) [ProtocolDetector.Detector] Detected protocol stomp from ::1:37988 (protocoldetector:125) 2021-11-22 12:39:14,613+1000 INFO (Reactor thread) [Broker.StompAdapter] Processing CONNECT request (stompserver:95) 2021-11-22 12:39:14,613+1000 INFO (JsonRpc (StompReactor)) [Broker.StompAdapter] Subscribe command received (stompserver:124) 2021-11-22 12:39:16,615+1000 WARN (JsonRpc (StompReactor)) [vds.dispatcher] unhandled close event (betterAsyncore:184) Version-Release number of selected component (if applicable): ovirt-hosted-engine-ha-2.4.9-1.el8ev.noarch vdsm-4.40.90.4-1.el8ev.x86_64 How reproducible: 100% Steps to Reproduce: 1. Have a working environment with Hosted-Engine, not in maintenace 2. Restart just vdsmd service Actual results: * VDSM logs spammed * ha-agent opening and closing connection to VDSM indefinitely Expected results: * No connection retry loop Additional info: * Enable debug on ha-agent and wou will see this at each loop: MainThread::DEBUG::2021-11-22 12:36:54,825::hosted_engine::542::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Sleeping 10 seconds Client localhost:54321::DEBUG::2021-11-22 12:37:02,455::stomp::398::stomp.AsyncDispatcher::(handle_connect) managed to connect successfully. Client localhost:54321::DEBUG::2021-11-22 12:37:02,456::stomp::431::stomp.AsyncDispatcher::(handle_error) Communication error occurred. Client localhost:54321::DEBUG::2021-11-22 12:37:02,456::stomp::431::stomp.AsyncDispatcher::(handle_error) Communication error occurred. Client localhost:54321::DEBUG::2021-11-22 12:37:02,456::stompclient::128::yajsonrpc.protocols.stomp.AsyncClient::(handle_timeout) Timeout occurred, trying to reconnect Client localhost:54321::DEBUG::2021-11-22 12:37:02,456::stomp::398::stomp.AsyncDispatcher::(handle_connect) managed to connect successfully. Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,458::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,458::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,458::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,458::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,458::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,458::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,458::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,458::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,458::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,458::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,458::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,458::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:02,458::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) Client localhost:54321::DEBUG::2021-11-22 12:37:04,462::stompclient::128::yajsonrpc.protocols.stomp.AsyncClient::(handle_timeout) Timeout occurred, trying to reconnect
I think something is wrong when closing the socket to the VDSM that restarted, in netstat there are old connections in FIN_WAIT2 state from ha-agent side.
Moving back to POST, need to cherry-pick the patch for 4.4.
Please fill in the "Fixed In Version" field with relevant fixed in version.
Our current build contains ovirt-hosted-engine-ha-2.4.9-1.el8ev.noarch, will wait a bit, until 2.4.10-1 arrives.
ovirt-hosted-engine-ha-2.4.10-1.el8ev.noarch ovirt-hosted-engine-setup-2.5.4-2.el8ev.noarch Red Hat Enterprise Linux release 8.5 (Ootpa) Linux 4.18.0-348.13.1.el8_5.x86_64 #1 SMP Tue Jan 18 16:53:27 EST 2022 x86_64 x86_64 x86_64 GNU/Linux systemctl restart ovirt-hosted-engine-ha cat /etc/ovirt-hosted-engine-ha/agent-log.conf [logger_root] level=DEBUG handlers=syslog,logfile propagate=0 # systemctl restart vdsmd # systemctl status vdsmd ● vdsmd.service - Virtual Desktop Server Manager Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; vendor preset: disabled) Active: active (running) since Tue 2022-01-25 18:04:12 IST; 2min 2s ago Process: 264350 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh --pre-start (code=exited, status=0/SUCCESS) Main PID: 264464 (vdsmd) Tasks: 115 (limit: 175716) Memory: 102.7M CGroup: /system.slice/vdsmd.service ├─264464 /usr/bin/python3 /usr/share/vdsm/vdsmd ├─264692 /usr/libexec/ioprocess --read-pipe-fd 69 --write-pipe-fd 68 --max-threads 10 --max-queued-request> ├─265066 /usr/libexec/ioprocess --read-pipe-fd 104 --write-pipe-fd 103 --max-threads 10 --max-queued-reque> ├─265139 /usr/libexec/ioprocess --read-pipe-fd 132 --write-pipe-fd 131 --max-threads 10 --max-queued-reque> ├─265155 /usr/libexec/ioprocess --read-pipe-fd 139 --write-pipe-fd 138 --max-threads 10 --max-queued-reque> ├─265163 /usr/libexec/ioprocess --read-pipe-fd 145 --write-pipe-fd 144 --max-threads 10 --max-queued-reque> ├─265170 /usr/libexec/ioprocess --read-pipe-fd 151 --write-pipe-fd 150 --max-threads 10 --max-queued-reque> ├─265177 /usr/libexec/ioprocess --read-pipe-fd 157 --write-pipe-fd 156 --max-threads 10 --max-queued-reque> └─265184 /usr/libexec/ioprocess --read-pipe-fd 163 --write-pipe-fd 162 --max-threads 10 --max-queued-reque> Jan 25 18:04:35 caracal01.lab.eng.tlv2.redhat.com vdsm[264464]: WARN unhandled close event Jan 25 18:04:39 caracal01.lab.eng.tlv2.redhat.com vdsm[264464]: ERROR ssl handshake: socket error, address: ::1 Jan 25 18:04:41 caracal01.lab.eng.tlv2.redhat.com vdsm[264464]: WARN Nic ovirtmgmt is not configured for IPv6 monitor> Jan 25 18:04:41 caracal01.lab.eng.tlv2.redhat.com vdsm[264464]: WARN Nic ovirtmgmt is not configured for IPv6 monitor> Jan 25 18:04:42 caracal01.lab.eng.tlv2.redhat.com vdsm[264464]: WARN Nic ovirtmgmt is not configured for IPv6 monitor> Jan 25 18:04:42 caracal01.lab.eng.tlv2.redhat.com vdsm[264464]: WARN Nic ovirtmgmt is not configured for IPv6 monitor> Jan 25 18:05:48 caracal01.lab.eng.tlv2.redhat.com vdsm[264464]: WARN Nic ovirtmgmt is not configured for IPv6 monitor> Jan 25 18:05:48 caracal01.lab.eng.tlv2.redhat.com vdsm[264464]: WARN Nic ovirtmgmt is not configured for IPv6 monitor> Jan 25 18:05:48 caracal01.lab.eng.tlv2.redhat.com vdsm[264464]: WARN Nic ovirtmgmt is not configured for IPv6 monitor> Jan 25 18:05:49 caracal01.lab.eng.tlv2.redhat.com vdsm[264464]: WARN Nic ovirtmgmt is not configured for IPv6 monitor> In vdsm log I don't see the initial loop anymore. vdsm log is attached. Moving to verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (RHV RHEL Host (ovirt-host) [ovirt-4.4.10]), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:0462
Due to QE capacity, we are not going to cover this issue in our automation