Bug 2025381 - VDSM logs are spammed due to multiple connection attempts and drops from ha-agent client
Summary: VDSM logs are spammed due to multiple connection attempts and drops from ha-a...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-hosted-engine-ha
Version: 4.4.9
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ovirt-4.4.10
: 4.4.10
Assignee: Yedidyah Bar David
QA Contact: Nikolai Sednev
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-11-22 02:43 UTC by Germano Veit Michel
Modified: 2022-08-07 10:36 UTC (History)
5 users (show)

Fixed In Version: 2.4.10-1
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-02-08 10:08:35 UTC
oVirt Team: Integration
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHV-44058 0 None None None 2021-11-22 02:44:05 UTC
Red Hat Knowledge Base (Solution) 6530481 0 None None None 2021-11-22 03:30:58 UTC
Red Hat Product Errata RHBA-2022:0462 0 None None None 2022-02-08 10:08:41 UTC
oVirt gerrit 117779 0 master MERGED Properly close the connection to VDSM 2021-11-25 07:37:41 UTC
oVirt gerrit 118070 0 ovirt-4.4 MERGED Properly close the connection to VDSM 2021-12-16 13:24:04 UTC

Description Germano Veit Michel 2021-11-22 02:43:28 UTC
Description of problem:

Upon restarting VDSM, the ha-agent is looping on connecting and disconnecting from VDSM with some sort of immediate timeout.
Everything is working fine (ha-agent has another working and stable connection to VDSM) which is used for the calls, but there seem to be this client thread looping over and over after the restart.

The result is VSDM logs are spamed with WARN aprox every ~10s (ha-agent monitoring loop cicle period)

2021-11-22 12:39:14,609+1000 INFO  (Reactor thread) [ProtocolDetector.AcceptorImpl] Accepted connection from ::1:37988 (protocoldetector:61)
2021-11-22 12:39:14,612+1000 WARN  (Reactor thread) [vds.dispatcher] unhandled write event (betterAsyncore:184)
2021-11-22 12:39:14,612+1000 INFO  (Reactor thread) [ProtocolDetector.Detector] Detected protocol stomp from ::1:37988 (protocoldetector:125)
2021-11-22 12:39:14,613+1000 INFO  (Reactor thread) [Broker.StompAdapter] Processing CONNECT request (stompserver:95)
2021-11-22 12:39:14,613+1000 INFO  (JsonRpc (StompReactor)) [Broker.StompAdapter] Subscribe command received (stompserver:124)
2021-11-22 12:39:16,615+1000 WARN  (JsonRpc (StompReactor)) [vds.dispatcher] unhandled close event (betterAsyncore:184)

Version-Release number of selected component (if applicable):
ovirt-hosted-engine-ha-2.4.9-1.el8ev.noarch
vdsm-4.40.90.4-1.el8ev.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Have a working environment with Hosted-Engine, not in maintenace
2. Restart just vdsmd service

Actual results:
* VDSM logs spammed
* ha-agent opening and closing connection to VDSM indefinitely

Expected results:
* No connection retry loop

Additional info:
* Enable debug on ha-agent and wou will see this at each loop:
MainThread::DEBUG::2021-11-22 12:36:54,825::hosted_engine::542::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Sleeping 10 seconds
Client localhost:54321::DEBUG::2021-11-22 12:37:02,455::stomp::398::stomp.AsyncDispatcher::(handle_connect) managed to connect successfully.
Client localhost:54321::DEBUG::2021-11-22 12:37:02,456::stomp::431::stomp.AsyncDispatcher::(handle_error) Communication error occurred.
Client localhost:54321::DEBUG::2021-11-22 12:37:02,456::stomp::431::stomp.AsyncDispatcher::(handle_error) Communication error occurred.
Client localhost:54321::DEBUG::2021-11-22 12:37:02,456::stompclient::128::yajsonrpc.protocols.stomp.AsyncClient::(handle_timeout) Timeout occurred, trying to reconnect
Client localhost:54321::DEBUG::2021-11-22 12:37:02,456::stomp::398::stomp.AsyncDispatcher::(handle_connect) managed to connect successfully.
Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,457::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,458::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,458::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,458::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,458::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,458::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,458::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,458::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,458::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,458::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,458::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,458::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,458::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:02,458::betterAsyncore::154::vds.dispatcher::(send) SSL error sending to <yajsonrpc.betterAsyncore.Dispatcher connected ::1:54321 at 0x7f043d54c160>: The operation did not complete (read) (_ssl.c:2207) 
Client localhost:54321::DEBUG::2021-11-22 12:37:04,462::stompclient::128::yajsonrpc.protocols.stomp.AsyncClient::(handle_timeout) Timeout occurred, trying to reconnect

Comment 1 Germano Veit Michel 2021-11-22 02:45:40 UTC
I think something is wrong when closing the socket to the VDSM that restarted, in netstat there are old connections in FIN_WAIT2 state from ha-agent side.

Comment 3 Yedidyah Bar David 2021-12-16 12:59:19 UTC
Moving back to POST, need to cherry-pick the patch for 4.4.

Comment 6 Nikolai Sednev 2022-01-20 08:03:59 UTC
Please fill in the "Fixed In Version" field with relevant fixed in version.

Comment 7 Nikolai Sednev 2022-01-23 09:37:07 UTC
Our current build contains ovirt-hosted-engine-ha-2.4.9-1.el8ev.noarch, will wait a bit, until 2.4.10-1 arrives.

Comment 9 Nikolai Sednev 2022-01-25 16:15:25 UTC
ovirt-hosted-engine-ha-2.4.10-1.el8ev.noarch
ovirt-hosted-engine-setup-2.5.4-2.el8ev.noarch
Red Hat Enterprise Linux release 8.5 (Ootpa)
Linux 4.18.0-348.13.1.el8_5.x86_64 #1 SMP Tue Jan 18 16:53:27 EST 2022 x86_64 x86_64 x86_64 GNU/Linux

systemctl restart ovirt-hosted-engine-ha
cat /etc/ovirt-hosted-engine-ha/agent-log.conf
[logger_root]
level=DEBUG
handlers=syslog,logfile
propagate=0

# systemctl restart vdsmd

# systemctl status vdsmd
● vdsmd.service - Virtual Desktop Server Manager
   Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2022-01-25 18:04:12 IST; 2min 2s ago
  Process: 264350 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh --pre-start (code=exited, status=0/SUCCESS)
 Main PID: 264464 (vdsmd)
    Tasks: 115 (limit: 175716)
   Memory: 102.7M
   CGroup: /system.slice/vdsmd.service
           ├─264464 /usr/bin/python3 /usr/share/vdsm/vdsmd
           ├─264692 /usr/libexec/ioprocess --read-pipe-fd 69 --write-pipe-fd 68 --max-threads 10 --max-queued-request>
           ├─265066 /usr/libexec/ioprocess --read-pipe-fd 104 --write-pipe-fd 103 --max-threads 10 --max-queued-reque>
           ├─265139 /usr/libexec/ioprocess --read-pipe-fd 132 --write-pipe-fd 131 --max-threads 10 --max-queued-reque>
           ├─265155 /usr/libexec/ioprocess --read-pipe-fd 139 --write-pipe-fd 138 --max-threads 10 --max-queued-reque>
           ├─265163 /usr/libexec/ioprocess --read-pipe-fd 145 --write-pipe-fd 144 --max-threads 10 --max-queued-reque>
           ├─265170 /usr/libexec/ioprocess --read-pipe-fd 151 --write-pipe-fd 150 --max-threads 10 --max-queued-reque>
           ├─265177 /usr/libexec/ioprocess --read-pipe-fd 157 --write-pipe-fd 156 --max-threads 10 --max-queued-reque>
           └─265184 /usr/libexec/ioprocess --read-pipe-fd 163 --write-pipe-fd 162 --max-threads 10 --max-queued-reque>

Jan 25 18:04:35 caracal01.lab.eng.tlv2.redhat.com vdsm[264464]: WARN unhandled close event
Jan 25 18:04:39 caracal01.lab.eng.tlv2.redhat.com vdsm[264464]: ERROR ssl handshake: socket error, address: ::1
Jan 25 18:04:41 caracal01.lab.eng.tlv2.redhat.com vdsm[264464]: WARN Nic ovirtmgmt is not configured for IPv6 monitor>
Jan 25 18:04:41 caracal01.lab.eng.tlv2.redhat.com vdsm[264464]: WARN Nic ovirtmgmt is not configured for IPv6 monitor>
Jan 25 18:04:42 caracal01.lab.eng.tlv2.redhat.com vdsm[264464]: WARN Nic ovirtmgmt is not configured for IPv6 monitor>
Jan 25 18:04:42 caracal01.lab.eng.tlv2.redhat.com vdsm[264464]: WARN Nic ovirtmgmt is not configured for IPv6 monitor>
Jan 25 18:05:48 caracal01.lab.eng.tlv2.redhat.com vdsm[264464]: WARN Nic ovirtmgmt is not configured for IPv6 monitor>
Jan 25 18:05:48 caracal01.lab.eng.tlv2.redhat.com vdsm[264464]: WARN Nic ovirtmgmt is not configured for IPv6 monitor>
Jan 25 18:05:48 caracal01.lab.eng.tlv2.redhat.com vdsm[264464]: WARN Nic ovirtmgmt is not configured for IPv6 monitor>
Jan 25 18:05:49 caracal01.lab.eng.tlv2.redhat.com vdsm[264464]: WARN Nic ovirtmgmt is not configured for IPv6 monitor>


In vdsm log I don't see the initial loop anymore.
vdsm log is attached.
Moving to verified.

Comment 15 errata-xmlrpc 2022-02-08 10:08:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (RHV RHEL Host (ovirt-host) [ovirt-4.4.10]), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:0462

Comment 16 meital avital 2022-08-07 10:36:06 UTC
Due to QE capacity, we are not going to cover this issue in our automation


Note You need to log in before you can comment on or make changes to this bug.