Bug 1376843

Summary: [RFE] Implement a keep-alive with reconnect if needed logic for the python jsonrpc client
Product: [oVirt] vdsm Reporter: Simone Tiraboschi <stirabos>
Component: Bindings-APIAssignee: Irit Goihman <igoihman>
Status: CLOSED CURRENTRELEASE QA Contact: Jiri Belka <jbelka>
Severity: high Docs Contact:
Priority: high    
Version: 4.18.13CC: bugs, igoihman, lsvaty, mperina, oourfali, pkliczew, sbonazzo, ybronhei, ylavi
Target Milestone: ovirt-4.2.0Keywords: FutureFeature, RFE
Target Release: 4.20.4Flags: rule-engine: ovirt-4.2+
rule-engine: exception+
jbelka: testing_plan_complete-
ylavi: planning_ack+
mperina: devel_ack+
lsvaty: testing_ack+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
The Python JSON-RPC client will reconnect if the connection is interrupted.
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-12-20 10:47:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1349829, 1417708, 1417709, 1471617    

Description Simone Tiraboschi 2016-09-16 14:47:45 UTC
Description of problem:
Implement a keep-alive with reconnect if needed logic for the python jsonrpc client; we have this feature in the java jsonrpc client but not the python one.

Comment 5 Martin Sivák 2017-06-13 15:02:38 UTC
Can you please also make sure that topic subscriptions are maintained (re-subscribed) after the reconnect event?

Comment 6 Yaniv Kaul 2017-06-14 20:44:20 UTC
(In reply to Martin Sivák from comment #5)
> Can you please also make sure that topic subscriptions are maintained
> (re-subscribed) after the reconnect event?

Setting NEEDINFO on Piotr.

Comment 7 Piotr Kliczewski 2017-06-16 06:59:39 UTC
It is how the engine client works and we intend to keep this behavior.

Comment 8 Sandro Bonazzola 2017-08-31 11:51:05 UTC
Will this make 4.1.6? We have BZ #1457471 and #1471617 depending on this

Comment 9 Martin Perina 2017-08-31 12:10:41 UTC
(In reply to Sandro Bonazzola from comment #8)
> Will this make 4.1.6? We have BZ #1457471 and #1471617 depending on this

We are trying our best to get this in, but due to difficult negotiation about those patches we will most probably be able to deliver them early next week.

Comment 10 Martin Perina 2017-09-22 13:33:14 UTC
This fix requires a lot of deep changes inside VDSM client infrastructure and it's still not ready even on master. So for now retargeting to 4.2, once finished on master we will decide to which 4.1.z will be backported

Comment 11 Jiri Belka 2017-11-03 15:37:15 UTC
is this testable from user experience perspective?

Comment 12 Irit Goihman 2017-11-05 14:24:16 UTC
I tested it using these steps:

1. change vdsm log level to DEBUG.
2. python script:

import logging
import sys
from vdsm import client

logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
cli = client.connect('localhost')
cli.Host.getCapabilities()
cli.close()

3. restarting vdsmd service after client connects and while running a command.


You should see log prints of the client timeout and attempts to reconnect and you should be able to run vdsm-client commands after reconnecting is successful.

Comment 13 Jiri Belka 2017-11-13 13:28:07 UTC
ok, vdsm-4.20.3-178.gitee07ec4.el7.centos.x86_64

...
Traceback (most recent call last):
  File "/tmp/test.py", line 8, in <module>
    cli = client.connect('localhost')
  File "/usr/lib/python2.7/site-packages/vdsm/client.py", line 124, in connect
    raise ConnectionError(host, port, use_tls, timeout, e)
vdsm.client.ConnectionError: Connection to localhost:54321 with use_tls=True, timeout=60 failed: [Errno 111] Connection refused
DEBUG:root:START thread <Thread(Client localhost:54321, started daemon 140435456718592)> (func=<bound method Reactor.process_requests of <yajsonrpc.betterAsyncore.Reactor object at 0x7fb9babc6a50>>, args=(), kwargs={})
DEBUG:yajsonrpc.protocols.stomp.AsyncClient:Stomp connection established
DEBUG:jsonrpc.AsyncoreClient:Sending response
{u'systemProductName': u'RHEV Hypervisor', u'systemSerialNumber': u'4C4C4544-004D-4B10-8046-B4C04F313332', u'systemFamily': u'Red Hat Enterprise Linux', u'systemVersion': u'7.4-18.el7', u'systemUUID': u'C3EBA6D8-17EA-4942-80B7-CE66111D0B85', u'systemManufacturer': u'Red Hat'}

Comment 14 Sandro Bonazzola 2017-12-20 10:47:35 UTC
This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017.

Since the problem described in this bug report should be
resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.