Bug 1376843 - [RFE] Implement a keep-alive with reconnect if needed logic for the python jsonrpc client
Summary: [RFE] Implement a keep-alive with reconnect if needed logic for the python js...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: Bindings-API
Version: 4.18.13
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ovirt-4.2.0
: 4.20.4
Assignee: Irit Goihman
QA Contact: Jiri Belka
URL:
Whiteboard:
Depends On:
Blocks: 1349829 1417708 1417709 1471617
TreeView+ depends on / blocked
 
Reported: 2016-09-16 14:47 UTC by Simone Tiraboschi
Modified: 2017-12-24 08:38 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
The Python JSON-RPC client will reconnect if the connection is interrupted.
Clone Of:
Environment:
Last Closed: 2017-12-20 10:47:35 UTC
oVirt Team: Infra
Embargoed:
rule-engine: ovirt-4.2+
rule-engine: exception+
jbelka: testing_plan_complete-
ylavi: planning_ack+
mperina: devel_ack+
lsvaty: testing_ack+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 78218 0 'None' 'MERGED' 'stomp: handle incoming heartbeats' 2019-12-02 06:12:46 UTC
oVirt gerrit 78284 0 'None' 'MERGED' 'jsonrpc: fix typo in heartbeat header' 2019-12-02 06:12:46 UTC
oVirt gerrit 78293 0 'None' 'MERGED' 'stomp: stop using hard coded values for heartbeats' 2019-12-02 06:12:46 UTC
oVirt gerrit 78535 0 'None' 'MERGED' 'stomp: introduce restore_subscriptions in client' 2019-12-02 06:12:45 UTC
oVirt gerrit 78614 0 'None' 'MERGED' 'stomp: handle timeout in server side' 2019-12-02 06:12:45 UTC
oVirt gerrit 78713 0 'None' 'MERGED' 'stomp: add integration tests for client reconnect' 2019-12-02 06:12:45 UTC
oVirt gerrit 78914 0 'None' 'MERGED' 'jsonrpc: fix typo in heartbeat header' 2019-12-02 06:12:45 UTC
oVirt gerrit 78915 0 'None' 'MERGED' 'tests: rename mock classes in stomp tests' 2019-12-02 06:12:45 UTC
oVirt gerrit 78916 0 'None' 'MERGED' 'stomp: handle incoming heartbeats' 2019-12-02 06:12:45 UTC
oVirt gerrit 78917 0 'None' 'MERGED' 'stomp: handle timeout in server side' 2019-12-02 06:12:45 UTC
oVirt gerrit 78918 0 'None' 'MERGED' 'stomp: stop using hard coded values for heartbeats' 2019-12-02 06:12:44 UTC
oVirt gerrit 78919 0 'None' 'MERGED' 'stomp: introduce restore_subscriptions in client' 2019-12-02 06:12:44 UTC
oVirt gerrit 79384 0 'None' 'MERGED' 'stomp: fix AsyncDispatcher next_check_interval' 2019-12-02 06:12:44 UTC
oVirt gerrit 80911 0 'None' 'MERGED' 'stomp: fix AsyncDispatcher next_check_interval' 2019-12-02 06:12:44 UTC
oVirt gerrit 82564 0 'None' 'MERGED' 'betterAsyncore: add modified create_socket' 2019-12-02 06:12:44 UTC
oVirt gerrit 82565 0 'None' 'MERGED' 'AsyncDispatcher: next_check_val reflects reconnect timeout' 2019-12-02 06:12:44 UTC
oVirt gerrit 82566 0 'None' 'MERGED' 'AsyncDispatcher: add handle_error' 2019-12-02 06:12:44 UTC
oVirt gerrit 82567 0 'None' 'MERGED' 'stomp: implement client reconnect' 2019-12-02 06:12:44 UTC
oVirt gerrit 82841 0 'None' 'MERGED' 'asyncClient: save all requests before reconnecting' 2019-12-02 06:12:43 UTC
oVirt gerrit 83155 0 'None' 'MERGED' 'stomp: calls are now blocking' 2019-12-02 06:12:43 UTC

Description Simone Tiraboschi 2016-09-16 14:47:45 UTC
Description of problem:
Implement a keep-alive with reconnect if needed logic for the python jsonrpc client; we have this feature in the java jsonrpc client but not the python one.

Comment 5 Martin Sivák 2017-06-13 15:02:38 UTC
Can you please also make sure that topic subscriptions are maintained (re-subscribed) after the reconnect event?

Comment 6 Yaniv Kaul 2017-06-14 20:44:20 UTC
(In reply to Martin Sivák from comment #5)
> Can you please also make sure that topic subscriptions are maintained
> (re-subscribed) after the reconnect event?

Setting NEEDINFO on Piotr.

Comment 7 Piotr Kliczewski 2017-06-16 06:59:39 UTC
It is how the engine client works and we intend to keep this behavior.

Comment 8 Sandro Bonazzola 2017-08-31 11:51:05 UTC
Will this make 4.1.6? We have BZ #1457471 and #1471617 depending on this

Comment 9 Martin Perina 2017-08-31 12:10:41 UTC
(In reply to Sandro Bonazzola from comment #8)
> Will this make 4.1.6? We have BZ #1457471 and #1471617 depending on this

We are trying our best to get this in, but due to difficult negotiation about those patches we will most probably be able to deliver them early next week.

Comment 10 Martin Perina 2017-09-22 13:33:14 UTC
This fix requires a lot of deep changes inside VDSM client infrastructure and it's still not ready even on master. So for now retargeting to 4.2, once finished on master we will decide to which 4.1.z will be backported

Comment 11 Jiri Belka 2017-11-03 15:37:15 UTC
is this testable from user experience perspective?

Comment 12 Irit Goihman 2017-11-05 14:24:16 UTC
I tested it using these steps:

1. change vdsm log level to DEBUG.
2. python script:

import logging
import sys
from vdsm import client

logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
cli = client.connect('localhost')
cli.Host.getCapabilities()
cli.close()

3. restarting vdsmd service after client connects and while running a command.


You should see log prints of the client timeout and attempts to reconnect and you should be able to run vdsm-client commands after reconnecting is successful.

Comment 13 Jiri Belka 2017-11-13 13:28:07 UTC
ok, vdsm-4.20.3-178.gitee07ec4.el7.centos.x86_64

...
Traceback (most recent call last):
  File "/tmp/test.py", line 8, in <module>
    cli = client.connect('localhost')
  File "/usr/lib/python2.7/site-packages/vdsm/client.py", line 124, in connect
    raise ConnectionError(host, port, use_tls, timeout, e)
vdsm.client.ConnectionError: Connection to localhost:54321 with use_tls=True, timeout=60 failed: [Errno 111] Connection refused
DEBUG:root:START thread <Thread(Client localhost:54321, started daemon 140435456718592)> (func=<bound method Reactor.process_requests of <yajsonrpc.betterAsyncore.Reactor object at 0x7fb9babc6a50>>, args=(), kwargs={})
DEBUG:yajsonrpc.protocols.stomp.AsyncClient:Stomp connection established
DEBUG:jsonrpc.AsyncoreClient:Sending response
{u'systemProductName': u'RHEV Hypervisor', u'systemSerialNumber': u'4C4C4544-004D-4B10-8046-B4C04F313332', u'systemFamily': u'Red Hat Enterprise Linux', u'systemVersion': u'7.4-18.el7', u'systemUUID': u'C3EBA6D8-17EA-4942-80B7-CE66111D0B85', u'systemManufacturer': u'Red Hat'}

Comment 14 Sandro Bonazzola 2017-12-20 10:47:35 UTC
This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017.

Since the problem described in this bug report should be
resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.