Bug 1112120 - [RFE] JSON RPC broker should pass correlation id to VDSM
Summary: [RFE] JSON RPC broker should pass correlation id to VDSM
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: Bindings-API
Version: ---
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ovirt-4.1.3
: 4.19.16
Assignee: Piotr Kliczewski
QA Contact: Pavel Novotny
URL:
Whiteboard:
Depends On:
Blocks: 1425219 1425910
TreeView+ depends on / blocked
 
Reported: 2014-06-23 07:49 UTC by Yair Zaslavsky
Modified: 2019-04-28 13:53 UTC (History)
19 users (show)

Fixed In Version:
Clone Of:
: 1425910 (view as bug list)
Environment:
Last Closed: 2017-07-06 13:16:59 UTC
oVirt Team: Infra
Embargoed:
rule-engine: ovirt-4.1+
rule-engine: ovirt-4.2+
lsvaty: testing_plan_complete-
mgoldboi: planning_ack+
mperina: devel_ack+
lsvaty: testing_ack+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 75333 0 master MERGED rpc: Version bump 2017-04-12 13:58:40 UTC
oVirt gerrit 75334 0 master MERGED send correlation id to the server 2017-04-11 14:40:55 UTC
oVirt gerrit 75335 0 master MERGED api: adding context for api calls 2017-05-09 23:19:25 UTC
oVirt gerrit 75458 0 master MERGED vdsbroker: correlation id for quartz based jobs 2017-04-13 22:48:14 UTC
oVirt gerrit 76748 0 master MERGED api: Move Context class to common.api 2017-05-12 11:02:10 UTC
oVirt gerrit 76749 0 master MERGED tests: Remove unneeded context initialization 2017-05-12 11:02:07 UTC
oVirt gerrit 76750 0 master MERGED tests: Add tests for logged methods 2017-05-15 08:46:43 UTC
oVirt gerrit 76751 0 master MERGED tests: Add failing test for @logged decorator 2017-05-15 08:46:39 UTC
oVirt gerrit 76752 0 master MERGED api: Handle calls without context 2017-05-15 08:46:36 UTC
oVirt gerrit 76753 0 master MERGED hsm: Unify storage and virt api logs 2017-05-19 20:24:02 UTC
oVirt gerrit 76770 0 master MERGED storage.dispatcher: Unify verbs errors 2017-05-19 20:23:59 UTC
oVirt gerrit 77171 0 ovirt-4.1 MERGED common: move thread local 2017-05-25 07:57:10 UTC
oVirt gerrit 77172 0 ovirt-4.1 MERGED api: adding context for api calls 2017-05-25 07:57:38 UTC
oVirt gerrit 77173 0 ovirt-4.1 MERGED api: log call context for api calls 2017-05-25 07:57:33 UTC
oVirt gerrit 77174 0 ovirt-4.1 MERGED api: Move Context class to common.api 2017-05-25 07:58:04 UTC
oVirt gerrit 77175 0 ovirt-4.1 MERGED tests: Remove unneeded context initialization 2017-05-25 07:57:54 UTC
oVirt gerrit 77176 0 ovirt-4.1 MERGED tests: Add tests for logged methods 2017-05-25 07:57:49 UTC
oVirt gerrit 77177 0 ovirt-4.1 MERGED tests: Add failing test for @logged decorator 2017-05-25 07:57:59 UTC
oVirt gerrit 77178 0 ovirt-4.1 MERGED api: Handle calls without context 2017-05-25 07:57:44 UTC
oVirt gerrit 77179 0 ovirt-4.1 MERGED hsm: Unify storage and virt api logs 2017-05-25 07:57:25 UTC
oVirt gerrit 77180 0 ovirt-4.1 MERGED storage.dispatcher: Unify verbs errors 2017-05-25 07:57:17 UTC
oVirt gerrit 77186 0 ovirt-4.1 MERGED common: missing context member 2017-05-25 07:58:09 UTC
oVirt gerrit 77187 0 ovirt-4.1 MERGED send correlation id to the server 2017-05-23 14:40:12 UTC
oVirt gerrit 77191 0 ovirt-engine-4.1 MERGED vdsbroker: correlation id for quartz based jobs 2017-05-24 09:03:01 UTC
oVirt gerrit 77203 0 ovirt-engine-4.1 MERGED rpc: Version bump 2017-05-24 11:29:26 UTC

Description Yair Zaslavsky 2014-06-23 07:49:08 UTC
Description of problem:

I was looking at the code to see how JSON RPC broker passes the correlation-id to vdsm - it doesn't. This should be fixed.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Saggi Mizrahi 2014-06-24 16:55:23 UTC
FYI using FlowID and correlation-id interchangeably.

Correlation was a mistake. I always said it's going to come a time where we will not support it anymore and that time has come.

Lets first start with why Correlation ID is completely useless.
The basic idea of passing one process's context to anothers' log is never a good idea.

It might seem nice at first. Being able to track calls that relate to a specific flow.

But VDSM doesn't care about engine flows. VDSM, for a lot of verbs (and many more to come) would actually try and batch jobs, Run tasks in parallel, get responses from cache and more. This means that most of the information is lost when looking at the correlation ID.

The correct way to filter is by Task ID, Domain ID, VM ID, etc... Those are context that VDSM understands and make's sure to log them whenever they are processes.

When this was originally suggested I was opposed to it and requested for a real use case for use of this information.

I was told that "it's good" but never received a concrete example.

Further more, in json-rpc *every* request has an ID and that ID can be whatever you want it to be. You could make IDs be "<FlowID>_<GUID>" you could even make it "<Flow_Name>_<Flow_ID>_<GUID>" so you have something even more descriptive.

In any case, I would still recommend ignoring FlowID when examining issues.

I have yet to see anyone use FlowID effectively, I have only seen people mistakenly use it and filter out important information.


As a side note, when event's are going to become more and more prevalent the stuff that are covered under the Task\Flow context are going to shrink. This is a good thing as Tasks are not entities in the system.

To sum up, unless I get a concrete use case with a problem that can't be easily solved without Correlation ID it's going to remain unsupported. If you don't want to argue you can put it as part of the RequestID but I strongly recommend you don't.

Comment 11 Saggi Mizrahi 2014-07-13 15:53:42 UTC
In general, I think the thing I have to stress is that unlike the XML-RPC interface in json RPC every request has it's own request ID.
This request ID is selected by the engine. This ID is recorder in both sides.

Currently it's just a GUID. That means that by grepping the flowID in the engine log you should be able to see all of the request IDs and grep for them in VDSM.

Also, since the engine chooses that ID it could be whatever is preferred.
One possible choice is having it be "<flowID>.<incremental integer>" so every call from a flow has an ID that is directly related to the flow in the engine.

That is completely unrelated to the storage TaskIDs which are currently selected by VDSM.

To sum up, if just having a GUID for every request logged in the engine log and VDSM log is enough that is already available. If you require the flowID to be mangled as part of the request ID it is possible and would require some coding.

Comment 13 Tomas Dosek 2014-07-16 17:53:33 UTC
Ok Saggi, thanks for making it clear. Yes, currently none of us uses correlation ID in vdsm as it has almost not practical use. Flow ID which would be able to provide direct link in vdsm events makes indeed more sense.

Comment 18 Barak 2015-07-06 12:30:47 UTC
we should aim to have a flowid in the vdsm log,
what is the ID to be used is the stuff there is no agreement on.
Let's redefine the problem we are trying to solve and continue from there.

Comment 19 Oved Ourfali 2015-07-06 12:59:30 UTC
(In reply to Barak from comment #18)
> we should aim to have a flowid in the vdsm log,
> what is the ID to be used is the stuff there is no agreement on.
> Let's redefine the problem we are trying to solve and continue from there.

Moving to 4.0 for consideration.

Comment 20 Red Hat Bugzilla Rules Engine 2015-10-19 10:59:33 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 21 Oved Ourfali 2016-03-10 11:21:13 UTC
As discussed in the past, this is not used, and we won't add this one at all.
Closing as WONTFIX.

Comment 22 Michal Skrivanek 2016-06-17 11:44:25 UTC
As far as I can tell and as bug 1339045 indicates, comment #18 never happened. There is a clear need of having a correlation between engine and vdsm sides

Comment 23 Michal Skrivanek 2016-06-17 11:44:49 UTC
*** Bug 1339045 has been marked as a duplicate of this bug. ***

Comment 25 Marina Kalinin 2017-02-22 17:52:18 UTC
Hey, reopening this, since this question come up more and more in all the teams. There is definitely a need for this feature to exist.

And no, with all the changes and improvements in 4.1 logging system, there is no implementation for correlation id to uniquely identify a flow top down from engine to vdsm. Or even between  the parent and children commands in the engine itself.

Comment 26 rhev-integ 2017-05-28 14:46:30 UTC
INFO: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Tag 'v4.19.16' doesn't contain patch 'https://gerrit.ovirt.org/77203']
gitweb: https://gerrit.ovirt.org/gitweb?p=ovirt-engine.git;a=shortlog;h=refs/tags/v4.19.16

For more info please contact: infra

Comment 27 Martin Perina 2017-05-29 07:16:07 UTC
This is upstream RFE and it contains fixed for VDSM, vdsm-jsonrpc-java and engine. All patches needed patches are merged, moving to ON_QA manually.

Comment 28 Pavel Novotny 2017-06-15 16:11:21 UTC
Verified downstream in bug 1458571 comment 7:
> Verified in 
> rhevm-4.1.3.2-0.1.el7.noarch
> vdsm-4.19.18-1.el7ev.x86_64
> 
> The flow_id in a main VDSM task is now set to the correlation ID returned by
> engine.
> [snip]


Note You need to log in before you can comment on or make changes to this bug.