Red Hat Bugzilla – Bug 1460701
[RFE] Add support to search jobs by correlation_id
Last modified: 2017-12-20 06:17:59 EST
Description of problem:
Currently the correlation ID is not exposed via the REST API.
Exposing this field could later provide a way to follow-up on a job's status.
IMHO, the Correlation-ID is an implementation detail. In other words, the real requirement here is to be able to poll commands managed by CoCo in some intelligent fashion, without relying on entity polling - if correlation id is the way to do it, that's fine. If you decide to go some other way, that's fine too.
Benny, I don't really understand what is your need. Can you elaborate on how you intend to use the correlation id to follow-up on a job status? Would be nice if you give examples of the API calls that you would like to use, and how they will be combined.
The problem which I encountered was in the live storage migration scenario, which consists of 3 jobs:
1. Create a snapshot
2. Move the disk
3. Delete the snapshot
I have added a test for this in OST:
It seems that using the SDK and checking the number of snapshots, SD of the disk, and the status of the disk would be enough, but because there is a memory lock on the disk, the fact that the 3 steps have finished successfully does not indicate the job is completed and currently this results in a race condition.
Currently the /jobs/:id endpoint looks like this:
<job href="/ovirt-engine/api/jobs/826193e8-26a3-426c-a42f-1bfb17b77db0" id="826193e8-26a3-426c-a42f-1bfb17b77db0">
<link href="/ovirt-engine/api/jobs/826193e8-26a3-426c-a42f-1bfb17b77db0/clear" rel="clear"/>
<link href="/ovirt-engine/api/jobs/826193e8-26a3-426c-a42f-1bfb17b77db0/end" rel="end"/>
Removing Snapshot Auto-generated for Live Storage Migration of VM vmo
<link href="/ovirt-engine/api/jobs/826193e8-26a3-426c-a42f-1bfb17b77db0/steps" rel="steps"/>
<owner href="/ovirt-engine/api/users/593fd8dd-03c9-0239-01ee-0000000003d0" id="593fd8dd-03c9-0239-01ee-0000000003d0"/>
After speaking with oliel I saw that the Correlation ID is passed in the response
after sending a request using disk_service.move(...) for instance. If I could use it to query the jobs service it would allow me to poll the job's status and correctly wait for its completion. I believe this could be useful for other scenarios as well.
I've also opened and RFE for tackling a similar issue - bug 1199011. It was closed as Moti preferred to use the existing Job polling type. But I'm not sure whether we can facilitate it in LSM flow (since the memory lock as mentioned in comment #3).
Note that the correlation id is generated by the engine only if it isn't provided by the caller. You can provide your own correlation id explicitly, with any API call:
If I understand correctly you want to be able to find the jobs that have certain correlation id. If we want to do so then we need more than just adding the attribute to the API job type: we need also to implement search for jobs:
Otherwise you would need to retrieve all the jobs and then look for those that have the relevant job id. I think that we don't have search capability for jobs in the backend, so that would also need to be added.
We can implement these two things, the new attribute and the search, but I still wonder how reliable this can be. We don't have any guarantee that a job will be created for a task. The next commit of the engine may change the way that live migration is implemented, to use a mechanism that doesn't use jobs, and then your code to check the status will silently fail. The presence of jobs for a task isn't part of the contract of the API, so you should better avoid using them.
Allon, what would the way right to poll for live storage migration completion with the current API?
(In reply to Juan Hernández from comment #6)
> Allon, what would the way right to poll for live storage migration
> completion with the current API?
I don't think there is one - hence this BZ.
I'm not sure correlation ID is the way to go, but theoretically, any long running operation should be job, regradless of whether it's backed by an SPM task or not.
Ideally, I'd like the return value of an action to hold the job id, and be able to poll it until it completes.
So, we need to add the capability to search based on "correlation_id". That needs to be implemented in the backend and the API won't need to be changed. Benny, please confirm that will solve your issue. Then we can move the bug to the backend search component.
Created attachment 1295730 [details]
Search events by correlation id
AFAIK search events by correlation_id is already supported (see attached screenshot)
I think that we should close this RFE , Martin ?
This RFE was added actually long ago by  , the correlation id also is supposed to be exposed to the API as well
Based on the above targeting to 4.2 and moving to MODIFIED
Eli/Martin, someone is missing something here (possibly me, of course). Those patches are about searching **audit logs**.
The requirement here is to search **jobs** as a mechanism to monitor long running operations (e.g. live storage migration).
Could you either explain how I can use the current REST API/SDK to do so, or move this RFE back to NEW/ASSIGNED?
After offline discussion fixing the title and moving back to NEW
(In reply to Martin Perina from comment #15)
> After offline discussion fixing the title and moving back to NEW
This will require adding Jobs as a search-able entity
Verified on ovirt-engine-4.2.0-0.0.master.20171113223918.git25568c3.el7.centos.noarch
Events can now be searched for by correlation id through REST API, there is no search in UI events though.
This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017.
Since the problem described in this bug report should be
resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE.
If the solution does not work for you, please open a new bug report.