Created attachment 616919 [details] logs Description of problem: When export template is finished, there should be created event about this with same correlation id as start has. Response body for action request is: <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <action> <async>false</async> <grace_period> <expiry>10</expiry> </grace_period> <storage_domain> <name>ExportDomainRest</name> </storage_domain> <exclusive>false</exclusive> <status> <state>complete</state> </status> </action> <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <events> <event id="1846" href="/api/events/1846"> <description>Starting to export Template rest_templ1 to ExportDomainRest</description> <code>1164</code> <severity>normal</severity> <time>2012-09-25T07:13:19.060+02:00</time> <correlation_id>519</correlation_id> <user id="9b9002d1-ec33-4083-8a7b-31f6b8931648" href="/api/users/9b9002d1-ec33-4083-8a7b-31f6b8931648"/> <storage_domain id="f3e8db7f-69f9-4fcc-8564-b21b10390c91" href="/api/storagedomains/f3e8db7f-69f9-4fcc-8564-b21b10390c91"/> <template id="5f593530-db21-4f9f-bc0c-45b6b29a359b" href="/api/templates/5f593530-db21-4f9f-bc0c-45b6b29a359b"/> <data_center id="510bb763-77c3-47ac-8a0e-b42d145e8c3c" href="/api/datacenters/510bb763-77c3-47ac-8a0e-b42d145e8c3c"/> </event> </events> In above events collection should be two events - started task, finished task Version-Release number of selected component (if applicable): master branch revision 9bc69502cb44370a856afc359b699614144de265 How reproducible: 20% Steps to Reproduce: 1. Export template with passed correlation id 2. After export task is completed, check events with given correlation id Additional info: The export was started at about 2012-09-25 07:13:18,290 Logs are attached
For comparison, this is how <events> collection should look: <events> <event id="1882" href="/api/events/1882"> <description>Template rest_templ1 was exported successfully to ExportDomainRest</description> <code>1156</code> <severity>normal</severity> <time>2012-09-25T01:05:57.420+02:00</time> <correlation_id>519</correlation_id> <user id="9b9002d1-ec33-4083-8a7b-31f6b8931648" href="/api/users/9b9002d1-ec33-4083-8a7b-31f6b8931648"/> <storage_domain id="1be2d15f-6f4d-416f-95d7-965cb00d1a0e" href="/api/storagedomains/1be2d15f-6f4d-416f-95d7-965cb00d1a0e"/> <template id="8939f177-2aee-4ccd-8d60-3e7317b827b6" href="/api/templates/8939f177-2aee-4ccd-8d60-3e7317b827b6"/> <data_center id="422a4b9d-f3c0-485f-809c-e5c5a68a89ef" href="/api/datacenters/422a4b9d-f3c0-485f-809c-e5c5a68a89ef"/> </event> <event id="1880" href="/api/events/1880"> <description>Starting to export Template rest_templ1 to ExportDomainRest</description> <code>1164</code> <severity>normal</severity> <time>2012-09-25T01:05:55.372+02:00</time> <correlation_id>519</correlation_id> <user id="9b9002d1-ec33-4083-8a7b-31f6b8931648" href="/api/users/9b9002d1-ec33-4083-8a7b-31f6b8931648"/> <storage_domain id="1be2d15f-6f4d-416f-95d7-965cb00d1a0e" href="/api/storagedomains/1be2d15f-6f4d-416f-95d7-965cb00d1a0e"/> <template id="8939f177-2aee-4ccd-8d60-3e7317b827b6" href="/api/templates/8939f177-2aee-4ccd-8d60-3e7317b827b6"/> <data_center id="422a4b9d-f3c0-485f-809c-e5c5a68a89ef" href="/api/datacenters/422a4b9d-f3c0-485f-809c-e5c5a68a89ef"/> </event> </events>
Hasn't reproduced in last 20 tests, closing as current release
Created attachment 700041 [details] engine log Seems to happen again, SI27 - SI27.1 Response body for action request is: <action> <async>false</async> <grace_period> <expiry>10</expiry> </grace_period> <storage_domain> <name>ExportDomainRest</name> </storage_domain> <exclusive>false</exclusive> <status> <state>complete</state> </status> </action> Response code is valid: [200, 201] Action status is valid: ['complete'] Response body for QUERY request is: <events> <event href="/api/events/1796" id="1796"> <description>Starting to export Template rest_templ1 to ExportDomainRest</description> <code>1164</code> <severity>normal</severity> <time>2013-02-20T00:24:46.010+02:00</time> <correlation_id>519</correlation_id> <user href="/api/users/fdfc627c-d875-11e0-90f0-83df133b58cc" id="fdfc627c-d875-11e0-90f0-83df133b58cc"/> <storage_domain href="/api/storagedomains/7bba6760-1789-4d6d-a0f9-48ad9e3d1947" id="7bba6760-1789-4d6d-a0f9-48ad9e3d1947"/> <template href="/api/templates/24e8c8a6-5fa1-469a-9aaf-3bfd3a359366" id="24e8c8a6-5fa1-469a-9aaf-3bfd3a359366"/> <data_center href="/api/datacenters/85086c9f-c6ab-4b5c-99de-16eefbf078b1" id="85086c9f-c6ab-4b5c-99de-16eefbf078b1"/> </event> </events>
an example: http://jenkins.qa.lab.tlv.redhat.com:8080/view/Core+Tools/view/3.1/job/3.1-automation_restapi_reg_vms_rhevh/63/testReport/junit/Templates/211-Export%20template/Export_template/
The problem is with the assumption that the end-action event will exist immediately after the control is gained-back from REST, when invoking an action asynchronically. REST is monitoring the VDSM-tasks to track the action's status, and when they're done, REST return a response and the caller gets the control back. The engine adds the end-action event after the VDSM tasks are finished and is not synchronized with the response that is returned by the REST api, thus it can lead to a race - the REST can return a respond that the action is completed, and the query on the events will be made before the end-action event is added. It was planned to change the REST monitoring to be on the job state instead of the VDSM tasks, such change will solve this race because the response will be returned after the end-action event is added.
Arik, did you make a BZ as per comment #7
No, I've checked with the guys working on REST and it seems that the required infrastrcuture for monitoring the job id instead of the VDSM tasks is already there and should work, so I'll try to use it for the export template command and see if it solves this bug. so this bug can remain assigned on me.
We have the same issue for remove VM command . Should we open separate bug per action or you are planning to solve this issue in some general way for all actions? http://jenkins.qa.lab.tlv.redhat.com:8080/view/Core+Tools/view/3.1/job/3.1-automation_restapi_reg_vms_rhevh/66/ http://jenkins.qa.lab.tlv.redhat.com:8080/view/Core+Tools/view/3.1/job/3.1-automation_restapi_reg_vms_rhevh/68/ Since this issue has major impact on our automation tests results I suggest to fix it for 3.2.
Michael - is there a reason not to make the job-polling as the default monitoring for all commands now that we see it's working well for two commands?
(In reply to comment #11) > Michael - is there a reason not to make the job-polling as the default > monitoring for all commands now that we see it's working well for two > commands? yes, the noise it will produce ..., this is obviously not 3.2 stuff.
Verified on SF10.1 This error wasn't since for 3 weeks since SF9 was built
3.2 has been released