Description of problem: I have automation wich schedules operation via REST API and right after that, it starts polling it's status (this way I implement waiting for it to finish). Sometimes server returns 404 instead of operation history. Version-Release number of selected component (if applicable): Version: 4.8.0-SNAPSHOT Build Number: bf44b14 How reproducible:sometimes Steps to Reproduce: 1. schedule an operation via rest by setting readyToSubmit to true 2. read operation history url from response body 3. try retrieving history Actual results: if fails with 404 Expected results: server always returns history object, right after operation was scheduled. REST API must make sure, that operation history exists, when it returns from scheduling request. Additional info:
The underlying issue seems to be that the history item is not created by the org.rhq.enterprise.server.operation.OperationManagerBean#scheduleResourceOperation method but from the Quartz-triggered org.rhq.enterprise.server.operation.ResourceOperationJob#execute method, which leaves a time gap in between that you are running into with the fast calls. We never had issues on the UI, as humans are just slow enough. We can work around that limitation inside the REST-api, but should probably just create the history item with state "in progress" directly at job creation time. The quartz job can then update it at will.
Created attachment 786576 [details] Proposed patch
Mazz, could you please review that proposed patch?
(In reply to Heiko W. Rupp from comment #5) > Mazz, could you please review that proposed patch? The only question I have is the following: 355 public void setStartedTime(long startedTime) { 356 if (this.startedTime != 0) { 357 throw new IllegalArgumentException("Can only start an operation once"); 358 } 359 this.startedTime = startedTime; Is this throwing an exception here purposefully to abort what the OperationManagerBean was doing? I was wondering if you shouldn't just let the operation manager continuing doing what its doing - just don't set the started time and leave it as-is. However, I think you may have done this on purpose because you really don't want the history item touched when this happens so you really do want to abort (??) Other than that question, it looks OK to me, especially if you can confirm that all unit tests and itests pass with this change.
Throwing the exception is following the setStartedTime() method, which does that as well. I basically need to set the started time once the Quartz job has started working and is calling into OperationManager.updateHistory, which is why I need the method with parameter.
master 1cd277b34e9
Bulk closing now that 4.10 is out. If you think an issue is not resolved, please open a new BZ and link to the existing one.