Bug 1369431 - /distribution/reservesys should wait if Updating status
Summary: /distribution/reservesys should wait if Updating status
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Beaker
Classification: Retired
Component: tests
Version: 23
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: beaker-dev-list
QA Contact: tools-bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-23 12:11 UTC by Marian Ganisin
Modified: 2020-10-21 14:13 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-21 14:12:42 UTC
Embargoed:


Attachments (Terms of Use)

Description Marian Ganisin 2016-08-23 12:11:27 UTC
Description of problem:

/distribution/reservesys has a RESERVE_IF_FAIL feature which seems to be badly affected by 'Updating' status despite to used 'sleep 40'.

Users seem to already report that recently 'Updating' status seeing for very long time (e.g. more than a 60s) and also reservesys can be seen triggered by /distribution/reservesys if RESERVE_IF_FAIL used despite to all previous jobs Passed.

This really looks like an issue with 'Updating' status (or any other "unexpected" value).

It might be worth to consider looping on 'Updating' status until something real is read rather then static sleep 40.


Actual results:
False alarms reported (besides others /distribution/reservesys fails after the period if left untouched),
unwanted reservations of machines

Expected results:
Opposite ot actual (does not matter what's the method to achieve this)

Comment 1 Dan Callaghan 2016-08-23 23:06:45 UTC
I guess you would have been hitting this problem last week when the data migration was slowing down the scheduler, so that update_dirty_jobs was taking several minutes to run, right?

Since Tuesday last week the scheduler was back to normal and processing status updates in ~20 seconds, so you should only hit this extremely rarely now.

Comment 2 Roman Joost 2016-09-13 00:08:20 UTC
Dear Marian,

thanks for your report. Based on Dan's reply I'm thinking of closing this bug, since it is due to the load of the data migration. I know it can be very frustrating of false alarms. Would this be acceptable?

Comment 3 Marian Ganisin 2016-09-13 06:44:09 UTC
(In reply to Roman Joost from comment #2)
> Dear Marian,
> 
> thanks for your report. Based on Dan's reply I'm thinking of closing this
> bug, since it is due to the load of the data migration. I know it can be
> very frustrating of false alarms. Would this be acceptable?

An alternative approach is to implement a kind of loop which waits until "known" state is available to avoid faulty behavior under any condition.

Do as you wish.

Comment 4 Dan Callaghan 2016-09-14 02:27:28 UTC
So the problem is "Updating..." is not a status, that's a hack in the web UI to avoid showing the current status from the database when we know it's wrong because the job is "dirty". ("Dirty" means that a status update is pending in beakerd.)

However in the recipe XML (which is what /distribution/reservesys is looking at, to determine if the previous task passed or not) we don't expose the "dirty" flag on the job, nor the "Updating..." status. Instead it just appears with the old values status="Running" result="New" until beakerd updates them.

We could probably make it loop until the result is something other than New. In theory an alternative harness can produce tasks with New result but I think none intentionally do that.

Comment 5 Roman Joost 2016-09-15 04:47:58 UTC
Dear Marian,

we had another look at this. Dan pointed me to a discussion about the reservesys element which currently lacks RESERVE_IF_FAIL functionality. We think the better way out of this would be to equip Beaker to handle reservation in case of failure with <reservesys /> instead of adding more functionality around this task. Until we have a backlog item for this, I'll keep this report open.

Comment 6 Roman Joost 2016-09-19 00:04:47 UTC
Dear Marian,

we'd like to proceed with implementing the RFE from Bug 1100593 (Conditional reservation support for harness independent reservation) in favour of this bug. I've bumped the priority and think time spent on this support would benefit everyone than adding more hacks to /distribution/reservesys. Personally I'd like to close this bug as WONTFIX with reference to Bug 1100593, but I'm also happy to keep it open and close it when Bug 1100593 if you feel like it should be kept.

Let me know what you think. Cheers!


Note You need to log in before you can comment on or make changes to this bug.