| Summary: | Sync the start of multihost tasks to avoid termination by watchdog | ||
|---|---|---|---|
| Product: | [Retired] Beaker | Reporter: | Karel Srot <ksrot> |
| Component: | scheduler | Assignee: | beaker-dev-list |
| Status: | CLOSED WONTFIX | QA Contact: | |
| Severity: | medium | Docs Contact: | |
| Priority: | high | ||
| Version: | 0.5 | CC: | azelinka, bpeck, ltoscano, mcsontos, ohudlick, stl, tools-bugs |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | MultiHost | ||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-10-21 14:16:09 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Karel Srot
2012-03-23 08:30:43 UTC
The second option is quite straightforward (and I'd be willing to add the support to the workflow-tcms). However, resolving this in Beaker could probably be a bit more clean solution. Bill, what do you think? Would this be better (and easy) to fix in Beaker or leave it rather to the workflow and together with helper tasks? I could update the /distribution/install task to do the following: rhts-sync-set READY rhts-sync-block -s READY $STANDALONE This will work fine for both multi-host and single-host jobs. Nice. Would it be necessary to extend the test time of /distribution/install? I am not sure about current maximal duration but extending it to hours might be causing unnecessary delays before the broke install is terminated by watchdog. I don't think we need to extend the watchdog of install, it already has a long time out for the install to finish. And remember that the watchdog won't kill the recipeSet until *all* the recipes have expired. OK, then it seems like the best solution. Thank you. (In reply to comment #2) > I could update the /distribution/install task to do the following: > > rhts-sync-set READY > rhts-sync-block -s READY $STANDALONE What does the variable STANDALONE hold? Hostnames of other recipes? I don't see any description in the Deployment Guide. ...which will break as soon as anyone got an idea to use the task with different role. RECIPE_MEMBERS variable would be more appropriate here. (In reply to comment #6) > (In reply to comment #2) > > I could update the /distribution/install task to do the following: > > > > rhts-sync-set READY > > rhts-sync-block -s READY $STANDALONE > > What does the variable STANDALONE hold? Hostnames of other > recipes? I don't see any description in the Deployment Guide. - environment variable is filled in for any role found in corresponding tasks (every Nth task in all recipes) - an another ones for roles in recipes That's the rhts way. (In reply to comment #7) > ...which will break as soon as anyone got an idea to use the task with > different role. > > RECIPE_MEMBERS variable would be more appropriate here. recipe role can be changed as well. In fact I think this would be a worse option since its more likely to be changed then someone changing the role for task "/distribution/install". If you change the role for that particular task I think we can assume you know what your doing. Bulk reassignment of issues as Bill has moved to another team. I would update the requirement from #c0. After getting more experience with the execution of multihost tasks in beaker I believe that the harness should sync the start on EVERY multihost job (not just once after the system provisioning). Reason for this is that people are using various "singlehost" tasks withing the job, e.g. for errata package update or /distribution/reservesys. Every "singlehost" tasks in recipe brings back the problem with unsynced starts of multihost task. Since we dont want to exaggerate test time of Multihost jobs just to avoid timeout I believe that the best solution is to sync multihost task execution for every multihost task. Also bumping the priority to get some attention. |