Hide Forgot
Description of problem: When scheduling multihost test, one machine is usually provisioned much earlier than the other one. That results in the early test execution on the first system, the test stops on the first rhts-sync-block waiting for the second server and usually it is killed by watchdog. FMPOV there are following solutions: 1. Extend the test time in Makefile to hours. This is not good for scheduling several multihost tasks because once some test got stuck, it takes hours till it is (correctly) killed by watchdog. 2. Create one special task that ensures the sync after the installation and use this task in every multihost job (could be done by workflow). 3. Something similar to 2 done automatically by beaker. Version-Release number of selected component (if applicable): Version - 0.8.1 ???
The second option is quite straightforward (and I'd be willing to add the support to the workflow-tcms). However, resolving this in Beaker could probably be a bit more clean solution. Bill, what do you think? Would this be better (and easy) to fix in Beaker or leave it rather to the workflow and together with helper tasks?
I could update the /distribution/install task to do the following: rhts-sync-set READY rhts-sync-block -s READY $STANDALONE This will work fine for both multi-host and single-host jobs.
Nice. Would it be necessary to extend the test time of /distribution/install? I am not sure about current maximal duration but extending it to hours might be causing unnecessary delays before the broke install is terminated by watchdog.
I don't think we need to extend the watchdog of install, it already has a long time out for the install to finish. And remember that the watchdog won't kill the recipeSet until *all* the recipes have expired.
OK, then it seems like the best solution. Thank you.
(In reply to comment #2) > I could update the /distribution/install task to do the following: > > rhts-sync-set READY > rhts-sync-block -s READY $STANDALONE What does the variable STANDALONE hold? Hostnames of other recipes? I don't see any description in the Deployment Guide.
...which will break as soon as anyone got an idea to use the task with different role. RECIPE_MEMBERS variable would be more appropriate here.
(In reply to comment #6) > (In reply to comment #2) > > I could update the /distribution/install task to do the following: > > > > rhts-sync-set READY > > rhts-sync-block -s READY $STANDALONE > > What does the variable STANDALONE hold? Hostnames of other > recipes? I don't see any description in the Deployment Guide. - environment variable is filled in for any role found in corresponding tasks (every Nth task in all recipes) - an another ones for roles in recipes That's the rhts way.
(In reply to comment #7) > ...which will break as soon as anyone got an idea to use the task with > different role. > > RECIPE_MEMBERS variable would be more appropriate here. recipe role can be changed as well. In fact I think this would be a worse option since its more likely to be changed then someone changing the role for task "/distribution/install". If you change the role for that particular task I think we can assume you know what your doing.
Bulk reassignment of issues as Bill has moved to another team.
I would update the requirement from #c0. After getting more experience with the execution of multihost tasks in beaker I believe that the harness should sync the start on EVERY multihost job (not just once after the system provisioning). Reason for this is that people are using various "singlehost" tasks withing the job, e.g. for errata package update or /distribution/reservesys. Every "singlehost" tasks in recipe brings back the problem with unsynced starts of multihost task. Since we dont want to exaggerate test time of Multihost jobs just to avoid timeout I believe that the best solution is to sync multihost task execution for every multihost task. Also bumping the priority to get some attention.