Description of problem: Harness does not restart the task after reboot. Version-Release number of selected component (if applicable): beah-0.6.32-1 How reproducible: Likely low[1][2] Steps to Reproduce: 1. Run rhts-reboot in a loop Actual results: Task ends with Fail result "harness/run" with message: > Harness could not run the task: Task 'XXX' finished, exit code 0. rc=2 Expected results: Task should cycle forever (until killed by EWD) Additional info: [1] I run a 24 hour long test test which was rebooting machine (virtual of course!) in a cycle and have not seen the issue in 400+ runs. [2] Even 1/1000 ratio would be large enough: running 100 tasks in a job would mean 10% of recipes would be broken and that's just too high failure rate. Considering this a high priority.
The fix for Bug 683184 caused problem introduced by fixing Bug 711270 got unmasked. By using non-standard lock-files services are not stopped when changing runlevels as no lockfile is found - /etc/rc requires lockfile for subsystem to be present. Services are stopped at random order by S01reboot calling killall5 and when task gets killed earlier than server this event is caught and task is considered finished. Plan to revert the fix for 711270 as that is just a workaround for seriously broken system. We shall not be held responsible for that.
Pushed to gerrit.
Tested in a VM on RHEL{4,5,6}. Updated package is now on beaker-stage.
All, The following issue still exists: ./harness/run Harness could not run the task: Task u'ccd30709-7ecc-427f-a3ad-7057088be08d' finished, exit code 0. rc=2 See here: https://beaker.engineering.redhat.com/tasks/executed?arch_id=7&task=%2Fkernel%2Fdrivers%2F3rd-party&result_id=4&job_id=134006&whiteboard=2.6.9-102.EL Best, -pbunyan
Looks like RHEL4 specific issue - services are still not stopped properly on reboot. Will look into it. If you see this on other releases let me know please.
ping - any news here? Why is this happening on rhel4?
Pushed to gerrit for review. > chkconfig on RHEL4 works in a slightly different manner than newer > releases: chkconfig --levels 345 service on does create only SNNservice > links and no KNNservice. 'chkconfig service reset' before 'chkconfig ... service on' seems to do the job.
Beah already has chkconfig --add in its %post scriptlet, does that not create the necessary K* symlinks? The (current) man page suggests that it should, although maybe on RHEL4 it was different... > --add name > This option adds a new service for management by chkconfig. When a new service > is added, chkconfig ensures that the service has either a start or a kill entry > in every runlevel. If any runlevel is missing such an entry, chkconfig creates > the appropriate entry as specified by the default values in the init script.
Error in the spec file which used $ instead of %.
Thanks Dan for pointing me to the error! After you gave me the hint it was easy. I apologise to chkconfig component who is completely innocent.
Hello, Your ticket is ready for testing and is currently running on https://beaker-stage.app.eng.bos.redhat.com Please ensure your request for beaker has been adequately addressed by testing it on the above machine. Testing will be available up until COB on the 5th October. Thank you Beaker development team
It is only el4 which suffers from the issue. As a workaround insert this task to your jobs, please: /distribution/beaker/beah/misc/chk-services Will test and submit proper patch during next week.
Marian, Thank you for the workaround task. I added task /distribution/beaker/beah/misc/chk-services to the jobs, and we have results for RHEL4u9 KT1 Testing :) https://beaker.engineering.redhat.com/jobs/157186 The issue was still hit on the following recipes, in case your interested: https://beaker.engineering.redhat.com/recipes/328255 https://beaker.engineering.redhat.com/recipes/328279 https://beaker.engineering.redhat.com/recipes/328280 https://beaker.engineering.redhat.com/recipes/328263 https://beaker.engineering.redhat.com/recipes/328267 (I have cloned the recipes, in hopes of getting past the issue) Thank you for the workaround, Marian :) Best, -pbunyan
I need more time as the patch does the same as the workaround just at the install time and that apparently does not fix the issue. The patch was submitted to gerrit for review anyway and will be pushed with next or next+1.
Marion, In regard to Comment#22... Is there a fix expected for this issue with RHEL4 testing? Best, -pbunyan
The patch mentioned in comment 22 is included in beah-0.6.36. beah-0.6.38 is deployed.