Bug 865351

Summary: Cannot return the machine because of active recipe, which is already completed.
Product: [Retired] Beaker Reporter: Gabriel Szasz <gszasz>
Component: schedulerAssignee: Dan Callaghan <dcallagh>
Status: CLOSED INSUFFICIENT_DATA QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 0.9CC: asaha, dcallagh, llim, rglasz, rmancy
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: Scheduler
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-05-13 07:05:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Gabriel Szasz 2012-10-11 09:39:19 UTC
Description of problem:
I am not able to return the machine to Beaker.
https://beaker.engineering.redhat.com/view/apollo.idm.lab.bos.redhat.com

Version-Release number of selected component (if applicable):
Version - 0.9.4 

How reproducible:
It occurs once per while - it is hard to reproduce.

Steps to Reproduce:
1. Provision the machine
2. When the machine is installed, and /distribution/reservesys task is running,
log into the machine and run return2beaker.sh command
  
Actual results:
The job is marked as completed, but the machine is not returned. When I tried to return the system manually via Web UI, I am receiving following error message:

Failed to return apollo.idm.lab.bos.redhat.com: u'System has active recipe 666596'

Nevertheless, the list of active recipes does not contain recipe 666596. 

I tried to delete the completed job, but I am still getting the same error message

Expected results:
Machine should be normally returned to Beaker after the job was marked as completed.

Note:
The same bug was already filed against version 0.6.6. [1]
	
[1] https://bugzilla.redhat.com/show_bug.cgi?id=684788

Comment 1 Dan Callaghan 2012-10-11 21:40:59 UTC
If you see this again can you please report it a little sooner? There is very little I can do to investigate when the job has been deleted.

Comment 2 Gabriel Szasz 2012-10-12 09:17:41 UTC
At the time when I filed the bugreport it was still possible to ssh the apollo.idm.lab.bos.redhat.com.

The return2beaker.sh command gave me following result:

[root@apollo]# return2beaker.sh
Unable to connect to server, sleeping 5 seconds...
Unable to connect to server, sleeping 5 seconds...
Unable to connect to server, sleeping 5 seconds...
...

This was suggesting that some beah-* services on apollo.idm.lab.bos.redhat.com are not running properly. Restarting of the beah-* services solved the issue.

Comment 3 Dan Callaghan 2013-05-13 07:05:38 UTC
We fixed a lot of race conditions and inconsistency problems in Beaker 0.12, which could have led to these kinds of problems (system in use for a completed recipe). Please re-open this bug if you see the problem again in future.