Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1240809

Summary: beaker-transfer gets stuck because recipe.files() XMLRPC call times out for recipes with large number of results
Product: [Retired] Beaker Reporter: drohwer
Component: lab controllerAssignee: matt jia <mjia>
Status: CLOSED CURRENTRELEASE QA Contact: tools-bugs <tools-bugs>
Severity: unspecified Docs Contact:
Priority: urgent    
Version: 20CC: alemay, dcallagh, dowang, dpaz, ebaak, fkolacek, jwalters, mboswell, mjia, rjoost
Target Milestone: 22.1Keywords: Patch
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-02-01 04:30:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 6 Dan Callaghan 2015-07-08 00:35:00 UTC
In the case which triggered this bug, we had a single task with 7422 results, each of which had a number of log files. Because recipe.files() currently issues one query per recipe, per task, and per result, this means that the response takes too long to generate (going back to the DB so many times) and therefore it hits the 120 second request timeout in beaker-transfer.

Comment 7 Dan Callaghan 2015-12-19 02:30:00 UTC
Note to self, fix this immediately so nobody ever has to spend their Saturday repeating all of this debugging and recovery again!

Comment 8 matt jia 2016-01-08 05:37:58 UTC
(In reply to Dan Callaghan from comment #6)
> In the case which triggered this bug, we had a single task with 7422
> results, each of which had a number of log files. Because recipe.files()
> currently issues one query per recipe, per task, and per result, this means
> that the response takes too long to generate (going back to the DB so many
> times) and therefore it hits the 120 second request timeout in
> beaker-transfer.

With this patch:

   http://gerrit.beaker-project.org/#/c/4575/

The response only takes 7 seconds instead of 33 seconds on a zero traffic beaker box. And I have also simulated a heavy load beaker box and it does not time out either.

Comment 9 matt jia 2016-01-08 06:34:47 UTC
(In reply to Dan Callaghan from comment #4)
> We should some better debug logs in beaker-transfer, specifically so that we
> can see each recipe as it tries to list and then transfer the files. That
> would have made it quite easy to spot which recipe it was getting stuck on
> in this case.

Add a log message for this comment:

  http://gerrit.beaker-project.org/#/c/4576/

Comment 10 matt jia 2016-01-14 06:53:05 UTC
> With this patch:
> 
>    http://gerrit.beaker-project.org/#/c/4575/
> 

This fix should go to release-22:

   http://gerrit.beaker-project.org/#/c/4586/

Comment 12 Dan Callaghan 2016-01-19 06:42:46 UTC
(In reply to matt jia from comment #9)

This patch is a more general overhaul of all the log messages in beaker-transfer, to tidy up some messes. This should give us a better idea of exactly what it's doing, plus better messages if rsync fails.

http://gerrit.beaker-project.org/4601

It will also help with verifying the beaker-transfer bugs we are working on.

Comment 15 Dan Callaghan 2016-02-01 04:30:40 UTC
Beaker 22.1 has been released.