Bug 1240809 - beaker-transfer gets stuck because recipe.files() XMLRPC call times out for recipes with large number of results
Summary: beaker-transfer gets stuck because recipe.files() XMLRPC call times out for r...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Beaker
Classification: Retired
Component: lab controller
Version: 20
Hardware: Unspecified
OS: Unspecified
urgent
unspecified
Target Milestone: 22.1
Assignee: matt jia
QA Contact: tools-bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-07-07 19:57 UTC by drohwer
Modified: 2016-02-01 04:30 UTC (History)
10 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2016-02-01 04:30:40 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1293007 0 unspecified CLOSED [RFE] enforce a server-side limit on number of results in a recipe 2021-02-22 00:41:40 UTC

Internal Links: 1293007

Comment 6 Dan Callaghan 2015-07-08 00:35:00 UTC
In the case which triggered this bug, we had a single task with 7422 results, each of which had a number of log files. Because recipe.files() currently issues one query per recipe, per task, and per result, this means that the response takes too long to generate (going back to the DB so many times) and therefore it hits the 120 second request timeout in beaker-transfer.

Comment 7 Dan Callaghan 2015-12-19 02:30:00 UTC
Note to self, fix this immediately so nobody ever has to spend their Saturday repeating all of this debugging and recovery again!

Comment 8 matt jia 2016-01-08 05:37:58 UTC
(In reply to Dan Callaghan from comment #6)
> In the case which triggered this bug, we had a single task with 7422
> results, each of which had a number of log files. Because recipe.files()
> currently issues one query per recipe, per task, and per result, this means
> that the response takes too long to generate (going back to the DB so many
> times) and therefore it hits the 120 second request timeout in
> beaker-transfer.

With this patch:

   http://gerrit.beaker-project.org/#/c/4575/

The response only takes 7 seconds instead of 33 seconds on a zero traffic beaker box. And I have also simulated a heavy load beaker box and it does not time out either.

Comment 9 matt jia 2016-01-08 06:34:47 UTC
(In reply to Dan Callaghan from comment #4)
> We should some better debug logs in beaker-transfer, specifically so that we
> can see each recipe as it tries to list and then transfer the files. That
> would have made it quite easy to spot which recipe it was getting stuck on
> in this case.

Add a log message for this comment:

  http://gerrit.beaker-project.org/#/c/4576/

Comment 10 matt jia 2016-01-14 06:53:05 UTC
> With this patch:
> 
>    http://gerrit.beaker-project.org/#/c/4575/
> 

This fix should go to release-22:

   http://gerrit.beaker-project.org/#/c/4586/

Comment 12 Dan Callaghan 2016-01-19 06:42:46 UTC
(In reply to matt jia from comment #9)

This patch is a more general overhaul of all the log messages in beaker-transfer, to tidy up some messes. This should give us a better idea of exactly what it's doing, plus better messages if rsync fails.

http://gerrit.beaker-project.org/4601

It will also help with verifying the beaker-transfer bugs we are working on.

Comment 15 Dan Callaghan 2016-02-01 04:30:40 UTC
Beaker 22.1 has been released.


Note You need to log in before you can comment on or make changes to this bug.