Bug 717500 - reserved guest doesn't return after timeout
Summary: reserved guest doesn't return after timeout
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Beaker
Classification: Retired
Component: lab controller
Version: 0.6
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
Assignee: Dan Callaghan
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-06-29 02:33 UTC by Han Pingtian
Modified: 2019-05-22 13:39 UTC (History)
5 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2011-07-14 02:07:17 UTC
Embargoed:


Attachments (Terms of Use)

Comment 1 Dan Callaghan 2011-06-29 03:33:03 UTC
It looks like there is a valid watchdog record for the system in question, with the correct kill time (which is now well past). But it was never triggered. Currently investigating why that is so.

Comment 2 Dan Callaghan 2011-06-29 04:01:32 UTC
The beaker-watchdog daemon on the lab controller in question was stuck reading from a dead HTTP connection. Apparently the system-wide default TCP timeout for established connections is 5 days(!), at least on that box, and we never set any stricter timeouts in the beaker-watchdog daemon itself. I think that is probably the real bug we should be fixing...

Comment 3 Dan Callaghan 2011-06-29 05:19:25 UTC
I was wrong, it seems we *do* set a timeout on the kobo hub transport for all the lab controller processes.

So the question is, why in this case did the timeout not kick in and prevent beaker-watchdog from getting stuck for 19 hours?

Comment 5 Dan Callaghan 2011-07-01 04:09:59 UTC
Hmm okay I thought I wrote another comment about this yesterday but perhaps I never hit save...

I think the problem is that although the Watchdog object itself has a timeout set, it creates Monitor objects which do not have the timeout set. I think it was one of those which was stuck in a read yesterday. (That explains why there was two connections open to the server, and it was the second one which was stuck.)

I think the best fix is to move the timeout setting into ProxyHelper, which is a parent class for all the objects which talk to the server. That way it will apply to Monitor as well as any other classes we have missed (or add in the future).


Note You need to log in before you can comment on or make changes to this bug.