Bug 848090 - thin server takes 100% cpu
Summary: thin server takes 100% cpu
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: CloudForms Cloud Engine
Classification: Retired
Component: aeolus-conductor
Version: 1.0.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
Assignee: Tzu-Mainn Chen
QA Contact: Dave Johnson
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-08-14 14:53 UTC by Dave Johnson
Modified: 2012-08-22 15:35 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-08-22 15:35:55 UTC
Embargoed:


Attachments (Terms of Use)

Description Dave Johnson 2012-08-14 14:53:46 UTC
Description of problem:
============================
conductor test automation around quotas continues to encounter a issue when the underlying thin server takes 100% of the cpu.  I pointed eck at the issue and he has acknowledged the issue and identified the reason why (I'll let him fill in the details) but I believe there is still some question on how to properly resolve.

Comment 1 John Eckersberg 2012-08-14 19:32:54 UTC
This seems to have something to do with eager loading in activerecord.

At pools_controller.rb:80:

Pool.includes(:deployments, :instances, :quota, :catalogs).
          list_for_user(current_session, current_user, Privilege::VIEW).
          list(sort_column(Pool), sort_direction)

Somewhere inside of this chain of calls, the process goes cpu-bound.  It's kinda hard to debug because the stack is like 100+ frames deep.  From what I can tell, it's trying to build objects for all the eager loaded records, and wire up all the inter-object relationships for them.

If instead the eager loading is removed, and the code looks like this:

Pool.list_for_user(current_session, current_user, Privilege::VIEW).
     list(sort_column(Pool), sort_direction)

The requests return in a reasonable amount of time.

This is happening on some test systems that dajo is running automation against.

Comment 2 Scott Seago 2012-08-16 05:42:10 UTC
This was added with the following commit:

commit bc9ef2f278a7d96ce3b7c02e072f141a04c89d87
Author: Tzu-Mainn Chen <tzumainn>
Date:   Thu Mar 22 11:27:24 2012 -0400

    BZ 802571 added eager loading and other minor efficiency fixes

It seems we have dueling scalability concerns here. The eager loading was added to address certain scalability concerns, and it is causing others.

It may be worth revisiting the original use case that led to the eager loading -- with recent changes in permissions queries, etc, the eager loading may not be needed anyway.

I'm reassigning to Tzumainn, since he added the original eager loading bits here. Tzumainn --  you're not the right one to resolve this, we can work w/ Angus to find the right person.

Comment 3 Dave Johnson 2012-08-22 15:35:55 UTC
I have not been able to reproduce this here lately so closing this out until it shows itself again.


Note You need to log in before you can comment on or make changes to this bug.