Bug 871660 - Scaled application has "dud" gears when overlapping scale-up requests occur
Summary: Scaled application has "dud" gears when overlapping scale-up requests occur
Keywords:
Status: CLOSED DUPLICATE of bug 855307
Alias: None
Product: OKD
Classification: Red Hat
Component: Pod
Version: 2.x
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Dan McPherson
QA Contact: libra bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-10-31 01:29 UTC by Ram Ranganathan
Modified: 2015-05-15 02:07 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-10-31 13:57:00 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Double configure error logs (856.67 KB, application/x-gzip)
2012-10-31 01:29 UTC, Ram Ranganathan
no flags Details

Description Ram Ranganathan 2012-10-31 01:29:02 UTC
Created attachment 635887 [details]
Double configure error logs

Description of problem:
I was running Apache benchmark on a scaled application and noticed that there was a "dud" gear in the haproxy configuration. On investigating this, found that  configure was called twice on a gear and that gear's dns was never removed. 

Version-Release number of selected component (if applicable):
Current release (2.0.19)

How reproducible:
Sometimes

Steps to Reproduce:
1.   Create a scaled application
2.   Add mysql to the app 
3.   Use the rails quickstart  - https://github.com/openshift/rails-example
4.   Add a junk folder with 2K files (201 MB worth) -- this is to simulate a customer 
      issue on prod where there were a lot of files and sync was taking a while.
      for i in `seq 2048`; do dd if=/dev/zero of=junk-$i count=100 bs=1024; done
5.  Add, commit and push to the app.
6.  On the devenv, run apache benchmark 
        ab -n 100000 -c 23 http://$app-$namespace.dev.rhcloud.com/
7.  You may optionally manually scale up while a scale up event occurs to 
     reproduce this (seems to happen when scaleups occur in parallel)

  
Actual results:
Should expect a valid configuration and haproxy running correctly

Expected results:
Mismatch between haproxy configs (running + gear registry).

Additional info:

This also leaves a lot of defunct shell process -- run from mcollective. The reason 
is the sync gears takes a while because of the # of files + size and because ab is actually "bashing" the haproxy gear with requests. 

At the very least we should clear up the gear's dns entry -- we do delete the gear. 

In the attached logs -- look for the gear  b1db43744f6a4fd7934c165ba53373c0 and you'll see configure called twice:
at line #s 10391 and 10802 in the mcollective logs: 
10391   {:cartridge=>"ruby-1.9",
10392    :process_results=>true,
10393    :args=>"'b1db43744f' 'rr50' 'b1db43744f6a4fd7934c165ba53373c0'",
10394    :action=>"configure"},

10802   {:cartridge=>"ruby-1.9",
10803    :process_results=>true,
10804    :args=>"'b1db43744f' 'rr50' 'b1db43744f6a4fd7934c165ba53373c0'",
10805    :action=>"configure"},

on the same gear.

Comment 1 Dan McPherson 2012-10-31 13:57:00 UTC
Should be fixed with model refactor

*** This bug has been marked as a duplicate of bug 855307 ***


Note You need to log in before you can comment on or make changes to this bug.