871660 – Scaled application has "dud" gears when overlapping scale-up requests occur

Bug 871660 - Scaled application has "dud" gears when overlapping scale-up requests occur

Summary: Scaled application has "dud" gears when overlapping scale-up requests occur

Keywords:
Status:	CLOSED DUPLICATE of bug 855307
Alias:	None
Product:	OKD
Classification:	Red Hat
Component:	Pod
Sub Component:
Version:	2.x
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Dan McPherson
QA Contact:	libra bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2012-10-31 01:29 UTC by Ram Ranganathan
Modified:	2015-05-15 02:07 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2012-10-31 13:57:00 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
Double configure error logs (856.67 KB, application/x-gzip) 2012-10-31 01:29 UTC, Ram Ranganathan	no flags	Details
View All

Description Ram Ranganathan 2012-10-31 01:29:02 UTC

Created attachment 635887 [details]
Double configure error logs

Description of problem:
I was running Apache benchmark on a scaled application and noticed that there was a "dud" gear in the haproxy configuration. On investigating this, found that  configure was called twice on a gear and that gear's dns was never removed. 

Version-Release number of selected component (if applicable):
Current release (2.0.19)

How reproducible:
Sometimes

Steps to Reproduce:
1.   Create a scaled application
2.   Add mysql to the app 
3.   Use the rails quickstart  - https://github.com/openshift/rails-example
4.   Add a junk folder with 2K files (201 MB worth) -- this is to simulate a customer 
      issue on prod where there were a lot of files and sync was taking a while.
      for i in `seq 2048`; do dd if=/dev/zero of=junk-$i count=100 bs=1024; done
5.  Add, commit and push to the app.
6.  On the devenv, run apache benchmark 
        ab -n 100000 -c 23 http://$app-$namespace.dev.rhcloud.com/
7.  You may optionally manually scale up while a scale up event occurs to 
     reproduce this (seems to happen when scaleups occur in parallel)

  
Actual results:
Should expect a valid configuration and haproxy running correctly

Expected results:
Mismatch between haproxy configs (running + gear registry).

Additional info:

This also leaves a lot of defunct shell process -- run from mcollective. The reason 
is the sync gears takes a while because of the # of files + size and because ab is actually "bashing" the haproxy gear with requests. 

At the very least we should clear up the gear's dns entry -- we do delete the gear. 

In the attached logs -- look for the gear  b1db43744f6a4fd7934c165ba53373c0 and you'll see configure called twice:
at line #s 10391 and 10802 in the mcollective logs: 
10391   {:cartridge=>"ruby-1.9",
10392    :process_results=>true,
10393    :args=>"'b1db43744f' 'rr50' 'b1db43744f6a4fd7934c165ba53373c0'",
10394    :action=>"configure"},

10802   {:cartridge=>"ruby-1.9",
10803    :process_results=>true,
10804    :args=>"'b1db43744f' 'rr50' 'b1db43744f6a4fd7934c165ba53373c0'",
10805    :action=>"configure"},

on the same gear.

Comment 1 Dan McPherson 2012-10-31 13:57:00 UTC

Should be fixed with model refactor

*** This bug has been marked as a duplicate of bug 855307 ***

Note You need to log in before you can comment on or make changes to this bug.