1140289 – Background requests made to the broker are done under a hard-coded timeout.

Bug 1140289 - Background requests made to the broker are done under a hard-coded timeout.

Summary: Background requests made to the broker are done under a hard-coded timeout.

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Node
Sub Component:
Version:	2.1.0
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Luke Meyer
QA Contact:	libra bugs
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1108246 (view as bug list)
Depends On:	1144175 1194323
Blocks:
TreeView+	depends on / blocked

Reported:	2014-09-10 15:55 UTC by Timothy Williams
Modified:	2018-12-09 18:32 UTC (History)
CC List:	8 users (show)
Fixed In Version:	rubygem-openshift-origin-console-1.31.3-1.git.62.44c654c.el6op openshift-origin-console-1.16.3-1.git.420.987e52a.el6op
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	1144175 (view as bug list)
Environment:
Last Closed:	2014-11-03 19:54:58 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2014:1796	0	normal	SHIPPED_LIVE	Moderate: Red Hat OpenShift Enterprise 2.2 Release Advisory	2014-11-04 00:52:02 UTC

Description Timothy Williams 2014-09-10 15:55:51 UTC

Description of problem:

Background requests made to the broker from the console are performed under a hard coded timeout. A customer recently hit this timeout when requesting the /console/applications page. The user hitting the timeout was a member of many domains, each with many gears. This high number of gears/domains appears to be causing the background requests to take over 10 seconds.

Below is the error received:
-=~~~~~~~~~~~~~~~~~~~~~~~~~~=-
2014-09-08 15:55:09.410 [INFO ] Completed 500 Internal Server Error in 10002ms (pid:5886)
2014-09-08 15:55:09.414 [FATAL] AsyncAware::ThreadTimedOut (The thread #<Thread:0x007f605c2997f8 dead> (index=1) did not complete within 10 seconds.):
  openshift-origin-console (1.23.4.4) app/controllers/async_aware.rb:38:in `block in join'
  openshift-origin-console (1.23.4.4) app/controllers/async_aware.rb:34:in `map'
  openshift-origin-console (1.23.4.4) app/controllers/async_aware.rb:34:in `join'
  openshift-origin-console (1.23.4.4) app/controllers/async_aware.rb:47:in `join!'
  openshift-origin-console (1.23.4.4) app/controllers/applications_controller.rb:85:in `index'
-=~~~~~~~~~~~~~~~~~~~~~~~~~~=-

The code that was hit:
-=~~~~~~~~~~~~~~~~~~~~~~~~~~=-
  def index
    if params[:test]
      @applications = Fixtures::Applications.list
      @domains = Fixtures::Applications.list_domains
      (@applications + @domains).each{ |d| d.send(:api_identity_id=, '2') }
    else
      async{ @applications = Application.find :all, :as => current_user, :params => {:include => :cartridges} }
      async{ @domains = user_domains }
-->    join!(10)
    end
-=~~~~~~~~~~~~~~~~~~~~~~~~~~=-

The 10 second timeout is hardcoded. Usually, if this timeout is hit, there is an issue with the broker, which navigating to the broker endpoints (such as /broker/rest/domain//applications?include=cartridges) would have revealed. In this particular case, there were no errors on the broker side. The logs are clean and all endpoints (eventually) return successfully. The call simply takes longer due to the high number of domains/gears that must be presented.

Another factor that we believe is causing this, is breaking up an openshift environment into numerous datacenters, which is now supported. This customer in particular splits their environment across 2 datacenters across the US. If the broker ends up making an mcollective request to one of several activemq instances that are in another datacenter, these broker calls could take even longer.

Version-Release number of selected component (if applicable):
2.1

Comment 3 Luke Meyer 2014-09-18 15:19:17 UTC

Investigate introducing a config value at the broker for this. PRs welcome.

Comment 4 Timothy Williams 2014-09-18 21:57:48 UTC

https://github.com/openshift/origin-server/pull/5823

Comment 5 Jason DeTiberus 2014-10-08 02:36:53 UTC

http://etherpad.corp.redhat.com/puddle-2-2-2014-10-07

Comment 6 Ma xiaoqiang 2014-10-08 07:18:30 UTC

Check on puddle [2.2/2014-10-07.2]

1. Create some apps
#for i in {1..20};do rhc app create testapp$i jbossews-1 -s --no-git ; rhc cartridge scale jbossews-1 -a testapp$i --min 3;done
2. Set BACKGROUND_REQUEST_TIMEOUT to 1
#vim /etc/openshift/console.conf
<--snip-->
#RED_HAT_ACCOUNT_URL=https://www.redhat.com/wapps/ugc

#CONTACT_MAIL=openshift

BACKGROUND_REQUEST_TIMEOUT=1
<--snip-->
3. Request the /console/applications
<--snip-->
The thread #<Thread:0x00000004d67000 dead> (index=0) did not complete within 1 seconds.
<--snip-->
4. Set BACKGROUND_REQUEST_TIMEOUT to 20
5. Request the /console/applications

In the step 5, list all the applications successfully.

Comment 8 errata-xmlrpc 2014-11-03 19:54:58 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2014-1796.html

Comment 9 Jan Pazdziora (Red Hat) 2015-01-05 12:29:32 UTC

*** Bug 1108246 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.