Bug 971500 - Sporadic web request failure under high load for Rails applications
Summary: Sporadic web request failure under high load for Rails applications
Alias: None
Product: JBoss Enterprise WFK Platform 2
Classification: Retired
Component: TorqueBox (Show other bugs)
(Show other bugs)
Version: 2.3.0
Hardware: Unspecified Unspecified
Target Milestone: CR1
: 2.3.0
Assignee: Ben Browning
QA Contact: Marek Schmidt
Depends On:
TreeView+ depends on / blocked
Reported: 2013-06-06 16:30 UTC by Ben Browning
Modified: 2013-07-16 10:57 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: A race condition in the TorqueBox page caching implementation can cause sporadic web request failure under high load. Consequence: Rails applications may experience unexpected 500 responses under high load with stack traces that contain references to org.projectodd.polyglot.web.servlet.StaticResourceServlet and org.apache.naming.resources.ResourceCache. Rack applications are not impacted. Fix: The race condition was resolved by synchronizing access to a shared resource. Result: Rails applications no longer have the potential to throw errors from StaticResourceServlet under highl oad.
Story Points: ---
Clone Of:
Last Closed: 2013-07-16 10:57:52 UTC
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

Description Ben Browning 2013-06-06 16:30:24 UTC
This is a product clone of the TorqueBox community bug https://issues.jboss.org/browse/TORQUE-1099

We have a race condition for Rails applications that manifests itself under high web request load. I've only been able to reproduce this for Rails application using page caching, but the potential is there for it to happen for all Rails applications. There is no known workaround right now.

The fix, in jboss-polyglot, is https://github.com/projectodd/jboss-polyglot/commit/775b1da2f1090391b5c754cba3183be6f433ccb2

Steps to reproduce copied from the upstream bug:

* Deploy a Rails application in production mode at the root context to TorqueBox
* Start TorqueBox
* Generate and sustain a high request load against an invalid URL for that Rails application (something that will 404) - ie http://localhost:8080/foo
* After the server is under load, create an empty foo.html file under $RAILS_ROOT/public/
* Remove the foo.html file shortly after creating it - ie touch public/foo.html && sleep 0.5 && rm public/foo.html
* You should see errors logged to the TorqueBox console or server.log with org.apache.naming.resources.ResourceCache near the top of the stack traces.
* You may need to stop TorqueBox, start it again, and do the whole touching / removing foo.html process again to display the problem. With a high enough load it seems to happen every time for me.

Comment 3 Matous Jobanek 2013-07-02 08:20:37 UTC
I cannot reproduce this bug (maybe due to some differences between my and your application or setting), but it seems that it should be fixed.

Note You need to log in before you can comment on or make changes to this bug.