Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1098564

Summary:	worker-timeout can cause httpd thread stalls
Product:	[JBoss] JBoss Enterprise Application Platform 6	Reporter:	Aaron Ogburn <aogburn>
Component:	mod_cluster	Assignee:	Jean-frederic Clere <jclere>
Status:	CLOSED CURRENTRELEASE	QA Contact:	Michal Karm Babacek <mbabacek>
Severity:	high	Docs Contact:	Russell Dickenson <rdickens>
Priority:	unspecified
Version:	6.3.0	CC:	mbabacek
Target Milestone:	ER9
Target Release:	EAP 6.3.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2014-08-06 14:37:03 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1100066

Description Aaron Ogburn 2014-05-16 14:50:11 UTC

Description of problem:

Setting a modcluster worker-timeout can stall requests and threads on the httpd side when the requests are received with workers in a down state. A stack of the problem thread looks like the following (recursive loops through mod_proxy_cluster from #160 to #2):

#0 0x00007ff8eb547533 in select () from /lib64/libc.so.6
#1 0x00007ff8eba39185 in apr_sleep () from /usr/lib64/libapr-1.so.0
#2 0x00007ff8e84be0d1 in ?? () from /etc/httpd/modules/mod_proxy_cluster.so
...
#160 0x00007ff8e84beb9f in ?? () from /etc/httpd/modules/mod_proxy_cluster.so
#161 0x00007ff8e88d2116 in proxy_run_pre_request () from /etc/httpd/modules/mod_proxy.so
#162 0x00007ff8e88d9186 in ap_proxy_pre_request () from /etc/httpd/modules/mod_proxy.so
#163 0x00007ff8e88d63c2 in ?? () from /etc/httpd/modules/mod_proxy.so


Version-Release number of selected component (if applicable):

1.2.8.Final


How reproducible:

Very


Steps to Reproduce:

1) Configure jboss with worker-timeout="1" in the modcluster subsystem
2) Start httpd and JBoss
3) Confirm JBoss is reachable through httpd/mod_cluster then kill JBoss so the mod_cluster worker-timeout retry logic is used
4) Load up httpd with requests for JBoss (a couple seconds holding refresh in a browser even will do the trick)

Then check for stalled requests/threads. Each request should finish by ~1 second. But this could take minutes once stalled. You can check access logs with %T to check response times once they're done, pstack to check threads, or the mod_status page (it'll show may threads in W state with many seconds since their requests started, which keeps growing)..


Actual results:

httpd threads can stall with a worker-timeout configured when JBoss nodes are down.

Expected results:

httpd threads don't stall.

Comment 1 Aaron Ogburn 2014-05-16 14:52:29 UTC

Fixed per MODCLUSTER-407 so will need a component upgrade

Comment 2 Michal Karm Babacek 2014-07-02 10:51:21 UTC

Will be verified with ER9.

Comment 3 Michal Karm Babacek 2014-07-11 19:33:05 UTC

The fix is present. For more info on reproducibility see: BZ 1100066.

Comment 4 JBoss JIRA Server 2014-08-07 16:03:15 UTC

Michal Babacek <mbabacek> updated the status of jira MODCLUSTER-407 to Closed