Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1098564

Summary: worker-timeout can cause httpd thread stalls
Product: [JBoss] JBoss Enterprise Application Platform 6 Reporter: Aaron Ogburn <aogburn>
Component: mod_clusterAssignee: Jean-frederic Clere <jclere>
Status: CLOSED CURRENTRELEASE QA Contact: Michal Karm Babacek <mbabacek>
Severity: high Docs Contact: Russell Dickenson <rdickens>
Priority: unspecified    
Version: 6.3.0CC: mbabacek
Target Milestone: ER9   
Target Release: EAP 6.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-08-06 14:37:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1100066    

Description Aaron Ogburn 2014-05-16 14:50:11 UTC
Description of problem:

Setting a modcluster worker-timeout can stall requests and threads on the httpd side when the requests are received with workers in a down state. A stack of the problem thread looks like the following (recursive loops through mod_proxy_cluster from #160 to #2):

#0 0x00007ff8eb547533 in select () from /lib64/libc.so.6
#1 0x00007ff8eba39185 in apr_sleep () from /usr/lib64/libapr-1.so.0
#2 0x00007ff8e84be0d1 in ?? () from /etc/httpd/modules/mod_proxy_cluster.so
...
#160 0x00007ff8e84beb9f in ?? () from /etc/httpd/modules/mod_proxy_cluster.so
#161 0x00007ff8e88d2116 in proxy_run_pre_request () from /etc/httpd/modules/mod_proxy.so
#162 0x00007ff8e88d9186 in ap_proxy_pre_request () from /etc/httpd/modules/mod_proxy.so
#163 0x00007ff8e88d63c2 in ?? () from /etc/httpd/modules/mod_proxy.so


Version-Release number of selected component (if applicable):

1.2.8.Final


How reproducible:

Very


Steps to Reproduce:

1) Configure jboss with worker-timeout="1" in the modcluster subsystem
2) Start httpd and JBoss
3) Confirm JBoss is reachable through httpd/mod_cluster then kill JBoss so the mod_cluster worker-timeout retry logic is used
4) Load up httpd with requests for JBoss (a couple seconds holding refresh in a browser even will do the trick)

Then check for stalled requests/threads. Each request should finish by ~1 second. But this could take minutes once stalled. You can check access logs with %T to check response times once they're done, pstack to check threads, or the mod_status page (it'll show may threads in W state with many seconds since their requests started, which keeps growing)..


Actual results:

httpd threads can stall with a worker-timeout configured when JBoss nodes are down.

Expected results:

httpd threads don't stall.

Comment 1 Aaron Ogburn 2014-05-16 14:52:29 UTC
Fixed per MODCLUSTER-407 so will need a component upgrade

Comment 2 Michal Karm Babacek 2014-07-02 10:51:21 UTC
Will be verified with ER9.

Comment 3 Michal Karm Babacek 2014-07-11 19:33:05 UTC
The fix is present. For more info on reproducibility see: BZ 1100066.

Comment 4 JBoss JIRA Server 2014-08-07 16:03:15 UTC
Michal Babacek <mbabacek> updated the status of jira MODCLUSTER-407 to Closed