Red Hat Bugzilla – Full Text Bug Listing
|Summary:||PHP process hung in D state with flock|
|Product:||Red Hat Cluster Suite||Reporter:||Wendy Cheng <wcheng>|
|Component:||gfs||Assignee:||Chris Feist <cfeist>|
|Status:||CLOSED ERRATA||QA Contact:||GFS Bugs <gfs-bugs>|
|Version:||3||CC:||cfeist, rkenna, tao|
|Fixed In Version:||RHBA-2006-0593||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2006-07-20 09:52:21 EDT||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Bug Depends On:|
Description Wendy Cheng 2006-03-10 07:14:17 EST
Description of problem: Field report says running Apache would sometime cause php process hung in D state within flock() system call and/or cause lock performance issue. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: Got a rough idea what could go wrong. In RHEL 3 (linux 2.4 based) kernel, flock has the following logic: 1. lock_kernel (Big Kernel Lock - BKL) 2. call filesystem-specific supplemental lock 3. handle linux vfs flock 4. unlock_kernel That BKL could be the culprit.
Comment 5 Wendy Cheng 2006-03-10 19:15:59 EST
Date: Fri, 10 Mar 2006 19:11:21 -0500 From: Wendy Cheng <firstname.lastname@example.org> To: "Treece, Britt" <Britt.Treece@savvis.net> CC: "Stanley, Jon" <email@example.com> Subject: Re: [Linux-cluster] GFS load average and locking Treece, Britt wrote: >Wendy, > >Did the sysrq-t's that I sent illustrate this problem further? I'm >hoping that they corroborate the situation that you described below. > (This reply will be logged into our ticket to get everyone in the loop.) There are three layers of code we're examining: 1. PHP layer Google searches found there were reports saying php session didn't get handled properly (https://www.redhat.com/archives/linux-cluster/2006-March/msg00069.html). Also by checking into RHEL src rpms, I find flock() is invoked as blocking call. This can lead to the following two issues:
Comment 6 Wendy Cheng 2006-03-10 19:24:47 EST
(sorry hit wrong key ... continue) 2. RHEL3 kernel flock implementation The flock() is implemented as: step1: lock_kernel (BKL) - has been removed from RHEL 4. step2: get filesystem lock (ext3 is noop while GFS calls gfs_lock) step3: get vfs layer lock (local memory fetch logic, relatively fast) step4: unlock_kernel The BKL is certainly a performance hit. However, since it will get dropped when process is not scheduled (sleeps), it will not significantly serialize the flock call as (I) previously expected. Also ext3 doesn't show signs of performance hit, look to me the issue is step 2, the GFS's glock, if we really have an issue. 3. GFS Layer GFS lock is obtained via network (to/from lock server) so it is subject to network congestion and certain level of overhead must be expected. Look to me that the (customer's) concern is that when it is waiting for the lock, it gets into D state (un-interruptible) and pumps up the system "load". The reason for this is that the flock() is invoked (by PHP) as a "blocking" call. By design, it (gfs_lock) loops around to wait for the lock to arrive where it accumulates the CPU consumption that leads to high "load". We certainly can make cosmetic changes to reduce this artificial "load" count (with some restrictions) but be aware that the average wait time will still be largely unchanged (or maybe even longer). Another solution would be for PHP to use non-blocking lock calls. Will discuss this issue with Joe Orton, our PHP maintainer. (Joe, this could be a nice white paper titled as "fine-tune php on a cluster filesystem" :) ). However, I have a fundamental question: does this "artificial" load number really affect the overall system thruput ? I can send out a test kernel for you to try out (quantify it) if you're interested. In general, I don't see we really capture the required info when the system is in sluggish states - that is, the PHP session thread traces from all GFS nodes to really understand what the threads are waiting for. What we need to do is: 1. Check out php session handling - is it really bad ? 2. Quantify BKL impact - however, base kernel team has indicated removing this from RHEL3 would be too risky. 3. Make changes to gfs_lock busy wait logic (maybe). 4. "Upgrade" PHP to use non-block flock call (to work better with cluster filesystem) (maybe). 5. The current set of sysrq do not show the real problem (it shows the artificial load as I explain above). What we really need is the thread trace when system is in slugghish state from all GFS nodes (use crash to do the job instead of sysrq-t). Also as common sense, seperate lock traffic with other traffic is a good thing, playing around with GFS tunables would be another good thing. -- Wendy
Comment 11 Wendy Cheng 2006-03-21 15:45:31 EST
RHEL4 works as expected (2nd thread/process blocks) but not RHEL3. Look like bug #1.
Comment 17 Chris Feist 2006-06-27 17:31:24 EDT
Closing bug as it has been verified fixed by the customer, should be appearing in RHEL3U8.
Comment 20 Red Hat Bugzilla 2006-07-20 09:52:21 EDT
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2006-0593.html
Comment 21 John Newbigin 2006-08-20 19:23:53 EDT
Any chance of getting the SRPM for his errata on ftp.redhat.com?