Bug 146860 - directory lookup contention for dcache_lock
directory lookup contention for dcache_lock
Status: CLOSED WONTFIX
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
3.0
i686 Linux
medium Severity low
: ---
: ---
Assigned To: Alexander Viro
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-02-01 23:57 EST by Kurtis Rader
Modified: 2007-11-30 17:07 EST (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-10-19 15:07:56 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
source and makefile to recreate problem (10.00 KB, application/octet-stream)
2005-02-01 23:58 EST, Kurtis Rader
no flags Details

  None (edit)
Description Kurtis Rader 2005-02-01 23:57:25 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5)
Gecko/20041111 Firefox/1.0

Description of problem:
A high pathname lookup rate by two tasks for objects in the same
directory, on a SMP system, can result in one or more CPUs spinning in
kernel mode for extended periods.  I found that running a single
instance of the attached  program would sporadically induce high CPU
load.  I found I could induce  the problem fairly reliably by forcing
each process to run on a different CPU.  For example,

    taskset 01 ./testd_c ; taskset 02 ./testd_c

This will often leave one CPU idle and the other 100% in system mode.
Occassionally both CPUs will be 100% busy in system mode.

A profiled kernel shows the following:

c029fa10 6767     0.955674    direct_strncpy_from_user
c017ee1a 9521     1.34461     .text.lock.dcache
c017d710 10661    1.50561     dput
c0172750 11062    1.56224     path_release
c029fc95 12025    1.69824     .text.lock.dec_and_lock
c0172620 13250    1.87124     permission
c0172ac0 13842    1.95484     link_path_walk
c0173470 39061    5.51641     path_init
c029fc50 75117    10.6084     atomic_dec_and_lock
c017e290 96373    13.6103     d_lookup

The atomic_dec_and_lock() is apparently from this statement at the top
of the dput() function:

        if (!atomic_dec_and_lock(&dentry->d_count, &dcache_lock))
                return;

The problem does occur on the older RHEL 2.1 kernels but much less
frequently.


Version-Release number of selected component (if applicable):
kernel-2.4.21-27.ELsmp

How reproducible:
Sometimes

Steps to Reproduce:
Run the attached program on a system with two CPUs. Force each task to
run on a different CPU using the taskset(1) command.    

Actual Results:  Occasionally one task will spend 100% of its time in
kernel mode for extended periods.

Expected Results:  Accumulated CPU time per task will increase at a
steady rate in direct proportion to the number of stat() calls.

Additional info:
Comment 1 Kurtis Rader 2005-02-01 23:58:51 EST
Created attachment 110543 [details]
source and makefile to recreate problem
Comment 2 Kurtis Rader 2005-02-03 18:14:57 EST
Please note that the contention, while sporadic, is severe when it
does occur. When it does occur a task can spin in the kernel
(apparently trying to acquire dcache_lock) for upwards of ten seconds.
Comment 3 RHEL Product and Program Management 2007-10-19 15:07:56 EDT
This bug is filed against RHEL 3, which is in maintenance phase.
During the maintenance phase, only security errata and select mission
critical bug fixes will be released for enterprise products. Since
this bug does not meet that criteria, it is now being closed.
 
For more information of the RHEL errata support policy, please visit:
http://www.redhat.com/security/updates/errata/
 
If you feel this bug is indeed mission critical, please contact your
support representative. You may be asked to provide detailed
information on how this bug is affecting you.

Note You need to log in before you can comment on or make changes to this bug.