From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5) Gecko/20041111 Firefox/1.0 Description of problem: A high pathname lookup rate by two tasks for objects in the same directory, on a SMP system, can result in one or more CPUs spinning in kernel mode for extended periods. I found that running a single instance of the attached program would sporadically induce high CPU load. I found I could induce the problem fairly reliably by forcing each process to run on a different CPU. For example, taskset 01 ./testd_c ; taskset 02 ./testd_c This will often leave one CPU idle and the other 100% in system mode. Occassionally both CPUs will be 100% busy in system mode. A profiled kernel shows the following: c029fa10 6767 0.955674 direct_strncpy_from_user c017ee1a 9521 1.34461 .text.lock.dcache c017d710 10661 1.50561 dput c0172750 11062 1.56224 path_release c029fc95 12025 1.69824 .text.lock.dec_and_lock c0172620 13250 1.87124 permission c0172ac0 13842 1.95484 link_path_walk c0173470 39061 5.51641 path_init c029fc50 75117 10.6084 atomic_dec_and_lock c017e290 96373 13.6103 d_lookup The atomic_dec_and_lock() is apparently from this statement at the top of the dput() function: if (!atomic_dec_and_lock(&dentry->d_count, &dcache_lock)) return; The problem does occur on the older RHEL 2.1 kernels but much less frequently. Version-Release number of selected component (if applicable): kernel-2.4.21-27.ELsmp How reproducible: Sometimes Steps to Reproduce: Run the attached program on a system with two CPUs. Force each task to run on a different CPU using the taskset(1) command. Actual Results: Occasionally one task will spend 100% of its time in kernel mode for extended periods. Expected Results: Accumulated CPU time per task will increase at a steady rate in direct proportion to the number of stat() calls. Additional info:
Created attachment 110543 [details] source and makefile to recreate problem
Please note that the contention, while sporadic, is severe when it does occur. When it does occur a task can spin in the kernel (apparently trying to acquire dcache_lock) for upwards of ten seconds.
This bug is filed against RHEL 3, which is in maintenance phase. During the maintenance phase, only security errata and select mission critical bug fixes will be released for enterprise products. Since this bug does not meet that criteria, it is now being closed. For more information of the RHEL errata support policy, please visit: http://www.redhat.com/security/updates/errata/ If you feel this bug is indeed mission critical, please contact your support representative. You may be asked to provide detailed information on how this bug is affecting you.