From Bugzilla Helper: User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.4) Gecko/20011128 Netscape6/6.2.1 Description of problem: Under ver high IO load (with 40 odd disks connected) on a 4 processor IA-64 machine (Intel's Lion), the machine hangs after about 16-20 hours of run. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1.Run IO and memory hog test suite for about 16-20 hours and the kernel hangs. 2. 3. Actual Results: The debug showed that there is a deadlock condition in kernel that gets gerneated with 2.4.9-18 (released RH7.2 kernel) Processor 1: In truncate_inode_pages(): holds mapping->page_lock & tries to get page cache lock (pg_lock) (inside truncate_list_pages()) Processor 2: In reclaim_page(): gets page cache lock (pg_lock) & tries to get mapping->page_lock. The problem is in truncate_inode_pages() where it tries to get the pg_lock and if it can not get the pg_lock it does not does not do any recovery (i.e. release the mapping->page_lock and retry). We have applied the following patch to correct the problem and would like to know if you see any problem. --- mm/filemap.c.org Tue Apr 2 08:25:37 2002 +++ mm/filemap.c Tue Apr 2 08:38:58 2002 @@ -366,7 +366,10 @@ pg_lock = PAGECACHE_LOCK(page); - if (!spin_trylock(pg_lock)) { + if (!spin_trylock(pg_lock)) { + spin_unlock(&mapping->page_lock); + barrier(); + spin_lock(&mapping->page_lock); return 1; } Additional info:
The patch is correct and will be included in the next errata kernel for 7.x/ia64. Note that we recommend to use at least version -31 of the kernel due to the security problems found in -18 (and a few other minor bugs that are fixed)
I did not notice this Bugzilla# in the errata kernel "fixed" list at http://rhn.redhat.com/errata/RHBA-2002-104.html. Please confirm that this is fixed in the Itanium errata kernel posted on 6/4/02.
Thanks for the bug report. However, Red Hat no longer maintains this version of the product. Please upgrade to the latest version and open a new bug if the problem persists. The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, and if you believe this bug is interesting to them, please report the problem in the bug tracker at: http://bugzilla.fedora.us/