Bug 461871

Summary:	processes holds mmap_sem
Product:	Red Hat Enterprise Linux 4	Reporter:	Wade Mealing <wmealing>
Component:	kernel	Assignee:	Josef Bacik <jbacik>
Status:	CLOSED DUPLICATE	QA Contact:	Martin Jenner <mjenner>
Severity:	urgent	Docs Contact:
Priority:	urgent
Version:	4.9	CC:	esandeen, tao, vgoyal
Target Milestone:	rc
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2008-09-12 01:28:00 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Wade Mealing 2008-09-11 02:11:22 UTC

Description of problem:

A multi threaded single process writing to an ext3 file system obtain a semaphore (i_sem in dio_refill_pages() ) and sits blocked on obtaining the process's mmap_sem.

A another thread is faulted taking mmap_sem for read and ends up sleeping indefinitely in start_this_handle().

There is a third thread doing a sys_mmap or sys_unmap, blocked waiting to for access to the processes mmap_sem, therefore creating a deadlock.

Nesting order of paired locks must always be preserved or risk deadlock.  For i_sem and mmap_sem, i_sem must always be held first.  

My explanation is probably not the best, so here is an example trace of a process in the stuck state below.

===========================
      Thread 1 [pid: 31776]
      --------
      sys_io_submit
      io_submit_one
      aio_run_iocb
      aio_pwrite
      ext3_file_write
      generic_file_aio_write             [SUCCESS: down(i_sem);]
      generic_file_aio_write_nolock
      __generic_file_aio_write_nolock
      generic_file_direct_write
      generic_file_direct_IO
      ext3_direct_IO
      __blockdev_direct_IO
      dio_get_page
      dio_refill_pages
      down_read(mmap_sem)                [BLOCKED]

      Thread 2 [pid: 11951]
      --------
      do_page_fault
      handle_mm_fault                    [SUCCESS: down_read(mmap_sem);]
      do_no_page
      alloc_page_vma
      __alloc_pages
      try_to_free_pages
      shrink_caches
      shrink_zone
      shrink_cache
      shrink_list
      pageout
      ext3_ordered_writepage
      ext3_journal_start
      ext3_journal_start_sb
      journal_start
      start_this_handle                  [wait for j_wait_transaction_locked]
      schedule

      Thread 3 [pid: 11885]
      --------
      sys_mmap
      __down_write(mmap_sem)             [BLOCKED]

      Thread 4 [pid: 31790]
      --------
      sys_fdatasync
      ext3_sync_file
      sync_inode
      __writeback_single_inode
      ext3_force_commit
      journal_force_commit
      journal_stop
      log_wait_commit                    [wait for j_wait_done_commit]
      schedule

      Thread 5 [pid: 2783]
      --------
      kjournald
      journal_commit_transaction         [wait for j_wait_updates]
      schedule
      ===========================

Version-Release number of selected component (if applicable):

kernel-2.6.9-67.0.20.EL

How reproducible:

Very, very rare.

Steps to Reproduce:
1. Start a multi threaded process that writes to disk using direct IO.
2. One thread writing to disk, the other reading from /proc/<pid>/cmdline
3. Wait about 30,000 hours, it should happen a few times.
  
Actual results:

Process blocks on read from file in proc.

Additional info:

The patch to be attached is an attempt of a backport of the patch in 2.6.25 code change ("ext3: fix lock inversion in direct IO" commit bd1939de9061dbc5cac44ffb4425aaf4c9b894f1).

I don't have the machine hours to test this in the same way that our customer has, although they believe that this modification solves the issue at hand.

Comment 5 Josef Bacik 2008-09-12 01:22:52 UTC

hrm, i could have sworn i fixed this already, let me dig up the bz.

Comment 6 Josef Bacik 2008-09-12 01:28:00 UTC


*** This bug has been marked as a duplicate of bug 381221 ***