Bug 461871

Summary: processes holds mmap_sem
Product: Red Hat Enterprise Linux 4 Reporter: Wade Mealing <wmealing>
Component: kernelAssignee: Josef Bacik <jbacik>
Status: CLOSED DUPLICATE QA Contact: Martin Jenner <mjenner>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 4.9CC: esandeen, tao, vgoyal
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-09-12 01:28:00 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Wade Mealing 2008-09-11 02:11:22 UTC
Description of problem:

A multi threaded single process writing to an ext3 file system obtain a semaphore (i_sem in dio_refill_pages() ) and sits blocked on obtaining the process's mmap_sem.

A another thread is faulted taking mmap_sem for read and ends up sleeping indefinitely in start_this_handle().

There is a third thread doing a sys_mmap or sys_unmap, blocked waiting to for access to the processes mmap_sem, therefore creating a deadlock.

Nesting order of paired locks must always be preserved or risk deadlock.  For i_sem and mmap_sem, i_sem must always be held first.  

My explanation is probably not the best, so here is an example trace of a process in the stuck state below.

===========================
      Thread 1 [pid: 31776]
      --------
      sys_io_submit
      io_submit_one
      aio_run_iocb
      aio_pwrite
      ext3_file_write
      generic_file_aio_write             [SUCCESS: down(i_sem);]
      generic_file_aio_write_nolock
      __generic_file_aio_write_nolock
      generic_file_direct_write
      generic_file_direct_IO
      ext3_direct_IO
      __blockdev_direct_IO
      dio_get_page
      dio_refill_pages
      down_read(mmap_sem)                [BLOCKED]

      Thread 2 [pid: 11951]
      --------
      do_page_fault
      handle_mm_fault                    [SUCCESS: down_read(mmap_sem);]
      do_no_page
      alloc_page_vma
      __alloc_pages
      try_to_free_pages
      shrink_caches
      shrink_zone
      shrink_cache
      shrink_list
      pageout
      ext3_ordered_writepage
      ext3_journal_start
      ext3_journal_start_sb
      journal_start
      start_this_handle                  [wait for j_wait_transaction_locked]
      schedule

      Thread 3 [pid: 11885]
      --------
      sys_mmap
      __down_write(mmap_sem)             [BLOCKED]

      Thread 4 [pid: 31790]
      --------
      sys_fdatasync
      ext3_sync_file
      sync_inode
      __writeback_single_inode
      ext3_force_commit
      journal_force_commit
      journal_stop
      log_wait_commit                    [wait for j_wait_done_commit]
      schedule

      Thread 5 [pid: 2783]
      --------
      kjournald
      journal_commit_transaction         [wait for j_wait_updates]
      schedule
      ===========================

Version-Release number of selected component (if applicable):

kernel-2.6.9-67.0.20.EL

How reproducible:

Very, very rare.

Steps to Reproduce:
1. Start a multi threaded process that writes to disk using direct IO.
2. One thread writing to disk, the other reading from /proc/<pid>/cmdline
3. Wait about 30,000 hours, it should happen a few times.
  
Actual results:

Process blocks on read from file in proc.

Additional info:

The patch to be attached is an attempt of a backport of the patch in 2.6.25 code change ("ext3: fix lock inversion in direct IO" commit bd1939de9061dbc5cac44ffb4425aaf4c9b894f1).

I don't have the machine hours to test this in the same way that our customer has, although they believe that this modification solves the issue at hand.

Comment 5 Josef Bacik 2008-09-12 01:22:52 UTC
hrm, i could have sworn i fixed this already, let me dig up the bz.

Comment 6 Josef Bacik 2008-09-12 01:28:00 UTC

*** This bug has been marked as a duplicate of bug 381221 ***