Bug 721044 - jbd2: Improve scalability by not taking j_state_lock in jbd2_journal_stop() fix missing from RHEL6 kernel.
Summary: jbd2: Improve scalability by not taking j_state_lock in jbd2_journal_stop() f...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Larry Woodman
QA Contact: Eryu Guan
URL:
Whiteboard:
Depends On: 741979
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-07-13 15:18 UTC by Larry Woodman
Modified: 2011-12-06 13:49 UTC (History)
2 users (show)

Fixed In Version: kernel-2.6.32-183.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-12-06 13:49:48 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:1530 0 normal SHIPPED_LIVE Moderate: Red Hat Enterprise Linux 6 kernel security, bug fix and enhancement update 2011-12-06 01:45:35 UTC

Description Larry Woodman 2011-07-13 15:18:14 UTC
Description of problem:

RHEL6.2 kernel is missing upstream commit c35a56a090eacefca07afeb994029b57d8dd8025


commit c35a56a090eacefca07afeb994029b57d8dd8025
Author: Theodore Ts'o <tytso>
Date:   Sun May 16 05:00:00 2010 -0400

    jbd2: Improve scalability by not taking j_state_lock in jbd2_journal_stop()
    
    One of the most contended locks in the jbd2 layer is j_state_lock when
    running dbench.  This is especially true if using the real-time kernel
    with its "sleeping spinlocks" patch that replaces spinlocks with
    priority inheriting mutexes --- but it also shows up on large SMP
    benchmarks.

Version-Release number of selected component (if applicable):

6.2

How reproducible:

Bug 713953 - AIM7 runs 2X faster on 2.6.32.y than RHEL6.1 includes an NMI
traceback inicating this patch is missing from RHEL6:

NMI stack dump cpu 292:
Pid: 284673, comm: multitask Not tainted 2.6.32-131.0.15.el6.x86_64 #1
Call Trace:
 <NMI>  [<ffffffff81030743>] ? uv_handle_nmi+0x53/0x70
 [<ffffffff814e0cf5>] ? notifier_call_chain+0x55/0x80
 [<ffffffff814e0d5a>] ? atomic_notifier_call_chain+0x1a/0x20
 [<ffffffff810940fe>] ? notify_die+0x2e/0x30
 [<ffffffff814de9bf>] ? do_nmi+0x1cf/0x2b0
 [<ffffffff814de270>] ? nmi+0x20/0x30
 [<ffffffff814ddae1>] ? _spin_lock+0x21/0x30
 <<EOE>>  [<ffffffffa00c1970>] ? jbd2_journal_stop+0x160/0x2e0 [jbd2]
 [<ffffffffa00f3913>] ? ext4_mark_inode_dirty+0x83/0x1d0 [ext4]
 [<ffffffffa010b7d8>] ? __ext4_journal_stop+0x68/0xa0 [ext4]
 [<ffffffffa00f3bdf>] ? ext4_dirty_inode+0x4f/0x60 [ext4]
 [<ffffffff8119b5ab>] ? __mark_inode_dirty+0x3b/0x160
 [<ffffffff8118be92>] ? file_update_time+0xf2/0x170
 [<ffffffff8110f5f0>] ? __generic_file_aio_write+0x220/0x480
 [<ffffffff8110f8bf>] ? generic_file_aio_write+0x6f/0xe0
 [<ffffffffa00ed2a1>] ? ext4_file_write+0x61/0x1e0 [ext4]
 [<ffffffff81050be3>] ? enqueue_task_fair+0x43/0x90
 [<ffffffff8117241a>] ? do_sync_write+0xfa/0x140
 [<ffffffff810573ce>] ? activate_task+0x2e/0x40
 [<ffffffff8108e160>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8117efb5>] ? putname+0x35/0x50
 [<ffffffff812051a6>] ? security_file_permission+0x16/0x20
 [<ffffffff81172718>] ? vfs_write+0xb8/0x1a0
 [<ffffffff810d1b62>] ? audit_syscall_entry+0x272/0x2a0
 [<ffffffff81173151>] ? sys_write+0x51/0x90
 [<ffffffff8100b172>] ? system_call_fastpath+0x16/0x1b


Steps to Reproduce:
1.
2.
3.
  
Actual results:

AIM7 runs 1/2 expected speed on RHEL6

Expected results:

2x speedup of AIM7

Additional info:

Comment 1 Larry Woodman 2011-07-13 15:20:39 UTC
This upstream patch seem to be missing from RHEL6:

commit c35a56a090eacefca07afeb994029b57d8dd8025
Author: Theodore Ts'o <tytso>
Date:   Sun May 16 05:00:00 2010 -0400

    jbd2: Improve scalability by not taking j_state_lock in jbd2_journal_stop()

    One of the most contended locks in the jbd2 layer is j_state_lock when
    running dbench.  This is especially true if using the real-time kernel
    with its "sleeping spinlocks" patch that replaces spinlocks with
    priority inheriting mutexes --- but it also shows up on large SMP
    benchmarks.

    Thanks to John Stultz for pointing this out.

    Reviewed by Mingming Cao and Jan Kara.

    Signed-off-by: "Theodore Ts'o" <tytso>

diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index bfc70f5..e214d68 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -1311,7 +1311,6 @@ int jbd2_journal_stop(handle_t *handle)
        if (handle->h_sync)
                transaction->t_synchronous_commit = 1;
        current->journal_info = NULL;
-       spin_lock(&journal->j_state_lock);
        spin_lock(&transaction->t_handle_lock);
        transaction->t_outstanding_credits -= handle->h_buffer_credits;
        transaction->t_updates--;
@@ -1340,8 +1339,7 @@ int jbd2_journal_stop(handle_t *handle)
                jbd_debug(2, "transaction too old, requesting commit for "
                                        "handle %p\n", handle);
                /* This is non-blocking */
-               __jbd2_log_start_commit(journal, transaction->t_tid);
-               spin_unlock(&journal->j_state_lock);
+               jbd2_log_start_commit(journal, transaction->t_tid);

                /*
                 * Special case: JBD2_SYNC synchronous updates require us
@@ -1351,7 +1349,6 @@ int jbd2_journal_stop(handle_t *handle)
                        err = jbd2_log_wait_commit(journal, tid);
        } else {
                spin_unlock(&transaction->t_handle_lock);
-               spin_unlock(&journal->j_state_lock);
        }

        lock_map_release(&handle->h_lockdep_map);

Comment 4 Kyle McMartin 2011-08-09 12:18:47 UTC
Patch(es) available on kernel-2.6.32-183.el6

Comment 8 errata-xmlrpc 2011-12-06 13:49:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2011-1530.html


Note You need to log in before you can comment on or make changes to this bug.