Bug 1309838

Summary: GFS2 lockdep deadlock warning (sd_log_flush_lock vs gl->gl_work.work)
Product: [Fedora] Fedora Reporter: Andrew Price <anprice>
Component: kernelAssignee: gfs2-maint
Status: CLOSED WORKSFORME QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, mchehab, rpeterso
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-10-16 16:21:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Andrew Price 2016-02-18 19:46:01 UTC
Description of problem:

Lockdep splat when creating a large number of files in a directory in a single node, 5G lock_nolock gfs2 fs with 512 byte blocks.

Version-Release number of selected component (if applicable):

4.5.0-0.rc4.git1.1.fc24.x86_64

How reproducible:

Not sure yet.

Steps to Reproduce:
1. mkfs.gfs2 -b 512 -p lock_nolock /dev/vda
2. mount /dev/vda /mnt/test
3. mkdir /mnt/test/foo
4. seq -w 1000000000 | while read i; do touch /mnt/test/foo/file$i; done

Actual results:

Lockdep output below.

Expected results:

Nothing untoward.

Additional info:

[  373.800997] =========================================================
[  373.802133] [ INFO: possible irq lock inversion dependency detected ]
[  373.803463] 4.5.0-0.rc4.git1.1.fc24.x86_64 #1 Tainted: G        W      
[  373.804872] ---------------------------------------------------------
[  373.805815] touch/17847 just changed the state of lock:
[  373.806583]  (&sdp->sd_log_flush_lock){++++.+}, at: [<ffffffffa020ed03>] gfs2_log_reserve+0x1d3/0x380 [gfs2]
[  373.808102] but this lock was taken by another, RECLAIM_FS-safe lock in the past:
[  373.809204]  ((&(&gl->gl_work)->work)){+.+.-.}

and interrupts could create inverse lock ordering between them.

[  373.810781] 
[  373.810781] other info that might help us debug this:
[  373.811737]  Possible interrupt unsafe locking scenario:
[  373.811737] 
[  373.812524]        CPU0                    CPU1
[  373.813050]        ----                    ----
[  373.813570]   lock(&sdp->sd_log_flush_lock);
[  373.814093]                                local_irq_disable();
[  373.814780]                                lock((&(&gl->gl_work)->work));
[  373.815581]                                lock(&sdp->sd_log_flush_lock);
[  373.816392]   <Interrupt>
[  373.816696]     lock((&(&gl->gl_work)->work));
[  373.817237] 
[  373.817237]  *** DEADLOCK ***
[  373.817237] 
[  373.817920] 4 locks held by touch/17847:
[  373.818367]  #0:  (sb_writers#10){.+.+.+}, at: [<ffffffff81292e74>] __sb_start_write+0xb4/0xf0
[  373.819434]  #1:  (&sb->s_type->i_mutex_key#15){+.+.+.}, at: [<ffffffff812a040f>] path_openat+0xe7f/0x1f30
[  373.820617]  #2:  (sb_internal#2){.+.+.+}, at: [<ffffffff81292e38>] __sb_start_write+0x78/0xf0
[  373.821676]  #3:  (&sdp->sd_log_flush_lock){++++.+}, at: [<ffffffffa020ed03>] gfs2_log_reserve+0x1d3/0x380 [gfs2]
[  373.822929] 
[  373.822929] the shortest dependencies between 2nd lock and 1st lock:
[  373.823839]  -> ((&(&gl->gl_work)->work)){+.+.-.} ops: 103990 {
[  373.824577]     HARDIRQ-ON-W at:
[  373.824971]                       [<ffffffff8110d8c8>] __lock_acquire+0x8f8/0x17e0
[  373.825851]                       [<ffffffff8110f17e>] lock_acquire+0xce/0x1c0
[  373.826688]                       [<ffffffff810cefa9>] process_one_work+0x219/0x690
[  373.827585]                       [<ffffffff810cf46e>] worker_thread+0x4e/0x490
[  373.828446]                       [<ffffffff810d6891>] kthread+0x101/0x120
[  373.829248]                       [<ffffffff818b845f>] ret_from_fork+0x3f/0x70
[  373.830091]     SOFTIRQ-ON-W at:
[  373.830476]                       [<ffffffff8110d8f5>] __lock_acquire+0x925/0x17e0
[  373.831363]                       [<ffffffff8110f17e>] lock_acquire+0xce/0x1c0
[  373.832216]                       [<ffffffff810cefa9>] process_one_work+0x219/0x690
[  373.833106]                       [<ffffffff810cf46e>] worker_thread+0x4e/0x490
[  373.833958]                       [<ffffffff810d6891>] kthread+0x101/0x120
[  373.834764]                       [<ffffffff818b845f>] ret_from_fork+0x3f/0x70
[  373.835599]     IN-RECLAIM_FS-W at:
[  373.836028]                          [<ffffffff8110d917>] __lock_acquire+0x947/0x17e0
[  373.836933]                          [<ffffffff8110f17e>] lock_acquire+0xce/0x1c0
[  373.837798]                          [<ffffffff810cdb2e>] flush_work+0x4e/0x320
[  373.838644]                          [<ffffffff810cde4f>] flush_delayed_work+0x4f/0x60
[  373.839571]                          [<ffffffffa0227a30>] gfs2_evict_inode+0xa0/0x430 [gfs2]
[  373.840558]                          [<ffffffff812b0308>] evict+0xb8/0x180
[  373.841356]                          [<ffffffff812b0414>] dispose_list+0x44/0x70
[  373.842216]                          [<ffffffff812b164a>] prune_icache_sb+0x5a/0x80
[  373.843125]                          [<ffffffff812937be>] super_cache_scan+0x14e/0x1a0
[  373.844051]                          [<ffffffff81205bf6>] shrink_slab.part.42+0x216/0x540
[  373.845004]                          [<ffffffff8120b255>] shrink_zone+0x2f5/0x300
[  373.845887]                          [<ffffffff8120c744>] kswapd+0x564/0xb60
[  373.846706]                          [<ffffffff810d6891>] kthread+0x101/0x120
[  373.847552]                          [<ffffffff818b845f>] ret_from_fork+0x3f/0x70
[  373.848474]     INITIAL USE at:
[  373.848858]                      [<ffffffff8110d59d>] __lock_acquire+0x5cd/0x17e0
[  373.849734]                      [<ffffffff8110f17e>] lock_acquire+0xce/0x1c0
[  373.850561]                      [<ffffffff810cefa9>] process_one_work+0x219/0x690
[  373.851448]                      [<ffffffff810cf46e>] worker_thread+0x4e/0x490
[  373.852332]                      [<ffffffff810d6891>] kthread+0x101/0x120
[  373.853127]                      [<ffffffff818b845f>] ret_from_fork+0x3f/0x70
[  373.853959]   }
[  373.854168]   ... key      at: [<ffffffffa023ced8>] __key.40559+0x0/0xffffffffffff2128 [gfs2]
[  373.855164]   ... acquired at:
[  373.855518]    [<ffffffff8110f17e>] lock_acquire+0xce/0x1c0
[  373.856545]    [<ffffffff818b58fa>] down_write+0x5a/0xc0
[  373.857339]    [<ffffffffa020f77e>] gfs2_log_flush+0x4e/0x8d0 [gfs2]
[  373.858108]    [<ffffffffa020d75c>] inode_go_sync+0x8c/0x130 [gfs2]
[  373.858853]    [<ffffffffa020b946>] do_xmote+0x176/0x340 [gfs2]
[  373.859552]    [<ffffffffa020bc00>] run_queue+0xf0/0x390 [gfs2]
[  373.860262]    [<ffffffffa020bef9>] glock_work_func+0x59/0x120 [gfs2]
[  373.861021]    [<ffffffff810cefe2>] process_one_work+0x252/0x690
[  373.861733]    [<ffffffff810cf46e>] worker_thread+0x4e/0x490
[  373.862396]    [<ffffffff810d6891>] kthread+0x101/0x120
[  373.863014]    [<ffffffff818b845f>] ret_from_fork+0x3f/0x70
[  373.863661] 
[  373.863856] -> (&sdp->sd_log_flush_lock){++++.+} ops: 297943 {
[  373.864584]    HARDIRQ-ON-W at:
[  373.864973]                     [<ffffffff8110d8c8>] __lock_acquire+0x8f8/0x17e0
[  373.865832]                     [<ffffffff8110f17e>] lock_acquire+0xce/0x1c0
[  373.866646]                     [<ffffffff818b58fa>] down_write+0x5a/0xc0
[  373.867434]                     [<ffffffffa020f77e>] gfs2_log_flush+0x4e/0x8d0 [gfs2]
[  373.868862]                     [<ffffffffa0210242>] gfs2_logd+0x242/0x2a0 [gfs2]
[  373.869820]                     [<ffffffff810d6891>] kthread+0x101/0x120
[  373.870600]                     [<ffffffff818b845f>] ret_from_fork+0x3f/0x70
[  373.871432]    HARDIRQ-ON-R at:
[  373.871821]                     [<ffffffff8110d526>] __lock_acquire+0x556/0x17e0
[  373.872667]                     [<ffffffff8110f17e>] lock_acquire+0xce/0x1c0
[  373.873486]                     [<ffffffff818b5851>] down_read+0x51/0xa0
[  373.874272]                     [<ffffffffa020ed03>] gfs2_log_reserve+0x1d3/0x380 [gfs2]
[  373.875217]                     [<ffffffffa022b05c>] gfs2_trans_begin+0xcc/0x140 [gfs2]
[  373.876157]                     [<ffffffffa021b1ac>] gfs2_create_inode+0x76c/0xfa0 [gfs2]
[  373.877116]                     [<ffffffffa021ba67>] gfs2_mkdir+0x47/0x50 [gfs2]
[  373.877989]                     [<ffffffff8129c025>] vfs_mkdir+0xc5/0x150
[  373.878792]                     [<ffffffff812a317a>] SyS_mkdir+0x7a/0xf0
[  373.879564]                     [<ffffffff818b80f2>] entry_SYSCALL_64_fastpath+0x12/0x72
[  373.880498]    SOFTIRQ-ON-W at:
[  373.880878]                     [<ffffffff8110d8f5>] __lock_acquire+0x925/0x17e0
[  373.881744]                     [<ffffffff8110f17e>] lock_acquire+0xce/0x1c0
[  373.882557]                     [<ffffffff818b58fa>] down_write+0x5a/0xc0
[  373.883351]                     [<ffffffffa020f77e>] gfs2_log_flush+0x4e/0x8d0 [gfs2]
[  373.884261]                     [<ffffffffa0210242>] gfs2_logd+0x242/0x2a0 [gfs2]
[  373.885130]                     [<ffffffff810d6891>] kthread+0x101/0x120
[  373.885916]                     [<ffffffff818b845f>] ret_from_fork+0x3f/0x70
[  373.886737]    SOFTIRQ-ON-R at:
[  373.887113]                     [<ffffffff8110d8f5>] __lock_acquire+0x925/0x17e0
[  373.887972]                     [<ffffffff8110f17e>] lock_acquire+0xce/0x1c0
[  373.888789]                     [<ffffffff818b5851>] down_read+0x51/0xa0
[  373.889561]                     [<ffffffffa020ed03>] gfs2_log_reserve+0x1d3/0x380 [gfs2]
[  373.890516]                     [<ffffffffa022b05c>] gfs2_trans_begin+0xcc/0x140 [gfs2]
[  373.891452]                     [<ffffffffa021b1ac>] gfs2_create_inode+0x76c/0xfa0 [gfs2]
[  373.892419]                     [<ffffffffa021ba67>] gfs2_mkdir+0x47/0x50 [gfs2]
[  373.893292]                     [<ffffffff8129c025>] vfs_mkdir+0xc5/0x150
[  373.894081]                     [<ffffffff812a317a>] SyS_mkdir+0x7a/0xf0
[  373.894860]                     [<ffffffff818b80f2>] entry_SYSCALL_64_fastpath+0x12/0x72
[  373.895806]    RECLAIM_FS-ON-R at:
[  373.896210]                        [<ffffffff8110ca96>] mark_held_locks+0x76/0xa0
[  373.897088]                        [<ffffffff8110fb1d>] lockdep_trace_alloc+0x7d/0xe0
[  373.897992]                        [<ffffffff811f8fe3>] __alloc_pages_nodemask+0xb3/0xe50
[  373.898938]                        [<ffffffff81252eeb>] alloc_pages_current+0x9b/0x1c0
[  373.899854]                        [<ffffffff811f3444>] __get_free_pages+0x14/0x50
[  373.901123]                        [<ffffffff810775b5>] pte_alloc_one_kernel+0x15/0x20
[  373.902240]                        [<ffffffff81228a8d>] __pte_alloc_kernel+0x1d/0x100
[  373.903157]                        [<ffffffff8123b68a>] vmap_page_range_noflush+0x2ea/0x340
[  373.904137]                        [<ffffffff8123b716>] map_vm_area+0x36/0x50
[  373.904972]                        [<ffffffff8123dcb0>] __vmalloc_node_range+0x1b0/0x270
[  373.905921]                        [<ffffffff8123ddba>] __vmalloc+0x4a/0x50
[  373.906735]                        [<ffffffffa01ffdd9>] gfs2_dir_get_hash_table+0x379/0x3e0 [gfs2]
[  373.907790]                        [<ffffffffa01ffe56>] get_leaf_nr+0x16/0x30 [gfs2]
[  373.908690]                        [<ffffffffa01fffe0>] gfs2_dirent_search+0xb0/0x1b0 [gfs2]
[  373.909678]                        [<ffffffffa02018c8>] gfs2_dir_add+0x468/0x820 [gfs2]
[  373.910611]                        [<ffffffffa021b8f9>] gfs2_create_inode+0xeb9/0xfa0 [gfs2]
[  373.911595]                        [<ffffffffa021c2fd>] gfs2_atomic_open+0x6d/0xe0 [gfs2]
[  373.912789]                        [<ffffffff812a0dc2>] path_openat+0x1832/0x1f30
[  373.913653]                        [<ffffffff812a2a61>] do_filp_open+0x91/0x100
[  373.914513]                        [<ffffffff8128e530>] do_sys_open+0x130/0x220
[  373.915368]                        [<ffffffff8128e63e>] SyS_open+0x1e/0x20
[  373.916172]                        [<ffffffff818b80f2>] entry_SYSCALL_64_fastpath+0x12/0x72
[  373.917154]    INITIAL USE at:
[  373.917523]                    [<ffffffff8110d59d>] __lock_acquire+0x5cd/0x17e0
[  373.918387]                    [<ffffffff8110f17e>] lock_acquire+0xce/0x1c0
[  373.919213]                    [<ffffffff818b5851>] down_read+0x51/0xa0
[  373.919989]                    [<ffffffffa020ed03>] gfs2_log_reserve+0x1d3/0x380 [gfs2]
[  373.920938]                    [<ffffffffa022b05c>] gfs2_trans_begin+0xcc/0x140 [gfs2]
[  373.921869]                    [<ffffffffa021b1ac>] gfs2_create_inode+0x76c/0xfa0 [gfs2]
[  373.922813]                    [<ffffffffa021ba67>] gfs2_mkdir+0x47/0x50 [gfs2]
[  373.923663]                    [<ffffffff8129c025>] vfs_mkdir+0xc5/0x150
[  373.924445]                    [<ffffffff812a317a>] SyS_mkdir+0x7a/0xf0
[  373.925230]                    [<ffffffff818b80f2>] entry_SYSCALL_64_fastpath+0x12/0x72
[  373.926172]  }
[  373.926371]  ... key      at: [<ffffffffa023d138>] __key.38384+0x0/0xffffffffffff1ec8 [gfs2]
[  373.927360]  ... acquired at:
[  373.927704]    [<ffffffff8110bbd9>] check_usage_backwards+0x149/0x160
[  373.928459]    [<ffffffff8110c810>] mark_lock+0x3d0/0x5e0
[  373.929096]    [<ffffffff8110ca96>] mark_held_locks+0x76/0xa0
[  373.929771]    [<ffffffff8110fb1d>] lockdep_trace_alloc+0x7d/0xe0
[  373.930478]    [<ffffffff811f8fe3>] __alloc_pages_nodemask+0xb3/0xe50
[  373.931231]    [<ffffffff81252eeb>] alloc_pages_current+0x9b/0x1c0
[  373.931955]    [<ffffffff811f3444>] __get_free_pages+0x14/0x50
[  373.932637]    [<ffffffff810775b5>] pte_alloc_one_kernel+0x15/0x20
[  373.933365]    [<ffffffff81228a8d>] __pte_alloc_kernel+0x1d/0x100
[  373.934077]    [<ffffffff8123b68a>] vmap_page_range_noflush+0x2ea/0x340
[  373.934847]    [<ffffffff8123b716>] map_vm_area+0x36/0x50
[  373.935480]    [<ffffffff8123dcb0>] __vmalloc_node_range+0x1b0/0x270
[  373.936223]    [<ffffffff8123ddba>] __vmalloc+0x4a/0x50
[  373.936839]    [<ffffffffa01ffdd9>] gfs2_dir_get_hash_table+0x379/0x3e0 [gfs2]
[  373.937677]    [<ffffffffa01ffe56>] get_leaf_nr+0x16/0x30 [gfs2]
[  373.938382]    [<ffffffffa01fffe0>] gfs2_dirent_search+0xb0/0x1b0 [gfs2]
[  373.939164]    [<ffffffffa02018c8>] gfs2_dir_add+0x468/0x820 [gfs2]
[  373.939899]    [<ffffffffa021b8f9>] gfs2_create_inode+0xeb9/0xfa0 [gfs2]
[  373.940679]    [<ffffffffa021c2fd>] gfs2_atomic_open+0x6d/0xe0 [gfs2]
[  373.941435]    [<ffffffff812a0dc2>] path_openat+0x1832/0x1f30
[  373.942108]    [<ffffffff812a2a61>] do_filp_open+0x91/0x100
[  373.942760]    [<ffffffff8128e530>] do_sys_open+0x130/0x220
[  373.943405]    [<ffffffff8128e63e>] SyS_open+0x1e/0x20
[  373.944009]    [<ffffffff818b80f2>] entry_SYSCALL_64_fastpath+0x12/0x72
[  373.944776] 
[  373.944956] 
[  373.944956] stack backtrace:
[  373.945458] CPU: 0 PID: 17847 Comm: touch Tainted: G        W       4.5.0-0.rc4.git1.1.fc24.x86_64 #1
[  373.946507] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  373.947541]  0000000000000086 00000000f4065c31 ffff880025fc3560 ffffffff8144daf5
[  373.949146]  ffffffff82f7b540 ffff880025fc35c0 ffff880025fc35a0 ffffffff811eac2a
[  373.950733]  ffff88003b908d50 ffff88003b908d50 ffff88003b908000 ffffffff81ca402f
[  373.952330] Call Trace:
[  373.952862]  [<ffffffff8144daf5>] dump_stack+0x86/0xc1
[  373.953955]  [<ffffffff811eac2a>] print_irq_inversion_bug.part.41+0x1a5/0x1b1
[  373.955444]  [<ffffffff8110bbd9>] check_usage_backwards+0x149/0x160
[  373.956766]  [<ffffffff81026b59>] ? sched_clock+0x9/0x10
[  373.957874]  [<ffffffff8110c810>] mark_lock+0x3d0/0x5e0
[  373.958973]  [<ffffffff8110ba90>] ? check_usage_forwards+0x160/0x160
[  373.960297]  [<ffffffff8110ca96>] mark_held_locks+0x76/0xa0
[  373.961459]  [<ffffffff8110fb1d>] lockdep_trace_alloc+0x7d/0xe0
[  373.962705]  [<ffffffff811f8fe3>] __alloc_pages_nodemask+0xb3/0xe50
[  373.964027]  [<ffffffff8112cb05>] ? rcu_read_lock_sched_held+0x45/0x80
[  373.965394]  [<ffffffff811f92f7>] ? __alloc_pages_nodemask+0x3c7/0xe50
[  373.966758]  [<ffffffff81252eeb>] alloc_pages_current+0x9b/0x1c0
[  373.968014]  [<ffffffff811f3444>] __get_free_pages+0x14/0x50
[  373.969192]  [<ffffffff810775b5>] pte_alloc_one_kernel+0x15/0x20
[  373.970458]  [<ffffffff81228a8d>] __pte_alloc_kernel+0x1d/0x100
[  373.971704]  [<ffffffff8123b68a>] vmap_page_range_noflush+0x2ea/0x340
[  373.972919]  [<ffffffff8123b716>] map_vm_area+0x36/0x50
[  373.973516]  [<ffffffff8123dcb0>] __vmalloc_node_range+0x1b0/0x270
[  373.974229]  [<ffffffffa01ffdd9>] ? gfs2_dir_get_hash_table+0x379/0x3e0 [gfs2]
[  373.975056]  [<ffffffff8121bee1>] ? kmalloc_order_trace+0xd1/0x140
[  373.975763]  [<ffffffffa01ff000>] ? gfs2_check_dirent+0xf0/0xf0 [gfs2]
[  373.976509]  [<ffffffff8123ddba>] __vmalloc+0x4a/0x50
[  373.977095]  [<ffffffffa01ffdd9>] ? gfs2_dir_get_hash_table+0x379/0x3e0 [gfs2]
[  373.977928]  [<ffffffffa01ffdd9>] gfs2_dir_get_hash_table+0x379/0x3e0 [gfs2]
[  373.978741]  [<ffffffffa022b551>] ? gfs2_trans_add_meta+0xa1/0x270 [gfs2]
[  373.979517]  [<ffffffffa01ff000>] ? gfs2_check_dirent+0xf0/0xf0 [gfs2]
[  373.980269]  [<ffffffffa01ffe56>] get_leaf_nr+0x16/0x30 [gfs2]
[  373.980944]  [<ffffffffa01fffe0>] gfs2_dirent_search+0xb0/0x1b0 [gfs2]
[  373.981691]  [<ffffffffa02018c8>] gfs2_dir_add+0x468/0x820 [gfs2]
[  373.982403]  [<ffffffffa021b8f9>] gfs2_create_inode+0xeb9/0xfa0 [gfs2]
[  373.983160]  [<ffffffffa021ab48>] ? gfs2_create_inode+0x108/0xfa0 [gfs2]
[  373.983936]  [<ffffffffa021b2e9>] ? gfs2_create_inode+0x8a9/0xfa0 [gfs2]
[  373.984703]  [<ffffffffa021c2fd>] gfs2_atomic_open+0x6d/0xe0 [gfs2]
[  373.985421]  [<ffffffff812a0dc2>] path_openat+0x1832/0x1f30
[  373.986061]  [<ffffffff812b3380>] ? __alloc_fd+0x100/0x200
[  373.986685]  [<ffffffff812a2a61>] do_filp_open+0x91/0x100
[  373.987320]  [<ffffffff818b7637>] ? _raw_spin_unlock+0x27/0x40
[  373.987996]  [<ffffffff812b3380>] ? __alloc_fd+0x100/0x200
[  373.988623]  [<ffffffff8128e530>] do_sys_open+0x130/0x220
[  373.989247]  [<ffffffff8128e63e>] SyS_open+0x1e/0x20
[  373.989823]  [<ffffffff818b80f2>] entry_SYSCALL_64_fastpath+0x12/0x72

Comment 1 Robert Peterson 2016-03-11 19:51:53 UTC
Hi Andy,

I noticed in your call trace above that the shrinker is calling
evict. Excerpt:

[  373.840558]                          [<ffffffff812b0308>] evict+0xb8/0x180
[  373.841356]                          [<ffffffff812b0414>] dispose_list+0x44/0x70
[  373.842216]                          [<ffffffff812b164a>] prune_icache_sb+0x5a/0x80
[  373.843125]                          [<ffffffff812937be>] super_cache_scan+0x14e/0x1a0
[  373.844051]                          [<ffffffff81205bf6>] shrink_slab.part.42+0x216/0x540
[  373.845004]                          [<ffffffff8120b255>] shrink_zone+0x2f5/0x300

The following comment and associated patch might be relevant to
this problem:

https://bugzilla.redhat.com/show_bug.cgi?id=1255872#c30

Perhaps you can try that patch and see if the problem still
recreates?

Comment 2 Andrew Price 2016-03-11 22:35:40 UTC
I managed to reproduce the bug with a fresh 4.5-rc7 kernel, first time (took approx. 100,000 files) so I tried again with the patch from https://bugzilla.redhat.com/show_bug.cgi?id=1255872#c30 and I haven't seen the lockdep splat yet. However, at around file 300,000 I saw a couple of hung task warnings for gfs2_quotad:

[ 2340.901808] INFO: task gfs2_quotad:742 blocked for more than 90 seconds.
[ 2340.902919]       Tainted: G        W       4.5.0-rc7-00230-g20698c9-dirty #56
[ 2340.903860] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2340.905698] gfs2_quotad     D ffff880027093b30     0   742      2 0x00000000
[ 2340.906640]  ffff880027093b30 ffff88003ff7b6b0 ffff88003db00000 ffff880039610000
[ 2340.907628]  ffff880027094000 ffff88003ff7b6b0 ffff880027093bc8 ffffffff81808b00
[ 2340.908595]  ffff880027093bb0 ffff880027093b48 ffffffff8180819c 0000000000000002
[ 2340.909558] Call Trace:
[ 2340.909897]  [<ffffffff81808b00>] ? out_of_line_wait_on_atomic_t+0xf0/0xf0
[ 2340.910728]  [<ffffffff8180819c>] schedule+0x3c/0x90
[ 2340.911362]  [<ffffffff81808b11>] bit_wait+0x11/0x60
[ 2340.912005]  [<ffffffff8180870d>] __wait_on_bit+0x5d/0x90
[ 2340.912666]  [<ffffffff810b4e7a>] ? finish_task_switch+0x6a/0x210
[ 2340.913407]  [<ffffffff81808b00>] ? out_of_line_wait_on_atomic_t+0xf0/0xf0
[ 2340.914276]  [<ffffffff81808872>] out_of_line_wait_on_bit+0x82/0xb0
[ 2340.916050]  [<ffffffff810d4eb0>] ? autoremove_wake_function+0x40/0x40
[ 2340.917510]  [<ffffffff8140b9f5>] gfs2_glock_dq_wait+0x65/0x70
[ 2340.918251]  [<ffffffff81426c61>] gfs2_evict_inode+0x111/0x470
[ 2340.919936]  [<ffffffff8180dc67>] ? _raw_spin_unlock+0x27/0x40
[ 2340.921679]  [<ffffffff81252368>] evict+0xb8/0x180
[ 2340.923073]  [<ffffffff8125246b>] dispose_list+0x3b/0x70
[ 2340.924652]  [<ffffffff8125281a>] prune_icache_sb+0x5a/0x80
[ 2340.926292]  [<ffffffff81236caf>] super_cache_scan+0x14f/0x1a0
[ 2340.928031]  [<ffffffff81420283>] gfs2_quotad+0x113/0x420
[ 2340.929662]  [<ffffffff810d4e70>] ? wake_atomic_t_function+0x70/0x70
[ 2340.931265]  [<ffffffff81420170>] ? gfs2_wake_up_statfs+0x40/0x40
[ 2340.932724]  [<ffffffff810abe1e>] kthread+0xfe/0x120
[ 2340.933691]  [<ffffffff810abd20>] ? __kthread_parkme+0x90/0x90
[ 2340.934443]  [<ffffffff8180e89f>] ret_from_fork+0x3f/0x70
[ 2340.935127]  [<ffffffff810abd20>] ? __kthread_parkme+0x90/0x90
[ 2340.935828] 1 lock held by gfs2_quotad/742:
[ 2340.936340]  #0:  (&type->s_umount_key#36){.+.+..}, at: [<ffffffff81236b2b>] trylock_super+0x1b/0x50

Comment 3 Andrew Price 2019-10-16 16:21:46 UTC
3 years later and this bug isn't reproducing any more so I'm closing this one.