Bug 1309838 - GFS2 lockdep deadlock warning (sd_log_flush_lock vs gl->gl_work.work)
Summary: GFS2 lockdep deadlock warning (sd_log_flush_lock vs gl->gl_work.work)
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: gfs2-maint
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-02-18 19:46 UTC by Andrew Price
Modified: 2019-10-16 16:21 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-10-16 16:21:46 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Andrew Price 2016-02-18 19:46:01 UTC
Description of problem:

Lockdep splat when creating a large number of files in a directory in a single node, 5G lock_nolock gfs2 fs with 512 byte blocks.

Version-Release number of selected component (if applicable):

4.5.0-0.rc4.git1.1.fc24.x86_64

How reproducible:

Not sure yet.

Steps to Reproduce:
1. mkfs.gfs2 -b 512 -p lock_nolock /dev/vda
2. mount /dev/vda /mnt/test
3. mkdir /mnt/test/foo
4. seq -w 1000000000 | while read i; do touch /mnt/test/foo/file$i; done

Actual results:

Lockdep output below.

Expected results:

Nothing untoward.

Additional info:

[  373.800997] =========================================================
[  373.802133] [ INFO: possible irq lock inversion dependency detected ]
[  373.803463] 4.5.0-0.rc4.git1.1.fc24.x86_64 #1 Tainted: G        W      
[  373.804872] ---------------------------------------------------------
[  373.805815] touch/17847 just changed the state of lock:
[  373.806583]  (&sdp->sd_log_flush_lock){++++.+}, at: [<ffffffffa020ed03>] gfs2_log_reserve+0x1d3/0x380 [gfs2]
[  373.808102] but this lock was taken by another, RECLAIM_FS-safe lock in the past:
[  373.809204]  ((&(&gl->gl_work)->work)){+.+.-.}

and interrupts could create inverse lock ordering between them.

[  373.810781] 
[  373.810781] other info that might help us debug this:
[  373.811737]  Possible interrupt unsafe locking scenario:
[  373.811737] 
[  373.812524]        CPU0                    CPU1
[  373.813050]        ----                    ----
[  373.813570]   lock(&sdp->sd_log_flush_lock);
[  373.814093]                                local_irq_disable();
[  373.814780]                                lock((&(&gl->gl_work)->work));
[  373.815581]                                lock(&sdp->sd_log_flush_lock);
[  373.816392]   <Interrupt>
[  373.816696]     lock((&(&gl->gl_work)->work));
[  373.817237] 
[  373.817237]  *** DEADLOCK ***
[  373.817237] 
[  373.817920] 4 locks held by touch/17847:
[  373.818367]  #0:  (sb_writers#10){.+.+.+}, at: [<ffffffff81292e74>] __sb_start_write+0xb4/0xf0
[  373.819434]  #1:  (&sb->s_type->i_mutex_key#15){+.+.+.}, at: [<ffffffff812a040f>] path_openat+0xe7f/0x1f30
[  373.820617]  #2:  (sb_internal#2){.+.+.+}, at: [<ffffffff81292e38>] __sb_start_write+0x78/0xf0
[  373.821676]  #3:  (&sdp->sd_log_flush_lock){++++.+}, at: [<ffffffffa020ed03>] gfs2_log_reserve+0x1d3/0x380 [gfs2]
[  373.822929] 
[  373.822929] the shortest dependencies between 2nd lock and 1st lock:
[  373.823839]  -> ((&(&gl->gl_work)->work)){+.+.-.} ops: 103990 {
[  373.824577]     HARDIRQ-ON-W at:
[  373.824971]                       [<ffffffff8110d8c8>] __lock_acquire+0x8f8/0x17e0
[  373.825851]                       [<ffffffff8110f17e>] lock_acquire+0xce/0x1c0
[  373.826688]                       [<ffffffff810cefa9>] process_one_work+0x219/0x690
[  373.827585]                       [<ffffffff810cf46e>] worker_thread+0x4e/0x490
[  373.828446]                       [<ffffffff810d6891>] kthread+0x101/0x120
[  373.829248]                       [<ffffffff818b845f>] ret_from_fork+0x3f/0x70
[  373.830091]     SOFTIRQ-ON-W at:
[  373.830476]                       [<ffffffff8110d8f5>] __lock_acquire+0x925/0x17e0
[  373.831363]                       [<ffffffff8110f17e>] lock_acquire+0xce/0x1c0
[  373.832216]                       [<ffffffff810cefa9>] process_one_work+0x219/0x690
[  373.833106]                       [<ffffffff810cf46e>] worker_thread+0x4e/0x490
[  373.833958]                       [<ffffffff810d6891>] kthread+0x101/0x120
[  373.834764]                       [<ffffffff818b845f>] ret_from_fork+0x3f/0x70
[  373.835599]     IN-RECLAIM_FS-W at:
[  373.836028]                          [<ffffffff8110d917>] __lock_acquire+0x947/0x17e0
[  373.836933]                          [<ffffffff8110f17e>] lock_acquire+0xce/0x1c0
[  373.837798]                          [<ffffffff810cdb2e>] flush_work+0x4e/0x320
[  373.838644]                          [<ffffffff810cde4f>] flush_delayed_work+0x4f/0x60
[  373.839571]                          [<ffffffffa0227a30>] gfs2_evict_inode+0xa0/0x430 [gfs2]
[  373.840558]                          [<ffffffff812b0308>] evict+0xb8/0x180
[  373.841356]                          [<ffffffff812b0414>] dispose_list+0x44/0x70
[  373.842216]                          [<ffffffff812b164a>] prune_icache_sb+0x5a/0x80
[  373.843125]                          [<ffffffff812937be>] super_cache_scan+0x14e/0x1a0
[  373.844051]                          [<ffffffff81205bf6>] shrink_slab.part.42+0x216/0x540
[  373.845004]                          [<ffffffff8120b255>] shrink_zone+0x2f5/0x300
[  373.845887]                          [<ffffffff8120c744>] kswapd+0x564/0xb60
[  373.846706]                          [<ffffffff810d6891>] kthread+0x101/0x120
[  373.847552]                          [<ffffffff818b845f>] ret_from_fork+0x3f/0x70
[  373.848474]     INITIAL USE at:
[  373.848858]                      [<ffffffff8110d59d>] __lock_acquire+0x5cd/0x17e0
[  373.849734]                      [<ffffffff8110f17e>] lock_acquire+0xce/0x1c0
[  373.850561]                      [<ffffffff810cefa9>] process_one_work+0x219/0x690
[  373.851448]                      [<ffffffff810cf46e>] worker_thread+0x4e/0x490
[  373.852332]                      [<ffffffff810d6891>] kthread+0x101/0x120
[  373.853127]                      [<ffffffff818b845f>] ret_from_fork+0x3f/0x70
[  373.853959]   }
[  373.854168]   ... key      at: [<ffffffffa023ced8>] __key.40559+0x0/0xffffffffffff2128 [gfs2]
[  373.855164]   ... acquired at:
[  373.855518]    [<ffffffff8110f17e>] lock_acquire+0xce/0x1c0
[  373.856545]    [<ffffffff818b58fa>] down_write+0x5a/0xc0
[  373.857339]    [<ffffffffa020f77e>] gfs2_log_flush+0x4e/0x8d0 [gfs2]
[  373.858108]    [<ffffffffa020d75c>] inode_go_sync+0x8c/0x130 [gfs2]
[  373.858853]    [<ffffffffa020b946>] do_xmote+0x176/0x340 [gfs2]
[  373.859552]    [<ffffffffa020bc00>] run_queue+0xf0/0x390 [gfs2]
[  373.860262]    [<ffffffffa020bef9>] glock_work_func+0x59/0x120 [gfs2]
[  373.861021]    [<ffffffff810cefe2>] process_one_work+0x252/0x690
[  373.861733]    [<ffffffff810cf46e>] worker_thread+0x4e/0x490
[  373.862396]    [<ffffffff810d6891>] kthread+0x101/0x120
[  373.863014]    [<ffffffff818b845f>] ret_from_fork+0x3f/0x70
[  373.863661] 
[  373.863856] -> (&sdp->sd_log_flush_lock){++++.+} ops: 297943 {
[  373.864584]    HARDIRQ-ON-W at:
[  373.864973]                     [<ffffffff8110d8c8>] __lock_acquire+0x8f8/0x17e0
[  373.865832]                     [<ffffffff8110f17e>] lock_acquire+0xce/0x1c0
[  373.866646]                     [<ffffffff818b58fa>] down_write+0x5a/0xc0
[  373.867434]                     [<ffffffffa020f77e>] gfs2_log_flush+0x4e/0x8d0 [gfs2]
[  373.868862]                     [<ffffffffa0210242>] gfs2_logd+0x242/0x2a0 [gfs2]
[  373.869820]                     [<ffffffff810d6891>] kthread+0x101/0x120
[  373.870600]                     [<ffffffff818b845f>] ret_from_fork+0x3f/0x70
[  373.871432]    HARDIRQ-ON-R at:
[  373.871821]                     [<ffffffff8110d526>] __lock_acquire+0x556/0x17e0
[  373.872667]                     [<ffffffff8110f17e>] lock_acquire+0xce/0x1c0
[  373.873486]                     [<ffffffff818b5851>] down_read+0x51/0xa0
[  373.874272]                     [<ffffffffa020ed03>] gfs2_log_reserve+0x1d3/0x380 [gfs2]
[  373.875217]                     [<ffffffffa022b05c>] gfs2_trans_begin+0xcc/0x140 [gfs2]
[  373.876157]                     [<ffffffffa021b1ac>] gfs2_create_inode+0x76c/0xfa0 [gfs2]
[  373.877116]                     [<ffffffffa021ba67>] gfs2_mkdir+0x47/0x50 [gfs2]
[  373.877989]                     [<ffffffff8129c025>] vfs_mkdir+0xc5/0x150
[  373.878792]                     [<ffffffff812a317a>] SyS_mkdir+0x7a/0xf0
[  373.879564]                     [<ffffffff818b80f2>] entry_SYSCALL_64_fastpath+0x12/0x72
[  373.880498]    SOFTIRQ-ON-W at:
[  373.880878]                     [<ffffffff8110d8f5>] __lock_acquire+0x925/0x17e0
[  373.881744]                     [<ffffffff8110f17e>] lock_acquire+0xce/0x1c0
[  373.882557]                     [<ffffffff818b58fa>] down_write+0x5a/0xc0
[  373.883351]                     [<ffffffffa020f77e>] gfs2_log_flush+0x4e/0x8d0 [gfs2]
[  373.884261]                     [<ffffffffa0210242>] gfs2_logd+0x242/0x2a0 [gfs2]
[  373.885130]                     [<ffffffff810d6891>] kthread+0x101/0x120
[  373.885916]                     [<ffffffff818b845f>] ret_from_fork+0x3f/0x70
[  373.886737]    SOFTIRQ-ON-R at:
[  373.887113]                     [<ffffffff8110d8f5>] __lock_acquire+0x925/0x17e0
[  373.887972]                     [<ffffffff8110f17e>] lock_acquire+0xce/0x1c0
[  373.888789]                     [<ffffffff818b5851>] down_read+0x51/0xa0
[  373.889561]                     [<ffffffffa020ed03>] gfs2_log_reserve+0x1d3/0x380 [gfs2]
[  373.890516]                     [<ffffffffa022b05c>] gfs2_trans_begin+0xcc/0x140 [gfs2]
[  373.891452]                     [<ffffffffa021b1ac>] gfs2_create_inode+0x76c/0xfa0 [gfs2]
[  373.892419]                     [<ffffffffa021ba67>] gfs2_mkdir+0x47/0x50 [gfs2]
[  373.893292]                     [<ffffffff8129c025>] vfs_mkdir+0xc5/0x150
[  373.894081]                     [<ffffffff812a317a>] SyS_mkdir+0x7a/0xf0
[  373.894860]                     [<ffffffff818b80f2>] entry_SYSCALL_64_fastpath+0x12/0x72
[  373.895806]    RECLAIM_FS-ON-R at:
[  373.896210]                        [<ffffffff8110ca96>] mark_held_locks+0x76/0xa0
[  373.897088]                        [<ffffffff8110fb1d>] lockdep_trace_alloc+0x7d/0xe0
[  373.897992]                        [<ffffffff811f8fe3>] __alloc_pages_nodemask+0xb3/0xe50
[  373.898938]                        [<ffffffff81252eeb>] alloc_pages_current+0x9b/0x1c0
[  373.899854]                        [<ffffffff811f3444>] __get_free_pages+0x14/0x50
[  373.901123]                        [<ffffffff810775b5>] pte_alloc_one_kernel+0x15/0x20
[  373.902240]                        [<ffffffff81228a8d>] __pte_alloc_kernel+0x1d/0x100
[  373.903157]                        [<ffffffff8123b68a>] vmap_page_range_noflush+0x2ea/0x340
[  373.904137]                        [<ffffffff8123b716>] map_vm_area+0x36/0x50
[  373.904972]                        [<ffffffff8123dcb0>] __vmalloc_node_range+0x1b0/0x270
[  373.905921]                        [<ffffffff8123ddba>] __vmalloc+0x4a/0x50
[  373.906735]                        [<ffffffffa01ffdd9>] gfs2_dir_get_hash_table+0x379/0x3e0 [gfs2]
[  373.907790]                        [<ffffffffa01ffe56>] get_leaf_nr+0x16/0x30 [gfs2]
[  373.908690]                        [<ffffffffa01fffe0>] gfs2_dirent_search+0xb0/0x1b0 [gfs2]
[  373.909678]                        [<ffffffffa02018c8>] gfs2_dir_add+0x468/0x820 [gfs2]
[  373.910611]                        [<ffffffffa021b8f9>] gfs2_create_inode+0xeb9/0xfa0 [gfs2]
[  373.911595]                        [<ffffffffa021c2fd>] gfs2_atomic_open+0x6d/0xe0 [gfs2]
[  373.912789]                        [<ffffffff812a0dc2>] path_openat+0x1832/0x1f30
[  373.913653]                        [<ffffffff812a2a61>] do_filp_open+0x91/0x100
[  373.914513]                        [<ffffffff8128e530>] do_sys_open+0x130/0x220
[  373.915368]                        [<ffffffff8128e63e>] SyS_open+0x1e/0x20
[  373.916172]                        [<ffffffff818b80f2>] entry_SYSCALL_64_fastpath+0x12/0x72
[  373.917154]    INITIAL USE at:
[  373.917523]                    [<ffffffff8110d59d>] __lock_acquire+0x5cd/0x17e0
[  373.918387]                    [<ffffffff8110f17e>] lock_acquire+0xce/0x1c0
[  373.919213]                    [<ffffffff818b5851>] down_read+0x51/0xa0
[  373.919989]                    [<ffffffffa020ed03>] gfs2_log_reserve+0x1d3/0x380 [gfs2]
[  373.920938]                    [<ffffffffa022b05c>] gfs2_trans_begin+0xcc/0x140 [gfs2]
[  373.921869]                    [<ffffffffa021b1ac>] gfs2_create_inode+0x76c/0xfa0 [gfs2]
[  373.922813]                    [<ffffffffa021ba67>] gfs2_mkdir+0x47/0x50 [gfs2]
[  373.923663]                    [<ffffffff8129c025>] vfs_mkdir+0xc5/0x150
[  373.924445]                    [<ffffffff812a317a>] SyS_mkdir+0x7a/0xf0
[  373.925230]                    [<ffffffff818b80f2>] entry_SYSCALL_64_fastpath+0x12/0x72
[  373.926172]  }
[  373.926371]  ... key      at: [<ffffffffa023d138>] __key.38384+0x0/0xffffffffffff1ec8 [gfs2]
[  373.927360]  ... acquired at:
[  373.927704]    [<ffffffff8110bbd9>] check_usage_backwards+0x149/0x160
[  373.928459]    [<ffffffff8110c810>] mark_lock+0x3d0/0x5e0
[  373.929096]    [<ffffffff8110ca96>] mark_held_locks+0x76/0xa0
[  373.929771]    [<ffffffff8110fb1d>] lockdep_trace_alloc+0x7d/0xe0
[  373.930478]    [<ffffffff811f8fe3>] __alloc_pages_nodemask+0xb3/0xe50
[  373.931231]    [<ffffffff81252eeb>] alloc_pages_current+0x9b/0x1c0
[  373.931955]    [<ffffffff811f3444>] __get_free_pages+0x14/0x50
[  373.932637]    [<ffffffff810775b5>] pte_alloc_one_kernel+0x15/0x20
[  373.933365]    [<ffffffff81228a8d>] __pte_alloc_kernel+0x1d/0x100
[  373.934077]    [<ffffffff8123b68a>] vmap_page_range_noflush+0x2ea/0x340
[  373.934847]    [<ffffffff8123b716>] map_vm_area+0x36/0x50
[  373.935480]    [<ffffffff8123dcb0>] __vmalloc_node_range+0x1b0/0x270
[  373.936223]    [<ffffffff8123ddba>] __vmalloc+0x4a/0x50
[  373.936839]    [<ffffffffa01ffdd9>] gfs2_dir_get_hash_table+0x379/0x3e0 [gfs2]
[  373.937677]    [<ffffffffa01ffe56>] get_leaf_nr+0x16/0x30 [gfs2]
[  373.938382]    [<ffffffffa01fffe0>] gfs2_dirent_search+0xb0/0x1b0 [gfs2]
[  373.939164]    [<ffffffffa02018c8>] gfs2_dir_add+0x468/0x820 [gfs2]
[  373.939899]    [<ffffffffa021b8f9>] gfs2_create_inode+0xeb9/0xfa0 [gfs2]
[  373.940679]    [<ffffffffa021c2fd>] gfs2_atomic_open+0x6d/0xe0 [gfs2]
[  373.941435]    [<ffffffff812a0dc2>] path_openat+0x1832/0x1f30
[  373.942108]    [<ffffffff812a2a61>] do_filp_open+0x91/0x100
[  373.942760]    [<ffffffff8128e530>] do_sys_open+0x130/0x220
[  373.943405]    [<ffffffff8128e63e>] SyS_open+0x1e/0x20
[  373.944009]    [<ffffffff818b80f2>] entry_SYSCALL_64_fastpath+0x12/0x72
[  373.944776] 
[  373.944956] 
[  373.944956] stack backtrace:
[  373.945458] CPU: 0 PID: 17847 Comm: touch Tainted: G        W       4.5.0-0.rc4.git1.1.fc24.x86_64 #1
[  373.946507] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  373.947541]  0000000000000086 00000000f4065c31 ffff880025fc3560 ffffffff8144daf5
[  373.949146]  ffffffff82f7b540 ffff880025fc35c0 ffff880025fc35a0 ffffffff811eac2a
[  373.950733]  ffff88003b908d50 ffff88003b908d50 ffff88003b908000 ffffffff81ca402f
[  373.952330] Call Trace:
[  373.952862]  [<ffffffff8144daf5>] dump_stack+0x86/0xc1
[  373.953955]  [<ffffffff811eac2a>] print_irq_inversion_bug.part.41+0x1a5/0x1b1
[  373.955444]  [<ffffffff8110bbd9>] check_usage_backwards+0x149/0x160
[  373.956766]  [<ffffffff81026b59>] ? sched_clock+0x9/0x10
[  373.957874]  [<ffffffff8110c810>] mark_lock+0x3d0/0x5e0
[  373.958973]  [<ffffffff8110ba90>] ? check_usage_forwards+0x160/0x160
[  373.960297]  [<ffffffff8110ca96>] mark_held_locks+0x76/0xa0
[  373.961459]  [<ffffffff8110fb1d>] lockdep_trace_alloc+0x7d/0xe0
[  373.962705]  [<ffffffff811f8fe3>] __alloc_pages_nodemask+0xb3/0xe50
[  373.964027]  [<ffffffff8112cb05>] ? rcu_read_lock_sched_held+0x45/0x80
[  373.965394]  [<ffffffff811f92f7>] ? __alloc_pages_nodemask+0x3c7/0xe50
[  373.966758]  [<ffffffff81252eeb>] alloc_pages_current+0x9b/0x1c0
[  373.968014]  [<ffffffff811f3444>] __get_free_pages+0x14/0x50
[  373.969192]  [<ffffffff810775b5>] pte_alloc_one_kernel+0x15/0x20
[  373.970458]  [<ffffffff81228a8d>] __pte_alloc_kernel+0x1d/0x100
[  373.971704]  [<ffffffff8123b68a>] vmap_page_range_noflush+0x2ea/0x340
[  373.972919]  [<ffffffff8123b716>] map_vm_area+0x36/0x50
[  373.973516]  [<ffffffff8123dcb0>] __vmalloc_node_range+0x1b0/0x270
[  373.974229]  [<ffffffffa01ffdd9>] ? gfs2_dir_get_hash_table+0x379/0x3e0 [gfs2]
[  373.975056]  [<ffffffff8121bee1>] ? kmalloc_order_trace+0xd1/0x140
[  373.975763]  [<ffffffffa01ff000>] ? gfs2_check_dirent+0xf0/0xf0 [gfs2]
[  373.976509]  [<ffffffff8123ddba>] __vmalloc+0x4a/0x50
[  373.977095]  [<ffffffffa01ffdd9>] ? gfs2_dir_get_hash_table+0x379/0x3e0 [gfs2]
[  373.977928]  [<ffffffffa01ffdd9>] gfs2_dir_get_hash_table+0x379/0x3e0 [gfs2]
[  373.978741]  [<ffffffffa022b551>] ? gfs2_trans_add_meta+0xa1/0x270 [gfs2]
[  373.979517]  [<ffffffffa01ff000>] ? gfs2_check_dirent+0xf0/0xf0 [gfs2]
[  373.980269]  [<ffffffffa01ffe56>] get_leaf_nr+0x16/0x30 [gfs2]
[  373.980944]  [<ffffffffa01fffe0>] gfs2_dirent_search+0xb0/0x1b0 [gfs2]
[  373.981691]  [<ffffffffa02018c8>] gfs2_dir_add+0x468/0x820 [gfs2]
[  373.982403]  [<ffffffffa021b8f9>] gfs2_create_inode+0xeb9/0xfa0 [gfs2]
[  373.983160]  [<ffffffffa021ab48>] ? gfs2_create_inode+0x108/0xfa0 [gfs2]
[  373.983936]  [<ffffffffa021b2e9>] ? gfs2_create_inode+0x8a9/0xfa0 [gfs2]
[  373.984703]  [<ffffffffa021c2fd>] gfs2_atomic_open+0x6d/0xe0 [gfs2]
[  373.985421]  [<ffffffff812a0dc2>] path_openat+0x1832/0x1f30
[  373.986061]  [<ffffffff812b3380>] ? __alloc_fd+0x100/0x200
[  373.986685]  [<ffffffff812a2a61>] do_filp_open+0x91/0x100
[  373.987320]  [<ffffffff818b7637>] ? _raw_spin_unlock+0x27/0x40
[  373.987996]  [<ffffffff812b3380>] ? __alloc_fd+0x100/0x200
[  373.988623]  [<ffffffff8128e530>] do_sys_open+0x130/0x220
[  373.989247]  [<ffffffff8128e63e>] SyS_open+0x1e/0x20
[  373.989823]  [<ffffffff818b80f2>] entry_SYSCALL_64_fastpath+0x12/0x72

Comment 1 Robert Peterson 2016-03-11 19:51:53 UTC
Hi Andy,

I noticed in your call trace above that the shrinker is calling
evict. Excerpt:

[  373.840558]                          [<ffffffff812b0308>] evict+0xb8/0x180
[  373.841356]                          [<ffffffff812b0414>] dispose_list+0x44/0x70
[  373.842216]                          [<ffffffff812b164a>] prune_icache_sb+0x5a/0x80
[  373.843125]                          [<ffffffff812937be>] super_cache_scan+0x14e/0x1a0
[  373.844051]                          [<ffffffff81205bf6>] shrink_slab.part.42+0x216/0x540
[  373.845004]                          [<ffffffff8120b255>] shrink_zone+0x2f5/0x300

The following comment and associated patch might be relevant to
this problem:

https://bugzilla.redhat.com/show_bug.cgi?id=1255872#c30

Perhaps you can try that patch and see if the problem still
recreates?

Comment 2 Andrew Price 2016-03-11 22:35:40 UTC
I managed to reproduce the bug with a fresh 4.5-rc7 kernel, first time (took approx. 100,000 files) so I tried again with the patch from https://bugzilla.redhat.com/show_bug.cgi?id=1255872#c30 and I haven't seen the lockdep splat yet. However, at around file 300,000 I saw a couple of hung task warnings for gfs2_quotad:

[ 2340.901808] INFO: task gfs2_quotad:742 blocked for more than 90 seconds.
[ 2340.902919]       Tainted: G        W       4.5.0-rc7-00230-g20698c9-dirty #56
[ 2340.903860] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2340.905698] gfs2_quotad     D ffff880027093b30     0   742      2 0x00000000
[ 2340.906640]  ffff880027093b30 ffff88003ff7b6b0 ffff88003db00000 ffff880039610000
[ 2340.907628]  ffff880027094000 ffff88003ff7b6b0 ffff880027093bc8 ffffffff81808b00
[ 2340.908595]  ffff880027093bb0 ffff880027093b48 ffffffff8180819c 0000000000000002
[ 2340.909558] Call Trace:
[ 2340.909897]  [<ffffffff81808b00>] ? out_of_line_wait_on_atomic_t+0xf0/0xf0
[ 2340.910728]  [<ffffffff8180819c>] schedule+0x3c/0x90
[ 2340.911362]  [<ffffffff81808b11>] bit_wait+0x11/0x60
[ 2340.912005]  [<ffffffff8180870d>] __wait_on_bit+0x5d/0x90
[ 2340.912666]  [<ffffffff810b4e7a>] ? finish_task_switch+0x6a/0x210
[ 2340.913407]  [<ffffffff81808b00>] ? out_of_line_wait_on_atomic_t+0xf0/0xf0
[ 2340.914276]  [<ffffffff81808872>] out_of_line_wait_on_bit+0x82/0xb0
[ 2340.916050]  [<ffffffff810d4eb0>] ? autoremove_wake_function+0x40/0x40
[ 2340.917510]  [<ffffffff8140b9f5>] gfs2_glock_dq_wait+0x65/0x70
[ 2340.918251]  [<ffffffff81426c61>] gfs2_evict_inode+0x111/0x470
[ 2340.919936]  [<ffffffff8180dc67>] ? _raw_spin_unlock+0x27/0x40
[ 2340.921679]  [<ffffffff81252368>] evict+0xb8/0x180
[ 2340.923073]  [<ffffffff8125246b>] dispose_list+0x3b/0x70
[ 2340.924652]  [<ffffffff8125281a>] prune_icache_sb+0x5a/0x80
[ 2340.926292]  [<ffffffff81236caf>] super_cache_scan+0x14f/0x1a0
[ 2340.928031]  [<ffffffff81420283>] gfs2_quotad+0x113/0x420
[ 2340.929662]  [<ffffffff810d4e70>] ? wake_atomic_t_function+0x70/0x70
[ 2340.931265]  [<ffffffff81420170>] ? gfs2_wake_up_statfs+0x40/0x40
[ 2340.932724]  [<ffffffff810abe1e>] kthread+0xfe/0x120
[ 2340.933691]  [<ffffffff810abd20>] ? __kthread_parkme+0x90/0x90
[ 2340.934443]  [<ffffffff8180e89f>] ret_from_fork+0x3f/0x70
[ 2340.935127]  [<ffffffff810abd20>] ? __kthread_parkme+0x90/0x90
[ 2340.935828] 1 lock held by gfs2_quotad/742:
[ 2340.936340]  #0:  (&type->s_umount_key#36){.+.+..}, at: [<ffffffff81236b2b>] trylock_super+0x1b/0x50

Comment 3 Andrew Price 2019-10-16 16:21:46 UTC
3 years later and this bug isn't reproducing any more so I'm closing this one.


Note You need to log in before you can comment on or make changes to this bug.