2351794 – 6.14.0-rc6 lockdep warning kswapd

Bug 2351794 - 6.14.0-rc6 lockdep warning kswapd

Summary: 6.14.0-rc6 lockdep warning kswapd

Keywords:
Status:	NEW
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	42
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Kernel Maintainer List
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2025-03-13 03:51 UTC by Chris Murphy
Modified:	2025-03-21 23:26 UTC (History)
CC List:	14 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
dmesg.log (197.32 KB, text/plain) 2025-03-13 03:51 UTC, Chris Murphy	no flags	Details
dmesg-6.14.0-rc7 (186.46 KB, text/plain) 2025-03-20 04:32 UTC, Chris Murphy	no flags	Details
View All

Description Chris Murphy 2025-03-13 03:51:52 UTC

Created attachment 2079954 [details]
dmesg.log

Created attachment 2079954 [details]
dmesg.log

Created attachment 2079954 [details]
dmesg.log

kernel 6.14.0-0.rc6.49.fc42.x86_64+debug
swap is enabled using a swapfile on btrfs on dm-crypt (same file system as /)
zswap is enabled using zsmalloc/zstd

This isn't reproducible on demand.


[ 4898.604852] perf: interrupt took too long (6155 > 6142), lowering kernel.perf_event_max_sample_rate to 32000
[ 5009.879938] ======================================================
[ 5009.879948] WARNING: possible circular locking dependency detected
[ 5009.879958] 6.14.0-0.rc6.49.fc42.x86_64+debug #1 Not tainted
[ 5009.879971] ------------------------------------------------------
[ 5009.879980] kswapd0/97 is trying to acquire lock:
[ 5009.879991] ffffe8fffea2bf00 (&per_cpu_ptr(pool->acomp_ctx, cpu)->mutex){+.+.}-{4:4}, at: zswap_compress+0x123/0x630
[ 5009.880036] 
               but task is already holding lock:
[ 5009.880046] ffffffff91a52ec0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x19d/0x1040
[ 5009.880083] 
               which lock already depends on the new lock.

[ 5009.880092] 
               the existing dependency chain (in reverse order) is:
[ 5009.880101] 
               -> #2 (fs_reclaim){+.+.}-{0:0}:
[ 5009.880130]        lock_acquire.part.0+0x125/0x360
[ 5009.880147]        fs_reclaim_acquire+0xc9/0x110
[ 5009.880163]        __kmalloc_cache_node_noprof+0x61/0x4f0
[ 5009.880178]        __get_vm_area_node+0xf6/0x2a0
[ 5009.880194]        __vmalloc_node_range_noprof+0x1fe/0x4c0
[ 5009.880206]        __vmalloc_node_noprof+0xb1/0x180
[ 5009.880218]        crypto_scomp_init_tfm+0x113/0x340
[ 5009.880229]        crypto_create_tfm_node+0xe9/0x2d0
[ 5009.880239]        crypto_init_scomp_ops_async+0x5a/0x1c0
[ 5009.880252]        crypto_create_tfm_node+0xe9/0x2d0
[ 5009.880265]        crypto_alloc_tfm_node+0xd7/0x1e0
[ 5009.880280]        alg_test_comp+0x10e/0x2c0
[ 5009.880294]        alg_test+0x365/0xff0
[ 5009.880306]        cryptomgr_test+0x54/0x80
[ 5009.880320]        kthread+0x39d/0x760
[ 5009.880332]        ret_from_fork+0x31/0x70
[ 5009.880344]        ret_from_fork_asm+0x1a/0x30
[ 5009.880357] 
               -> #1 (scomp_lock){+.+.}-{4:4}:
[ 5009.880376]        lock_acquire.part.0+0x125/0x360
[ 5009.880386]        __mutex_lock+0x1b3/0x1430
[ 5009.880395]        crypto_exit_scomp_ops_async+0x42/0x80
[ 5009.880405]        crypto_destroy_tfm+0xd8/0x250
[ 5009.880413]        zswap_cpu_comp_dead+0x11d/0x1c0
[ 5009.880420]        cpuhp_invoke_callback+0x190/0xa70
[ 5009.880431]        cpuhp_issue_call+0x13a/0x8a0
[ 5009.880439]        __cpuhp_state_remove_instance+0x214/0x510
[ 5009.880448]        __zswap_pool_release+0x48/0x110
[ 5009.880455]        process_one_work+0x896/0x14b0
[ 5009.880465]        worker_thread+0x5e5/0xfb0
[ 5009.880473]        kthread+0x39d/0x760
[ 5009.880481]        ret_from_fork+0x31/0x70
[ 5009.880488]        ret_from_fork_asm+0x1a/0x30
[ 5009.880495] 
               -> #0 (&per_cpu_ptr(pool->acomp_ctx, cpu)->mutex){+.+.}-{4:4}:
[ 5009.880511]        check_prev_add+0x1ab/0x23c0
[ 5009.880519]        __lock_acquire+0x22d6/0x2e30
[ 5009.880527]        lock_acquire.part.0+0x125/0x360
[ 5009.880535]        __mutex_lock+0x1b3/0x1430
[ 5009.880543]        zswap_compress+0x123/0x630
[ 5009.880550]        zswap_store_page+0xf0/0xb50
[ 5009.880562]        zswap_store+0x72f/0xb90
[ 5009.880575]        swap_writepage+0x384/0x790
[ 5009.880588]        shmem_writepage+0xd14/0x14b0
[ 5009.880602]        pageout+0x372/0xa60
[ 5009.880615]        shrink_folio_list+0x26da/0x3880
[ 5009.880628]        evict_folios+0x670/0x1c40
[ 5009.880640]        try_to_shrink_lruvec+0x422/0x9d0
[ 5009.880654]        shrink_one+0x36d/0x820
[ 5009.880667]        shrink_many+0x337/0xc90
[ 5009.880680]        shrink_node+0x2f5/0x1460
[ 5009.880694]        balance_pgdat+0x544/0x1040
[ 5009.880708]        kswapd+0x2f9/0x510
[ 5009.880722]        kthread+0x39d/0x760
[ 5009.880736]        ret_from_fork+0x31/0x70
[ 5009.880751]        ret_from_fork_asm+0x1a/0x30
[ 5009.880764] 
               other info that might help us debug this:

[ 5009.880774] Chain exists of:
                 &per_cpu_ptr(pool->acomp_ctx, cpu)->mutex --> scomp_lock --> fs_reclaim

[ 5009.880811]  Possible unsafe locking scenario:

[ 5009.880819]        CPU0                    CPU1
[ 5009.880825]        ----                    ----
[ 5009.880831]   lock(fs_reclaim);
[ 5009.880848]                                lock(scomp_lock);
[ 5009.880866]                                lock(fs_reclaim);
[ 5009.880883]   lock(&per_cpu_ptr(pool->acomp_ctx, cpu)->mutex);
[ 5009.880900] 
                *** DEADLOCK ***

[ 5009.880906] 1 lock held by kswapd0/97:
[ 5009.880918]  #0: ffffffff91a52ec0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x19d/0x1040
[ 5009.880960] 
               stack backtrace:
[ 5009.880971] CPU: 3 UID: 0 PID: 97 Comm: kswapd0 Not tainted 6.14.0-0.rc6.49.fc42.x86_64+debug #1
[ 5009.880984] Hardware name: LENOVO 20QDS3E200/20QDS3E200, BIOS N2HET77W (1.60 ) 02/06/2024
[ 5009.880989] Call Trace:
[ 5009.880994]  <TASK>
[ 5009.881002]  dump_stack_lvl+0x84/0xd0
[ 5009.881015]  print_circular_bug.cold+0x38/0x48
[ 5009.881031]  check_noncircular+0x309/0x3e0
[ 5009.881044]  ? __pfx_check_noncircular+0x10/0x10
[ 5009.881061]  ? mark_lock+0x75/0x890
[ 5009.881074]  ? alloc_chain_hlocks+0x4c2/0x6f0
[ 5009.881086]  check_prev_add+0x1ab/0x23c0
[ 5009.881104]  __lock_acquire+0x22d6/0x2e30
[ 5009.881125]  ? __pfx___lock_acquire+0x10/0x10
[ 5009.881135]  ? __lock_release.isra.0+0x4ab/0xa30
[ 5009.881144]  ? __lock_acquired+0x22b/0x880
[ 5009.881157]  lock_acquire.part.0+0x125/0x360
[ 5009.881167]  ? zswap_compress+0x123/0x630
[ 5009.881180]  ? __pfx_lock_acquire.part.0+0x10/0x10
[ 5009.881195]  ? rcu_is_watching+0x15/0xe0
[ 5009.881206]  ? lock_acquire+0x1a6/0x210
[ 5009.881220]  __mutex_lock+0x1b3/0x1430
[ 5009.881230]  ? zswap_compress+0x123/0x630
[ 5009.881237]  ? kmem_cache_alloc_node_noprof+0x153/0x4e0
[ 5009.881250]  ? swap_writepage+0x384/0x790
[ 5009.881257]  ? zswap_compress+0x123/0x630
[ 5009.881265]  ? pageout+0x372/0xa60
[ 5009.881271]  ? shrink_folio_list+0x26da/0x3880
[ 5009.881279]  ? evict_folios+0x670/0x1c40
[ 5009.881286]  ? try_to_shrink_lruvec+0x422/0x9d0
[ 5009.881295]  ? shrink_one+0x36d/0x820
[ 5009.881302]  ? shrink_many+0x337/0xc90
[ 5009.881313]  ? __pfx___mutex_lock+0x10/0x10
[ 5009.881321]  ? ret_from_fork+0x31/0x70
[ 5009.881330]  ? ret_from_fork_asm+0x1a/0x30
[ 5009.881354]  ? zswap_compress+0x123/0x630
[ 5009.881362]  zswap_compress+0x123/0x630
[ 5009.881374]  ? __pfx_zswap_compress+0x10/0x10
[ 5009.881389]  ? rcu_is_watching+0x15/0xe0
[ 5009.881401]  ? zswap_store_page+0xd6/0xb50
[ 5009.881417]  zswap_store_page+0xf0/0xb50
[ 5009.881430]  zswap_store+0x72f/0xb90
[ 5009.881442]  ? __pfx_zswap_store+0x10/0x10
[ 5009.881450]  ? folio_free_swap+0x169/0x470
[ 5009.881466]  swap_writepage+0x384/0x790
[ 5009.881479]  shmem_writepage+0xd14/0x14b0
[ 5009.881495]  ? __pfx_shmem_writepage+0x10/0x10
[ 5009.881504]  ? mark_usage+0x11e/0x330
[ 5009.881521]  ? folio_clear_dirty_for_io+0x115/0x6a0
[ 5009.881537]  pageout+0x372/0xa60
[ 5009.881547]  ? __pfx_pageout+0x10/0x10
[ 5009.881586]  ? folio_check_references.isra.0+0x79/0x2f0
[ 5009.881596]  ? __pfx_folio_check_references.isra.0+0x10/0x10
[ 5009.881610]  ? folio_evictable+0xa5/0x200
[ 5009.881627]  shrink_folio_list+0x26da/0x3880
[ 5009.881645]  ? __pfx_shrink_folio_list+0x10/0x10
[ 5009.881661]  ? __pfx_scan_folios+0x10/0x10
[ 5009.881692]  ? mark_held_locks+0x96/0xe0
[ 5009.881704]  ? _raw_spin_unlock_irq+0x28/0x60
[ 5009.881717]  evict_folios+0x670/0x1c40
[ 5009.881739]  ? mark_usage+0x11e/0x330
[ 5009.881749]  ? __pfx_evict_folios+0x10/0x10
[ 5009.881760]  ? mark_lock+0x75/0x890
[ 5009.881782]  ? __pfx___might_resched+0x10/0x10
[ 5009.881800]  try_to_shrink_lruvec+0x422/0x9d0
[ 5009.881821]  ? __lock_release.isra.0+0x4ab/0xa30
[ 5009.881833]  ? __pfx_try_to_shrink_lruvec+0x10/0x10
[ 5009.881845]  ? mark_lock+0x75/0x890
[ 5009.881859]  shrink_one+0x36d/0x820
[ 5009.881870]  ? shrink_many+0x312/0xc90
[ 5009.881882]  shrink_many+0x337/0xc90
[ 5009.881891]  ? shrink_many+0x312/0xc90
[ 5009.881909]  shrink_node+0x2f5/0x1460
[ 5009.881932]  ? __pfx_shrink_node+0x10/0x10
[ 5009.881951]  ? pgdat_balanced+0xb3/0x1a0
[ 5009.881965]  balance_pgdat+0x544/0x1040
[ 5009.881978]  ? __pfx_balance_pgdat+0x10/0x10
[ 5009.881986]  ? set_pgdat_percpu_threshold+0x1bd/0x300
[ 5009.882000]  ? _raw_spin_unlock_irq+0x38/0x60
[ 5009.882005]  ? __refrigerator+0x110/0x260
[ 5009.882015]  kswapd+0x2f9/0x510
[ 5009.882023]  ? __pfx_kswapd+0x10/0x10
[ 5009.882029]  ? __kthread_parkme+0xb0/0x1e0
[ 5009.882037]  ? __pfx_kswapd+0x10/0x10
[ 5009.882042]  kthread+0x39d/0x760
[ 5009.882048]  ? __pfx_kthread+0x10/0x10
[ 5009.882056]  ? _raw_spin_unlock_irq+0x28/0x60
[ 5009.882060]  ? __pfx_kthread+0x10/0x10
[ 5009.882067]  ret_from_fork+0x31/0x70
[ 5009.882072]  ? __pfx_kthread+0x10/0x10
[ 5009.882077]  ret_from_fork_asm+0x1a/0x30
[ 5009.882089]  </TASK>
[ 5380.636730] show_signal_msg: 6 callbacks suppressed



$ sudo zswap-cli --stats
ZSWAP KERNEL MODULE SETTINGS:
ZSwap enabled: Y.
Same filled pages enabled: N/A.
Maximum pool percentage: 15.
Compression algorithm: zstd.
Kernel's zpool type: zsmalloc.
Accept threshold percentage: 90.
Non same filled pages enabled: N/A.
Exclusive loads: N/A.
Shrinker enabled: Y.

ZSWAP KERNEL MODULE USAGE SUMMARY:
Pool: 31.61 MiB (0.2% of MemTotal).
Stored: 135.56 MiB (87.0% of SwapUsed).
Compression ratio: 4.29.

ZSWAP KERNEL MODULE DEBUG INFO:
Pool limit hit: 0.
Pool total size: 33148928.
Reject allocation failures: 0.
Reject compression poor: 0.
Reject Kmemcache failures: 0.
Reject reclaim failures: 0.
Reject compression failures: 272.
Same filled pages count: 0.
Stored pages count: 34704.
Written back pages count: 0.

Comment 1 Chris Murphy 2025-03-13 03:56:46 UTC

Reported to the linux-mm list
https://marc.info/?l=linux-mm&m=174183768913672&w=2

Comment 2 Chris Murphy 2025-03-20 04:32:22 UTC

Created attachment 2080959 [details]
dmesg-6.14.0-rc7

Note You need to log in before you can comment on or make changes to this bug.