Bug 2298136 (CVE-2022-48800)

Summary: CVE-2022-48800 kernel: mm: vmscan: remove deadlock due to throttling failing to make progress
Product: [Other] Security Response Reporter: OSIDB Bzimport <bzimport>
Component: vulnerabilityAssignee: Product Security DevOps Team <prodsec-dev>
Status: NEW --- QA Contact:
Severity: low Docs Contact:
Priority: low    
Version: unspecifiedCC: dfreiber, drow, jburrell, vkumar
Target Milestone: ---Keywords: Security
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: kernel 5.16.10, kernel 5.17 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description OSIDB Bzimport 2024-07-16 12:28:12 UTC
In the Linux kernel, the following vulnerability has been resolved:

mm: vmscan: remove deadlock due to throttling failing to make progress

A soft lockup bug in kcompactd was reported in a private bugzilla with
the following visible in dmesg;

  watchdog: BUG: soft lockup - CPU#33 stuck for 26s! [kcompactd0:479]
  watchdog: BUG: soft lockup - CPU#33 stuck for 52s! [kcompactd0:479]
  watchdog: BUG: soft lockup - CPU#33 stuck for 78s! [kcompactd0:479]
  watchdog: BUG: soft lockup - CPU#33 stuck for 104s! [kcompactd0:479]

The machine had 256G of RAM with no swap and an earlier failed
allocation indicated that node 0 where kcompactd was run was potentially
unreclaimable;

  Node 0 active_anon:29355112kB inactive_anon:2913528kB active_file:0kB
    inactive_file:0kB unevictable:64kB isolated(anon):0kB isolated(file):0kB
    mapped:8kB dirty:0kB writeback:0kB shmem:26780kB shmem_thp:
    0kB shmem_pmdmapped: 0kB anon_thp: 23480320kB writeback_tmp:0kB
    kernel_stack:2272kB pagetables:24500kB all_unreclaimable? yes

Vlastimil Babka investigated a crash dump and found that a task
migrating pages was trying to drain PCP lists;

  PID: 52922  TASK: ffff969f820e5000  CPU: 19  COMMAND: "kworker/u128:3"
  Call Trace:
     __schedule
     schedule
     schedule_timeout
     wait_for_completion
     __flush_work
     __drain_all_pages
     __alloc_pages_slowpath.constprop.114
     __alloc_pages
     alloc_migration_target
     migrate_pages
     migrate_to_node
     do_migrate_pages
     cpuset_migrate_mm_workfn
     process_one_work
     worker_thread
     kthread
     ret_from_fork

This failure is specific to CONFIG_PREEMPT=n builds.  The root of the
problem is that kcompact0 is not rescheduling on a CPU while a task that
has isolated a large number of the pages from the LRU is waiting on
kcompact0 to reschedule so the pages can be released.  While
shrink_inactive_list() only loops once around too_many_isolated, reclaim
can continue without rescheduling if sc->skipped_deactivate == 1 which
could happen if there was no file LRU and the inactive anon list was not
low.