2297560 – (CVE-2024-40976) CVE-2024-40976 kernel: drm/lima: mask irqs in timeout path before hard reset

Bug 2297560 (CVE-2024-40976) - CVE-2024-40976 kernel: drm/lima: mask irqs in timeout path before hard reset

Summary: CVE-2024-40976 kernel: drm/lima: mask irqs in timeout path before hard reset

Keywords:
Status:	NEW
Alias:	CVE-2024-40976
Product:	Security Response
Classification:	Other
Component:	vulnerability
Sub Component:
Version:	unspecified
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Product Security DevOps Team
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2024-07-12 13:44 UTC by OSIDB Bzimport
Modified:	2024-09-25 07:26 UTC (History)
CC List:	4 users (show)
Fixed In Version:	kernel 5.10.221, kernel 5.15.162, kernel 6.1.96, kernel 6.6.36, kernel 6.9.7, kernel 6.10-rc1
Clone Of:
Environment:
Last Closed:
Embargoed:

Attachments	(Terms of Use)

Description OSIDB Bzimport 2024-07-12 13:44:31 UTC

In the Linux kernel, the following vulnerability has been resolved:

drm/lima: mask irqs in timeout path before hard reset

There is a race condition in which a rendering job might take just long
enough to trigger the drm sched job timeout handler but also still
complete before the hard reset is done by the timeout handler.
This runs into race conditions not expected by the timeout handler.
In some very specific cases it currently may result in a refcount
imbalance on lima_pm_idle, with a stack dump such as:

[10136.669170] WARNING: CPU: 0 PID: 0 at drivers/gpu/drm/lima/lima_devfreq.c:205 lima_devfreq_record_idle+0xa0/0xb0
...
[10136.669459] pc : lima_devfreq_record_idle+0xa0/0xb0
...
[10136.669628] Call trace:
[10136.669634]  lima_devfreq_record_idle+0xa0/0xb0
[10136.669646]  lima_sched_pipe_task_done+0x5c/0xb0
[10136.669656]  lima_gp_irq_handler+0xa8/0x120
[10136.669666]  __handle_irq_event_percpu+0x48/0x160
[10136.669679]  handle_irq_event+0x4c/0xc0

We can prevent that race condition entirely by masking the irqs at the
beginning of the timeout handler, at which point we give up on waiting
for that job entirely.
The irqs will be enabled again at the next hard reset which is already
done as a recovery by the timeout handler.

Note You need to log in before you can comment on or make changes to this bug.