Bug 2265269 (CVE-2023-52437)

Summary: CVE-2023-52437 kernel: Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d"
Product: [Other] Security Response Reporter: Avinash Hanwate <ahanwate>
Component: vulnerabilityAssignee: Product Security <prodsec-ir-bot>
Status: CLOSED NOTABUG QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: unspecifiedCC: acaringi, allarkin, aquini, bhu, chwhite, cye, cyin, dbohanno, debarbos, dfreiber, drow, dvlasenk, ezulian, hkrzesin, jarod, jburrell, jdenham, jfaracco, jforbes, jlelli, joe.lawrence, jshortt, jstancek, jwyatt, kcarcia, ldoskova, lgoncalv, lzampier, mleitner, mmilgram, mstowell, nmurray, pbonzini, ptalbert, rparrazo, rrobaina, rvrbovsk, rysulliv, scweaver, sidakwo, sukulkar, tglozar, tyberry, vkumar, wcosta, williams, wmealing, ycote, ykopkova, zhijwang
Target Milestone: ---Keywords: Security
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
A flaw was found in the Linux kernel's md/raid5 driver, specifically introduced by commit 5e2cf333b7bd. This commit triggers a race condition wherein the system hangs due to improper handling of MD_SB_CHANGE_PENDING flags. During the execution of md_write_start, if MD_SB_CHANGE_PENDING is set and concurrently cleared by raid5d, it can lead to a deadlock situation. This results in system unresponsiveness, potentially causing a denial of service (DoS).
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-02-25 10:51:23 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2265270    
Bug Blocks: 2265182    

Description Avinash Hanwate 2024-02-21 08:33:28 UTC
In the Linux kernel, the following vulnerability has been resolved:

Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d"

This reverts commit 5e2cf333b7bd5d3e62595a44d598a254c697cd74.

That commit introduced the following race and can cause system hung.

 md_write_start:             raid5d:
 // mddev->in_sync == 1
 set "MD_SB_CHANGE_PENDING"
                            // running before md_write_start wakeup it
                             waiting "MD_SB_CHANGE_PENDING" cleared
                             >>>>>>>>> hung
 wakeup mddev->thread
 ...
 waiting "MD_SB_CHANGE_PENDING" cleared
 >>>> hung, raid5d should clear this flag
 but get hung by same flag.

The issue reverted commit fixing is fixed by last patch in a new way.

https://git.kernel.org/stable/c/84c39986fe6dd77aa15f08712339f5d4eb7dbe27
https://git.kernel.org/stable/c/bed0acf330b2c50c688f6d9cfbcac2aa57a8e613
https://git.kernel.org/stable/c/cfa46838285814c3a27faacf7357f0a65bb5d152
https://git.kernel.org/stable/c/e16a0bbdb7e590a6607b0d82915add738c03c069
https://git.kernel.org/stable/c/aab69ef769707ad987ff905d79e0bd6591812580
https://git.kernel.org/stable/c/0de40f76d567133b871cd6ad46bb87afbce46983
https://git.kernel.org/stable/c/87165c64fe1a98bbab7280c58df3c83be2c98478
https://git.kernel.org/stable/c/bed9e27baf52a09b7ba2a3714f1e24e17ced386d

Comment 1 Avinash Hanwate 2024-02-21 08:35:17 UTC
Created kernel tracking bugs for this issue:

Affects: fedora-all [bug 2265270]

Comment 3 Paolo Bonzini 2024-02-21 11:40:20 UTC
This CVE is bogus.  The revert was later reverted again on the stable branch because it themselves caused hangs.

I'm preparing the popcorn.

Comment 4 Wade Mealing 2024-02-21 12:11:25 UTC
If it was an attempt at a fix, this means the CVE is not bogus, ie, it should not be disputed.

Instead this is an incomplete fix, and another CVE needs to be generated.

Comment 6 Paolo Bonzini 2024-02-21 13:36:23 UTC
> If it was an attempt at a fix, this means the CVE is not bogus, ie, it should not be disputed.

According to https://lore.kernel.org/all/626f3f93-7085-7bd4-2172-3f97fcf197c9@huaweicloud.com/T/, this commit is just an "attempt to undo a suboptimal fix" that, as it turned out, had to be reverted.

The actual fix is in commit d6e035aad6c0 ("md: bypass block throttle for superblock update").

So there is a bug that is fixed by commit d6e035aad6c0.  I am not sure if it should be considered an instance of this CVE, or a CVE at all.

Comment 7 Justin M. Forbes 2024-02-21 22:52:24 UTC
The fix is queued for 6.7.6, but yes, the revert landed in 6.6.13, and the revert was reverted in 6.6.14.  I will mark 6.7.6 as fixing this for Fedora when it lands.