Bug 1916891
Summary: | lvm2 test suite crashed in raid5 control code | |||
---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Zdenek Kabelac <zkabelac> | |
Component: | lvm2 | Assignee: | Heinz Mauelshagen <heinzm> | |
Status: | CLOSED UPSTREAM | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | |
Severity: | unspecified | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 34 | CC: | agk, anprice, bmarzins, bmr, cfeist, heinzm, jonathan, kzak, lvm-team, mcsontos, msnitzer, prajnoha, prockai, zkabelac | |
Target Milestone: | --- | |||
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1929675 (view as bug list) | Environment: | ||
Last Closed: | 2021-07-15 14:51:54 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1929675 |
Description
Zdenek Kabelac
2021-01-15 18:13:38 UTC
Actually this issue was already opened as bug 1859336 for kernel 5.8 - so it's still present. This bug appears to have been reported against 'rawhide' during the Fedora 34 development cycle. Changing version to 34. The BUG_ON gets triggered on single core systems easily, much less on multi core (see below). Rational: 1. the md sync thread calls end_reshape() from raid5_sync_request when done reshaping; end_reshape() _only_ updates the reshape position to MaxSector but keeps the changed layout configuration, i.e. any delta disks, chunk sector or raid algorithm changes; that inconclusive configuration is stored in the superblock 2. dm-raid constructs a mapping loading such inconsistent superblock as of step 1 before step 3 was able to finish and calls md_run() which leads to the bug in raid5.c as of the description 3. the MD RAID personality finish_reshape() is called which resets the reshape information about chunk sectors, delta disks etc.; this is explaining why the BUG is rarely seen on multi-core machines as finish_reshape() races with the dm-raid constructor as of step 2 thus may finish before the superblock gets loaded in step 2. Also, dm-raid postsuspend may even prevent the MD sync thread from calling finish_reshape() and storing superblocks completely. Upstream patch submitted -> https://listman.redhat.com/archives/dm-devel/2021-April/msg00182.html (In reply to Heinz Mauelshagen from comment #6) > Upstream patch submitted -> > https://listman.redhat.com/archives/dm-devel/2021-April/msg00182.html This patch is now upstream (and was marked for stable@), see: http://git.kernel.org/linus/f99a8e4373eeacb279bc9696937a55adbff7a28a |