Bug 1608070
| Summary: | NULL pointer dereference while deleting VDO volumes that had been used for stacked raid testing | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Corey Marthaler <cmarthal> | |
| Component: | kmod-kvdo | Assignee: | Matthew Sakai <msakai> | |
| Status: | CLOSED ERRATA | QA Contact: | Corey Marthaler <cmarthal> | |
| Severity: | medium | Docs Contact: | ||
| Priority: | high | |||
| Version: | 7.6 | CC: | awalsh, cmarthal, jkrysl, limershe, msakai, rhandlin | |
| Target Milestone: | rc | Keywords: | ZStream | |
| Target Release: | --- | |||
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | 6.1.1.120 | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1619605 (view as bug list) | Environment: | ||
| Last Closed: | 2018-10-30 09:40:03 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1619605 | |||
|
Description
Corey Marthaler
2018-07-24 22:22:48 UTC
I think I see the issue. First, I'm assuming that the storage under the VDO device does not accept flushes. Is this correct? It looks like if the storage does not accept flushes, and the VDO's other components release their atomic locks with proper timing, then recovery journal reaping code can grow the stack to an arbitrary depth. This case occurred during device shutdown, which may have helped us produce the problematic lock release timing, since everything will be trying to release their locks at that time. The fix is not obvious, but we are considering options. Given the reliance on timing between different threads, however, I'm not sure how easy it will be to reproduce this issue reliably, in order to test the fix we settle on. It appears they do not: [...] Jul 24 20:41:28 hayes-02 kernel: kvdo31:dmsetup: underlying device, REQ_FLUSH: not supported, REQ_FUA: not supported Jul 24 20:41:29 hayes-02 kernel: kvdo32:dmsetup: underlying device, REQ_FLUSH: not supported, REQ_FUA: not supported Jul 24 20:41:31 hayes-02 kernel: kvdo33:dmsetup: underlying device, REQ_FLUSH: not supported, REQ_FUA: not supported Jul 24 20:41:33 hayes-02 kernel: kvdo34:dmsetup: underlying device, REQ_FLUSH: not supported, REQ_FUA: not supported We're no longer seeing this issue after stacked raid10 vdo tests are run and cleaned up for additional testing. Marking verified in the latest rpms. [lvm_vdo_raid] lvm_vdo_raid_sanity_raid10 PASS 3.10.0-951.el7.x86_64 vdo-6.1.1.125-3.el7 BUILT: Sun Sep 16 21:51:14 CDT 2018 kmod-kvdo-6.1.1.125-5.el7 BUILT: Tue Sep 18 09:32:02 CDT 2018 lvm2-2.02.180-8.el7 BUILT: Mon Sep 10 04:45:22 CDT 2018 lvm2-libs-2.02.180-8.el7 BUILT: Mon Sep 10 04:45:22 CDT 2018 device-mapper-1.02.149-8.el7 BUILT: Mon Sep 10 04:45:22 CDT 2018 device-mapper-libs-1.02.149-8.el7 BUILT: Mon Sep 10 04:45:22 CDT 2018 device-mapper-event-1.02.149-8.el7 BUILT: Mon Sep 10 04:45:22 CDT 2018 device-mapper-event-libs-1.02.149-8.el7 BUILT: Mon Sep 10 04:45:22 CDT 2018 device-mapper-persistent-data-0.7.3-3.el7 BUILT: Tue Nov 14 05:07:18 CST 2017 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:3094 |