Bug 1589249
Summary: | VDO can go read-only, lose sparsely-written data when parts are discarded. | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Sweet Tea Dorminy <sweettea> | |
Component: | kmod-kvdo | Assignee: | bjohnsto | |
Status: | CLOSED ERRATA | QA Contact: | Jakub Krysl <jkrysl> | |
Severity: | unspecified | Docs Contact: | ||
Priority: | high | |||
Version: | 7.6 | CC: | awalsh, corwin, jerome, jkrysl, knappch, limershe, rhandlin, ryan.p.norwood | |
Target Milestone: | rc | Keywords: | ZStream | |
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | 6.1.1.99 | Doc Type: | If docs needed, set a value | |
Doc Text: |
Previously, VDO did not correctly reinitialize certain structures when a discard spanned logical addresses on two different block map trees. As a consequence, a discard operation sometimes switched the affected VDO volume to read-only mode or corrupted data on it in rare cases. With this update, VDO now reinitializes the structures correctly, and the described problem no longer occurs.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1600058 (view as bug list) | Environment: | ||
Last Closed: | 2018-10-30 09:39:49 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1600058 |
Description
Sweet Tea Dorminy
2018-06-08 14:10:44 UTC
In order to set the max discard sectors to 1G, run the following command: echo 2097152 | sudo tee -a /sys/kvdo/max_discard_sectors 2097152 should be the number of sectors in 1G if I've done my math right. Do this after the VDO kernel module is loaded but before you have run vdo create. Oh, and with regards to running this command, this will only work for RHEL 7.x Thanks Bruce, managed to reproduce on kmod-kvdo-6.1.1.91. It took 8 iterations for my reproducer to hit it: blkdiscard: /dev/mapper/vdo: BLKDISCARD ioctl failed: Input/output error operating mode: read-only /var/log/messages: (there are 6 more same calltraces right before this one) [ 9124.370923] INFO: task blkdiscard:24572 blocked for more than 120 seconds. [ 9124.405063] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 9124.444681] Call Trace: [ 9124.456537] [<ffffffffc0351829>] ? __split_and_process_bio+0x2e9/0x520 [dm_mod] [ 9124.491228] [<ffffffff939580f9>] schedule+0x29/0x70 [ 9124.514617] [<ffffffff93955a41>] schedule_timeout+0x221/0x2d0 [ 9124.542323] [<ffffffffc0351d88>] ? dm_make_request+0x128/0x1a0 [dm_mod] [ 9124.573314] [<ffffffff932ff122>] ? ktime_get_ts64+0x52/0xf0 [ 9124.600013] [<ffffffff9395760d>] io_schedule_timeout+0xad/0x130 [ 9124.627927] [<ffffffff9395872d>] wait_for_completion_io+0xfd/0x140 [ 9124.657410] [<ffffffff932d4db0>] ? wake_up_state+0x20/0x20 [ 9124.683209] [<ffffffff93547f1c>] blkdev_issue_discard+0x2ac/0x2d0 [ 9124.711212] [<ffffffff93550fa1>] blk_ioctl_discard+0xd1/0x120 [ 9124.737403] [<ffffffff93551a72>] blkdev_ioctl+0x5e2/0x9b0 [ 9124.763231] [<ffffffff9347c3c1>] block_ioctl+0x41/0x50 [ 9124.787680] [<ffffffff93452410>] do_vfs_ioctl+0x360/0x550 [ 9124.813394] [<ffffffff934526a1>] SyS_ioctl+0xa1/0xc0 [ 9124.837119] [<ffffffff939648af>] system_call_fastpath+0x1c/0x21 [-- MARK -- Tue Jul 24 10:25:00 2018] [ 9418.702380] kvdo3:physQ0: Decrementing free block at offset 411747 in slab 0: kvdo: Reference count would become invalid (2050) [ 9418.761458] kvdo3:logQ0: Preparing to enter read-only mode: DataVIO for LBN 51181 (becoming mapped to 1095105, previously mapped to 1090936, allocated 1095105) is completing with a fatal error after operation journalDecrementForWrite: kvdo: Reference count would become invalid (2050) [ 9418.874824] kvdo3:logQ0: [ 9418.874829] kvdo3:journalQ: Unrecoverable error, entering read-only mode: kvdo: Reference count would become invalid (2050) [ 9418.935780] Completing write VIO for LBN 51181 with error after journalDecrementForWrite: kvdo: Reference count would become invalid (2050) [ 9418.992415] kvdo3:physQ0: VDO not read-only when cleaning DataVIO with RJ lock [ 9419.905606] kvdo3:dmsetup: suspending device 'vdo' [ 9419.930690] kvdo3:packerQ: compression is disabled [ 9419.956022] kvdo3:packerQ: compression is enabled [ 9419.977157] kvdo3:dmsetup: device 'vdo' suspended [ 9419.998325] kvdo3:dmsetup: stopping device 'vdo' [ 9420.019251] kvdo3:journalQ: Error closing VDO: kvdo: The device is in read-only mode (2069) [ 9420.056344] kvdo3:journalQ: Error closing VDO: kvdo: The device is in read-only mode (2069) [ 9420.095247] kvdo3:journalQ: Error closing VDO: kvdo: The device is in read-only mode (2069) [ 9420.132796] kvdo3:dmsetup: uds: kvdo3:dedupeQ: index_0: beginning save (vcn 21) [ 9420.165673] Setting UDS index target state to closed Tested kmod-kvdo-6.1.1.99 and could not reproduce it after 50 iterations. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:3094 |