Bug 1589249

Summary:	VDO can go read-only, lose sparsely-written data when parts are discarded.
Product:	Red Hat Enterprise Linux 7	Reporter:	Sweet Tea Dorminy <sweettea>
Component:	kmod-kvdo	Assignee:	bjohnsto
Status:	CLOSED ERRATA	QA Contact:	Jakub Krysl <jkrysl>
Severity:	unspecified	Docs Contact:
Priority:	high
Version:	7.6	CC:	awalsh, corwin, jerome, jkrysl, knappch, limershe, rhandlin, ryan.p.norwood
Target Milestone:	rc	Keywords:	ZStream
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	6.1.1.99	Doc Type:	If docs needed, set a value
Doc Text:	Previously, VDO did not correctly reinitialize certain structures when a discard spanned logical addresses on two different block map trees. As a consequence, a discard operation sometimes switched the affected VDO volume to read-only mode or corrupted data on it in rare cases. With this update, VDO now reinitializes the structures correctly, and the described problem no longer occurs.	Story Points:	---
Clone Of:
Clones:	1600058 (view as bug list)		Environment:
Last Closed:	2018-10-30 09:39:49 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1600058

Description Sweet Tea Dorminy 2018-06-08 14:10:44 UTC

Description of problem:

When a discard spans logical addresses on two different block map trees, VDO does not correctly reinitialize some of its structures between the logical addresses. As such, if the logical address on the first tree was mapped to a physical address, and the second tree was not fully allocated because nothing in the relevant range of logical addresses was ever mapped, VDO may incorrectly use the block map page from the first tree when attempting to process the block map page for the second tree. This can result in random data in unwritten sectors (in the case of a non-4k aligned 4k discard, requiring VDO's 512e mode), or in ruining the reference counts of the mapped blocks on the block map page which can lead to underflow and read only mode, or reuse of the physical address for other data leading to corruption.


Version-Release number of selected component (if applicable):
6.0.0.85 or before, to 6.2.0.84.

How reproducible:
Depends on the discard size and how much data is written. With the right setup, 90%+.

Steps to Reproduce:
1. Set VDO's max discard size to 1G.
2. Create a VDO with 200G logical space.
3. Randwrite a very few blocks, say less than a quarter gigabyte, with unique data.
4. Blkdiscard the entire VDO in 1G chunks.

Actual results:
The VDO goes read-only.

Expected results:
No corruption or read-only mode.

Additional info:

Comment 4 bjohnsto 2018-07-18 20:59:49 UTC

In order to set the max discard sectors to 1G, run the following command:

echo 2097152 | sudo tee -a /sys/kvdo/max_discard_sectors

2097152 should be the number of sectors in 1G if I've done my math right. 

Do this after the VDO kernel module is loaded but before you have run vdo create.

Comment 5 bjohnsto 2018-07-18 21:03:27 UTC

Oh, and with regards to running this command, this will only work for RHEL 7.x

Comment 6 Jakub Krysl 2018-07-24 12:55:55 UTC

Thanks Bruce, managed to reproduce on kmod-kvdo-6.1.1.91. It took 8 iterations for my reproducer to hit it:

blkdiscard: /dev/mapper/vdo: BLKDISCARD ioctl failed: Input/output error
        operating mode: read-only

/var/log/messages: (there are 6 more same calltraces right before this one)
[ 9124.370923] INFO: task blkdiscard:24572 blocked for more than 120 seconds. 
[ 9124.405063] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 9124.444681] Call Trace: 
[ 9124.456537]  [<ffffffffc0351829>] ? __split_and_process_bio+0x2e9/0x520 [dm_mod] 
[ 9124.491228]  [<ffffffff939580f9>] schedule+0x29/0x70 
[ 9124.514617]  [<ffffffff93955a41>] schedule_timeout+0x221/0x2d0 
[ 9124.542323]  [<ffffffffc0351d88>] ? dm_make_request+0x128/0x1a0 [dm_mod] 
[ 9124.573314]  [<ffffffff932ff122>] ? ktime_get_ts64+0x52/0xf0 
[ 9124.600013]  [<ffffffff9395760d>] io_schedule_timeout+0xad/0x130 
[ 9124.627927]  [<ffffffff9395872d>] wait_for_completion_io+0xfd/0x140 
[ 9124.657410]  [<ffffffff932d4db0>] ? wake_up_state+0x20/0x20 
[ 9124.683209]  [<ffffffff93547f1c>] blkdev_issue_discard+0x2ac/0x2d0 
[ 9124.711212]  [<ffffffff93550fa1>] blk_ioctl_discard+0xd1/0x120 
[ 9124.737403]  [<ffffffff93551a72>] blkdev_ioctl+0x5e2/0x9b0 
[ 9124.763231]  [<ffffffff9347c3c1>] block_ioctl+0x41/0x50 
[ 9124.787680]  [<ffffffff93452410>] do_vfs_ioctl+0x360/0x550 
[ 9124.813394]  [<ffffffff934526a1>] SyS_ioctl+0xa1/0xc0 
[ 9124.837119]  [<ffffffff939648af>] system_call_fastpath+0x1c/0x21 
[-- MARK -- Tue Jul 24 10:25:00 2018] 
[ 9418.702380] kvdo3:physQ0: Decrementing free block at offset 411747 in slab 0: kvdo: Reference count would become invalid (2050) 
[ 9418.761458] kvdo3:logQ0: Preparing to enter read-only mode: DataVIO for LBN 51181 (becoming mapped to 1095105, previously mapped to 1090936, allocated 1095105) is completing with a fatal error after operation journalDecrementForWrite: kvdo: Reference count would become invalid (2050) 
[ 9418.874824] kvdo3:logQ0: [ 9418.874829] kvdo3:journalQ: Unrecoverable error, entering read-only mode: kvdo: Reference count would become invalid (2050) 
 
[ 9418.935780] Completing write VIO for LBN 51181 with error after journalDecrementForWrite: kvdo: Reference count would become invalid (2050) 
[ 9418.992415] kvdo3:physQ0: VDO not read-only when cleaning DataVIO with RJ lock 
[ 9419.905606] kvdo3:dmsetup: suspending device 'vdo' 
[ 9419.930690] kvdo3:packerQ: compression is disabled 
[ 9419.956022] kvdo3:packerQ: compression is enabled 
[ 9419.977157] kvdo3:dmsetup: device 'vdo' suspended 
[ 9419.998325] kvdo3:dmsetup: stopping device 'vdo' 
[ 9420.019251] kvdo3:journalQ: Error closing VDO: kvdo: The device is in read-only mode (2069) 
[ 9420.056344] kvdo3:journalQ: Error closing VDO: kvdo: The device is in read-only mode (2069) 
[ 9420.095247] kvdo3:journalQ: Error closing VDO: kvdo: The device is in read-only mode (2069) 
[ 9420.132796] kvdo3:dmsetup: uds: kvdo3:dedupeQ: index_0: beginning save (vcn 21) 
 
[ 9420.165673] Setting UDS index target state to closed 

Tested kmod-kvdo-6.1.1.99 and could not reproduce it after 50 iterations.

Comment 8 errata-xmlrpc 2018-10-30 09:39:49 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3094