Bug 1752893

Summary: VDO reports assert when used over cached volume
Product: Red Hat Enterprise Linux 8 Reporter: Zdenek Kabelac <zkabelac>
Component: kmod-kvdoAssignee: corwin <corwin>
Status: CLOSED ERRATA QA Contact: Filip Suba <fsuba>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 8.1CC: agk, awalsh, corwin, heinzm, jbrassow, msnitzer, prajnoha, rhandlin, thornber, zkabelac
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 6.2.2.24 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-04-28 16:43:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Zdenek Kabelac 2019-09-17 13:57:35 UTC
Following example:

# create 5G vdopool + vdo LV
lvcreate --vdo -L5G --name LV  vg/VDOPOOL

# cache vdopool 
lvcreate --cache -L100  vg/VDOPOOL

# mkfs  vdo LV
mkfs.ext4 -E nodiscard  /dev/vg/LV

Generates following kernel assert report (making VDO device futher unusable and locked in kernel - requires reboot to get it dropped)

uds: kvdo68:journalQ: assertion "completed journal write is still active" ((block->sequenceNumber >= lastActiveBlock->sequenceNumber)) failed at /var/lib/dkms/kvdo/6.2.1.138-6.2.1.138/build/vdo/base/recoveryJournal.c:1017
uds: kvdo68:journalQ: [backtrace]
CPU: 0 PID: 91945 Comm: kvdo68:journalQ Tainted: G           OE     5.3.0-0.rc8.git0.1.fc32.x86_64 #1
Hardware name: LENOVO 6464CTO/6464CTO, BIOS 7LETC9WW (2.29 ) 03/18/2011
Call Trace:
 dump_stack+0x5c/0x80
 assertionFailedLogOnly+0x49/0x70 [uds]
 ? kvdoGetCurrentThreadID+0xa/0x20 [kvdo]
 completeWrite+0x220/0x230 [kvdo]
 workQueueRunner+0x1dc/0x650 [kvdo]
 ? finish_wait+0x80/0x80
 kthread+0xfb/0x130
 ? initKernelVDOOnce+0x20/0x20 [kvdo]
 ? kthread_park+0x80/0x80
 ret_from_fork+0x35/0x40



Version-Release number of selected component (if applicable):

lvm:    2.03.05
kernel: 5.3.0-0.rc8.git0.1.fc32.x86_64
vdo:    6.2.1.138-6.2.1.138

Comment 1 Andy Walsh 2019-09-17 14:22:27 UTC
I was able to reproduce on Fedora Rawhide as well.

Comment 2 Andy Walsh 2019-09-17 14:38:19 UTC
I have reproduced this on RHEL8 kernel 4.18.0-144.el8 as well.

Package versions:
  kernel-4.18.0-144.el8
  vdo-6.2.1.134-11.el8
  kmod-kvdo-6.2.1.138-57.el8
  lvm2-2.03.05-4.el8

Terminal Output:
>[vagrant@localhost ~]$ sudo dmesg -c > /dev/null
>[vagrant@localhost ~]$ sudo vgcreate base /dev/vdb
>  Physical volume "/dev/vdb" successfully created.
>  Volume group "base" successfully created
>[vagrant@localhost ~]$ sudo lvcreate --type vdo -L4G -V2G --name vdolvm base/pool
>  Logical volume "vdolvm" created.
>[vagrant@localhost ~]$ sudo lvcreate -H -L10 base/pool
>  Rounding up size to full physical extent 12.00 MiB
>  Logical volume base/pool_vdata is now cached.
>[vagrant@localhost ~]$ sudo mkfs.ext4 -E nodiscard /dev/base/vdolvm 
>mke2fs 1.44.3 (10-July-2018)
>Creating filesystem with 524288 4k blocks and 131072 inodes
>Filesystem UUID: 72144bd6-fc0e-4c3a-8f0e-623ef2da4da1
>Superblock backups stored on blocks: 
>        32768, 98304, 163840, 229376, 294912
>
>Allocating group tables: done                            
>Writing inode tables: done                            
>Creating journal (16384 blocks): 

Log output:
>Sep 17 14:30:34 localhost.localdomain kernel: uds: kvdo1:journalQ: assertion "completed journal write is still active" ((block->sequenceNumber >= lastActiveBlock->sequenceNumber)) failed at /builddir/build/BUILD/kvdo-7806cbf0b693ea3ac82ca
>fe567b1a964bcd69924/obj/./vdo/base/recoveryJournal.c:1018
>Sep 17 14:30:34 localhost.localdomain kernel: uds: kvdo1:journalQ: [backtrace]
>Sep 17 14:30:34 localhost.localdomain kernel: CPU: 0 PID: 6606 Comm: kvdo1:journalQ Kdump: loaded Tainted: G           O     --------- -  - 4.18.0-144.el8.x86_64 #1
>Sep 17 14:30:34 localhost.localdomain kernel: Hardware name: Red Hat KVM, BIOS 1.10.2-3.el7_4.1 04/01/2014
>Sep 17 14:30:34 localhost.localdomain kernel: Call Trace:
>Sep 17 14:30:34 localhost.localdomain kernel:  dump_stack+0x5c/0x80
>Sep 17 14:30:34 localhost.localdomain kernel:  assertionFailedLogOnly+0x49/0x70 [uds]
>Sep 17 14:30:34 localhost.localdomain kernel:  ? __next_timer_interrupt+0xc0/0xc0
>Sep 17 14:30:34 localhost.localdomain kernel:  ? kvdoGetCurrentThreadID+0xa/0x20 [kvdo]
>Sep 17 14:30:34 localhost.localdomain kernel:  completeWrite+0x229/0x240 [kvdo]
>Sep 17 14:30:34 localhost.localdomain kernel:  workQueueRunner+0x1b9/0x660 [kvdo]
>Sep 17 14:30:34 localhost.localdomain kernel:  ? finish_wait+0x80/0x80
>Sep 17 14:30:34 localhost.localdomain kernel:  ? initKernelVDOOnce+0x20/0x20 [kvdo]
>Sep 17 14:30:34 localhost.localdomain kernel:  kthread+0x112/0x130
>Sep 17 14:30:34 localhost.localdomain kernel:  ? kthread_flush_work_fn+0x10/0x10
>Sep 17 14:30:34 localhost.localdomain kernel:  ret_from_fork+0x35/0x40
>Sep 17 14:30:34 localhost.localdomain kernel: uds: kvdo1:journalQ: assertion "completed journal write is still active" ((block->sequenceNumber >= lastActiveBlock->sequenceNumber)) failed at /builddir/build/BUILD/kvdo-7806cbf0b693ea3ac82ca
>fe567b1a964bcd69924/obj/./vdo/base/recoveryJournal.c:1018
>Sep 17 14:30:34 localhost.localdomain kernel: uds: kvdo1:journalQ: [backtrace]
>Sep 17 14:30:34 localhost.localdomain kernel: CPU: 0 PID: 6606 Comm: kvdo1:journalQ Kdump: loaded Tainted: G           O     --------- -  - 4.18.0-144.el8.x86_64 #1
>Sep 17 14:30:34 localhost.localdomain kernel: Hardware name: Red Hat KVM, BIOS 1.10.2-3.el7_4.1 04/01/2014
>Sep 17 14:30:34 localhost.localdomain kernel: Call Trace:
>Sep 17 14:30:34 localhost.localdomain kernel:  dump_stack+0x5c/0x80
>Sep 17 14:30:34 localhost.localdomain kernel:  assertionFailedLogOnly+0x49/0x70 [uds]
>Sep 17 14:30:34 localhost.localdomain kernel:  ? __next_timer_interrupt+0xc0/0xc0
>Sep 17 14:30:34 localhost.localdomain kernel:  ? kvdoGetCurrentThreadID+0xa/0x20 [kvdo]
>Sep 17 14:30:34 localhost.localdomain kernel:  completeWrite+0x229/0x240 [kvdo]
>Sep 17 14:30:34 localhost.localdomain kernel:  workQueueRunner+0x1b9/0x660 [kvdo]
>Sep 17 14:30:34 localhost.localdomain kernel:  ? finish_wait+0x80/0x80
>Sep 17 14:30:34 localhost.localdomain kernel:  ? initKernelVDOOnce+0x20/0x20 [kvdo]
>Sep 17 14:30:34 localhost.localdomain kernel:  kthread+0x112/0x130
>Sep 17 14:30:34 localhost.localdomain kernel:  ? kthread_flush_work_fn+0x10/0x10
>Sep 17 14:30:34 localhost.localdomain kernel:  ret_from_fork+0x35/0x40
>Sep 17 14:30:34 localhost.localdomain kernel: uds: kvdo1:journalQ: assertion "completed journal write is still active" ((block->sequenceNumber >= lastActiveBlock->sequenceNumber)) failed at /builddir/build/BUILD/kvdo-7806cbf0b693ea3ac82ca
>fe567b1a964bcd69924/obj/./vdo/base/recoveryJournal.c:1018
>Sep 17 14:30:34 localhost.localdomain kernel: uds: kvdo1:journalQ: [backtrace]
>Sep 17 14:30:34 localhost.localdomain kernel: CPU: 0 PID: 6606 Comm: kvdo1:journalQ Kdump: loaded Tainted: G           O     --------- -  - 4.18.0-144.el8.x86_64 #1
>Sep 17 14:30:34 localhost.localdomain kernel: Hardware name: Red Hat KVM, BIOS 1.10.2-3.el7_4.1 04/01/2014
>Sep 17 14:30:34 localhost.localdomain kernel: Call Trace:
>Sep 17 14:30:34 localhost.localdomain kernel:  dump_stack+0x5c/0x80
>Sep 17 14:30:34 localhost.localdomain kernel:  assertionFailedLogOnly+0x49/0x70 [uds]
>Sep 17 14:30:34 localhost.localdomain kernel:  ? writeBlock+0x50/0x50 [kvdo]
>Sep 17 14:30:34 localhost.localdomain kernel:  ? kvdoGetCurrentThreadID+0xa/0x20 [kvdo]
>Sep 17 14:30:34 localhost.localdomain kernel:  completeWrite+0x229/0x240 [kvdo]
>Sep 17 14:30:34 localhost.localdomain kernel:  workQueueRunner+0x1b9/0x660 [kvdo]
>Sep 17 14:30:34 localhost.localdomain kernel:  ? finish_wait+0x80/0x80
>Sep 17 14:30:34 localhost.localdomain kernel:  ? initKernelVDOOnce+0x20/0x20 [kvdo]
>Sep 17 14:30:34 localhost.localdomain kernel:  kthread+0x112/0x130
>Sep 17 14:30:34 localhost.localdomain kernel:  ? kthread_flush_work_fn+0x10/0x10
>Sep 17 14:30:34 localhost.localdomain kernel:  ret_from_fork+0x35/0x40
>Sep 17 14:30:34 localhost.localdomain kernel: uds: kvdo1:journalQ: assertion "completed journal write is still active" ((block->sequenceNumber >= lastActiveBlock->sequenceNumber)) failed at /builddir/build/BUILD/kvdo-7806cbf0b693ea3ac82ca
>fe567b1a964bcd69924/obj/./vdo/base/recoveryJournal.c:1018
>Sep 17 14:30:34 localhost.localdomain kernel: uds: kvdo1:journalQ: [backtrace]
>Sep 17 14:30:34 localhost.localdomain kernel: CPU: 0 PID: 6606 Comm: kvdo1:journalQ Kdump: loaded Tainted: G           O     --------- -  - 4.18.0-144.el8.x86_64 #1
>Sep 17 14:30:34 localhost.localdomain kernel: Hardware name: Red Hat KVM, BIOS 1.10.2-3.el7_4.1 04/01/2014
>Sep 17 14:30:34 localhost.localdomain kernel: Call Trace:
>Sep 17 14:30:34 localhost.localdomain kernel:  dump_stack+0x5c/0x80
>Sep 17 14:30:34 localhost.localdomain kernel:  assertionFailedLogOnly+0x49/0x70 [uds]
>Sep 17 14:30:34 localhost.localdomain kernel:  ? __next_timer_interrupt+0xc0/0xc0
>Sep 17 14:30:34 localhost.localdomain kernel:  ? kvdoGetCurrentThreadID+0xa/0x20 [kvdo]
>Sep 17 14:30:34 localhost.localdomain kernel:  completeWrite+0x229/0x240 [kvdo]
>Sep 17 14:30:34 localhost.localdomain kernel:  workQueueRunner+0x1b9/0x660 [kvdo]
>Sep 17 14:30:34 localhost.localdomain kernel:  ? finish_wait+0x80/0x80
>Sep 17 14:30:34 localhost.localdomain kernel:  ? initKernelVDOOnce+0x20/0x20 [kvdo]
>Sep 17 14:30:34 localhost.localdomain kernel:  kthread+0x112/0x130
>Sep 17 14:30:34 localhost.localdomain kernel:  ? kthread_flush_work_fn+0x10/0x10
>Sep 17 14:30:34 localhost.localdomain kernel:  ret_from_fork+0x35/0x40
>Sep 17 14:30:34 localhost.localdomain kernel: uds: kvdo1:journalQ: assertion "completed journal write is still active" ((block->sequenceNumber >= lastActiveBlock->sequenceNumber)) failed at /builddir/build/BUILD/kvdo-7806cbf0b693ea3ac82ca
>fe567b1a964bcd69924/obj/./vdo/base/recoveryJournal.c:1018
>Sep 17 14:30:34 localhost.localdomain kernel: uds: kvdo1:journalQ: [backtrace]
>Sep 17 14:30:34 localhost.localdomain kernel: CPU: 0 PID: 6606 Comm: kvdo1:journalQ Kdump: loaded Tainted: G           O     --------- -  - 4.18.0-144.el8.x86_64 #1
>Sep 17 14:30:34 localhost.localdomain kernel: Hardware name: Red Hat KVM, BIOS 1.10.2-3.el7_4.1 04/01/2014
>Sep 17 14:30:34 localhost.localdomain kernel: Call Trace:
>Sep 17 14:30:34 localhost.localdomain kernel:  dump_stack+0x5c/0x80
>Sep 17 14:30:34 localhost.localdomain kernel:  assertionFailedLogOnly+0x49/0x70 [uds]
>Sep 17 14:30:34 localhost.localdomain kernel:  ? __next_timer_interrupt+0xc0/0xc0
>Sep 17 14:30:34 localhost.localdomain kernel:  ? kvdoGetCurrentThreadID+0xa/0x20 [kvdo]
>Sep 17 14:30:34 localhost.localdomain kernel:  completeWrite+0x229/0x240 [kvdo]
>Sep 17 14:30:34 localhost.localdomain kernel:  workQueueRunner+0x1b9/0x660 [kvdo]
>Sep 17 14:30:34 localhost.localdomain kernel:  ? finish_wait+0x80/0x80
>Sep 17 14:30:34 localhost.localdomain kernel:  ? initKernelVDOOnce+0x20/0x20 [kvdo]
>Sep 17 14:30:34 localhost.localdomain kernel:  kthread+0x112/0x130
>Sep 17 14:30:34 localhost.localdomain kernel:  ? kthread_flush_work_fn+0x10/0x10
>Sep 17 14:30:34 localhost.localdomain kernel:  ret_from_fork+0x35/0x40

Comment 3 Jakub Krysl 2019-10-15 14:38:54 UTC
Mass migration to Filip.

Comment 6 Filip Suba 2020-03-17 14:09:55 UTC
Verified with vdo-6.2.2.117-13.el8.

Mar 17 15:02:49  kernel: kvdo4:lvcreate: Using write policy async automatically.
Mar 17 15:02:49  kernel: kvdo4:lvcreate: loading device 'base-pool-vpool'
Mar 17 15:02:49  kernel: kvdo4:lvcreate: zones: 1 logical, 1 physical, 1 hash; base threads: 5
Mar 17 15:02:49  kernel: kvdo4:lvcreate: starting device 'base-pool-vpool'
Mar 17 15:02:49  kernel: kvdo4:journalQ: VDO commencing normal operation
Mar 17 15:02:49  kernel: kvdo4:lvcreate: Setting UDS index target state to online
Mar 17 15:02:49  kernel: uds: kvdo4:dedupeQ: creating index: dev=/dev/dm-3 offset=4096 size=2781704192
Mar 17 15:02:49  kernel: kvdo4:lvcreate: device 'base-pool-vpool' started

Comment 8 errata-xmlrpc 2020-04-28 16:43:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:1782