Bug 1958273 - snapshot-origin table trivially crashes the kernel
Summary: snapshot-origin table trivially crashes the kernel
Keywords:
Status: POST
Alias: None
Product: LVM and device-mapper
Classification: Community
Component: device-mapper
Version: unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Mikuláš Patočka
QA Contact: cluster-qe
URL:
Whiteboard:
Depends On:
Blocks: 1958298
TreeView+ depends on / blocked
 
Reported: 2021-05-07 14:19 UTC by Michael Tokarev
Modified: 2023-08-10 15:39 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1958298 (view as bug list)
Environment:
Last Closed:
Embargoed:
pm-rhel: lvm-technical-solution?


Attachments (Terms of Use)

Description Michael Tokarev 2021-05-07 14:19:46 UTC
Just create a snapshot-origin and do some activity on it, it is a 2-line reproducer:

# first create a test device: it can be anything
# here we use a 100-MB loop device
truncate --size=100M base
losetup /dev/loop0 base
# now create the snapshot-origin on it
sz=$(blockdev --getsize /dev/loop0)
dmsetup create base --table "0 $sz snapshot-origin /dev/loop0"
# and now the crash
mkfs.ext4 /dev/mapper/base

This crashes instantly. Different kernels crashed a bit differently,
I tried even some 3.x kernels. But the result is the same - crash.

Here's an example from 5.10 kernel:

[   89.661594] ------------[ cut here ]------------
[   89.663789] kernel BUG at block/bio.c:1473!
[   89.665624] invalid opcode: 0000 [#1] SMP PTI
[   89.669108] CPU: 0 PID: 264 Comm: mkfs.ext4 Not tainted 5.10.0-6-amd64 #1 Debian 5.10.28-1
[   89.672551] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
[   89.675987] RIP: 0010:bio_split+0x74/0x80
[   89.677744] Code: 89 c7 e8 ff 5e 03 00 41 8b 74 24 28 48 89 ef e8 e2 f5 ff ff f6 45 15 01 74 08 66 41 81 4c 24 14 00 01 4c 89 e0 5b 5d 41 5c c3 <0f> 0b 0f 0b 0f 0b 45 31 e4 eb ed 90 0f 1f 44 00 00 39 77 28 76 05
[   89.686181] RSP: 0018:ffffb248c026bb30 EFLAGS: 00010246
[   89.688347] RAX: 0000000000000008 RBX: 0000000000000000 RCX: ffff8bd5025f7d80
[   89.691201] RDX: 0000000000000c00 RSI: 0000000000000000 RDI: ffff8bd502031780
[   89.694089] RBP: 0000000000000000 R08: 00000019a1b717a8 R09: 0000000000000000
[   89.696975] R10: ffff8bd5341fc600 R11: ffff8bd5341fc658 R12: ffff8bd5024a0558
[   89.699935] R13: ffff8bd5024a0000 R14: ffff8bd502031780 R15: ffff8bd502383c80
[   89.704128] FS:  00007f0a33236780(0000) GS:ffff8bd53ea00000(0000) knlGS:0000000000000000
[   89.707641] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   89.710162] CR2: 00007f0a31db2000 CR3: 0000000002ab4000 CR4: 00000000000006f0
[   89.713071] Call Trace:
[   89.714247]  dm_submit_bio+0x35d/0x440 [dm_mod]
[   89.716230]  submit_bio_noacct+0xf8/0x420
[   89.719430]  ? bio_add_page+0x62/0x90
[   89.721073]  submit_bh_wbc+0x16a/0x190
[   89.722702]  __block_write_full_page+0x1fa/0x460
[   89.724767]  ? bdev_evict_inode+0xc0/0xc0
[   89.726523]  ? block_invalidatepage+0x150/0x150
[   89.728446]  __writepage+0x17/0x60
[   89.730010]  write_cache_pages+0x186/0x3d0
[   89.731789]  ? __wb_calc_thresh+0x120/0x120
[   89.734842]  generic_writepages+0x4c/0x80
[   89.736830]  do_writepages+0x34/0xc0
[   89.738390]  ? __fsnotify_parent+0xe7/0x2d0
[   89.740196]  __filemap_fdatawrite_range+0xc5/0x100
[   89.742245]  file_write_and_wait_range+0x61/0xb0
[   89.744219]  blkdev_fsync+0x17/0x40
[   89.745777]  __x64_sys_fsync+0x34/0x60
[   89.747446]  do_syscall_64+0x33/0x80
[   89.749341]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   89.753171] RIP: 0033:0x7f0a320b37a0
[   89.754777] Code: 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 83 3d 69 cf 2b 00 00 75 10 b8 4a 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 4e 3d 01 00 48 89 04 24
[   89.762134] RSP: 002b:00007ffdb7a55158 EFLAGS: 00000246 ORIG_RAX: 000000000000004a
[   89.765245] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f0a320b37a0
[   89.770073] RDX: 0000000000000400 RSI: 000055797fd3e620 RDI: 0000000000000003
[   89.773025] RBP: 000055797fd3e510 R08: 0000000004800800 R09: 00007f0a32e08c40
[   89.775864] R10: 0000000004800800 R11: 0000000000000246 R12: 00007ffdb7a551c0
[   89.778680] R13: 00007ffdb7a551c8 R14: 000055797fd3e2e0 R15: 0000000000000000
[   89.781984] Modules linked in: dm_snapshot dm_bufio dm_mod loop hid_generic usbhid hid uhci_hcd ehci_hcd virtio_net net_failover ppdev failover joydev usbcore parport_pc psmouse evdev pcspkr serio_raw parport ata_generic floppy i2c_piix4 virtio_pci sg usb_common ata_piix virtio_ring virtio button qemu_fw_cfg ip_tables x_tables autofs4 crc32c_generic ext4 crc16 mbcache jbd2 sd_mod t10_pi crc_t10dif crct10dif_common ahci libahci libata scsi_mod
[   89.799301] ---[ end trace 1aa4a3cf509dc9b7 ]---

(block/bio.c:1473 is this:
struct bio *bio_split(struct bio *bio, int sectors,
                     gfp_t gfp, struct bio_set *bs)
{
       struct bio *split;

       BUG_ON(sectors <= 0);                   <==== here
       BUG_ON(sectors >= bio_sectors(bio));
) 

Maybe this is a wrong usage of snapshot-origin, I dunno (tried this after reading all the available docs on the topic), but I guess it is not a reason for such a crash

Comment 1 Mikuláš Patočka 2021-05-07 16:02:28 UTC
A patch is here: https://listman.redhat.com/archives/dm-devel/2021-May/msg00018.html

Comment 2 Zdenek Kabelac 2021-05-08 17:50:04 UTC
Worth to comment usage:

When the user intends to use a snapshot - standard way is:

using i.e. origin as striped target -

taking snapshot - so wrapping with  snapshot-origin + snapshot.

When snapshot is no longer needed or is getting merged,
snapshot-origin is reloading with respective target.

So in the standard 'intended' use-case there is never  just standalone  
'snapshot-origin' without any snapshot (as such setup is slightly less efficient)


Note You need to log in before you can comment on or make changes to this bug.