Bug 1798706

Summary: need a more graceful way to fail when attempting to writecache an origin using a RO pool device
Product: Red Hat Enterprise Linux 8 Reporter: Corey Marthaler <cmarthal>
Component: lvm2Assignee: LVM and device-mapper development team <lvm-team>
lvm2 sub component: Cache Logical Volumes QA Contact: cluster-qe <cluster-qe>
Status: CLOSED ERRATA Docs Contact:
Severity: low    
Priority: unspecified CC: agk, heinzm, jbrassow, mcsontos, msnitzer, pasik, prajnoha, teigland, zkabelac
Version: 8.2Flags: pm-rhel: mirror+
Target Milestone: rc   
Target Release: 8.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: lvm2-2.03.08-2.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-04-28 16:59:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Corey Marthaler 2020-02-05 20:03:54 UTC
Description of problem:
Pretty asinine scenario, but...

SCENARIO - [create_ro_writecache_pool]
Create RO cache pool (fast) volumes

*** Cache info for this scenario ***
*  origin (slow):  /dev/sdd1
*  pool (fast):    /dev/sdh1
************************************

Adding "slow" and "fast" tags to corresponding pvs
Create origin (slow) volume
lvcreate --wipesignatures y  -L 4G -n corigin writecache_sanity @slow

lvcreate  -p r -L 1G -n ro_pool writecache_sanity /dev/sdh1
  WARNING: Logical volume writecache_sanity/ro_pool not zeroed.
Deactivte both fast and slow volumes before conversion to write cache
Create writecached volume by combining the cache pool (fast) and origin (slow) volumes
lvconvert --yes --type writecache --cachevol writecache_sanity/ro_pool writecache_sanity/corigin

Activating volume: corigin

Feb  5 13:57:17 hayes-02 qarshd[34388]: Running cmdline: lvchange -vvvv -ay writecache_sanity/corigin
Feb  5 13:57:17 hayes-02 kernel: ------------[ cut here ]------------
Feb  5 13:57:17 hayes-02 kernel: generic_make_request: Trying to write to read-only block-device dm-0 (partno 0)
Feb  5 13:57:17 hayes-02 kernel: WARNING: CPU: 20 PID: 34389 at block/blk-core.c:788 generic_make_request_checks+0x3d1/0x660
Feb  5 13:57:18 hayes-02 kernel: Modules linked in: dm_snapshot dm_crypt dm_mirror dm_region_hash dm_log nft_chain_route_ipv4 xt_CHECKSUM nft_chain_nat_ipv4 ipt_MASQUERADE nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 nft_counter nft_compat nf_tables nfnetlink tun bridge stp llc sunrpc dm_multipath dm_cache_smq dm_cache dm_writecache(t) kvdo(O) uds(O) dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio raid10 raid1 raid0 dm_raid raid456 async_raid6_recov async_memcpy async_pq intel_rapl_msr async_xor intel_rapl_common xor async_tx sb_edac raid6_pq dm_mod x86_pkg_temp_thermal intel_powerclamp iTCO_wdt mxm_wmi coretemp iTCO_vendor_support kvm_intel dcdbas kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate intel_uncore intel_rapl_perf pcspkr qla2xxx nvme_fc nvme_fabrics nvme_core scsi_transport_fc mei_me mei ipmi_ssif lpc_ich ipmi_si ipmi_devintf ipmi_msghandler wmi acpi_power_meter ip_tables xfs libcrc32c sd_mod sr_mod cdrom
Feb  5 13:57:18 hayes-02 kernel: sg mgag200 i2c_algo_bit drm_vram_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt ahci fb_sys_fops libahci drm crc32c_intel libata megaraid_sas tg3
Feb  5 13:57:18 hayes-02 kernel: CPU: 20 PID: 34389 Comm: lvchange Kdump: loaded Tainted: G           O     --------- -t - 4.18.0-173.el8.x86_64 #1
Feb  5 13:57:18 hayes-02 kernel: Hardware name: Dell Inc. PowerEdge R830/0VVT0H, BIOS 1.8.0 05/28/2018
Feb  5 13:57:18 hayes-02 kernel: RIP: 0010:generic_make_request_checks+0x3d1/0x660
Feb  5 13:57:18 hayes-02 kernel: Code: 4c 04 00 00 48 89 df 48 8d 74 24 08 c6 05 ea 2d fe 00 01 e8 41 5c 01 00 48 c7 c7 80 3f ad ba 44 89 e2 48 89 c6 e8 d9 d5 cc ff <0f> 0b 4c 8b 63 08 e9 be fe ff ff 80 3d c2 2d fe 00 00 0f 84 43 02
Feb  5 13:57:18 hayes-02 kernel: RSP: 0018:ffffb4640e077828 EFLAGS: 00010282
Feb  5 13:57:18 hayes-02 kernel: RAX: 0000000000000000 RBX: ffff8b382d71e100 RCX: 0000000000000000
Feb  5 13:57:18 hayes-02 kernel: RDX: ffff8b397f8a6480 RSI: ffff8b397f896a08 RDI: ffff8b397f896a08
Feb  5 13:57:18 hayes-02 kernel: RBP: ffff8b39464c09d8 R08: 0000000000000f35 R09: 000000000000005f
Feb  5 13:57:18 hayes-02 kernel: R10: 0000000000000000 R11: ffffb4640e0776d0 R12: 0000000000000000
Feb  5 13:57:18 hayes-02 kernel: R13: 0000000000200000 R14: 0000000000000000 R15: ffff8b38a6682fc0
Feb  5 13:57:18 hayes-02 kernel: FS:  00007ff835184980(0000) GS:ffff8b397f880000(0000) knlGS:0000000000000000
Feb  5 13:57:18 hayes-02 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb  5 13:57:18 hayes-02 kernel: CR2: 000055a73a9af1e0 CR3: 0000001ef6806006 CR4: 00000000003606e0
Feb  5 13:57:18 hayes-02 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Feb  5 13:57:18 hayes-02 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Feb  5 13:57:18 hayes-02 kernel: Call Trace:
Feb  5 13:57:18 hayes-02 kernel: ? finish_wait+0x80/0x80
Feb  5 13:57:18 hayes-02 kernel: ? finish_wait+0x80/0x80
Feb  5 13:57:18 hayes-02 kernel: generic_make_request+0x2e/0x310
Feb  5 13:57:18 hayes-02 kernel: ? bvec_alloc+0x82/0xe0
Feb  5 13:57:18 hayes-02 kernel: submit_bio+0x45/0x140
Feb  5 13:57:18 hayes-02 kernel: ? bio_add_page+0x1b/0x50
Feb  5 13:57:18 hayes-02 kernel: dispatch_io+0x1b2/0x400 [dm_mod]
Feb  5 13:57:18 hayes-02 kernel: ? bio_next_page+0xa0/0xa0 [dm_mod]
Feb  5 13:57:18 hayes-02 kernel: ? bio_get_page+0x40/0x40 [dm_mod]
Feb  5 13:57:18 hayes-02 kernel: ? writecache_insert_entry+0xa0/0xa0 [dm_writecache]
Feb  5 13:57:18 hayes-02 kernel: dm_io+0x111/0x230 [dm_mod]
Feb  5 13:57:18 hayes-02 kernel: ? bio_next_page+0xa0/0xa0 [dm_mod]
Feb  5 13:57:18 hayes-02 kernel: ? bio_get_page+0x40/0x40 [dm_mod]
Feb  5 13:57:18 hayes-02 kernel: ssd_commit_flushed+0x13d/0x1d0 [dm_writecache]
Feb  5 13:57:18 hayes-02 kernel: ? writecache_insert_entry+0xa0/0xa0 [dm_writecache]
Feb  5 13:57:18 hayes-02 kernel: writecache_ctr+0x1042/0x1140 [dm_writecache]
Feb  5 13:57:18 hayes-02 kernel: dm_table_add_target+0x17d/0x360 [dm_mod]
Feb  5 13:57:18 hayes-02 kernel: table_load+0x122/0x2e0 [dm_mod]
Feb  5 13:57:18 hayes-02 kernel: ? dev_status+0x40/0x40 [dm_mod]
Feb  5 13:57:18 hayes-02 kernel: ctl_ioctl+0x1af/0x3f0 [dm_mod]
Feb  5 13:57:18 hayes-02 kernel: ? selinux_file_ioctl+0xe0/0x200
Feb  5 13:57:18 hayes-02 kernel: dm_ctl_ioctl+0xa/0x10 [dm_mod]
Feb  5 13:57:18 hayes-02 kernel: do_vfs_ioctl+0xa4/0x630
Feb  5 13:57:18 hayes-02 kernel: ksys_ioctl+0x60/0x90
Feb  5 13:57:18 hayes-02 kernel: __x64_sys_ioctl+0x16/0x20
Feb  5 13:57:18 hayes-02 kernel: do_syscall_64+0x5b/0x1a0
Feb  5 13:57:18 hayes-02 kernel: entry_SYSCALL_64_after_hwframe+0x65/0xca
Feb  5 13:57:18 hayes-02 kernel: RIP: 0033:0x7ff83337087b
Feb  5 13:57:18 hayes-02 kernel: Code: 0f 1e fa 48 8b 05 0d 96 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d dd 95 2c 00 f7 d8 64 89 01 48
Feb  5 13:57:18 hayes-02 kernel: RSP: 002b:00007ffeabd99988 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
Feb  5 13:57:18 hayes-02 kernel: RAX: ffffffffffffffda RBX: 000055a73a96acb0 RCX: 00007ff83337087b
Feb  5 13:57:18 hayes-02 kernel: RDX: 000055a73bb591d0 RSI: 00000000c138fd09 RDI: 000000000000000a
Feb  5 13:57:18 hayes-02 kernel: RBP: 000055a73aa18917 R08: 000000000000000a R09: 00007ffeabd95bc7
Feb  5 13:57:18 hayes-02 kernel: R10: 000000000000000a R11: 0000000000000206 R12: 0000000000000000
Feb  5 13:57:18 hayes-02 kernel: R13: 000055a73bb59200 R14: 000055a73bb591d0 R15: 000055a73bb4c370
Feb  5 13:57:18 hayes-02 kernel: ---[ end trace 7318713998360e58 ]---


Version-Release number of selected component (if applicable):
4.18.0-173.el8.x86_64

kernel-4.18.0-173.el8    BUILT: Fri Jan 24 06:02:03 CST 2020
lvm2-2.03.07-1.el8    BUILT: Mon Dec  2 00:09:32 CST 2019
lvm2-libs-2.03.07-1.el8    BUILT: Mon Dec  2 00:09:32 CST 2019
device-mapper-1.02.167-1.el8    BUILT: Mon Dec  2 00:09:32 CST 2019
device-mapper-libs-1.02.167-1.el8    BUILT: Mon Dec  2 00:09:32 CST 2019
device-mapper-event-1.02.167-1.el8    BUILT: Mon Dec  2 00:09:32 CST 2019
device-mapper-event-libs-1.02.167-1.el8    BUILT: Mon Dec  2 00:09:32 CST 2019


How reproducible:
Everytime

Comment 1 David Teigland 2020-02-11 18:50:10 UTC
The problem is that wipe_lv() does not return an error when writes to the LV fail, leaving the LV uninitialized.  Since wipe_lv has always ignored errors and is widely used elsewhere, we need to check what other effects might come from fixing it.

Comment 2 David Teigland 2020-02-11 19:02:57 UTC
pushed a limited fix to master (just checks if the cachevol is writable, but doesn't check for failed wiping more generally):
https://sourceware.org/git/?p=lvm2.git;a=commitdiff;h=cba06012acc589888ef88221f1a580b5b81b4100

Comment 5 Corey Marthaler 2020-02-24 20:50:08 UTC
Fix verified in the latest rpms.

kernel-4.18.0-179.el8    BUILT: Fri Feb 14 17:03:01 CST 2020
lvm2-2.03.08-2.el8    BUILT: Mon Feb 24 11:21:38 CST 2020
lvm2-libs-2.03.08-2.el8    BUILT: Mon Feb 24 11:21:38 CST 2020
device-mapper-1.02.169-2.el8    BUILT: Mon Feb 24 11:21:38 CST 2020
device-mapper-libs-1.02.169-2.el8    BUILT: Mon Feb 24 11:21:38 CST 2020
device-mapper-event-1.02.169-2.el8    BUILT: Mon Feb 24 11:21:38 CST 2020
device-mapper-event-libs-1.02.169-2.el8    BUILT: Mon Feb 24 11:21:38 CST 2020


[root@hayes-02 ~]# lvcreate --wipesignatures y  -L 4G -n cworigin writecache_sanity @slow
  Logical volume "cworigin" created.
[root@hayes-02 ~]# lvcreate  -p r -L 1G -n ro_pool writecache_sanity /dev/sdm1
  WARNING: Logical volume writecache_sanity/ro_pool not zeroed.
  Logical volume "ro_pool" created.
[root@hayes-02 ~]# lvchange -an writecache_sanity

[root@hayes-02 ~]# lvconvert --yes --type writecache --cachevol writecache_sanity/ro_pool writecache_sanity/cworigin
  Cannot initialize readonly LV writecache_sanity/ro_pool
  LV writecache_sanity/ro_pool could not be zeroed.
[root@hayes-02 ~]# echo $?
5

Comment 7 errata-xmlrpc 2020-04-28 16:59:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:1881