Hide Forgot
Description of problem: UDS writes the volume geometry out to a header page on the volume section of the index. It writes it in native endian format. Therefore it is little endian on most platforms, but big endian on S390X. UDS does not use the volume geometry from the volume header page. It reads the UDSConfiguration and generates the proper volume geometry to use. But is does read the volume header page and compares the geometry that it read to the geometry that it generated. If they do not match, UDS refuses to load the index. This made some sense in the original multifile Albireo setup. It verified that the volume file and the config file were in agreement. But now we are in a single file or single device setup where we are writing this redundant information to the same file or device. Changing UDS to always write little-endian could have been done as part of the multiplatform 8.0 work. But now we have customers who may have created VDO devices that are definitely in native-endian format. Those customers cannot transfer the VDO storage to a system with the other endian format. The easy fix for this problem is to just ignore the volume header page. Do not read the page, and do not compare the geometry to the one generated from the UDSConfiguration. Version-Release number of selected component (if applicable): kmod-kvdo-6.2.* How reproducible: 100% Steps to Reproduce: 1. Create a VDO volume on one architecture endian-ness (aarch64, ppc64le, or x86_64) or s390x. This problem will happen when moving from one endian-ness to the other, regardless of starting point. 2. Stop the VDO volume. 3. Move the backing storage to the other endian-ness architecture from what was chosen in step 1. 4. Transfer /etc/vdoconf.yml from system 1 to system 2. 4. Start the VDO volume. 4. Observe that VDO detected a corrupt index in the logs and created a new index Actual results: A new index is created over the existing index. Expected results: The existing index is used. Additional info: This was discovered during code review. Messages in /var/log/messages on the second system: Apr 4 21:01:19 s390x-test_host kernel: kvdo0:dmsetup: Setting UDS index target state to online Apr 4 21:01:19 s390x-test_host kernel: kvdo0:dmsetup: device 'vdo0' started Apr 4 21:01:19 s390x-test_host kernel: uds: kvdo0:dedupeQ: loading or rebuilding index: dev=/dev/disk/by-id/scsi-36001405c542b9af52194fe2bb1a43300 offset=4096 size=2781704192 Apr 4 21:01:19 s390x-test_host kernel: uds: kvdo0:dedupeQ: Using 2 indexing zones for concurrency. Apr 4 21:01:19 s390x-test_host kernel: uds: kvdo0:dedupeQ: config and volume geometries are inconsistent: UDS Error: Corrupt saved component (1030) Apr 4 21:01:19 s390x-test_host kernel: uds: kvdo0:dedupeQ: could not allocate index: UDS Error: Corrupt saved component (1030) Apr 4 21:01:19 s390x-test_host kernel: uds: kvdo0:dedupeQ: failed to create index: UDS Error: Corrupt saved component (1030) Apr 4 21:01:19 s390x-test_host kernel: uds: kvdo0:dedupeQ: Failed to make router: UDS Error: Corrupt saved component (1030) Apr 4 21:01:19 s390x-test_host kernel: uds: kvdo0:dedupeQ: Failed loading or rebuilding index: UDS Error: Corrupt saved component (1030) Apr 4 21:01:19 s390x-test_host kernel: kvdo0:dedupeQ: Error opening index dev=/dev/disk/by-id/scsi-36001405c542b9af52194fe2bb1a43300 offset=4096 size=2781704192: UDS Error: Corrupt file (1030) Apr 4 21:01:19 s390x-test_host kernel: uds: kvdo0:dedupeQ: creating index: dev=/dev/disk/by-id/scsi-36001405c542b9af52194fe2bb1a43300 offset=4096 size=2781704192 Apr 4 21:01:19 s390x-test_host kernel: uds: kvdo0:dedupeQ: Using 2 indexing zones for concurrency. I was able to produce this by using an iscsi Target/Initiator setup. s390x initiator IQN: iqn.1994-05.com.redhat:a8dba1df5742 aarch64 initiator IQN: iqn.1994-05.com.redhat:393fd68e864b Target setup (in this case, I used System 1): 1. targetcli backstores/block create name=base dev=/dev/loop0 Created block storage object base using /dev/loop0. 2. targetcli iscsi/ create iqn.2019-04.com.redhat.vdo:testtarget Created target iqn.2019-04.com.redhat.vdo:testtarget. Created TPG 1. Global pref auto_add_default_portal=true Created default portal listening on all IPs (0.0.0.0), port 3260. 3. targetcli iscsi/iqn.2019-04.com.redhat.vdo:testtarget/tpg1/acls create iqn.1994-05.com.redhat:a8dba1df5742 Created Node ACL for iqn.1994-05.com.redhat:a8dba1df5742 4. targetcli iscsi/iqn.2019-04.com.redhat.vdo:testtarget/tpg1/luns create lun=100 /backstores/block/base Created LUN 100. Created LUN 100->100 mapping in node ACL iqn.1994-05.com.redhat:a8dba1df5742 5. targetcli iscsi/iqn.2019-04.com.redhat.vdo:testtarget/tpg1/acls/ create iqn.1994-05.com.redhat:393fd68e864b Initiator setup: System 1 (aarch64): 1. iscsiadm --mode discovery --type sendtargets --portal 192.168.121.2 192.168.121.2:3260,1 iqn.2019-04.com.redhat.vdo:testtarget 2. iscsiadm --mode node --target iqn.2019-04.com.redhat.vdo:testtarget -l Logging in to [iface: default, target: iqn.2019-04.com.redhat.vdo:testtarget, portal: 192.168.121.2,3260] (multiple) Login to [iface: default, target: iqn.2019-04.com.redhat.vdo:testtarget, portal: 192.168.121.2,3260] successful. 3. ls -l /dev/disk/by-path/ip-192.168.121.2:3260-iscsi-iqn.2019-04.com.redhat.vdo:testtarget-lun-100 4. vdo create --name vdo0 --device /dev/disk/by-path/ip-192.168.121.2:3260-iscsi-iqn.2019-04.com.redhat.vdo:testtarget-lun-100 5. dd if=/dev/urandom of=/dev/mapper/vdo0 bs=1M count=10 oflag=direct 6. vdo stop --name vdo0 7. iscsiadm --mode node --target iqn.2019-04.com.redhat.vdo:testtarget -u System 2 (s390x): 1. Copy contents of /etc/vdoconf.yml from System 1 to /etc/vdoconf.yml 2. iscsiadm --mode discovery --type sendtargets --portal 192.168.121.2 192.168.121.2:3260,1 iqn.2019-04.com.redhat.vdo:testtarget 3. iscsiadm --mode node --target iqn.2019-04.com.redhat.vdo:testtarget -l Logging in to [iface: default, target: iqn.2019-04.com.redhat.vdo:testtarget, portal: 192.168.121.2,3260] (multiple) Login to [iface: default, target: iqn.2019-04.com.redhat.vdo:testtarget, portal: 192.168.121.2,3260] successful. 4. ls -l /dev/disk/by-path/ip-192.168.121.2:3260-iscsi-iqn.2019-04.com.redhat.vdo:testtarget-lun-100 5. vdo start --name vdo0 6. grep 'config and volume geometries are inconsistent' /var/log/messages
Yes! This version of the doc text is accurate: > As a consequence, any deduplication advice stored in the UDS index prior to being > overwritten is lost. VDO is then unable to deduplicate newly written data against > the data that was stored before you moved the volume, leading to lower space savings.
Yes, the updated doc text is accurate
Thanks! Removing the needinfo.
RHEL-8.0.0 (kmod-kvdo-6.2.0.293-50.el8): Aug 20 12:55:11 ibm-z-122 kernel: kvdo3:journalQ: VDO commencing normal operation Aug 20 12:55:11 ibm-z-122 kernel: kvdo3:dmsetup: Setting UDS index target state to online Aug 20 12:55:11 ibm-z-122 kernel: kvdo3:dmsetup: device 'vdo0' started Aug 20 12:55:11 ibm-z-122 kernel: uds: kvdo3:dedupeQ: loading or rebuilding index: dev=/dev/disk/by-id/scsi-360a980003246694a412b456733453433 offset=4096 size=2781704192 Aug 20 12:55:11 ibm-z-122 kernel: uds: kvdo3:dedupeQ: Using 2 indexing zones for concurrency. Aug 20 12:55:11 ibm-z-122 kernel: uds: kvdo3:dedupeQ: config and volume geometries are inconsistent: UDS Error: Corrupt saved component (1030) Aug 20 12:55:11 ibm-z-122 kernel: uds: kvdo3:dedupeQ: could not allocate index: UDS Error: Corrupt saved component (1030) Aug 20 12:55:11 ibm-z-122 kernel: uds: kvdo3:dedupeQ: failed to create index: UDS Error: Corrupt saved component (1030) Aug 20 12:55:11 ibm-z-122 kernel: uds: kvdo3:dedupeQ: Failed to make router: UDS Error: Corrupt saved component (1030) Aug 20 12:55:11 ibm-z-122 kernel: uds: kvdo3:dedupeQ: Failed loading or rebuilding index: UDS Error: Corrupt saved component (1030) Aug 20 12:55:11 ibm-z-122 kernel: kvdo3:dedupeQ: Error opening index dev=/dev/disk/by-id/scsi-360a980003246694a412b456733453433 offset=4096 size=2781704192: UDS Error: Corrupt file (1030) Aug 20 12:55:11 ibm-z-122 kernel: uds: kvdo3:dedupeQ: creating index: dev=/dev/disk/by-id/scsi-360a980003246694a412b456733453433 offset=4096 size=2781704192 Aug 20 12:55:11 ibm-z-122 kernel: uds: kvdo3:dedupeQ: Using 2 indexing zones for concurrency. Aug 20 12:55:11 ibm-z-122 kernel: kvdo3:dmsetup: resuming device 'vdo0' Aug 20 12:55:11 ibm-z-122 kernel: kvdo3:dmsetup: device 'vdo0' resumed This also happens when going from RHEL-8.1.0 (kmod-kvdo-6.2.1.134-56.el8) to RHEL-8.0.0. Going the other way around does not try to rebuild the index. When both systems are with this fix, there are no issues with index rebuild. Aug 20 13:23:52 ibm-z-122 kernel: kvdo1:dmsetup: underlying device, REQ_FLUSH: not supported, REQ_FUA: not supported Aug 20 13:23:52 ibm-z-122 kernel: kvdo1:dmsetup: Using write policy sync automatically. Aug 20 13:23:52 ibm-z-122 kernel: kvdo1:dmsetup: loading device 'vdo0' Aug 20 13:23:52 ibm-z-122 kernel: kvdo1:dmsetup: zones: 1 logical, 1 physical, 1 hash; base threads: 5 Aug 20 13:23:52 ibm-z-122 kernel: kvdo1:dmsetup: starting device 'vdo0' Aug 20 13:23:52 ibm-z-122 kernel: kvdo1:journalQ: VDO commencing normal operation Aug 20 13:23:52 ibm-z-122 kernel: kvdo1:dmsetup: Setting UDS index target state to online Aug 20 13:23:52 ibm-z-122 kernel: kvdo1:dmsetup: device 'vdo0' started Aug 20 13:23:52 ibm-z-122 kernel: kvdo1:dmsetup: resuming device 'vdo0' Aug 20 13:23:52 ibm-z-122 kernel: kvdo1:dmsetup: device 'vdo0' resumed Aug 20 13:23:52 ibm-z-122 kernel: uds: kvdo1:dedupeQ: loading or rebuilding index: dev=/dev/sdc offset=4096 size=2781704192 Aug 20 13:23:52 ibm-z-122 kernel: uds: kvdo1:dedupeQ: Using 2 indexing zones for concurrency. Aug 20 13:23:52 ibm-z-122 UDS/vdodmeventd[2272]: INFO (vdodmeventd/2272) VDO device vdo0 is now registered with dmeventd for monitoring Aug 20 13:23:52 ibm-z-122 kernel: kvdo1:packerQ: compression is enabled Aug 20 13:23:52 ibm-z-122 lvm[1833]: Monitoring VDO pool vdo0. Aug 20 13:23:53 ibm-z-122 kernel: uds: kvdo1:dedupeQ: loaded index from chapter 0 through chapter 0
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:3548