Bug 1535476
| Summary: | VDO allows removal with backing device missing | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Jakub Krysl <jkrysl> | |
| Component: | vdo | Assignee: | Joseph Chapman <jochapma> | |
| Status: | CLOSED ERRATA | QA Contact: | Jakub Krysl <jkrysl> | |
| Severity: | unspecified | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 7.5 | CC: | awalsh, bgurney, bjohnsto, emilne, jkrysl, limershe, mpatalan, pasik | |
| Target Milestone: | rc | |||
| Target Release: | --- | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | 6.1.2.35 | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1684248 (view as bug list) | Environment: | ||
| Last Closed: | 2019-08-06 13:08:04 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1577173, 1684248 | |||
|
Description
Jakub Krysl
2018-01-17 13:34:11 UTC
Versions used: vdo-6.1.0.114-14 kmod-kvdo-6.1.0.114-11 iscsi-initiator-utils-6.2.0.874-6 Also when said LUN is added later ('iscsiadm -m node -l') and removed again ('iscsiadm -m node -u)', same calltrace appears on removal. So the calltrace itself might be issue of iscsi-initiator-utils. These 2 steps can be repeated with the same result. I will wait with creating BZ for iscsi component before it is clear how this state happened...
BZ 1534568 appears to have almost the same calltrace "sysfs group ffffffffad4e4100 not found for kobject '3:0:0:0'". I expect you will find this was caused by the sysfs changes introduced in kernel-3.10.0-828.el7. Suggest you re-test with -827 and if the problem does not appear, this would be a duplicate of bug 1534568. With kernel .827 this is not reproducible, I am only getting this occasionally: [ 117.902936] blk_update_request: I/O error, dev dm-3, sector 4294967168 [ 117.909471] blk_update_request: I/O error, dev dm-3, sector 4294967168 [ 117.915992] Buffer I/O error on dev dm-3, logical block 536870896, async page read So that part will be solved in BZ 1534568 as it is duplicate of that one. What is missing here is the "ERROR: Device under VDO not found. Overwrite with --force". Changing the name to reflect that. The reasoning behind this is that vdo is not able to clean up the superblock when it cannot reach the device itself. Server admin should be made aware of this remnant so he can clean it up himself to avoid errors like: "vdo: ERROR - vdoformat: Cannot format device already containing a valid VDO!" Tested on:
RHEL-7.6-20180626.0
kernel-3.10.0-915.el7
kmod-vdo-6.1.1.99-1.el7
vdo-6.1.1.99-2.el7
Now when removing vdo with backing device missing the admin is notified in terminal there is the missing device issue and pointed to solution using --force. But at that point the device is already removed, just the superblock is not cleared:
# vdo remove --name vdo --verbose
Removing VDO vdo
Stopping VDO vdo
dmsetup status vdo
mount
udevadm settle
dmsetup remove vdo
vdo: ERROR - Device /dev/disk/by-id/scsi-360fff19abdd9b56dfb9bf59625f4c9f7 not found. Remove VDO with --force.
[root@storageqe-74 vdo]# vdo remove --name vdo --verbose
Removing VDO vdo
Stopping VDO vdo
dmsetup status vdo
vdo: ERROR - Device /dev/disk/by-id/scsi-360fff19abdd9b56dfb9bf59625f4c9f7 not found. Remove VDO with --force.
# iscsiadm -m node -l
Logging in to [iface: default, target: iqn.2001-05.com.equallogic:0-af1ff6-6db5d9bd9-f7c9f42596f59bfb-vdo-general, portal: *****] (multiple)
Login to [iface: default, target: iqn.2001-05.com.equallogic:0-af1ff6-6db5d9bd9-f7c9f42596f59bfb-vdo-general, portal: *****] successful.
[root@storageqe-74 vdo]# vdo remove --name vdo --verbose
Removing VDO vdo
Stopping VDO vdo
dmsetup status vdo
dd if=/dev/zero of=/dev/disk/by-id/scsi-360fff19abdd9b56dfb9bf59625f4c9f7 oflag=direct bs=4096 count=1
/var/log/messages:
[162456.240454] kvdo2:dmsetup: suspending device 'vdo'
[162456.265472] kvdo2:dmsetup: synchronous flush failed: System error -5 (-5)
[162456.300191] kvdo2:dmsetup: suspend of device 'vdo' failed with error: -5
[162456.332379] kvdo2:dmsetup: stopping device 'vdo'
[162456.356679] kvdo2:journalQ: Completing write VIO of type 10 for physical block 679128 with error: System error -5 (-5)
[162456.407045] kvdo2:dmsetup: stopKernelLayer: Close device failed -5 (System error -5: System error -5)
[162456.450874] kvdo2:dmsetup: uds: kvdo2:dedupeQ: index_0: beginning save (vcn 4294967295)
[162456.451745] uds: kvdo2:dedupeQ: save: cannot prepare index ris_prepareSave: Input/output error (5)
uds: kvdo2:dedupeQ: index_0: save failed
[162456.451750] uds: kvdo2:dedupeQ: index router save state problem: Input/output error (5)
uds: kvdo2:dedupeQ: ignoring error from saveIndexSession: Input/output error (5)
[162456.473634] kvdo2:dedupeQ: Error closing index dev=/dev/disk/by-id/scsi-360fff19abdd9b56dfb9bf59625f4c9f7 offset=4096 size=2781704192: System error 5 (5)
[162456.685616] Setting UDS index target state to closed
[162456.753027] kvdo2:dmsetup: device 'vdo' stopped
All those messages in /var/log/messages appear when the command 'vdo remove' is run the first time. Though all changes requested in comment #6 are done, all these errors could be prevented if vdo stopped removal on first sight of issues. As this BZ is about the removal itself and not just the comment #6 and VDO actually still allows the removal, I am not sure this BZ should be closed. Probably we should discuss this a bit more to come with proper way how to tackle those new errors.
vdo-6.1.2.41-4.el7.x86_64
VDO now sends error before actually doing anything. Reconnecting the device starts the VDO again
# vdo create --name vdo --device /dev/sdm --verbose
Creating VDO vdo
grep MemAvailable /proc/meminfo
pvcreate --config devices/scan_lvs=1 -qq --test /dev/sdm
blkid -p /dev/sdm
modprobe kvdo
vdoformat --uds-checkpoint-frequency=0 --uds-memory-size=0.25 /dev/disk/by-id/scsi-360fff19aad198f7e32a79578cd06903b
vdodumpconfig /dev/disk/by-id/scsi-360fff19aad198f7e32a79578cd06903b
Starting VDO vdo
dmsetup status --target vdo vdo
grep MemAvailable /proc/meminfo
modprobe kvdo
vdodumpconfig /dev/disk/by-id/scsi-360fff19aad198f7e32a79578cd06903b
dmsetup create vdo --uuid VDO-5e56e8a1-fd8c-426e-9ec1-a8dbc16e12c4 --table '0 12557488 vdo /dev/disk/by-id/scsi-360fff19aad198f7e32a79578cd06903b 4096 disabled 0 32768 16380 on auto vdo ack=1,bio=4,bioRotationInterval=64,cpu=2,hash=1,logical=1,physical=1'
dmsetup status --target vdo vdo
Starting compression on VDO vdo
dmsetup message vdo 0 compression on
vdodmeventd -r vdo
dmsetup status --target vdo vdo
VDO instance 2 volume is ready at /dev/mapper/vdo
# iscsiadm -m node -u
Logging out of session [sid: 1, target: iqn.2001-05.com.equallogic:0-af1ff6-7e8f19ad9-3b9006cd7895a732-vdo-small, portal: XXXXXX,3260]
Logout of [sid: 1, target: iqn.2001-05.com.equallogic:0-af1ff6-7e8f19ad9-3b9006cd7895a732-vdo-small, portal: XXXXXX,3260] successful.
# vdo remove --all --verbose
Removing VDO vdo
vdo: ERROR - Device /dev/disk/by-id/scsi-360fff19aad198f7e32a79578cd06903b not found. Remove VDO with --force.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2233 |