Bug 1519377
| Summary: | Filesystem gets corrupted when VDO is filled | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Jakub Krysl <jkrysl> | ||||
| Component: | kmod-kvdo | Assignee: | Bryan Gurney <bgurney> | ||||
| Status: | CLOSED DUPLICATE | QA Contact: | Jakub Krysl <jkrysl> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 7.5 | CC: | awalsh, bgurney, chorn, fcami, jkrysl, ldelouw, limershe, madko, msakai, sfroemer, sudo, sweettea | ||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2019-04-11 14:15:19 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 1517911 | ||||||
| Attachments: |
|
||||||
|
Description
Jakub Krysl
2017-11-30 16:25:39 UTC
When you run into the "VDO full" condition, do you end up trying to "Grow Physical", or run a fsck against the volume before mounting it (or after the first failed attempt at mounting it)? I wonder if the issue isn't 'corruption', but more that we have no space to work, and therefore the system just can't operate the mount at that point. We've seen such behaviors regularly when one of our tests, with a filesystem atop VDO, accidentally runs out of space. When VDO receives a write request to some logical address, it allocates a free block and will write the data at that location if it neither dedupes nor compresses. If that logical address was mapped to some physical address before this write happened, that old physical address might be freed after the logical address has been updated (if no other logical address maps there). XFS, and other filesystems, have a journal storing pending writes. The journal is in a fixed location, so from VDO's point of view the same logical addresses are being written over and over with unique data. In order for a filesystem to recover, it has to do some writes, possibly to the journal region. I suspect (I have not confirmed) that dm-thin overwrites data in place --- once a logical address is mapped to a particular physical address, later writes to that logical address will overwrite that particular physical address. This means a filesystem that ran out of space on dm-thin can recover as long as it's only overwriting data and not writing to new addresses. Filesystems on VDO can't write at all, even overwrites, when VDO is full, because we never overwrite old data before the new data is on disk. I tested increasing physical size and the filesystem is mountable again, so it is not corrupted. But this is just a workaround, as increasing physical size might not be possible at the moment or even at all. So the prefered fix to this is to duplicate thinp behaviour as closely as possible with the goal to give user access to his data. This might even mean locking the vdo, disabling dedupe and setting it to read-only...as long as user can access his data. If that is not possible at all, documenting this behaviour really well with the workaround. Also part of this solution are probably some early warnings (RFE BZ 1519307) saying the user should prepare more physical space asap. The reason for prefering fix is because some filesystems might not handle this very gracefully, one example is the EXT4 that gets basically stuck writing and very slow speed (in B/s) and spamming error messages for every byte it cannot write. We have considered fixing before for similar reasons -- XFS's previous behavior (now tunable and not default) was to infinitely retry failed writes, which for obvious reasons caused problems on a VDO that ran out of space. Out of curiosity: 1. Does mounting readonly work? 2. Does relocating the xfs journal to a non-VDO block device allow us to mount the filesystem on a full VDO volume? I note that its not enough to mount read-only, you need be able to mount as a writeable filesystem with discards enabled to clean up space. Either we need to figure out how to combine optimized and non-optimized storage to allow us to get mounted and be able to do discards or we need to suggest that users keep some storage in reserve when provisioning VDO - or maybe even do it by default so that support has a safety valve. The problem with putting in a reserve, is that you can only tap into it until you use it all. How many times is acceptable to run into that situation? Once we've tapped out the reserves, the same problem applies. Documenting that we recommend using a storage medium that can be expanded and telling the user to use a reserve of their own making, is the approach I've been trying to state at this point, but that's not a solution other than "You're doing it wrong" when they run into the issue. 1 way to have a reserve from the user's perspective is to partition the VDO logical space into two logical volumes, say 'reserve' and 'actual', fill 'reserve' with /dev/urandom, and then only use 'actual'. If you run out of space on 'actual', you can overwrite 'reserve' with /dev/zero to free some VDO space, recover the filesystem on 'actual' and delete some stuff, then fill 'reserve' with random data again to get a reserve back. growPhysical is failing as well. [root@rhel75beta ~]# vdo growPhysical -n vdo1 vdo: ERROR - Cannot grow physical on VDO vdo1; device-mapper: message ioctl on vdo1 failed: Invalid argument vdo: ERROR - device-mapper: message ioctl on vdo1 failed: Invalid argument At the end, it seems to be a unrecoverable error Hi; can you check in journalctl for messages from VDO about what the invalid argument was? (In reply to Luc de Louw from comment #9) > growPhysical is failing as well. > > [root@rhel75beta ~]# vdo growPhysical -n vdo1 > vdo: ERROR - Cannot grow physical on VDO vdo1; device-mapper: message ioctl > on vdo1 failed: Invalid argument > vdo: ERROR - device-mapper: message ioctl on vdo1 failed: Invalid argument > > At the end, it seems to be a unrecoverable error Hi there, journalctl does not provide any information: Jan 26 21:55:15 rhel75beta.example.com vdo[1450]: ERROR - device-mapper: message ioctl on vdo1 failed: Invalid argument Jan 26 22:00:00 rhel75beta.example.com kernel: kvdo0:dmsetup: Preparing to resize physical to 28835840 Jan 26 22:00:00 rhel75beta.example.com kernel: kvdo0:dmsetup: Done preparing to resize physical Jan 26 22:00:00 rhel75beta.example.com kernel: kvdo0:dmsetup: suspending device 'vdo1' Jan 26 22:00:01 rhel75beta.example.com kernel: kvdo0:dmsetup: device 'vdo1' suspended Jan 26 22:00:01 rhel75beta.example.com kernel: kvdo0:dmsetup: Requested physical block count 28835840 not greater than 28835840 Jan 26 22:00:01 rhel75beta.example.com vdo[1802]: ERROR - Cannot grow physical on VDO vdo1; device-mapper: message ioctl on vdo1 failed: Invalid argument Jan 26 22:00:01 rhel75beta.example.com kernel: kvdo0:dmsetup: resuming device 'vdo1' Jan 26 22:00:01 rhel75beta.example.com kernel: kvdo0:dmsetup: device 'vdo1' resumed Jan 26 22:00:01 rhel75beta.example.com vdo[1802]: ERROR - device-mapper: message ioctl on vdo1 failed: Invalid argument Hi Luc;
>Jan 26 22:00:01 rhel75beta.example.com kernel: kvdo0:dmsetup: Requested
physical block count 28835840 not greater than 28835840
This indicates the storage under VDO hasn't expanded, so VDO can't expand into new space. Without new space, VDO still doesn't have any more free blocks so cannot accept more writes.
(In reply to Sweet Tea Dorminy from comment #12) > Hi Luc; > >Jan 26 22:00:01 rhel75beta.example.com kernel: kvdo0:dmsetup: Requested > physical block count 28835840 not greater than 28835840 > > This indicates the storage under VDO hasn't expanded, so VDO can't expand > into new space. Without new space, VDO still doesn't have any more free > blocks so cannot accept more writes. That is strange.... hypervisor:/vm-images# qemu-img resize rhel75beta-vdo-disk.qcow2 +10G Image resized. [root@rhel75beta ~]# fdisk -l /dev/vdb Disk /dev/vdb: 118.1 GB, 118111600640 bytes, 230686720 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes [root@rhel75beta ~]# There should be 10G available on the (virtual) physical disk (In reply to Sweet Tea Dorminy from comment #12) > Hi Luc; > >Jan 26 22:00:01 rhel75beta.example.com kernel: kvdo0:dmsetup: Requested > physical block count 28835840 not greater than 28835840 > > This indicates the storage under VDO hasn't expanded, so VDO can't expand > into new space. Without new space, VDO still doesn't have any more free > blocks so cannot accept more writes. That is strange.... hypervisor:/vm-images# qemu-img resize rhel75beta-vdo-disk.qcow2 +10G Image resized. [root@rhel75beta ~]# fdisk -l /dev/vdb Disk /dev/vdb: 118.1 GB, 118111600640 bytes, 230686720 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes [root@rhel75beta ~]# There should be 10G available on the (virtual) physical disk (In reply to Sweet Tea Dorminy from comment #12) > Hi Luc; > >Jan 26 22:00:01 rhel75beta.example.com kernel: kvdo0:dmsetup: Requested > physical block count 28835840 not greater than 28835840 > > This indicates the storage under VDO hasn't expanded, so VDO can't expand > into new space. Without new space, VDO still doesn't have any more free > blocks so cannot accept more writes. I have the disk additional 100GB, so its more than double to original size and tried again. Same result, different numbers: [ 14.958349] kvdo0:dmsetup: Preparing to resize physical to 57671680 [ 14.961344] kvdo0:dmsetup: Done preparing to resize physical [ 14.963874] kvdo0:dmsetup: suspending device 'vdo1' [ 15.182655] kvdo0:dmsetup: device 'vdo1' suspended [ 15.186369] kvdo0:dmsetup: Requested physical block count 57671680 not greater than 57671680 This is so odd. There is nothing inherent about a grow physical operation which would cause it to fail, even if the existing storage is completely full. Could I suggest you move this issue to a new BZ so that we can try to work it out independent from the out-of-space issue?
Some more information might help, as well. Can you get stats out this VDO volume, to see how much space it thinks it's using? And also, a longer log of the recent operations related to this VDO might help us figure out what's happening here.
As an example, one way to get this error is to do a successful grow physical operation, and then immediately launch another. The second grow physical would fail in this way.
> [ 14.958349] kvdo0:dmsetup: Preparing to resize physical to 57671680
> [ 14.961344] kvdo0:dmsetup: Done preparing to resize physical
> [ 14.963874] kvdo0:dmsetup: suspending device 'vdo1'
> [ 15.182655] kvdo0:dmsetup: device 'vdo1' suspended
> [ 15.186369] kvdo0:dmsetup: Requested physical block count 57671680 not
> greater than 57671680
Based on these messages, it appears that the VDO volume believes it is already using all 220G of your device.
Looking at this again after several months, the issue with the grow physical operation may well be an instance of bug 1582647. Please note that I see this in Red Hat Enterprise Linux 8 Snapshot 1 (latest at the time of writing) The KCS article "Managing Thin Provisioning with Virtual Data Optimizer" has been published at https://access.redhat.com/articles/3966841 *** This bug has been marked as a duplicate of bug 1657152 *** The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days |