Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
+++ This bug was initially created as a clone of Bug #1578353 +++
Description of problem:
-----------------------
According to Dennis Keefe,
Use these xfs settings for XFS in RHHI, so that XFS handle conditions when VDO runs out of physical space.
echo 4 | tee /sys/fs/xfs/dm-13/error/metadata/ENOSPC/max_retries
echo 4 | tee /sys/fs/xfs/dm-13/error/metadata/EIO/max_retries
Gdeploy bug ( BZ 1523562 ) was fixed to solve this request, but it turned out to be that these values are not persisted in the config files - max_retries
This bug is required to make these changes persistent.
As of now there is no framework to persist changes in sysfs, let alone apply them automatically when a filesystem is mounted.
The closest we have is tuned, and that's not really suited to the task. We've talked about the possibility of something like a udev hook to trigger a script which will set these parameters when a filesystem gets mounted. We'd need to do the configuration by uuid, presumably, since device names can change.
The good news is that this may not be a problem for the default settings for RHEL 7.5, and therefore, RHEL 7.5 systems will not need any additional settings for an XFS filesystem on a VDO volume.
In previous tests on RHEL 7.2 (running kernel-3.10.0-327.36.1.el7.x86_64), when the VDO volume ran out of space, XFS would encounter an ENOSPC metadata error, and then retry forever:
(logs from a test system on October 14, 2016)
Oct 14 16:28:34 localhost kernel: [ 5411.311086] XFS (dm-2): metadata I/O error: block 0x1fe00008 ("xfs_buf_iodone_callbacks") error 28 numblks 8
Oct 14 16:28:34 localhost kernel: [ 5411.360685] XFS (dm-2): Failing async write on buffer block 0x1fe00028. Retrying async write.
Oct 14 16:28:34 localhost kernel: [ 5411.360691] XFS (dm-2): Failing async write on buffer block 0x1fe00020. Retrying async write.
Oct 14 16:28:34 localhost kernel: [ 5411.360694] XFS (dm-2): Failing async write on buffer block 0x1fe00008. Retrying async write.
At this point, the system was hung, because XFS was endlessly retrying metadata I/O, which would not succeed until the VDO volume executed a "vdo growPhysical" operation. But XFS would need to remount after a "vdo growPhysical" operation, and there was no way to convince it to stop retrying, except for restarting the system.
However, on a RHEL 7.5 system (3.10.0-862.el7.x86_64), the same scenario results in the following events from XFS:
(logs from a test system on May 15, 2018)
May 15 13:56:38 localhost kernel: XFS (dm-3): metadata I/O error: block 0x5fd1180 ("xlog_iodone") error 28 numblks 64
May 15 13:56:38 localhost kernel: XFS (dm-3): xfs_do_force_shutdown(0x2) called from line 1222 of file fs/xfs/xfs_log.c. Return address = 0xffffffffc0847e20
May 15 13:56:38 localhost kernel: XFS (dm-3): Log I/O Error Detected. Shutting down filesystem
May 15 13:56:38 localhost kernel: XFS (dm-3): Please umount the filesystem and rectify the problem(s)
At this point, after unmounting the filesystem, and after "vdo growPhysical" is successfully executed to add physical space to the VDO volume, the XFS filesystem can be remounted, and the file writes that were interrupted by the out-of-space condition can be retried.