Bug 1115201
Summary: | [xfs] can't create inodes in newly added space after xfs_growfs | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Boris Ranto <branto> |
Component: | kernel | Assignee: | Eric Sandeen <esandeen> |
Status: | CLOSED ERRATA | QA Contact: | Zorro Lang <zlang> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 7.1 | CC: | bfoster, dchinner, dmick, eguan, esandeen, g.fhnrunznrqeqf, hamiller, icolle, jkt, leif, pasteur, szhao, vikumar |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | kernel-3.10.0-210.el7 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2015-03-05 12:26:32 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Boris Ranto
2014-07-01 20:59:23 UTC
Thanks for the details, I'll take a look. I have a hunch this is related to incore superblock counters. What is the "empty files worker?" If you change it from "empty files" to "1 byte files" does the behavior change? Another question: does "stat -f /mnt/point" also get it going again? Out of curiosity, how does one distinguish "ENOSPC inodes" vs. "ENOSPC blocks"? fd = open(O_CREAT) = ENOSPC -> no inodes write(fd, buf, size) = ENOSPC -> no blocks Hi Eric, the empty files worker is just something like this (i=0;while touch /mnt/point/$i; do i=$(($i+1));done) & but it could be reproduced with untarred kernel sources so 1 byte files probably would not help. I'll have to retest with the stat -f call. btw: the reproducer script is fairly easy, just something like this: lvcreate -L 128M -n test some_vg mkfs.xfs /dev/mapper/some_vg-test mount /dev/mapper/some_vg-test /mnt/point/ (i=0;while touch /mnt/point/$i; do i=$(($i+1));done) & lvextend -L 1280M /dev/mapper/some_vg-test xfs_growfs /dev/mapper/some_vg-test # wait for the background worker to run out of inode space # then also the following fails touch /mnt/point/file # remount will update the inode space mount -o remount /mnt/point # Now, we can continue to create new files (i=0;while touch /mnt/point/$i; do i=$(($i+1));done) & I've retested with the stat -f call and that did not get it going again. Ok, different problem than one I've encountered before, then. Thanks, -Eric When we saw similar allocation failures in our testing of ICE, I noticed that sometimes I could get a fail-to-touch (inode allocation failure, I assume), and then some time later get a success for the same path, with no reboot/remount. Of course root is not static, and some background process might have been freeing inodes with file ops, but....just another data point. I understand Eric has a bead on this issue and is working a fix. Yes, it looks like it's a problem in rhel6 as well as in rhel7. FWIW, closing a loop: "mount -o remount /" also solves the problem on my VM testcase. Sorry, incorrect; it looked like it helped, but no. I bet if you do: # mount -o remount,inode64 it'll fix it. Sent a patch to the upstream list. Finally managed to confirm that "remount" does not fix, "remount,inode64" does, in my test scenario. (I take it back, it's not an issue in RHEL6) Dan, is this some thing that needs attention & fixing prior to RHEL7.1's release, or should we just include the fix along with all the other RHEL7.1 xfs updates?) Well, our particular test case is "provision a VM from a cloud image using cloud-init", which is not going to be a core use case; however, I don't know how many other "provision from the minimal image" situations might matter more. I'm pretty sure the original root image is distributed by RedHat for cloud-provisioning purposes, and, given that its root is 6GB, it would not surprise me if most people that use it are growfs'ing root (for simplicity; who wants an evanescent cloud image with multiple disks). In short I guess I don't *know* of the various business segments that might get hit with this, but I feel like there are probably more than my sorta-weird case (because the only thing really weird is the use of the cloud-init package itself, cloud-init being an Ubuntu-created utility). Asking around here, I guess cloud-init is sorta integral to OpenStack and thus RHEL-OSP, so that is probably a vote for this being more important. It's not just a matter of growing it, but growing it *and* running creating many more inodes, right? Is the original image so tight that you can barely create any new inodes right out of the gate? Anyway, the question comes down to whether you can document this away for now ("after growing this image, unmount and remount it to ensure that all space is available; this will be fixed in RHEL7.1") or if we need to ship a z-stream kernel with the fix before RHEL7.1GA... -Eric Yes, it involves some inode creation, and I don't know how much; in my use case it was "three Ceph OSDs", which do create quite a deep filesystem hierarchy even by default, so it could be that it's more intense than some other uses. But the failing number of inodes was reported as "1%" (presumably of the new size) by df -i, so I was assuming it wasn't all that many. As for shipping urgency, I can't really speak to that. Ok, well - barring information to the contrary, we'll just ship it with RHEL7.1. -Eric This is a failing state: $ df -i Filesystem Inodes IUsed IFree IUse% Mounted on /dev/vda1 104856400 61504 104794896 1% / and after the remount,inode64, and retrying the failed operation, this is how many were being requested by the failing operation: ]$ df -i Filesystem Inodes IUsed IFree IUse% Mounted on /dev/vda1 104856400 61551 104794849 1% / *** Bug 1149912 has been marked as a duplicate of this bug. *** I can't tell if this is the same bug as described in the XFS FAQ, here, and was wondering if anyone could comment: http://xfs.org/index.php/XFS_FAQ#Q:_Why_do_I_receive_No_space_left_on_device_after_xfs_growfs.3F Excerpt: 'Unfortunately, v3.7 also added a bug present from kernel v3.7 to v3.17 which caused new allocation groups added by growfs to be unavailable for inode allocation. This was fixed by commit 9de67c3b xfs: allow inode allocations in post-growfs disk space. in kernel v3.17. Without that commit, the problem can be worked around by doing a "mount -o remount,inode64" after the growfs operation.' Since we are on kernel 3.10 of RHEL7, we fall into the range where the above bug occurs. The filesystem in question was /usr, and I'm not aware that it was using an abnormally large number of inodes _prior_ to the growfs, which makes me question whether or not that is required to repro, or if the FAQ growfs bug simply differs from the bug being discussed here. What I do know is that after growing the filesystem, mkdir and touch would both fail in that filesystem with "no space left on device." I ran the remount as described in the FAQ, which appeared to successfully work around the problem. Unfortunately, I failed to make a note of inode stats before and after I ran the remount. It is the same bug, fixed with the same patch. This bug addresses the problem, and the next released kernel will have the fix. It's not just that it's a large number of inodes used prior - if all the space is used up for any reason (i.e. data in files!) and growfs doesn't present new space as available for new inodes, inode allocation will fail. Thanks, Eric. Do you happen to know the ETA for the RHEL7 kernel release that will include the bug fix for this? It's slated for RHEL7.1 but I don't know if I can divulge schedules. Do you have a support/partner contact? If it's critical, could request a z-stream update. Patch(es) available on kernel-3.10.0-210.el7 test by run xfs/015, reproduced on kernel 200, test passed on kernel 220 Duplicate bug (1149912) talks about commit 9de67c3ba9ea961ba420573d56479d09d33a7587 (xfs: allow inode allocations in post-growfs disk space), but I got pointed on IRC that it might not be enough. Was commit 7a1df1561609c14ac457d65d9a4a2b6c0f4204ad (xfs: fix premature enospc on inode allocation) planned for inclusion as well? (I've hit the same problem overall on 7.0, but honestly can't say if original fix is enough) The 2nd commit is not yet included in RHEL7; at the time this bug was filed and fixed that upstream commit didn't exist. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-0290.html |