Bug 490649
Summary: | GFS2: gfs2_grow fails on a full file system | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Christine Caulfield <ccaulfie> | ||||||
Component: | gfs2-utils | Assignee: | Ben Marzinski <bmarzins> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> | ||||||
Severity: | low | Docs Contact: | |||||||
Priority: | high | ||||||||
Version: | 5.3 | CC: | adas, casmith, ccoffey, iannis, mjuricek, rpeterso, sbradley, swhiteho, tao | ||||||
Target Milestone: | rc | ||||||||
Target Release: | 5.5 | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | gfs2-utils-0.1.62-29.el5 | Doc Type: | Bug Fix | ||||||
Doc Text: |
In order to grow a gfs2 filesystem, gfs2 needs to add additional resource groups to manage the new space. gfs2_grow does this by writing to the rindex file. If there are no free blocks available in the filesystem at its current size, and the last block of the rindex file is too full to add another resource group entry, gfs2_grow will be unable to write out the necessary information for gfs2 to be able to use the new space. When this happens, gfs2_grow is unable to grow the filesystem.
This problem can only happen on filesytems where the last block of the rindex file is too full to add another resource group entry. Whether or not this is the case is based on the filesystem size, the blocksize, and the resource group size.
If this problem occurs, gfs2_grow will report "Error writing new rindex entries;aborted." In this case, the user must remove or truncate a file to free up spacce for gfs2_grow to complete. Once the filesystem has been grown, the file can safely be added back to the gfs2 filesystem.
|
Story Points: | --- | ||||||
Clone Of: | |||||||||
: | 659123 711451 (view as bug list) | Environment: | |||||||
Last Closed: | 2011-07-21 11:02:05 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | 626585, 661904 | ||||||||
Bug Blocks: | 711451 | ||||||||
Attachments: |
|
Description
Christine Caulfield
2009-03-17 14:05:26 UTC
Looks like we should preallocate an extra block for the rindex. We could do that at mkfs time easily, but we can't do that at grow time until we have support for fallocate(). The fallocate() mode in question needs checking to ensure that we don't fall foul of any of the unwritten rules of gfs2 by allowing files to grow beyond the height dictated by the file size. So I think this will have to remain a "feature" for now. It will only affect filesystems where the rindex entries fill a complete block so that its not possible to add even a single extra entry without further allocations. A potential workaround for production scenarios would be to use quotas to ensure that at least one spare block is kept on the filesystem at all times. See my notes in bug #498469. If we implement my new statfs_fast patch in gfs2, we can perhaps unlink the old statfs sync file and reuse that block for writing to the rindex file. Then, once the new rg is in place, we can recreate said file for the next time this happens, or some such. There are concerns, however, which I noted in that bug record. I may need to bump the priority on this one. I had a user who did a gfs2_grow on a full file system. The gfs2_grow program wrote the new RG info to the rindex file until it ran out of space. When it ran out of blocks, it couldn't write to the rindex any more, but the alarming thing is that it didn't write a multiple of 96 bytes because of block boundaries, and therefore the rindex file was left with an invalid dinode size. That confused both the kernel and the rindex repair function of fsck.gfs2. This was on gfs2-utils-0.1.62-1.el5. I hand-patched the dinode size with gfs2_edit to a multiple of 96 and the file system was usable again. Some of the new rgs were there, which meant the file system was usable and a subsequent gfs2_grow was able to fix the problem. I think I changed gfs2_grow not that long ago (within the last year) so that it writes the first rindex entry first, then writes the rest in a big chunk. That may fix the problem for 99% of the users just by moving to the latest code. However, it doesn't solve the case where the rindex file is full AND its last block isn't big enough to hold another entry. We still need to figure out a way to squeeze out one more block. (Previously I had suggested deleting a system inode like statfs temporarily but perhaps there's a better way). I'd like to make gfs2_grow figure out if there are any free blocks to work with (and/or free space in the last block of rindex) and take extra measures if there aren't either. The other possibility is that the initial 96-byte write to rindex failed to trigger more free blocks in the kernel code for the subsequent writes to rindex, which would be another bug to slay. Hello, Ran into this bug yesterday. I was able to grow the filesystem after removing some data. However, even though df reports plenty of space, upon writing data to the filesystem, out of space errors are generated. After taking the filesystem offline and fsck'ing, I was able to successfully use the new space. Coincidence? Didn't test to see if it was the remount that was the fix. Is it possible to increase the priority of this bug? Thank you. What kernel and gfs2-utils version were you using? Are you using a cluster? If so, how is it set up? How did you fill it up. I used kernel-2.6.18-197.el5, gfs2-utils-0.1.62-20.el5, and I filled it up by dd'ing one massive file to take up almost all of the space, and then I created a bunch of emtpy files until I got the number of available blocks reported by df to say 0. When I tried to grow it after this, it worked fine. Sorry for the lack of info: 2.6.18-128.1.14.el5 gfs2-utils-0.1.62-1.el5 A user filled it up doing a copy over nfs to the gfs2 filesystem. This is a single node installation. I just did a another test as I was a bit foggy on this issue: - created 3GB gfs2 filesystem - filled it up with 10MB files - filled it up more with smaller files - at this point I'm out of space - I extend the lv by 1GB - gfs2_grow the lv ... it succeeds ... weird! But at this point even though df reports space available, I cannot create new files as I am told there is no space left on the device. I unmounted the filesystem, then remounted, and I can now write files. So, this test case was a bit different as I didn't run into the issue with the gfs2_grow reporting inability to write rindex files. FYI .. details .... =============================================================================== df -kl /dev/mapper/PDS_VG-test_bug_lv 3145376 3144656 720 100% /test_bug [root@pds test_bug]# lvextend -L+1G /dev/PDS_VG/test_bug_lv Extending logical volume test_bug_lv to 4.00 GB Logical volume test_bug_lv successfully resized [root@pds test_bug]# gfs2_grow /dev/PDS_VG/test_bug_lv FS: Mount Point: /test_bug FS: Device: /dev/mapper/PDS_VG-test_bug_lv FS: Size: 786431 (0xbffff) FS: RG size: 65534 (0xfffe) DEV: Size: 1048576 (0x100000) The file system grew by 1024MB. gfs2_grow complete. df -kl /dev/mapper/PDS_VG-test_bug_lv 3931712 3144656 787056 80% /test_bug [root@pds test_bug]# for i in `seq 500 600`; do dd if=/dev/zero of=file_$i count=1 bs=1M;done dd: opening `file_500': No space left on device dd: opening `file_501': No space left on device =============================================================================== Hope this helps, thanks. The bug you describe in Comment #7 is bug #482756. It has been fixed in the 2.6.18-174.el5 kernel, so it looks like we're back to only the original problem. Comment #11 indicates there may have been rgrp corruption that fsck.gfs2 was unable to fix. I've been working on a complex patch that is hopefully better at fixing this kind of damage. I'm doing so in the name of bug #576640, which is a RHEL6 bug, and so I don't have a back-port to RHEL5.5 yet. The patch tests my brutal "gfs2_fsck_hellfire" test, but it still needs some testing. Once I get the fix tested, I'll back-port the patch to RHEL5.x. If the problem is really due to damage, I'm hoping the patch will correctly identify and fix the problem. If the problem is actually due to the file system being full, today's gfs2_grow has no way to extend it. There's no simple solution. Our team has discussed ways we can solve the problem but we haven't tried to implement any of those ideas yet. This work is somewhat backed up behind other fsck.gfs2 work I have pending. Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: In order to grow a gfs2 filesystem, gfs2 needs to add additional resource groups to manage the new space. gfs2_grow does this by writing to the rindex file. If there are no free blocks available in the filesystem at its current size, and the last block of the rindex file is too full to add another resource group entry, gfs2_grow will be unable to write out the necessary information for gfs2 to be able to use the new space. When this happens, gfs2_grow is unable to grow the filesystem. This problem can only happen on filesytems where the last block of the rindex file is too full to add another resource group entry. Whether or not this is the case is based on the filesystem size, the blocksize, and the resource group size. If this problem occurs, gfs2_grow will report "Error writing new rindex entries;aborted." In this case, the user must remove or truncate a file to free up spacce for gfs2_grow to complete. Once the filesystem has been grown, the file can safely be added back to the gfs2 filesystem. Created attachment 461327 [details]
information on how much space is necessary to grow the filesystem.
Ben, the RHEL57 branch is open, so you should be able to commit this now. Ben's patch is in the RHEL57 branch of the git tree. I also built this into gfs2-utils-0.1.62-29.el5. Changing status to Modified. Verified against kernel-2.6.18-261.el5 and gfs2-utils-0.1.62-30.el5. When gfs2_grow finishes with Error, it shouldn't return zero exit code. Please fix that. [root@a3:~]$ uname -a Linux a3 2.6.18-261.el5 #1 SMP Thu May 12 16:47:19 EDT 2011 ia64 ia64 ia64 GNU/Linux (10:16:41) [root@a3:~]$ rpm -q gfs2-utils gfs2-utils-0.1.62-30.el5 [root@a3:/opt]$ gfs2_grow /mnt/test FS: Mount Point: /mnt/test FS: Device: /dev/mapper/vg1-lv1 FS: Size: 524288 (0x80000) FS: RG size: 65533 (0xfffd) DEV: Size: 53311488 (0x32d7800) The file system grew by 206200MB. Error writing new rindex entries;aborted. gfs2_grow complete. (10:08:53) [root@a3:/opt]$ echo $? 0 My reproducer now grows the filesystem correctly. Could you please provide some more information about your failing test. What would be most helpful is if you could give me 1. The initial size of the logical volume, by running # lvs --units s before making the filesystem. 2. The command used to create the filesystem 3. The size of the logical volume after growing, by running # lvs --units s after you resize the lv. I'm not sure how we want to deal with this. This is a completely different bug than the one fixed here. It has nothing to do with fallocate. However, it still can cause gfs2_grow to fail to grow a completely full filesystem. Here's the issue. GFS2 is writing to the rindex file a page at a time. The fix for this bug (and the bugs it depends on) made sure that at the end of the last block, there was enough space for another resource group. For that to make any difference at all, block size must be equal to page size. If the page size is bigger than the block size, you may still need to do an allocation to write out the first page of data to the rindex file. That's the general problem. What your are seeing is a corner case. When the rindex file is stuffed, there's not enough space for a page full of data, even if the page size and block size are equal. So you will still need to do an allocation to write the entire page. The end result is that if your rindex file is stuffed, or if your page size doesn't equal you blocksize (and your file doesn't have enough allocated blocks to be aligned on a page boundary anyway), it's still possible that you will need to do an allocation before you can write any rource groups entries to the rindex file, which will cause a hang. The easiest solution is to have gfs2_grow do what it did in RHEL5, which is to write out the first resource group entry by itself, and then write then write out the rest. The fallocate fix will guarantee that there is enough space for one more rindex group, and once you write that one, there will be tons of space for the rest of the rindex file. Another possible solution is to make mkfs.gfs2 always create an unstuffed rindex file, and then to make fallocate alway allocate enough space to make the file a multiple of the page size. Created attachment 500962 [details]
Patch to make gfs2_grow write one resource group first, and then the rest
Like I mentioned in comment #27, the general problem here is that multipath must be able to write the first page of data out without allocating, and this can also happen when the page size and the cache size aren't equal. To test that, you can use the same LV sizes from Comment #25, but when you make the filesystem, make it with -b 1024. That will give you a non-page aligned rindex file (but with the smaller block size, it won't be stuffed). The only other difference is that you won't be able to fill all of the blocks of the filesystem by simply touching files in the the filesystems root directory. The file creation will fail, but df will still show blocks left. You will need to create a couple of directories, and keep touching files in there, until the filesystem really is out of free blocks. Without this fix, that test fails. With the patch, the filesystem will grow correctly. Posted Comment on attachment 500962 [details] Patch to make gfs2_grow write one resource group first, and then the rest This fix is being dealt with in Bug #711451 An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2011-1042.html |