Bug 660661
Summary: | fsck.gfs2 reported statfs error after gfs2_grow | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Dominic Geevarghese <dgeevarg> | ||||
Component: | kernel | Assignee: | Robert Peterson <rpeterso> | ||||
Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | urgent | ||||||
Version: | 5.5 | CC: | adas, anton, bmarzins, cww, dhoward, edamato, jpirko, jwest, liko, mmahudha, pyaduvan, ssaha, swhiteho, syeghiay | ||||
Target Milestone: | rc | Keywords: | ZStream | ||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: |
Due to an off-by-one error, gfs2_grow failed to take the very last "rgrp" parameter into account when adding up the new free space. With this update, the GFS2 kernel properly counts all the new resource groups and fixes the "statfs" file correctly.
|
Story Points: | --- | ||||
Clone Of: | |||||||
: | 661048 719762 (view as bug list) | Environment: | |||||
Last Closed: | 2011-07-21 09:57:09 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 661048, 666792, 719762 | ||||||
Attachments: |
|
Description
Dominic Geevarghese
2010-12-07 14:40:59 UTC
Looks like the new fs size is being correctly cached locally, but probably not written back to disk on umount like it ought to be. I recreated the problem on roth-01. The statfs file is definitely being updated by adjust_fs_space, from the dmesg: GFS2: fsid=bobs_roth:roth_lv.0: File system extended by 244016 blocks. Before: [root@roth-01 ~]# gfs2_edit -p statfs /dev/roth_vg/roth_lv | tail -4 sc_total 732060 0xb2b9c sc_free 665866 0xa290a sc_dinodes 16 0x10 After: [root@roth-01 ~]# gfs2_edit -p statfs /dev/roth_vg/roth_lv | tail -4 sc_total 976076 0xee4cc sc_free 909882 0xde23a sc_dinodes 16 0x10 The math is right: 665866 + 244016 = 909882 So it looks more like the file is getting written back, but the new value is wrong. But the new value is calculated by function adjust_fs_space as: fs_total = gfs2_ri_total(sdp); ... new_free = fs_total - (m_sc->sc_total + l_sc->sc_total); And that's the value printed in the "File system extended by" During the gfs2_grow, five new rgrps were added to the rindex. Each of them was 61004 blocks. So the correct value for the amount of new free space is 5 * 61004 = 305020, which is what fsck.gfs2 calculated. On the other hand, 4 * 61004 = 244016, which is what gfs2 kernel decided. Therefore, this is an off-by-one error. The kernel code did not take one of the new rgrps into account for some reason. Created attachment 466435 [details]
Patch to fix the problem
Solved. Here is a patch to fix the problem.
Requesting ack flags. Tested on roth-01 where I could recreate the problem. [root@roth-01 ../gfs-kernel/src/gfs]# dmesg | tail -3 FS2: fsid=bobs_roth:roth_lv.0: File system extended by 305020 blocks. dlm: roth_lv: leaving the lockspace group... dlm: roth_lv: group event done 0 0 [root@roth-01 ../gfs-kernel/src/gfs]# fsck.gfs2 /dev/roth_vg/roth_lv Initializing fsck Validating Resource Group index. Level 1 RG check. (level 1 passed) Starting pass1 Pass1 complete Starting pass1b Pass1b complete Starting pass1c Pass1c complete Starting pass2 Pass2 complete Starting pass3 Pass3 complete Starting pass4 Pass4 complete Starting pass5 Pass5 complete gfs2_fsck complete Cloned as bug #661048 for crosswriting to RHEL6.x. The patch was accepted into the upstream -nmw git tree. I posted the patch for inclusion into RHEL5.7. Changing status to POST. The experimental version temporarily may be downloaded here: http://people.redhat.com/rpeterso/Experimental/RHEL5.x/gfs2/gfs2.660661.ko This module has not gone through the Red Hat Quality Engineering process, so as always, use at your own risk. Changing component to kernel as it should be. Hi, I have used hotfix kernel per bz 660661, comment # 16 . Unfortunately my test environment reported the same "statfs" error while testing the patch. [root@dhcp210-53 ~]# uname -a Linux dhcp210-53.gsslab.pnq.redhat.com 2.6.18-238.1.1.el5 #1 SMP Tue Jan 4 13:32:19 EST 2011 x86_64 x86_64 x86_64 GNU/Linux [root@dhcp210-53 ~]# lsmod | grep gfs gfs2 524204 1 lock_dlm configfs 62045 2 dlm I failed to get the modinfo at first place by simple "modinfo gfs2" [root@dhcp210-53 ~]# modinfo gfs2 [root@dhcp210-53 ~]# [root@dhcp210-53 ~]# modinfo /lib/modules/2.6.18-238.1.1.el5/kernel/fs/gfs2/gfs2.ko filename: /lib/modules/2.6.18-238.1.1.el5/kernel/fs/gfs2/gfs2.ko license: GPL author: Red Hat, Inc. description: Global File System srcversion: 39378B8C32BD3F6A7DDDBBA depends: vermagic: 2.6.18-238.1.1.el5 SMP mod_unload gcc-4.1 module_sig: 883f3504d236b93286a51799a18fc80112986f09f5db76d36a58112b15063494c37c41de94e84ab3709f5c29552feab9d1aed9c0af898527bfb574de92b3 [root@dhcp210-53 ~]# [root@dhcp210-53 ~]# mkfs -t gfs2 -p lock_dlm -t domxen:gfs2 -j 2 /dev/domvg/domlv This will destroy any data on /dev/domvg/domlv. Are you sure you want to proceed? [y/n] y Device: /dev/domvg/domlv Blocksize: 4096 Device Size 1.87 GB (489472 blocks) Filesystem Size: 1.87 GB (489471 blocks) Journals: 2 Resource Groups: 8 Locking Protocol: "lock_dlm" Lock Table: "domxen:gfs2" UUID: 8CF32642-97A5-B434-57F7-15580F846192 [root@dhcp210-53 ~]# gfs2_edit -p statfs /dev/domvg/domlv | tail -4 sc_total 489416 0x777c8 sc_free 423222 0x67536 sc_dinodes 16 0x10 ------------------------------------------------------ [root@dhcp210-53 ~]# pvcreate /dev/sda2 ; vgextend domvg /dev/sda2 ; lvextend -L 3G /dev/domvg/domlv Physical volume "/dev/sda2" successfully created Volume group "domvg" successfully extended Extending logical volume domlv to 3.00 GB Logical volume domlv successfully resized [root@dhcp210-53 ~]# gfs2_grow /gfs FS: Mount Point: /gfs FS: Device: /dev/mapper/domvg-domlv FS: Size: 489471 (0x777ff) FS: RG size: 61181 (0xeefd) DEV: Size: 786432 (0xc0000) The file system grew by 1160MB. gfs2_grow complete. [root@dhcp210-53 ~]# df -h ... /dev/mapper/domvg-domlv 2.6G 259M 2.4G 10% /gfs ... [root@dhcp210-53 ~]# umount /gfs ; fsck.gfs2 /dev/domvg/domlv Initializing fsck Validating Resource Group index. Level 1 RG check. (level 1 passed) Starting pass1 Pass1 complete Starting pass1b Pass1b complete Starting pass1c Pass1c complete Starting pass2 Pass2 complete Starting pass3 Pass3 complete Starting pass4 Pass4 complete Starting pass5 Pass5 complete The statfs file is wrong: Current statfs values: blocks: 672944 (0xa44b0) free: 606750 (0x9421e) dinodes: 16 (0x10) Calculated statfs values: blocks: 734120 (0xb33a8) free: 667926 (0xa3116) dinodes: 16 (0x10) Okay to fix the master statfs file? (y/n)n The statfs file was not fixed. gfs2_fsck complete [root@dhcp210-53 ~]# gfs2_edit -p statfs /dev/domvg/domlv | tail -4 sc_total 672944 0xa44b0 sc_free 606750 0x9421e sc_dinodes 16 0x10 ------------------------------------------------------ Thanks, Dominic Dominic had some problems testing the hotfix, but it turns out that the gfs2 overlay module was still loaded. Be sure before trying the patch that you (1) remove the overlay from the disk (2) remove the overlay from memory, and (3) make sure the new version is running in memory with dmesg | grep GFS2 [root@dhcp210-53 ~]# dmesg | grep built | grep GFS2 GFS2 Overlay (built May 29 2008 16:48:00) installed [root@dhcp210-53 ~]# rpm -e kmod-gfs2 [root@dhcp210-53 ~]# rmmod lock_dlm gfs2 [root@dhcp210-53 ~]# in kernel-2.6.18-241.el5 You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5 Detailed testing feedback is always welcomed. SCENARIO - [after_grow] Check that fsck is clean after growfs Creating 2G LV aftergrow on dash-01 Creating file system on /dev/fsck/aftergrow with options '-p lock_dlm -j 1 -t dash:aftergrow' on dash-01 Device: /dev/fsck/aftergrow Blocksize: 4096 Device Size 2.00 GB (524288 blocks) Filesystem Size: 2.00 GB (524288 blocks) Journals: 1 Resource Groups: 8 Locking Protocol: "lock_dlm" Lock Table: "dash:aftergrow" UUID: 8FAD9CAD-A522-3FCA-D80C-4166288756C4 Mounting gfs2 /dev/fsck/aftergrow on dash-01 with opts '' Extending LV aftergrow by +2G on dash-01 Growing /dev/fsck/aftergrow on dash-01 FS: Mount Point: /mnt/fsck FS: Device: /dev/mapper/fsck-aftergrow FS: Size: 524288 (0x80000) FS: RG size: 65533 (0xfffd) DEV: Size: 1048576 (0x100000) The file system grew by 2048MB. gfs2_grow complete. Unmounting /mnt/fsck on dash-01 Starting fsck.gfs2 of /dev/fsck/aftergrow on dash-01 fsck.gfs2 output in /tmp/gfs_fsck_stress.24262/3.after_grow/1.fsck-dash-01.log Removing LV aftergrow on dash-01 $ cat /tmp/gfs_fsck_stress.24262/3.after_grow/1.fsck-dash-01.log Initializing fsck Validating Resource Group index. Level 1 RG check. (level 1 passed) Starting pass1 Pass1 complete Starting pass1b Pass1b complete Starting pass1c Pass1c complete Starting pass2 Pass2 complete Starting pass3 Pass3 complete Starting pass4 Pass4 complete Starting pass5 Pass5 complete gfs2_fsck complete Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Due to an off-by-one error, gfs2_grow failed to take the very last "rgrp" parameter into account when adding up the new free space. With this update, the GFS2 kernel properly counts all the new resource groups and fixes the "statfs" file correctly. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-1065.html |