Bug 641461
Summary: | online resize of LV/ext4 corrupts data | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Benjamin Kahn <bkahn> |
Component: | lvm2 | Assignee: | Zdenek Kabelac <zkabelac> |
Status: | CLOSED ERRATA | QA Contact: | Corey Marthaler <cmarthal> |
Severity: | high | Docs Contact: | |
Priority: | urgent | ||
Version: | 6.0 | CC: | agk, antillon.maurizio, benl, coughlan, ddumas, dwysocha, esandeen, heinzm, jbrassow, joe.thornber, mbroz, plyons, pm-eus, prajnoha, prockai, rwheeler, syeghiay, tmichael, zkabelac |
Target Milestone: | rc | Keywords: | ZStream |
Target Release: | --- | ||
Hardware: | i386 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | lvm2-2.02.72-8.el6_0.2 | Doc Type: | Bug Fix |
Doc Text: |
This update avoids data corruption caused by a failure to detect that a filesystem being resized with 'fsadm' (or lvresize/lvreduce --resizefs) is mounted. The update also fixes various other problems in 'fsadm' including incorrect handling of user's break action, inconsistent processing of the '--dry-run' option, missing support for correctly passing the '--yes' option, and incorrect handling of the 'LVM_BINARY' environment variable.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2010-11-10 19:00:31 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 638052 | ||
Bug Blocks: |
Description
Benjamin Kahn
2010-10-08 18:56:50 UTC
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: This update fixes various problems in 'fsadm' including incorrect handling of user's break action, inconsistent processing of the '--dry-run' option, missing support for correctly passing the '--yes' option, incorrect handling of the 'LVM_BINARY' environment variable, and failing detection of mounted filesystem when udev was used. Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1 +1 @@ -This update fixes various problems in 'fsadm' including incorrect handling of user's break action, inconsistent processing of the '--dry-run' option, missing support for correctly passing the '--yes' option, incorrect handling of the 'LVM_BINARY' environment variable, and failing detection of mounted filesystem when udev was used.+This update avoids data corruption caused by a failure to detect that a filesystem being resized with 'fsadm' (or lvresize/lvreduce --resizefs) is mounted. The update also fixes various other problems in 'fsadm' including incorrect handling of user's break action, inconsistent processing of the '--dry-run' option, missing support for correctly passing the '--yes' option, and incorrect handling of the 'LVM_BINARY' environment variable. I ran the series of commands that Eric provided in bug 638052, and I'll add that test case to the lvm regression tests. The lvextend cmd no longer forces the resize, marking verified. [root@hayes-01 tmp]# touch pvfile [root@hayes-01 tmp]# truncate --size 12g pvfile [root@hayes-01 tmp]# losetup /dev/loop0 pvfile [root@hayes-01 tmp]# pvcreate /dev/loop0 Physical volume "/dev/loop0" successfully created [root@hayes-01 tmp]# vgcreate testvg /dev/loop0 Volume group "testvg" successfully created [root@hayes-01 tmp]# lvcreate -L 4g testvg Logical volume "lvol0" created [root@hayes-01 tmp]# mkfs.ext4 /dev/mapper/testvg-lvol0 mke2fs 1.41.12 (17-May-2010) Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) Stride=0 blocks, Stripe width=0 blocks 262144 inodes, 1048576 blocks 52428 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=1073741824 32 block groups 32768 blocks per group, 32768 fragments per group 8192 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736 Writing inode tables: done Creating journal (32768 blocks): done Writing superblocks and filesystem accounting information: done This filesystem will be automatically checked every 26 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override. [root@hayes-01 tmp]# mkdir -p /mnt/test [root@hayes-01 tmp]# mount /dev/mapper/testvg-lvol0 /mnt/test [root@hayes-01 tmp]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg_hayes01-lv_root 35G 1.3G 32G 4% / tmpfs 3.9G 0 3.9G 0% /dev/shm /dev/sda1 485M 36M 425M 8% /boot /dev/mapper/vg_hayes01-lv_home 29G 172M 27G 1% /home /dev/mapper/testvg-lvol0 4.0G 136M 3.7G 4% /mnt/test [root@hayes-01 tmp]# lvextend -L +10G -r /dev/mapper/testvg-lvol0 fsadm: Can not fsck device "/dev/mapper/testvg-lvol0", filesystem mounted on /mnt/test fsadm failed: 1 2.6.32-71.el6.x86_64 lvm2-2.02.72-8.el6_0.1 BUILT: Mon Oct 11 10:45:21 CDT 2010 lvm2-libs-2.02.72-8.el6_0.1 BUILT: Mon Oct 11 10:45:21 CDT 2010 lvm2-cluster-2.02.72-8.el6_0.1 BUILT: Mon Oct 11 10:45:21 CDT 2010 udev-147-2.29.el6 BUILT: Tue Aug 31 16:44:10 CDT 2010 device-mapper-1.02.53-8.el6_0.1 BUILT: Mon Oct 11 10:45:21 CDT 2010 device-mapper-libs-1.02.53-8.el6_0.1 BUILT: Mon Oct 11 10:45:21 CDT 2010 device-mapper-event-1.02.53-8.el6_0.1 BUILT: Mon Oct 11 10:45:21 CDT 2010 device-mapper-event-libs-1.02.53-8.el6_0.1 BUILT: Mon Oct 11 10:45:21 CDT 2010 cmirror-2.02.72-8.el6_0.1 BUILT: Mon Oct 11 10:45:21 CDT 2010 (In reply to comment #10) > I ran the series of commands that Eric provided in bug 638052, and I'll add > that test case to the lvm regression tests. The lvextend cmd no longer forces > the resize, marking verified. > > [root@hayes-01 tmp]# touch pvfile > [root@hayes-01 tmp]# truncate --size 12g pvfile > [root@hayes-01 tmp]# losetup /dev/loop0 pvfile > [root@hayes-01 tmp]# pvcreate /dev/loop0 > Physical volume "/dev/loop0" successfully created > [root@hayes-01 tmp]# vgcreate testvg /dev/loop0 > Volume group "testvg" successfully created > [root@hayes-01 tmp]# lvcreate -L 4g testvg > Logical volume "lvol0" created > [root@hayes-01 tmp]# mkfs.ext4 /dev/mapper/testvg-lvol0 > mke2fs 1.41.12 (17-May-2010) > Filesystem label= > OS type: Linux > Block size=4096 (log=2) > Fragment size=4096 (log=2) > Stride=0 blocks, Stripe width=0 blocks > 262144 inodes, 1048576 blocks > 52428 blocks (5.00%) reserved for the super user > First data block=0 > Maximum filesystem blocks=1073741824 > 32 block groups > 32768 blocks per group, 32768 fragments per group > 8192 inodes per group > Superblock backups stored on blocks: > 32768, 98304, 163840, 229376, 294912, 819200, 884736 > > Writing inode tables: done > Creating journal (32768 blocks): done > Writing superblocks and filesystem accounting information: done > > This filesystem will be automatically checked every 26 mounts or > 180 days, whichever comes first. Use tune2fs -c or -i to override. > [root@hayes-01 tmp]# mkdir -p /mnt/test > [root@hayes-01 tmp]# mount /dev/mapper/testvg-lvol0 /mnt/test > [root@hayes-01 tmp]# df -h > Filesystem Size Used Avail Use% Mounted on > /dev/mapper/vg_hayes01-lv_root > 35G 1.3G 32G 4% / > tmpfs 3.9G 0 3.9G 0% /dev/shm > /dev/sda1 485M 36M 425M 8% /boot > /dev/mapper/vg_hayes01-lv_home > 29G 172M 27G 1% /home > /dev/mapper/testvg-lvol0 > 4.0G 136M 3.7G 4% /mnt/test > [root@hayes-01 tmp]# lvextend -L +10G -r /dev/mapper/testvg-lvol0 > fsadm: Can not fsck device "/dev/mapper/testvg-lvol0", filesystem mounted on > /mnt/test > fsadm failed: 1 > > > 2.6.32-71.el6.x86_64 > > lvm2-2.02.72-8.el6_0.1 BUILT: Mon Oct 11 10:45:21 CDT 2010 > lvm2-libs-2.02.72-8.el6_0.1 BUILT: Mon Oct 11 10:45:21 CDT 2010 > lvm2-cluster-2.02.72-8.el6_0.1 BUILT: Mon Oct 11 10:45:21 CDT 2010 > udev-147-2.29.el6 BUILT: Tue Aug 31 16:44:10 CDT 2010 > device-mapper-1.02.53-8.el6_0.1 BUILT: Mon Oct 11 10:45:21 CDT 2010 > device-mapper-libs-1.02.53-8.el6_0.1 BUILT: Mon Oct 11 10:45:21 CDT 2010 > device-mapper-event-1.02.53-8.el6_0.1 BUILT: Mon Oct 11 10:45:21 CDT 2010 > device-mapper-event-libs-1.02.53-8.el6_0.1 BUILT: Mon Oct 11 10:45:21 CDT > 2010 > cmirror-2.02.72-8.el6_0.1 BUILT: Mon Oct 11 10:45:21 CDT 2010 Hi Corey, After your lvextend command, did /dev/mapper/testvg-lvol0 grow by 10GB or did the command just exit? Can you provide "df -h" after issuing lvextend? it stayed at 4G. [root@hayes-01 tmp]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/testvg-lvol0 4.0G 136M 3.7G 4% /mnt/test [root@hayes-01 tmp]# lvs LV VG Attr LSize Origin Snap% Move Log Copy% Convert lvol0 testvg -wi-ao 4.00g So that means no online resize? I don't think that is fixing the original issue but removing a feature. so it's just ext4 that had corruption then? [root@hayes-01 tmp]# lvextend -L +10G -r /dev/mapper/testvg-lvol0 fsadm: Can not fsck device "/dev/mapper/testvg-lvol0", filesystem mounted on /mnt/test fsadm failed: 1 That looks wrong to me. Why didn't it perform a resize? (Or at least give a message saying how to do it!) Or is this bug just about stopping the corruption, while the proper fix happens later? The -only- think fsadm needs to change, I think, is to simply not invoke fsck at all if it is going to do an online resize on extN. It's not required, and in fact it's not possible, (at least not possible w/o corrupting said filesystem in the process). Exactly - in the case of a mounted modern fs like ext4, the fsck step should be skipped automatically. FAILS_QA ? lvextend by default runs sequence - fsck/fsresize - so unless instrumented by user - it will first try to fsck given LV - and fsadm detects mounted partition - and fails whole resize operation - as expected by the logic: "Do not resize uncheck fs". Also this logic has been probably heavily motivated by the ext2/3/4 request for fsck even for cleanly unmounted fs (which could be skipped with force option) Of course user can online resize LV with just add option -n / --nofsck flag together with -r / --resizefs and it should work on fs where online resize is possible. I haven't changed original logic of lvresize - so it stayed in this maybe a little bit confusing working model. I think there is some potentiality for improvement - fsadm could return different error status for fsck on filesystem that in fact supports online resize and does not need fsck fisrt - so lvresize could recognize this different error status and might continue trying to resize fs online where it is possible. (In reply to comment #18) > lvextend by default runs sequence - fsck/fsresize - so unless instrumented > by user - it will first try to fsck given LV - and fsadm detects mounted > partition - and fails whole resize operation - as expected by the logic: "Do > not resize uncheck fs". Also this logic has been probably heavily motivated by > the ext2/3/4 request for fsck even for cleanly unmounted fs (which could be > skipped with force option) You can use whatever logic you like, but as the extN maintainer for Red Hat, I am telling you that you can resize an online filesystem without first (unmounting and) running fsck. Online resize has never required it: [root@neon tmp]# resize2fs /dev/loop0 resize2fs 1.41.10 (10-Feb-2009) Please run 'e2fsck -f /dev/loop0' first. [root@neon tmp]# mount /dev/loop0 mnt/ [root@neon tmp]# resize2fs /dev/loop0 resize2fs 1.41.10 (10-Feb-2009) Filesystem at /dev/loop0 is mounted on /tmp/mnt; on-line resizing required old desc_blocks = 1, new_desc_blocks = 2 Performing an on-line resize of /dev/loop0 to 524288 (1k) blocks. The filesystem on /dev/loop0 is now 524288 blocks long. Failing the resize in fsadm by default is an interesting extension of your logic, but you can do what you like with the fsadm tool, I suppose. If you prefer that approach, I would just stop earlier and error out with "fsadm will not resize mounted filesystems." -Eric I didn't quite understand the bug when I "verified" it. After discussing it though, it's clear that this bug should be in put back into assigned (FAILS_QA). This looks like minimal patch to change this behaviour. It could be probably consider also as a bugfix for fsadm check mounted_fs With this patch it now returns 0 - meaning mounted fs is OK. diff --git a/scripts/fsadm.sh b/scripts/fsadm.sh old mode 100644 new mode 100755 index 945089a..a981a6e --- a/scripts/fsadm.sh +++ b/scripts/fsadm.sh @@ -399,7 +399,10 @@ resize() { ################### check() { detect_fs "$1" - detect_mounted && error "Cannot fsck device \"$VOLUME\", filesystem is mounted on $MOUNTED" + if detect_mounted ; then + verbose "Skipping fsck device \"$VOLUME\" as filesystem is mounted on $MOUNTED"; + cleanup 0 + fi case "$FSTYPE" in "xfs") dry $XFS_CHECK "$VOLUME" ;; *) # check if executed from intera Better upstream patches: http://www.redhat.com/archives/lvm-devel/2010-November/msg00002.html http://www.redhat.com/archives/lvm-devel/2010-November/msg00004.html This now appears to work fine. Marking verified. [root@hayes-03 ~]# touch pvfile [root@hayes-03 ~]# truncate --size 16g pvfile [root@hayes-03 ~]# losetup /dev/loop0 pvfile [root@hayes-03 ~]# pvcreate /dev/loop0 Physical volume "/dev/loop0" successfully created [root@hayes-03 ~]# vgcreate testvg /dev/loop0 Volume group "testvg" successfully created [root@hayes-03 ~]# lvcreate -L 4g testvg Logical volume "lvol0" created [root@hayes-03 ~]# mkfs.ext4 /dev/mapper/testvg-lvol0 mke2fs 1.41.12 (17-May-2010) Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) Stride=0 blocks, Stripe width=0 blocks 262144 inodes, 1048576 blocks 52428 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=1073741824 32 block groups 32768 blocks per group, 32768 fragments per group 8192 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736 Writing inode tables: done Creating journal (32768 blocks): done Writing superblocks and filesystem accounting information: done This filesystem will be automatically checked every 23 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override. [root@hayes-03 ~]# mkdir -p /mnt/test [root@hayes-03 ~]# mount /dev/mapper/testvg-lvol0 /mnt/test [root@hayes-03 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/testvg-lvol0 4.0G 136M 3.7G 4% /mnt/test [root@hayes-03 ~]# lvextend -L +10G -r /dev/mapper/testvg-lvol0 Extending logical volume lvol0 to 14.00 GiB Logical volume lvol0 successfully resized resize2fs 1.41.12 (17-May-2010) Filesystem at /dev/mapper/testvg-lvol0 is mounted on /mnt/test; on-line resizing required old desc_blocks = 1, new_desc_blocks = 1 Performing an on-line resize of /dev/mapper/testvg-lvol0 to 3670016 (4k) blocks. The filesystem on /dev/mapper/testvg-lvol0 is now 3670016 blocks long. [root@hayes-03 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/testvg-lvol0 14G 138M 13G 2% /mnt/test 2.6.32-71.el6.x86_64 lvm2-2.02.72-8.el6_0.2 BUILT: Mon Nov 1 11:29:36 CDT 2010 lvm2-libs-2.02.72-8.el6_0.2 BUILT: Mon Nov 1 11:29:36 CDT 2010 lvm2-cluster-2.02.72-8.el6_0.2 BUILT: Mon Nov 1 11:29:36 CDT 2010 udev-147-2.29.el6 BUILT: Tue Aug 31 16:44:10 CDT 2010 device-mapper-1.02.53-8.el6_0.2 BUILT: Mon Nov 1 11:29:36 CDT 2010 device-mapper-libs-1.02.53-8.el6_0.2 BUILT: Mon Nov 1 11:29:36 CDT 2010 device-mapper-event-1.02.53-8.el6_0.2 BUILT: Mon Nov 1 11:29:36 CDT 2010 device-mapper-event-libs-1.02.53-8.el6_0.2 BUILT: Mon Nov 1 11:29:36 CDT 2010 cmirror-2.02.72-8.el6_0.2 BUILT: Mon Nov 1 11:29:36 CDT 2010 An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2010-0849.html |