Bug 454872

Summary: [NetApp 4.8 bug] online resize of filesystem does not work
Product: Red Hat Enterprise Linux 4 Reporter: Tanvi <tanvi>
Component: kernelAssignee: Jeffrey Moyer <jmoyer>
Status: CLOSED ERRATA QA Contact: Martin Jenner <mjenner>
Severity: high Docs Contact:
Priority: medium    
Version: 4.8CC: ahecox, andriusb, bmarzins, coughlan, marting, mchristi, naveenr, rlerch, tanvi, tao, xdl-redhat-bugzilla
Target Milestone: rcKeywords: OtherQA
Target Release: 4.8   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Red Hat Enterprise Linux 4.8 can detect online growing or shrinking of an underlying block device. However, there is no method to automatically detect that a device has changed size, so manual steps are required to recognize this and resize any file systems which reside on the given device(s). When a resized block device is detected, a message like the following will appear in the system logs: VFS: busy inodes on changed media or resized disk sdi If the block device was grown, then this message can be safely ignored. However, if the block device was shrunk without shrinking any data set on the block device first, the data residing on the device may be corrupted. It is only possible to do an online resize of a filesystem that was created on the entire LUN (or block device). If there is a partition table on the block device, then the file system will have to be unmounted to update the partition table.
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-05-18 15:31:51 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Bug Depends On: 444964, 480338    
Bug Blocks: 450897, 458123, 458752, 461297, 479684    
Attachments:
Description Flags
wrapper for lower level revalidate_disk routines
none
adjust block device size after an online resize of a disk
none
check for device resize when rescanning partitions
none
scsi sd driver calls revalidate_disk wrapper
none
add flush_disk to factor out common buffer cache flushing code
none
call flush_disk after detecting an online resize none

Description Tanvi 2008-07-10 10:02:31 EDT
+++ This bug was initially created as a clone of Bug #444964 +++

Description of problem:

From the resize2fs manpage:
The resize2fs program will resize ext2 or ext3 file systems.  It can be
       used  to  enlarge or shrink an unmounted file system located on device.
       If the filesystem is mounted, it can be used to expand the size of  the
       mounted filesystem, assuming the kernel supports on-line resizing.  (As
       of this writing, the Linux  2.6  kernel  supports  on-line  resize  for
       filesystems mounted using ext3 only.).



It has been seen that online resize of filesystem doesn't work. The resize2fs
tool is supposed to resize ext2/ext3 filesystems while they are mounted and are
in use by the system. But if the filesystem is mounted (i.e. device is in use)
and the mounted device is resized on the target, the kernel is not able to
detect the new size of the device. To reflect the new size, we need to unmount
and then remount the filesystem.


Version-Release number of selected component (if applicable):
[root@lnx200-171 ~]# lsb_release -a
LSB Version:   
:core-3.0-amd64:core-3.0-ia32:core-3.0-noarch:graphics-3.0-amd64:graphics-3.0-ia32:graphics-3.0-noarch
Distributor ID: RedHatEnterpriseAS
Description:    Red Hat Enterprise Linux AS release 4 (Nahant Update 7 Beta)
Release:        4
Codename:       NahantUpdate7Beta



How reproducible:
Always reproducible

Steps to Reproduce:
1 mount an iSCSI LUN
2 Increase the LUN size on the target
3 rescan the iSCSI sessions on the initiator
4 resize the filesystem using resize2fs ( resize2fs reports that the filesystem
is already occupying all the blocks and that there is nothing to resize)
5 unmount the file system
6 mount it again
7 resize the filesystem using resize2fs ( now it works )

  
Actual results:
Device size change is not reflected to the filesystem utilities and dm-multipath

Expected results:
Device size change should be reflected to filesystem resize utilities so that
the filesystem can be grown/expanded to the new size.


Additional info:

Point to note here is that after executing step 3, the SCSI subsystem reflects
the new size. This can be verified with the value present in /sys/block/DEVICE/size
If we resize the LUN while it is mounted, SCSI reflects the change after the
rescan, but resize2fs does not. This forfeits the whole idea of "Online Resize"
because we can't see the new size on the _filesystem_
 
Similar is the case when using multipath. When we try to expand a multipathed
LUN, to reflect the size change in multipath, we need to flush (release) the LUN
(using `multipath -F`) and then re-discover it (using `multipath -v3`) which
would be a disruptive operation for the LUN.

-- Additional comment from coughlan@redhat.com on 2008-05-19 11:06 EST --
I wonder whether this is related to the recent patch set:

http://lkml.org/lkml/2008/5/8/40

-- Additional comment from jmoyer@redhat.com on 2008-05-29 10:10 EST --
(In reply to comment #2)
> I wonder whether this is related to the recent patch set:
> 
> http://lkml.org/lkml/2008/5/8/40

I put together a test kernel with those patches applied.  It can be found at:

  http://people.redhat.com/jmoyer/dio/rhel5/

I have not yet run the kernel through the reproducer described here.  I'll do so
first chance I get.

-- Additional comment from andriusb@redhat.com on 2008-05-29 11:26 EST --
Ritesh, could you please try out the test kernel?

-- Additional comment from jmoyer@redhat.com on 2008-05-29 13:02 EST --
(In reply to comment #4)
> Ritesh, could you please try out the test kernel?

Ritesh, only do so if you have spare cycles.  I'd rather have you test after I
am convinced that the patches solve the problem.

Thanks.

-- Additional comment from jmoyer@redhat.com on 2008-06-27 11:35 EST --
OK, I finally got the required hardware setup to test this, and it works for me.
 Please test out the kernels posted to my people page (see comment #3) and let
me know if they resolve the issue for you.

Thanks!

-- Additional comment from tanvi@netapp.com on 2008-07-09 03:39 EST --
We verified it and here is the result

In case of scsi devices the file system can be resized online with the new
kernel package.
But when using multipathed devices the size is still not reflected unless maps
are flushed and rediscovered ( which ultimately requires unmount of the
multipathed device)

-- Additional comment from jmoyer@redhat.com on 2008-07-09 03:44 EST --
(In reply to comment #8)
> We verified it and here is the result
> 
> In case of scsi devices the file system can be resized online with the new
> kernel package.
That was always the case, right?  I thought you were testing iSCSI.

> But when using multipathed devices the size is still not reflected unless maps
> are flushed and rediscovered ( which ultimately requires unmount of the
> multipathed device)

Can you provide your configuration, please?

-- Additional comment from tanvi@netapp.com on 2008-07-09 05:31 EST --
(In reply to comment #9)
> That was always the case, right?  I thought you were testing iSCSI.
>
Yes, used to. But wasn't true with the actual test we did. Online resize
wouldn't work on the standard kernels shipped with RHEL5.2. A umount was required.
 
> Can you provide your configuration, please?

iSCSI LUN was mapped with two paths with multipath enabled on top of it.

If you need the exact command outputs please let me know.
Comment 1 Andrius Benokraitis 2008-07-10 10:18:09 EDT
This is highly dependent on RHEL 5.3 inclusion, so we'll follow its lead on
this. Furthermore this could be something we may not have capacity for in 4.8
since it will be a small release.
Comment 2 RHEL Product and Program Management 2008-09-03 09:08:05 EDT
Updating PM score.
Comment 3 RHEL Product and Program Management 2008-09-22 13:23:38 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 4 Jeffrey Moyer 2008-09-22 13:36:19 EDT
Created attachment 317386 [details]
wrapper for lower level revalidate_disk routines
Comment 5 Jeffrey Moyer 2008-09-22 13:37:00 EDT
Created attachment 317387 [details]
adjust block device size after an online resize of a disk
Comment 6 Jeffrey Moyer 2008-09-22 13:37:31 EDT
Created attachment 317388 [details]
check for device resize when rescanning partitions
Comment 7 Jeffrey Moyer 2008-09-22 13:38:04 EDT
Created attachment 317389 [details]
scsi sd driver calls revalidate_disk wrapper
Comment 8 Jeffrey Moyer 2008-09-22 13:39:08 EDT
Created attachment 317390 [details]
add flush_disk to factor out common buffer cache flushing code
Comment 9 Jeffrey Moyer 2008-09-22 13:39:37 EDT
Created attachment 317391 [details]
call flush_disk after detecting an online resize
Comment 10 Jeffrey Moyer 2008-09-22 13:40:34 EDT
The above patches are backports of the upstream patch set from Andrew Patterson.  They have not yet been through any sort of testing.  I'll update the bug when I have testing results.
Comment 12 Vivek Goyal 2009-01-09 08:55:13 EST
Committed in 78.26.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/
Comment 13 Jeffrey Moyer 2009-01-09 10:21:35 EST
Can we get customer testing on this kernel, please?  Thanks!
Comment 14 Tanvi 2009-01-15 06:16:34 EST
I tested it with 78.28.EL kernel. Things are not working. I had to unmount and then remount the SCSI device before ext2online could resize the filesystem.
Comment 15 Jeffrey Moyer 2009-01-15 09:29:35 EST
(In reply to comment #14)
> I tested it with 78.28.EL kernel. Things are not working. I had to unmount and
> then remount the SCSI device before ext2online could resize the filesystem.

Thanks for the quick testing turn-around, Tanvi!  I'll look into this immediately.
Comment 16 Jeffrey Moyer 2009-01-15 11:30:32 EST
(In reply to comment #14)
> I tested it with 78.28.EL kernel. Things are not working. I had to unmount and
> then remount the SCSI device before ext2online could resize the filesystem.

Hi, Tanvi,

I just tried this with the 78.29.EL kernel, and it works for me.  Could you provide more information on your test procedure so that I can try to reproduce the problem?

These are the steps I took:

service iscsi start
mkfs -t ext3 /dev/sdi
mount /dev/sdi /mnt/equallogic/
cd /mnt/equallogic/
touch x
touch y
dd if=/dev/zero of=foo bs=1M count=100
sync
sync

# login to iscsi target and resize the lun

iscsi-rescan 
ext2online /dev/sdi
df -h .

Thanks!
Comment 17 Tanvi 2009-01-16 01:45:13 EST
Hi Jeffrey,

I did follow the same steps.
I retested with 78.29.EL kernel and was able to resize an online SCSI device. Thank you.
But to resize a file system which was created on top of a multipathed device, I had to follow following steps

1.unmount the device
2.flush the map
3.create the map again
4.remount it
5.ext2online

Is it expected to be fixed in RHEL4.8?
Comment 18 Jeffrey Moyer 2009-01-16 09:49:34 EST
Hi, Tanvi,

Sorry I didn't test with a multipath device!  I just went ahead and did so, and I got it to work, but it's even worse than RHEL 5!

For the most part, the procedure is the same as the RHEL 5 procedure.  I'll spell it out here, though:

service iscsi start
service multipathd start
mkfs -t ext3 /dev/mpath/mpathX
mount /dev/mpath/mpathX /mnt/equallogic/
cd /mnt/equallogic/
touch x
touch y
dd if=/dev/zero of=foo bs=1M count=100

# resize the iscsi target

iscsi-rescan
dmsetup table mpathX > /root/newtab

# modify newtab to have the new end sector of the device in column 2

dmsetup suspend /dev/mpath/mpathX
dmsetup reload /dev/mpath/mpathX /root/newtab
dmsetup resume /dev/mpath/mpathX

# and now the stupid part.  ext2online sees that /dev/mpath/mpathX is a
# symbolic link to /dev/dm-X, and so looks in /etc/mtab for /dev/dm-X.
# Of course, that doesn't exist, so it fails.  So, I modified /etc/mtab to
# put the dm-X device in place of the mpath device, and:

ext2online /dev/mpath/mpath9

And that worked for me.  Strangely enough, I couldn't use --force, nor could I just pass in /dev/dm-X.  I'm quite puzzled by ext2online's reticence to actually do what you want.  I'll file a bug on that.

Now, I know a bug was filed to update the user-space tools to allow this online resizing to be less painful using the multipath utilities in RHEL 5.  Has the same bug been filed for RHEL 4?  If now, we should dup the RHEL 5 bug to RHEL 4.

I'll get to work on the ext2online bug.  Thanks again for your patience and your testing, Tanvi.  It is much appreciated!
Comment 19 Jeffrey Moyer 2009-01-16 10:04:00 EST
I filed bug 480338 to track the e2fsprogs (ext2online) issue.
Comment 20 Martin George 2009-01-16 10:09:18 EST
Yes, the user-space bug for multipath utilities has been cloned for RHEL 4.8. It is tracked at bugzilla #479684.
Comment 21 Ben Marzinski 2009-01-21 18:45:20 EST
Did you try just running
# multipath
After the underlying block device has been resized.  In RHEL4, it should already do the same thing as the manual method in Comment #18 did.

Strangely, when I try this, ext2online fails for me

[root@ask-06 mnt]# ext2online /dev/mapper/mpath7 
ext2online v1.1.18 - 2001/03/18 for EXT2FS 0.5b
error: Input/output error: read -1 of 16384 bytes at 4096

However, it fails just the same using the method in Comment #18, so I'm not sure if this is a completely unrelated problem.
Comment 22 Jeffrey Moyer 2009-01-22 11:39:54 EST
(In reply to comment #21)
> Did you try just running
> # multipath
> After the underlying block device has been resized.  In RHEL4, it should
> already do the same thing as the manual method in Comment #18 did.

Hi, Ben. I'll assume your question is addressed to me.  No, I didn't try that, and yes, it does work.

> Strangely, when I try this, ext2online fails for me
> 
> [root@ask-06 mnt]# ext2online /dev/mapper/mpath7 
> ext2online v1.1.18 - 2001/03/18 for EXT2FS 0.5b
> error: Input/output error: read -1 of 16384 bytes at 4096
> 
> However, it fails just the same using the method in Comment #18, so I'm not
> sure if this is a completely unrelated problem.

I've never seen that.  You'll need to provide a whole lot more information if we're to debug it, though.
Comment 23 Ben Marzinski 2009-01-22 17:02:38 EST
I'm using an X86_64 box with the RHEL 4.8 1/4/09 nightly build installed. It's connected to a Winchestor storage array via FC using the qla2400 driver. I'm running the 2.6.9-78.30.ELsmp kernel.

The multipath device I'm using looks like:
mpath7 (3600d0230000000000e13955cc3757806)
[size=48 GB][features="0"][hwhandler="0"]
\_ round-robin 0 [prio=1][active]
 \_ 5:0:0:6     sdh 8:112 [active][ready]

The commands I'm running are:
# multipath
# mkfs -t ext3 /dev/mapper/mpath7
# mount /dev/mapper/mpath7 /mnt/test
# echo 1 > /sys/block/sdh/device/rescan
# multipath
# ext2online

After more testing, I've found that it works just fine if I start with 51200000 block device, and resize to a 307200000 block device.  However, it fails when I try to go from a 51200000 block device to a 614400000 block device.  I'm not sure exactly at what size it starts failing, but it does seem to be size dependent.
Comment 26 Ben Marzinski 2009-01-22 18:00:22 EST
Some more information.  I was wrong.  It seems to happen randomly.  It just looked like it was size dependent for a couple of runs.  Also, when this happens, all IO to the device seems to fail.  If you try to do a dd from the multipath device, it will fail as well.  However unmounting the filesystem fixes this.
Comment 27 Jeffrey Moyer 2009-01-23 10:41:38 EST
I found this in your system logs:

Jan 22 09:46:47 ask-06 kernel: kjournald starting.  Commit interval 5 seconds
Jan 22 09:46:47 ask-06 kernel: EXT3 FS on dm-2, internal journal
Jan 22 09:46:47 ask-06 kernel: EXT3-fs: mounted filesystem with ordered data mod
e.
Jan 22 09:47:17 ask-06 kernel: end_request: I/O error, dev sdh, sector 8208
Jan 22 09:47:17 ask-06 kernel: device-mapper: dm-multipath: Failing path 8:112.
Jan 22 09:47:17 ask-06 kernel: Buffer I/O error on device dm-2, logical block 10
27
Jan 22 09:47:17 ask-06 kernel: lost page write due to I/O error on dm-2
Jan 22 09:47:17 ask-06 kernel: end_request: I/O error, dev sdh, sector 12312
Jan 22 09:47:17 ask-06 kernel: Buffer I/O error on device dm-2, logical block 15
39
Jan 22 09:47:17 ask-06 kernel: lost page write due to I/O error on dm-2
Jan 22 09:47:17 ask-06 kernel: end_request: I/O error, dev sdh, sector 8
Jan 22 09:47:17 ask-06 kernel: Buffer I/O error on device dm-2, logical block 1
Jan 22 09:47:17 ask-06 kernel: lost page write due to I/O error on dm-2
Jan 22 09:47:17 ask-06 kernel: Buffer I/O error on device dm-2, logical block 10
26
Jan 22 09:47:17 ask-06 kernel: lost page write due to I/O error on dm-2
Jan 22 09:48:51 ask-06 kernel: SCSI device sdh: 1228800000 512-byte hdwr sectors
 (629146 MB)
Jan 22 09:48:51 ask-06 kernel: SCSI device sdh: drive cache: write back
Jan 22 09:48:51 ask-06 kernel: sdh: detected capacity change from 52428800000 to
 629145600000

/dev/mpath/mpath7 is a symbolic link to /dev/dm-2.  It looks like those I/O errors were present before any resizing was done.  Is that right?

You can read the first 4k of the multipath disk just fine:

[root@ask-06 mnt]# dd if=/dev/mapper/mpath7 of=/dev/null bs=4k count=1
1+0 records in
1+0 records out

But try to read the next 4k and you get a failure:

[root@ask-06 mnt]# dd if=/dev/mapper/mpath7 of=/dev/null bs=4k count=2
dd: reading `/dev/mapper/mpath7': Input/output error
1+0 records in
1+0 records out

I/O to the underlying sd device works just fine:

[root@ask-06 mnt]# dd if=/dev/sdh of=/dev/null bs=4k count=2
2+0 records in
2+0 records out
Comment 28 Jeffrey Moyer 2009-01-23 14:33:04 EST
I'd like to know what caused the I/O errors in the first place.  Ben mentioned that it might have happened during the online resize, as he has to unmap the LUN before growing it.

At any rate, the problem is that, once you set the PG_error bit for the page cache page, a regular read on the device file will always and forever see an error.  I proposed this patch upstream:
  http://lkml.org/lkml/2009/1/23/288

A simple way around the problem is to mmap the device and read from the locations that are giving I/O errors (but that's hardly acceptable!).

So, if the I/O errors are indeed from taking the LUN offline, then we will have to update our documentation to perhaps suggest suspending the device mapper device before doing the resize of the storage device.
Comment 31 Jeffrey Moyer 2009-02-09 16:08:47 EST
Release note added. If any revisions are required, please set the 
"requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

New Contents:
Red Hat Enterprise Linux 4.8 can detect online growing or shrinking of an underlying block device.  However, there is no method to automatically detect that a device has changed size, so manual steps are required to recognize this and resize any file systems which reside on the given device(s).  When a resized block device is detected, a message like the following will appear in the system logs:

VFS: busy inodes on changed media or resized disk sdi

If the block device was grown, then this message can be safely ignored. However, if the block device was shrunk without shrinking any data set on the block device first, the data residing on the device may be corrupted.

It is only possible to do an online resize of a filesystem that was created on the entire LUN (or block device). If there is a partition table on the block device, then the file system will have to be unmounted to update the partition table.
Comment 32 Chris Ward 2009-02-20 08:31:42 EST
~~ Attention Partners!  ~~
RHEL 4.8 Partner Alpha has been released on partners.redhat.com. There should
be a fix present in the Beta, which addresses this bug. If you have already completed testing your other URGENT priority bugs, and you still haven't had a chance yet to test this bug, please do so at your earliest convenience, to ensure that only the highest possible quality bits are shipped in the upcoming public Beta drop.

If you encounter any issues, please set the bug back to the ASSIGNED state and
describe the issues you encountered. Further questions can be directed to your
Red Hat Partner Manager.

Thanks, more information about Beta testing to come.
 - Red Hat QE Partner Management
Comment 34 Naveen Reddy 2009-03-05 00:53:39 EST
Verified it in RHEL4.8 successfully.

Steps  followed - (Taken from Comment 18)

1. Map a LUN from a NetApp Controller (Fibre Channel target)
2. Discover it on the host
3. mkfs -t ext3 /dev/mapper/mpath5
4. mount /dev/mapper/mpath5 mnt1/
5. touch x
   touch y
   dd if=/dev/zero of=foo bs=1M count=100

6. Resize the LUN  and then did a rescan on the host
7. dmsetup table mpath5 > /root/newtab
8. Modified newtab to have new end sector of the device.
9. dmsetup suspend /dev/mapper/mpath5
   dmsetup reload /dev/mapper/mpath5 /root/newtab
   dmsetup resume /dev/mapper/mpath5
   ext2online /dev/mapper/mpath5

Online resize happened successfully.
Comment 37 errata-xmlrpc 2009-05-18 15:31:51 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1024.html