Bug 455692

Summary: [NetApp 5.3 bug] online resize of filesystem does not work (user space)
Product: Red Hat Enterprise Linux 5 Reporter: Ritesh Raj Sarraf <rsarraf>
Component: device-mapper-multipathAssignee: Ben Marzinski <bmarzins>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: high Docs Contact:
Priority: high    
Version: 5.3CC: agk, akarlsso, andriusb, bmarzins, bmr, bstevens, christophe.varoqui, coughlan, cward, ddomingo, dwysocha, edamato, egoggin, heinzm, jmoyer, junichi.nomura, kueda, lmb, mbroz, mchristi, nandkumar.mane, nstraz, pasik, prockai, rlerch, tanvi, tao, tom, tranlan, xdl-redhat-bugzilla
Target Milestone: beta   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
There is a new multipathd command, "resize map <mapname>". This command is used to cause dm-multipath to adjust to a change in the size of the underlying block storage device. After resizing the underlying block device, you can resize your multipath device by running: # multipathd -k"resize map <mapname>" For example: # multipathd -k"resize map mpath0" There is one restriction on the use of the device mapper multipath resize command. You can not resize a device-mapper device while there commands queued to that device. That is, do not use the resize command when no_path_retry is set to "queue", and there are no active paths to the device.
Story Points: ---
Clone Of:
: 479684 (view as bug list) Environment:
Last Closed: 2009-01-20 22:08:21 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 444964    
Bug Blocks: 373081    

Description Ritesh Raj Sarraf 2008-07-17 06:34:55 UTC
+++ This bug was initially created as a clone of Bug #444964 +++

Description of problem:

From the resize2fs manpage:
The resize2fs program will resize ext2 or ext3 file systems.  It can be
       used  to  enlarge or shrink an unmounted file system located on device.
       If the filesystem is mounted, it can be used to expand the size of  the
       mounted filesystem, assuming the kernel supports on-line resizing.  (As
       of this writing, the Linux  2.6  kernel  supports  on-line  resize  for
       filesystems mounted using ext3 only.).



It has been seen that online resize of filesystem doesn't work. The resize2fs
tool is supposed to resize ext2/ext3 filesystems while they are mounted and are
in use by the system. But if the filesystem is mounted (i.e. device is in use)
and the mounted device is resized on the target, the kernel is not able to
detect the new size of the device. To reflect the new size, we need to unmount
and then remount the filesystem.


Version-Release number of selected component (if applicable):
[root@199-119 ~]# lsb_release -a
LSB Version:    :core-3.1-ia32:core-3.1-noarch:graphics-3.1-ia32:graphics-3.1-noarch
Distributor ID: RedHatEnterpriseServer
Description:    Red Hat Enterprise Linux Server release 5.2 Beta (Tikanga)
Release:        5.2
Codename:       Tikanga


How reproducible:
Always reproducible

Steps to Reproduce:
1 mount an iSCSI LUN
2 Increase the LUN size on the target
3 rescan the iSCSI sessions on the initiator
4 resize the filesystem using resize2fs ( resize2fs reports that the filesystem
is already occupying all the blocks and that there is nothing to resize)
5 unmount the file system
6 mount it again
7 resize the filesystem using resize2fs ( now it works )

  
Actual results:
Device size change is not reflected to the filesystem utilities and dm-multipath

Expected results:
Device size change should be reflected to filesystem resize utilities so that
the filesystem can be grown/expanded to the new size.


Additional info:

Point to note here is that after executing step 3, the SCSI subsystem reflects
the new size. This can be verified with the value present in /sys/block/DEVICE/size
If we resize the LUN while it is mounted, SCSI reflects the change after the
rescan, but resize2fs does not. This forfeits the whole idea of "Online Resize"
because we can't see the new size on the _filesystem_
 
Similar is the case when using multipath. When we try to expand a multipathed
LUN, to reflect the size change in multipath, we need to flush (release) the LUN
(using `multipath -F`) and then re-discover it (using `multipath -v3`) which
would be a disruptive operation for the LUN.

-- Additional comment from coughlan on 2008-05-19 11:06 EST --
I wonder whether this is related to the recent patch set:

http://lkml.org/lkml/2008/5/8/40

-- Additional comment from jmoyer on 2008-05-29 10:10 EST --
(In reply to comment #2)
> I wonder whether this is related to the recent patch set:
> 
> http://lkml.org/lkml/2008/5/8/40

I put together a test kernel with those patches applied.  It can be found at:

  http://people.redhat.com/jmoyer/dio/rhel5/

I have not yet run the kernel through the reproducer described here.  I'll do so
first chance I get.

-- Additional comment from andriusb on 2008-05-29 11:26 EST --
Ritesh, could you please try out the test kernel?

-- Additional comment from jmoyer on 2008-05-29 13:02 EST --
(In reply to comment #4)
> Ritesh, could you please try out the test kernel?

Ritesh, only do so if you have spare cycles.  I'd rather have you test after I
am convinced that the patches solve the problem.

Thanks.

-- Additional comment from jmoyer on 2008-06-27 11:35 EST --
OK, I finally got the required hardware setup to test this, and it works for me.
 Please test out the kernels posted to my people page (see comment #3) and let
me know if they resolve the issue for you.

Thanks!

-- Additional comment from tanvi on 2008-07-09 03:39 EST --
We verified it and here is the result

In case of scsi devices the file system can be resized online with the new
kernel package.
But when using multipathed devices the size is still not reflected unless maps
are flushed and rediscovered ( which ultimately requires unmount of the
multipathed device)

-- Additional comment from jmoyer on 2008-07-09 03:44 EST --
(In reply to comment #8)
> We verified it and here is the result
> 
> In case of scsi devices the file system can be resized online with the new
> kernel package.
That was always the case, right?  I thought you were testing iSCSI.

> But when using multipathed devices the size is still not reflected unless maps
> are flushed and rediscovered ( which ultimately requires unmount of the
> multipathed device)

Can you provide your configuration, please?

-- Additional comment from tanvi on 2008-07-09 05:31 EST --
(In reply to comment #9)
> That was always the case, right?  I thought you were testing iSCSI.
>
Yes, used to. But wasn't true with the actual test we did. Online resize
wouldn't work on the standard kernels shipped with RHEL5.2. A umount was required.
 
> Can you provide your configuration, please?

iSCSI LUN was mapped with two paths with multipath enabled on top of it.

If you need the exact command outputs please let me know.


-- Additional comment from andriusb on 2008-07-14 09:29 EST --
Tanvi, yeah - I think jmoyer will need your configuration.

-- Additional comment from jmoyer on 2008-07-15 15:24 EST --
(In reply to comment #10)
> (In reply to comment #9)
> > That was always the case, right?  I thought you were testing iSCSI.
> >
> Yes, used to. But wasn't true with the actual test we did. Online resize
> wouldn't work on the standard kernels shipped with RHEL5.2. A umount was required.
>  
> > Can you provide your configuration, please?
> 
> iSCSI LUN was mapped with two paths with multipath enabled on top of it.

Is the server multi-homed or is the client?  I tried using two network
interfaces on the client, but iscsiadm is having difficulties connecting to the
storage in this configuration.  Perhaps it's my poor understanding of iSCSI
configuration, or perhaps it's a bug.  At any rate, more information from you on
your configuration will certainly help expedite the matter.

> If you need the exact command outputs please let me know.

Not necessarily the output, but the precise commands you are running would be
helpful.

Thanks!

-- Additional comment from rsarraf on 2008-07-16 06:00 EST --
The initiator (i.e. the Linux host) is multipathed. In this setup, the target 
(server), is a NetApp SAN Controller.
There are multiple interfaces on the target. For each interface on the target, 
you get one path.

So in total, I have a multipathed device on the initiator, for which I can 
discover the new size till the SCSI layer, when the LUN is resized on the 
target.

But for the multipathed devices, I'm forced to free and re-discover the maps 
to reflect the new sizes.

-- Additional comment from jmoyer on 2008-07-16 13:31 EST --
OK, sorry for being so dense on this issue.  I setup a single disk multipath
device and retested.  Basically, this *can* work, but it requires some work.

1) Connect to your iSCSI storage, creating the /dev/sd* devices.
2) use multipath to create your multipath associations.  This will result in a
/dev/mpath/mpathX device.
3) resize the iSCSI LUN
4) rescan the iSCSI session (this triggers the update for the block devices)
-and here it gets ugly-
5) dump the device mapper table for the mpath device: dmsetup table mpathX
6) The first two numbers in each line correspond to the start and end sectors of
the disk.
7) suspend the device mapper target: dmsetup suspend /dev/mpath/mpathX
8) load a new table with the larger end sectors: Change the second number to
reflect the number of 512 byte sectors in the disk.  So, if you resized to 2G,
this will be 4208640.  In my case, I saved the dumped table to ~/newtab, and
modified its contents to be:
0 4208640 multipath 0 0 1 1 round-robin 0 1 1 8:0 1000
and ran:
dmsetup reload ~/newtab
9) resume the target:  dmsetup resume /dev/mpath/mpathX
10) resize the file system: resize2fs /dev/mpath/mpathX 2G

So, it *can* be done.  We should make this way more user friendly, though.

Ritesh and Tanvi, I'd like to open a separate bugzilla to track the changes to
the device mapper utilities to support this.  Is that okay with you?  That way
we can get the kernel changes in without waiting explicitly for that support.

Thanks for your patience on the matter.  Cheers.

Comment 4 Jeremy West 2008-07-29 19:55:53 UTC
*** Bug 457129 has been marked as a duplicate of this bug. ***

Comment 6 Ben Marzinski 2008-09-19 04:10:41 UTC
There is now a new multipathd command, "resize map <mapname>", which will do the multipath work to resize a block device.  There is one problem. Because of a kernel device-mapper issue, you cannot resize a device-mapper device with no_flush on.  This means that you will flushed any queued IOs when you suspend the device to change it's size.  This will only cause problems if user attempts to resize when all paths are down, and the device is set to queue_if_no_path.

Comment 7 Ben Marzinski 2008-09-19 04:22:47 UTC
*** Bug 445262 has been marked as a duplicate of this bug. ***

Comment 8 Ritesh Raj Sarraf 2008-09-19 13:08:54 UTC
Ben,

I would like to test this feature. I looked upstream and at sourceware, but I don't see the resize stuff anywhere.

Comment 10 Pasi Karkkainen 2008-09-29 13:54:04 UTC
I'd like to test this feature too! Any test-packages for dm-multipath available?

Comment 11 Ben Marzinski 2008-09-29 18:37:58 UTC
There are test packages available at:
http://people.redhat.com/rpeterso/Experimental/RHEL5.x/dm-multipath/

Comment 12 Pasi Karkkainen 2008-09-29 19:00:51 UTC
Thanks! Although I'd need i386 binary.. could you please upload also i386 package, and/or srpm?

Comment 13 Ben Marzinski 2008-09-29 21:27:35 UTC
O.k. the i386 rpms are up as well.

Comment 14 Pasi Karkkainen 2008-09-30 11:11:29 UTC
Thanks.

I just tried it but it doesn't seem to work for me.. 

First I resized (i)SCSI devices, and that went OK. "/proc/partitions" and kernel/dmesg show the new/bigger size for SCSI devices.

Then I tried resizing dm-multipath device:

# multipathd -k"resize map mpath-resize-test"
ok

But 'mpath-resize-test' device is still same/old size after that.. based on "multipath -ll" output.

Any idea how to get multipath-device to see the new size? 
Is there something else I need to do/run? 

I have LVM PV/VG on that "mpath-resize-test" device.. Can that be the problem?

Comment 15 Pasi Karkkainen 2008-09-30 11:20:41 UTC
from /var/log/messages:

multipathd: mpath-resize-test: resize map (operator)
multipathd: mpath-resize-test: map is still the same size (8417280)

And now it seems that device is not working anymore.. "touch /mnt/testvol/test" hangs and doesn't do anything.. also "lvdisplay" just hangs. 

Weird.

Comment 16 Ritesh Raj Sarraf 2008-09-30 15:04:44 UTC
Thanks Ben/Jeff.
Now it works perfect.

Comment 17 Pasi Karkkainen 2008-09-30 15:26:42 UTC
Ritesh: 

- How did you test it? 
- Filesystem directly on top of dm-multipath device? 
- Does "multipath -ll" show the new/bigger size for you after resize map? 

I'm using LVM PV/VG on top of dm-multipath device, and I can't get the dm-multipath device to grow on the fly.. My multipath-device consists of 2 paths (scsi devices) to iSCSI storage. 

Maybe I'll try again..

Thanks!

Comment 18 Ben Marzinski 2008-09-30 16:23:32 UTC
Pasi:

This package only does the userspace part of the fix.  I'm not sure when the kernel part was added, but you need a recent kernel, possibly a beta-release kernel.

You should be able to check if this works be doing a

# blockdev --getsize <device>

after you have resized your scsi devices.  If this shows the new size, then everything should work.  If not, then this isn't going to work without
a newer kernel or extra steps.

For growing a device, the extra steps involve running

# multipathd -k
> del path <path1_from_the_device>
> add path <path1_from_the_device>
> del path <path2_from_the_device>
> add path <path2_from_the_device>
...
for all the paths in the device. This allows the size to change. Then you 
can run
> resize map <device>
> CTRL-D

It's odd the the device is all locked up. Can you run
# dmsetup info <device>

The "State" should say "ACTIVE".  If is says "SUSPENDED", then that's the problem.  You can run

# dmsetup resume <device>

To get the device back to the active state (hopefully.  It might not work at all
without the kernel fix). In any case, let me know what you find.

Comment 19 Pasi Karkkainen 2008-09-30 19:34:07 UTC
I'm running patched kernel from http://people.redhat.com/jmoyer/dio/rhel5/ so that should be all fine. 

Like said, I was able to resize SCSI (iSCSI) devices on the fly. Kernel sees the new/grown size for SCSI devices. "blockdev --getsize" reports the new size, dmesg shows it and "/proc/partitions" shows the new size.

The problem is I didn't do that del/add path magic earlier. I just tried 'resize map' straight out.

"dmsetup info" says the device is SUSPENDED, but "dmsetup resume" doesn't help. it's still SUSPENDED.

I was able to del path from the mpath device, but now I can't add that path back:

/var/log/messages:
multipathd: mpath-resize-test: uev_add_path sleep
multipathd: mpath-resize-test: failed in domap for addition of new path sdc

And now for example "multipathd -k" just hangs and doesn't do anything.. 
"multipath -ll" works though. 

Any more ideas? :) 

Thanks!

Comment 20 Ritesh Raj Sarraf 2008-10-01 04:31:35 UTC
Pasi,

I'm not sure how you're doing it. The only difference in my and your's test setup is LVM.

I've tested it on a multipathed device with a filesystem on top of it. Online resize works perfect.

Even for LVM on top of multipath, I think it should work perfect. You'll only have extend the VG/LV after your resize the multpathed device.

Comment 21 Pasi Karkkainen 2008-10-01 07:06:14 UTC
Ritesh: 

Thanks for the reply.

I know I need to pvresize the PV/VG after resizing multipath device. 

So far I was able to resize SCSI devices (that make up the multipath device) just fine online.

But the problem is I can't get the dm-multipath device to resize/grow.. removing a path was successful, but trying to add it back fails and hangs "multipathd -k".

I think I'll re-test again, first without LVM just with a filesystem on top of dm-multipath device, and if that works, then continue with LVM on top of dm-multipath.

Maybe there's something wrong with my multipath.conf causing the problem with resizing (adding paths back). At least the dm-multipath device has went into SUSPENDED mode and I can't get it back to ACTIVE.

Comment 22 Pasi Karkkainen 2008-10-01 13:06:51 UTC
Now I got it working! 

You can see the steps I did for successful online resize here:
http://pasik.reaktio.net/rhel5-online-iscsi-resize-test.txt

I tried with both ext3 filesystem directly on top of dm-mpath device, and LVM PV/VG on top of dm-mpath device. Both worked OK now.

Why it didn't work for the first time is still a bit of mystery.. I had to actually power cycle the whole server, because mpath device went into SUSPENDED mode and I couldn't get it back to ACTIVE.. and there was IO waiting for execution so things were just stalling and nothing happening.. 'reboot' command didn't work either.

I suspect the problem was that I didn't reload/restart multipathd to pick up the proper configuration for mpath-device before I started to experiment with online resizing.. 

Anyway, now it seems to work.. 

Thanks a lot!

Comment 24 nandkumar mane 2008-11-27 12:59:41 UTC
Verified in RHEL5U3 snapshot3. I followed steps mentioned in comment #22, things seems to be working fine after deleting/adding all mapped paths to the concerned multipathed device and resizing that multipathed device.

Going forward we should work on script which will do all this stuff automatically.

Thank you.

Comment 27 Tom Coughlan 2008-12-22 21:38:28 UTC
Release note added. If any revisions are required, please set the 
"requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

New Contents:
There is a new multipathd command, "resize map <mapname>". This command is used to cause dm-multipath to adjust to a change in the size of the underlying block storage device. 

Use the the following procedure, after you resize the underlying block device: 

# multipathd -k
multipathd> del path <path-name>
ok
multipathd> add path <path-name>
ok

<repeat for each path to the device>

multipathd> resize map <map name>

For example:

# multipath -ll

mpath-resize-test (36090a018c032e4801e8e341700008039) dm-5 EQLOGIC,100E-00
[size=2.0G][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][active]
 \_ 2:0:0:0 sdd 8:48  [active][ready]
\_ round-robin 0 [prio=1][enabled]
 \_ 3:0:0:0 sdc 8:32  [active][ready]


# multipathd -k
multipathd> del path sdc
ok
multipathd> add path sdc
ok
multipathd> del path sdd
ok
multipathd> add path sdd
ok
multipathd> resize map mpath-resize-test
ok

There is one restriction on the use of the device mapper multipath resize command. You can not resize a device-mapper device while there commands queued to that device. That is, do not use the resize command when no_path_retry is set to "queue", and there are no active paths to the device.

Comment 29 Ben Marzinski 2009-01-16 17:04:02 UTC
editted.  With 5.3 kernel, you don't need to remove and readd the paths anymore. You can just resize the LUN and then resize the multipath device.

Comment 30 Ben Marzinski 2009-01-16 17:04:02 UTC
Release note updated. If any revisions are required, please set the 
"requires_release_notes"  flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

Diffed Contents:
@@ -1,39 +1,11 @@
 There is a new multipathd command, "resize map <mapname>". This command is used to cause dm-multipath to adjust to a change in the size of the underlying block storage device. 
 
-Use the the following procedure, after you resize the underlying block device: 
+After resizing the underlying block device, you can resize your multipath device by running:
 
-# multipathd -k
-multipathd> del path <path-name>
-ok
-multipathd> add path <path-name>
-ok
+# multipathd -k"resize map <mapname>"
 
-<repeat for each path to the device>
-
-multipathd> resize map <map name>
-
 For example:
 
-# multipath -ll
-
-mpath-resize-test (36090a018c032e4801e8e341700008039) dm-5 EQLOGIC,100E-00
-[size=2.0G][features=0][hwhandler=0][rw]
-\_ round-robin 0 [prio=1][active]
- \_ 2:0:0:0 sdd 8:48  [active][ready]
-\_ round-robin 0 [prio=1][enabled]
- \_ 3:0:0:0 sdc 8:32  [active][ready]
-
-
-# multipathd -k
-multipathd> del path sdc
-ok
-multipathd> add path sdc
-ok
-multipathd> del path sdd
-ok
-multipathd> add path sdd
-ok
-multipathd> resize map mpath-resize-test
-ok
+# multipathd -k"resize map mpath0"
 
 There is one restriction on the use of the device mapper multipath resize command. You can not resize a device-mapper device while there commands queued to that device. That is, do not use the resize command when no_path_retry is set to "queue", and there are no active paths to the device.

Comment 31 errata-xmlrpc 2009-01-20 22:08:21 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2009-0232.html