Bug 676683 - resize2fs shrink ext3 is CPU bound
Summary: resize2fs shrink ext3 is CPU bound
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Fedora
Classification: Fedora
Component: e2fsprogs
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Eric Sandeen
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-02-10 18:40 UTC by John Reiser
Modified: 2011-02-11 03:24 UTC (History)
4 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2011-02-10 22:26:22 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
output from "vmstat 5" during resize2fs (87.29 KB, text/plain)
2011-02-11 03:05 UTC, John Reiser
no flags Details

Description John Reiser 2011-02-10 18:40:07 UTC
Description of problem: A resize2fs shrink (offline under gparted) was CPU bound for the last half of total elapsed wall-clock time.  The first half was disk bound (as expected): copying occupied blocks from above to below the new fence.  Observing the first half with "vmstat 5" showed both 'bi' and 'bo' between 10,000 and 20,000 (KiB/s) on consumer PATA drive.  Observing the last half of the shrink with /usr/bin/top showed 83% CPU usage while vmstat showed bi always less than 8500 (often in reverse Fibonacci-like sequence for over 30 seconds) and usually < 400, and bo often zero.  Being CPU bound for that long (with no observed output) is unexpected; some algorithm and/or data structure should be changed.  Process size was 27MB total, 25MB resident; the box has 1GB PC2100 SDRAM.


Version-Release number of selected component (if applicable):
e2fsprogs-1.41.12

How reproducible:


Steps to Reproduce:
1. gparted shrink ext3 from 153.85 GiB (53% used) to 87.70 GiB.  (Detailed statistics appear below.)
2.
3.
  
Actual results: Elapsed hh:mm:ss of  02:16:40  (about 137 minutes) with first 75 minutes disk bound and last 62 minutes CPU bound (83% on 1.6GHz Pentium4.)


Expected results: Elapsed time at most 110% of time required to copy occupied blocks.  In this case, 83 minutes (75 + 7.5).


Additional info:  The gparted details:
=====
GParted 0.7.0

Libparted 2.3
Shrink /dev/sdb6 from 153.85 GiB to 87.70 GiB  02:36:14    ( SUCCESS )
     	
calibrate /dev/sdb6  00:00:00    ( SUCCESS )
     	
path: /dev/sdb6
start: 68067468
end: 390716864
size: 322649397 (153.85 GiB)
check file system on /dev/sdb6 for errors and (if possible) fix them  00:10:32    ( SUCCESS )
     	
e2fsck -f -y -v /dev/sdb6
     	
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information

1071971 inodes used (5.32%)
43038 non-contiguous files (4.0%)
627 non-contiguous directories (0.1%)
# of inodes with ind/dind/tind blocks: 48752/697/2
21510913 blocks used (53.34%)
0 bad blocks
8 large files

1004728 regular files
60494 directories
0 character device files
0 block device files
8 fifos
581 links
6732 symbolic links (5062 fast symbolic links)
0 sockets
--------
1072543 files
e2fsck 1.41.12 (17-May-2010)
shrink file system  02:16:40    ( SUCCESS )
     	
resize2fs /dev/sdb6 91956153K
     	
Resizing the filesystem on /dev/sdb6 to 22989038 (4k) blocks.
The filesystem on /dev/sdb6 is now 22989038 blocks long.

resize2fs 1.41.12 (17-May-2010)
shrink partition from 153.85 GiB to 87.70 GiB  00:00:01    ( SUCCESS )
     	
old start: 68067468
old end: 390716864
old size: 322649397 (153.85 GiB)
new start: 68067468
new end: 251979775
new size: 183912308 (87.70 GiB)
check file system on /dev/sdb6 for errors and (if possible) fix them  00:09:00    ( SUCCESS )
     	
e2fsck -f -y -v /dev/sdb6
     	
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information

1071971 inodes used (9.32%)
43856 non-contiguous files (4.1%)
627 non-contiguous directories (0.1%)
# of inodes with ind/dind/tind blocks: 48752/697/2
21238940 blocks used (92.39%)
0 bad blocks
8 large files

1004728 regular files
60494 directories
0 character device files
0 block device files
8 fifos
581 links
6732 symbolic links (5062 fast symbolic links)
0 sockets
--------
1072543 files
e2fsck 1.41.12 (17-May-2010)
grow file system to fill the partition  00:00:01    ( SUCCESS )
     	
resize2fs /dev/sdb6
     	
resize2fs 1.41.12 (17-May-2010)
The filesystem is already 22989038 blocks long. Nothing to do!

========================================

Comment 1 Eric Sandeen 2011-02-10 22:26:22 UTC
Hm, in my testing, I didn't see anything like 83%.  Occasional jumps to maybe 10% during pass 4 (Updating inode references) but that's it... Different CPU, smaller fs though.

If this is something you can repeat (?) it might be worth doing some profiling to see where it's spending time?

Running with -p would at least show you which phase it's in when this happens.

FWIW, this is probably best reported upstream, since it's not unique to Fedora.

I don't know that I'll be able to spend much time on this; fs shrink performance has never really been on the radar...  for that reason I think I will close this as UPSTREAM; if you can report it to the linux-ext4.org list, that would probably be good.  I always feel bad about doing that, but sometimes it's best to set realistic expectations.  :)

(If you're able to do some profiling & identify where time is being spent, feel free to add updates, if your further investigation points to something obvious or fixable, I'll take another look)

Comment 2 John Reiser 2011-02-11 03:02:59 UTC
I can reproduce the problem.

The original filesystem has been shrunk, and I did not save a pre-shrink copy.  However, I created a new ext3 of nearly the same original size (153GB), copied ("cp -r") the entire contents as files from the shrunk system to the new system, and resized the new system similarly to the original shrink.  Pass 3, Scanning inode table, is CPU-limited.

I used different hardware for the second resize shrink: a 250GB external drive with USB2.0 interface on a 2.5GHz Athlon II x86_64 with 4GB DDR2 RAM.  I ran the second resize shrink with active oprofile-0.9.6 on Fedora 14, kernel-2.6.35.10-74.fc14.x86_64 #1 SMP Thu Dec 23 16:04:50 UTC 2010.

The output from the resize was:
-----
# e2fsck -f /dev/sdc2
e2fsck 1.41.12 (17-May-2010)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
test-resize: 1071971/9863168 files (0.1% non-contiguous), 21597266/39423744 blocks
# time resize2fs -p /dev/sdc2 91956153K
resize2fs 1.41.12 (17-May-2010)
Resizing the filesystem on /dev/sdc2 to 22989038 (4k) blocks.
Begin pass 2 (max = 7024795)
Relocating blocks             XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Begin pass 3 (max = 1204)
Scanning inode table          XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Begin pass 4 (max = 62998)
Updating inode references     XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
The filesystem on /dev/sdc2 is now 22989038 blocks long.


real	69m52.167s
user	26m29.431s
sys	2m3.456s

# 
-----


The opreport from oprofile was:
-----
# opreport -l image:/sbin/resize2fs
Overflow stats not available
CPU: AMD64 processors, speed 2499.56 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask of 0x00 (No unit mask) count 100000
samples  %        symbol name
31677    55.9437  ext2fs_extent_translate
11008    19.4409  resize_fs
9302     16.4280  process_block
2443      4.3145  get_new_block
1341      2.3683  ext2fs_add_extent_entry
450       0.7947  check_and_change_inodes
212       0.3744  resize_progress_func
159       0.2808  ext2fs_progress_update
26        0.0459  mark_table_blocks
4         0.0071  progress_callback
1         0.0018  ext2fs_iterate_extent

# opreport
Overflow stats not available
CPU: AMD64 processors, speed 2499.56 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask of 0x00 (No unit mask) count 100000
CPU_CLK_UNHALT...|
  samples|      %|
------------------
 38686670 75.1753 libext2fs.so.2.4
  8991168 17.4715 no-vmlinux
   604698  1.1750 libc-2.13.so
   526096  1.0223 libpixman-1.so.0.18.4
   283113  0.5501 libgklayout.so
   176661  0.3433 libxpcom_core.so
   175469  0.3410 Xorg

# 
-----

Because it is somewhat large, I will attach the (annotated) output from "vmstat 5" during the resize2fs, instead of displaying it inline here.

Comment 3 John Reiser 2011-02-11 03:05:18 UTC
Created attachment 478170 [details]
output from "vmstat 5" during resize2fs

Output from vmstat, with annotations marking the Pass of resize2fs.

Comment 4 John Reiser 2011-02-11 03:24:53 UTC
I reported this to linux-ext4.org.


Note You need to log in before you can comment on or make changes to this bug.