Bug 234627
| Summary: | Speed up gfs2_fsck | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 5 | Reporter: | Robert Peterson <rpeterso> | ||||||
| Component: | gfs2-utils | Assignee: | Robert Peterson <rpeterso> | ||||||
| Status: | CLOSED NOTABUG | QA Contact: | GFS Bugs <gfs-bugs> | ||||||
| Severity: | medium | Docs Contact: | |||||||
| Priority: | medium | ||||||||
| Version: | 5.1 | CC: | kanderso, swhiteho | ||||||
| Target Milestone: | --- | Keywords: | FutureFeature | ||||||
| Target Release: | --- | ||||||||
| Hardware: | All | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | Enhancement | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2008-07-24 16:28:55 UTC | Type: | --- | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
|
Description
Robert Peterson
2007-03-30 17:57:28 UTC
Created attachment 151303 [details]
First go at a fix
This patch probably needs a little cleanup: I have long horrid comments
in there explaining the math reduction (mostly for my own sanity).
Created attachment 151304 [details]
First go at a fix
This patch probably needs a little cleanup: I have long horrid comments
in there explaining the math reduction (mostly for my own sanity).
Here's some gprof profiling data for gfs2_fsck:
Flat profile excerpts:
Each sample counts as 0.01 seconds.
Without the patch: --------------------------------------------------------
% cumulative self self total
time seconds seconds calls ms/call ms/call name
31.57 1.19 1.19 gfs2_bitmap_get
27.06 2.21 1.02 __udivdi3
13.53 2.72 0.51 __umoddi3
2.92 2.83 0.11 bget_generic
2.65 2.93 0.10 139301 0.00 0.00 inode_hash_search
2.65 3.03 0.10 gfs2_bitmap_set
2.39 3.12 0.09 gfs2_block_check
2.12 3.20 0.08 bfind
1.99 3.28 0.08 26594 0.00 0.00 build_and_check_metalist
1.86 3.35 0.07 gfs2_bitmap_clear
1.06 3.39 0.04 160 0.25 0.34 check_block_status
0.80 3.42 0.03 4247564 0.00 0.00 warm_fuzzy_stuff
0.80 3.45 0.03 gfs2_block_mark
0.80 3.48 0.03 gfs2_disk_hash
0.80 3.51 0.03 gfs2_find_next_block_type
0.80 3.54 0.03 __divdi3
With the patch: --------------------------------------------------------
% cumulative self self total
time seconds seconds calls ms/call ms/call name
30.48 0.57 0.57 gfs2_bitmap_get
11.23 0.78 0.21 bget_generic
7.49 0.92 0.14 26594 0.01 0.01 build_and_check_metalist
5.88 1.03 0.11 139301 0.00 0.00 inode_hash_search
5.35 1.13 0.10 gfs2_block_check
4.81 1.22 0.09 gfs2_disk_hash
4.28 1.30 0.08 gfs2_bitmap_set
3.74 1.37 0.07 gfs2_bitmap_clear
3.21 1.43 0.06 217620 0.00 0.00 hash_more_internal
2.14 1.47 0.04 print_fsck_log
1.60 1.50 0.03 add_buffer
1.60 1.53 0.03 bfind
1.60 1.56 0.03 gfs2_bitfit
1.60 1.59 0.03 gfs2_block_set
1.60 1.62 0.03 gfs2_find_next_block_type
1.34 1.65 0.03 1880692 0.00 0.00 check_data
1.07 1.67 0.02 83750 0.00 0.00 check_dentry
1.07 1.69 0.02 55551 0.00 0.00 handle_di
1.07 1.71 0.02 160 0.13 0.22 check_block_status
1.07 1.73 0.02 gfs2_check_meta
Call graph excerpts:
granularity: each sample hit covers 2 byte(s) for 0.27% of 3.77 seconds
(Note: This is only the top 4 in the list):
Without fix: -------------------------------------------------------
index % time self children called name
[1] 31.6 1.19 0.00 gfs2_bitmap_get [1]
[2] 27.1 1.02 0.00 __udivdi3 [2]
[3] 13.5 0.51 0.00 __umoddi3 [3]
[4] 9.3 0.00 0.35 main [4]
With the fix: -------------------------------------------------------
[1] 30.5 0.57 0.00 gfs2_bitmap_get [1]
[2] 25.1 0.00 0.47 main [2]
0.00 0.29 1/1 pass1 [3]
0.01 0.12 1/1 pass2 [10]
0.00 0.03 1/1 pass5 [27]
0.00 0.01 1/1 pass4 [44]
0.00 0.01 1/1 destroy [41]
0.00 0.00 1/1 pass3 [56]
0.00 0.00 1/1 read_cmdline [74]
0.00 0.00 1/1 initialize [71]
0.00 0.00 1/1 pass1b [72]
0.00 0.00 1/1 pass1c [73]
0.00 0.29 1/1 main [2]
[3] 15.3 0.00 0.29 1 pass1 [3]
0.01 0.28 55551/55551 scan_meta [4]
0.00 0.00 55551/4247564 warm_fuzzy_stuff [38]
0.01 0.28 55551/55551 pass1 [3]
[4] 15.3 0.01 0.28 55551 scan_meta [4]
0.02 0.26 55551/55551 handle_di [5]
As you can see from the data, this change is saving lots of cpu time.
However, with gfs2_fsck, the bottleneck is probably more in IO rather
than cpu time. So I don't know if this will make a different in real
run time. Also, the above profile was taken before I fixed some
things in the patch that made it run slower.
Okay, here are some real numbers: Without the fix: real 2m20.487s user 0m6.144s sys 0m1.464s With the fix: real 2m18.141s user 0m3.356s sys 0m1.476s So the cpu time was almost cut in half, but it only saved 2 seconds of "real time". This is approximately a one percent increase in speed for 8G. If this was an 8-hour gfs_fsck, this would equate to saving about 5 minutes. devel & 5.3 ACKs please. Deferring to 5.4; there are some other things that would make gfs_fsck more efficient. For example, getting rid of the buf.c interface that mimics kernel buffer_heads would likely improve performance by a lot. There are two major things we can do to improve the speed of gfs2_fsck.
(1) Reduce the memory requirements. A huge part of the slowdown on many
customer systems is the fact that it uses way too much memory. That
pushes the system into using swap space, which drags the system down
to its knees. I'm working on the memory issue.
(2) Make gfs2_fsck multi-threaded. For a lot of its operation, gfs2_fsck
is using 1% of the CPU, and mostly what it's doing is waiting for
disk IO. So if I make it create several threads, all working on
separate Resource Groups simultaneously, maybe it can use the cpu(s)
more effectively.
I've decided to close this bug. The userland bitmap speedup code (attachments 1 and 2) has been incorporated into the fix for bug #404611. I did this to try to get my fsck testing faster for that bug. At one point I had a 2TB file system that had been running benchp14 for days, and I tried to run gfs2_fsck on it. After days of no progress, problems running out of memory (swapping) and CPU gridlocked at 100%, I killed it. So I reduced the memory requirements, found a machine that had more memory, and to reduce the thing from being CPU-bound, I implemented the fix attached below. It did help, but not enough for that particular file system. It managed to make it to 7% through pass 1 after running overnight, but it was now page-bound rather than being CPU bound. I ran the code on a smaller file system and my gfs2_fsck time went from 19 seconds down to 6 seconds with this fix. This fix has never been shipped because there are larger issues with speeding up gfs2_fsck. With bug #404611, I decided it was better to get something out in the field while I worked on the other issues. The other issues will be addressed in bug #455300. |