Bug 223220 - Optimise bmap for the gfs2 journal
Summary: Optimise bmap for the gfs2 journal
Keywords:
Status: CLOSED DUPLICATE of bug 253990
Alias: None
Product: Fedora
Classification: Fedora
Component: GFS-kernel
Version: rawhide
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Robert Peterson
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-01-18 14:50 UTC by Steve Whitehouse
Modified: 2007-12-12 19:51 UTC (History)
0 users

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-12-12 19:51:53 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Bob's proposed patch (4.70 KB, patch)
2007-12-09 19:47 UTC, Robert Peterson
no flags Details | Diff

Description Steve Whitehouse 2007-01-18 14:50:31 UTC
Currently the GFS2 journal is mapped in the same way as any other file on the
disk. Since we know that the currently active journal cannot change while its in
use, it makes sense to scan all of the indirect pointer blocks at mount time and
then create an in-core extent map.

This can then be used for bmap operations which will then be much faster (and
use less memory) than reading the indirect blocks all the time. Since we expect
the journal to be laid out linearly on disk in general, the entire journal
should be easily representable with very few extents.

Comment 1 Robert Peterson 2007-12-09 19:47:37 UTC
Created attachment 282321 [details]
Bob's proposed patch

This patch is my first attempt at solving this problem.  The good news
is that it seems to function perfectly.  In other words, it seems to be
figuring out the journal blocks in all cases using the new method.
It hasn't been put through a "serious" test though.  The bad news is that
it doesn't seem to be speeding things up much, at least when using my
bobzone program, which hardly registers a difference.  It's not a very
stressful test though, so it's not too surprising.

Steve, I can assume ownership of this one if you want.	First I wanted
you to have a look at my approach and tell me if this is what you had
in mind.  If you like it, I can send it to cluster-devel for inclusion
upstream, but it should probably be tested further before I do.

Comment 2 Steve Whitehouse 2007-12-10 09:00:32 UTC
Some comments:

+struct gfs2_journal_extent {
+	struct list_head extent_list;
+
+	unsigned int lblock; /* First logical block */
+	u64 dblock; /* First disk block */
+	u64 blocks;
+};

The logical block needs to be a u64, the "blocks" cannot be more than a u32 due
to the max size of an rgrp. Also this structure probably can be private to
bmap.c (see below).

There is no need for log_bmap to have a fall back in case of a missing extent
list. Just fail in that case as it should never happen. I'd suggest moving
log_bmap and your new map_journal_extents() into bmap.c to keep all the block
mapping code in one place.

Also, you don't have to map the journal one block at a time. You might as well
try to map as many blocks as posible with a single call to gfs2_block_map(). The
function gfs2_write_alloc_required() has some hints on how to do that.




Comment 3 Steve Whitehouse 2007-12-10 09:05:52 UTC
Actually, scratch that last remark, gfs2_allocate_page_backing() is a far better
example (only in upstream atm).


Comment 4 Robert Peterson 2007-12-12 19:51:53 UTC
My first prototype of this fix had lblock as 64 bits.  However,
AFAICT the lblock can only be 32-bits due to the way it's passed in
because log_bmap gets passed the lblock as an unsigned int, which comes
from sdp->sd_log_flush_head, which is also an unsigned int.
If this is a problem, we have several other code issues we need to fix
and we should fix them across the board.  Should that be the case,
please open a new bugzilla record for that work.

As for this one, my code to implement this was built in to my performance
enhancements for bug #253990.  As such, it also appears in the upstream
GFS2 as well.  As per your suggestion, I removed the fall-back case
from comment #2.  For that reason, I'm closing this as a duplicate of
253990.  Perhaps I should have separated the two, but we were under a
time crunch.

Also, I'm not worried about doing the journal mapping one block at a
time since it's only done at mount time.  Perhaps we can consider
improving it as an enhancement in the future.


*** This bug has been marked as a duplicate of 253990 ***


Note You need to log in before you can comment on or make changes to this bug.