223220 – Optimise bmap for the gfs2 journal

Bug 223220 - Optimise bmap for the gfs2 journal

Summary: Optimise bmap for the gfs2 journal

Keywords:
Status:	CLOSED DUPLICATE of bug 253990
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	GFS-kernel
Sub Component:
Version:	rawhide
Hardware:	All
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	---
Assignee:	Robert Peterson
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2007-01-18 14:50 UTC by Steve Whitehouse
Modified:	2007-12-12 19:51 UTC (History)
CC List:	0 users
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2007-12-12 19:51:53 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Bob's proposed patch (4.70 KB, patch) 2007-12-09 19:47 UTC, Robert Peterson	no flags	Details \| Diff
View All

Description Steve Whitehouse 2007-01-18 14:50:31 UTC

Currently the GFS2 journal is mapped in the same way as any other file on the
disk. Since we know that the currently active journal cannot change while its in
use, it makes sense to scan all of the indirect pointer blocks at mount time and
then create an in-core extent map.

This can then be used for bmap operations which will then be much faster (and
use less memory) than reading the indirect blocks all the time. Since we expect
the journal to be laid out linearly on disk in general, the entire journal
should be easily representable with very few extents.

Comment 1 Robert Peterson 2007-12-09 19:47:37 UTC

Created attachment 282321 [details]
Bob's proposed patch

This patch is my first attempt at solving this problem.  The good news
is that it seems to function perfectly.  In other words, it seems to be
figuring out the journal blocks in all cases using the new method.
It hasn't been put through a "serious" test though.  The bad news is that
it doesn't seem to be speeding things up much, at least when using my
bobzone program, which hardly registers a difference.  It's not a very
stressful test though, so it's not too surprising.

Steve, I can assume ownership of this one if you want.	First I wanted
you to have a look at my approach and tell me if this is what you had
in mind.  If you like it, I can send it to cluster-devel for inclusion
upstream, but it should probably be tested further before I do.

Comment 2 Steve Whitehouse 2007-12-10 09:00:32 UTC

Some comments:

+struct gfs2_journal_extent {
+	struct list_head extent_list;
+
+	unsigned int lblock; /* First logical block */
+	u64 dblock; /* First disk block */
+	u64 blocks;
+};

The logical block needs to be a u64, the "blocks" cannot be more than a u32 due
to the max size of an rgrp. Also this structure probably can be private to
bmap.c (see below).

There is no need for log_bmap to have a fall back in case of a missing extent
list. Just fail in that case as it should never happen. I'd suggest moving
log_bmap and your new map_journal_extents() into bmap.c to keep all the block
mapping code in one place.

Also, you don't have to map the journal one block at a time. You might as well
try to map as many blocks as posible with a single call to gfs2_block_map(). The
function gfs2_write_alloc_required() has some hints on how to do that.

Comment 3 Steve Whitehouse 2007-12-10 09:05:52 UTC

Actually, scratch that last remark, gfs2_allocate_page_backing() is a far better
example (only in upstream atm).

Comment 4 Robert Peterson 2007-12-12 19:51:53 UTC

My first prototype of this fix had lblock as 64 bits.  However,
AFAICT the lblock can only be 32-bits due to the way it's passed in
because log_bmap gets passed the lblock as an unsigned int, which comes
from sdp->sd_log_flush_head, which is also an unsigned int.
If this is a problem, we have several other code issues we need to fix
and we should fix them across the board.  Should that be the case,
please open a new bugzilla record for that work.

As for this one, my code to implement this was built in to my performance
enhancements for bug #253990.  As such, it also appears in the upstream
GFS2 as well.  As per your suggestion, I removed the fall-back case
from comment #2.  For that reason, I'm closing this as a duplicate of
253990.  Perhaps I should have separated the two, but we were under a
time crunch.

Also, I'm not worried about doing the journal mapping one block at a
time since it's only done at mount time.  Perhaps we can consider
improving it as an enhancement in the future.


*** This bug has been marked as a duplicate of 253990 ***

Note You need to log in before you can comment on or make changes to this bug.