413021 – Journal recovery fails when flush did not occur and records don't end on sblk boundary

Bug 413021 - Journal recovery fails when flush did not occur and records don't end on sblk boundary

Summary: Journal recovery fails when flush did not occur and records don't end on sblk...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Enterprise MRG
Classification:	Red Hat
Component:	qpid-cpp
Sub Component:
Version:	beta
Hardware:	All
OS:	Linux
Priority:	urgent
Severity:	urgent
Target Milestone:	---
Target Release:	---
Assignee:	Kim van der Riet
QA Contact:	Kim van der Riet
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2007-12-05 21:57 UTC by Kim van der Riet
Modified:	2012-12-07 17:46 UTC (History)
CC List:	0 users
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Kim van der Riet 2007-12-05 21:57:12 UTC

Under normal conditions, the journal flushes after inactivity or when closing
down. This fills the journal with filler records until an sblk boundary is
crossed and then writes to disk. However, if a crash or stoppage occurs such
that this write is interrupted and the last full record does not coincide with
an sblk boundary, then the journal cannot be recovered.

This can be fixed by filling the remaining space with filler records durning the
recover process, but requires that the journal files be written to during
recovery, something that is not presently allowed.

Comment 1 Kim van der Riet 2007-12-05 22:23:34 UTC

Currently recovery does not use O_DIRECT (since performance is not an issue
during this process), and this relieves the sblk boundary restriction for reads
and writes. This should make it easier to add the required fill records in this
case.

Comment 2 Kim van der Riet 2007-12-06 21:51:20 UTC

The strategy to fix this is as follows:

Step 1:
Check the record tail of each record during the analysis phase. A bad tail
indicates either a corrupted record header or an incomplete record write. If a
bad tail is found in any file *other* than the last logical file, then this is a
fatal error (and should never happen). However, if this occurs in the last
logical file, then this indicates an incomplete write at the file overwrite
boundary.
In this context, the first logical file is the last complete file to not be
overwritten (i.e. the oldest complete file) and thus the first to be read during
recovery, while the last logical file is the most recent file to be overwritten,
and possibly contains an overwrite boundary. It is the last file to be read
during recovery.

Step 2:
If the record that has been truncated starts on a dblock boundary that is not
also an sblock boundary, then filler records need to be written to the file
which will overwrite the truncated record. These will start at the record
header, each consuming one dblock, until the next sblock boundary is reached. At
this point, the journal is once again usable, as O_DIRECT reads and writes which
must be sblock aligned, can again take place without interleaving bad or
corrupted records.
Presently, the sblock size is 512 bytes (although this can be set to any
multiple of 512 bytes), and there are 4 dblocks per sblock - i.e. 128 bytes.

Comment 3 Kim van der Riet 2007-12-07 19:58:31 UTC

RHM svn r1442
Cruisecontrol 64-bit build 337
Cruisecontrol 32-bit build 49

Note You need to log in before you can comment on or make changes to this bug.