Bug 250537 - [PATCH] jbd: wait for already submitted t_sync_datalist buffer to complete (Possibility of in-place data destruction)
Summary: [PATCH] jbd: wait for already submitted t_sync_datalist buffer to complete (P...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.0
Hardware: All
OS: Linux
urgent
high
Target Milestone: rc
: ---
Assignee: Eric Sandeen
QA Contact: Martin Jenner
URL: http://git.kernel.org/?p=linux/kernel...
Whiteboard: upstream patch
Depends On:
Blocks: 246139 296411 345141 372911 420521 422431 422441
TreeView+ depends on / blocked
 
Reported: 2007-08-02 05:00 UTC by Masaki MAENO
Modified: 2018-10-19 23:15 UTC (History)
3 users (show)

Fixed In Version: RHBA-2008-0314
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-05-21 14:48:03 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2008:0314 0 normal SHIPPED_LIVE Updated kernel packages for Red Hat Enterprise Linux 5.2 2008-05-20 18:43:34 UTC

Description Masaki MAENO 2007-08-02 05:00:59 UTC
Strong Request:

Please take the following patch:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=6f5a9da1af5a8c286575c30c2706dc1fbef9164b;hp=6d3a25f1fb75206ae8b2b1cdd1431b3852e1a45a
(I think that it is very safe to apply this patch.)

Otherwise, there is a possibility that in-place data block breaks 
when fs operates in "ordered" mode (journaling default mode) .


Description of problem:

> [PATCH] jbd: wait for already submitted t_sync_datalist buffer to complete
> Hisashi Hifumi [Fri, 22 Dec 2006 09:11:50 +0000 (01:11 -0800)]
> 
> In the current jbd code, if a buffer on BJ_SyncData list is dirty and not
> locked, the buffer is refiled to BJ_Locked list, submitted to the IO and
> waited for IO completion.
>
> But the fsstress test showed the case that when a buffer was already
> submitted to the IO just before the buffer_dirty(bh) check, the buffer was
> not waited for IO completion.
>
> Following patch solves this problem.  If it is assumed that a buffer is
> submitted to the IO before the buffer_dirty(bh) check and still being
> written to disk, this buffer is refiled to BJ_Locked list.

>--- a/fs/jbd/commit.c
>+++ b/fs/jbd/commit.c
>@@ -248,8 +248,12 @@ write_out_data:
>                                bufs = 0;
>                                goto write_out_data;
>                        }
>-               }
>-               else {
>+               } else if (!locked && buffer_locked(bh)) {
>+                       __journal_file_buffer(jh, commit_transaction,
>+                                               BJ_Locked);
>+                       jbd_unlock_bh_state(bh);
>+                       put_bh(bh);
>+               } else {
>                        BUFFER_TRACE(bh, "writeout complete: unfile");
>                        __journal_unfile_buffer(jh);
>                        jbd_unlock_bh_state(bh);

If t_sync_data (= in-place data) is dirty and not locked, the buffer 
is already submitted IO, but is not completed IO.
Though the buffer (= in-place data block) is not written to the disk, 
there is a possibility that the buffer is recycled and overwritten.
Also, the order of writing the disk goes mad (original: "in-place data
block" --> "journal data block" / current version (before patch): "journal
data block" --> "in-place data") .


Version-Release number of selected component (if applicable):
  RHEL5.0      : kernel-2.6.18-8.el5
  RHEL5.1 Beta1: kernel-2.6.18-36.el5


How reproducible:

A high I/O load is given to Linux system.


Actual results:

in-place data block is broken.


Expected results:

in-place data block is not broken.


Additional info:

[LinusTorvals]/fs/jbd/commit.c 's Patch History:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=history;f=fs/jbd/commit.c;hb=6f5a9da1af5a8c286575c30c2706dc1fbef9164b

Comment 1 Masaki MAENO 2007-08-16 09:33:10 UTC
I researched this problem.
As result, the buffer (= bh) is not released for condition of buffer_busy()
though the reference count of buffer (= bh->b_count) is zero. So, I think that
in-place data block doesn't break.

But, the ordered mode doesn't work correctly because there is a high possibility
of   writing "in-place data" after "jounrnal of meta-data", if ext3FS is crashed
for a sudden power down.


Comment 3 Eric Sandeen 2007-08-24 06:43:19 UTC
The patch is not yet committed in our kernel.

Has the customer actually seen this problem in practice, and encountered an
error?  I did not expect this bug to be an urgent issue, it looked almost like a
hypothetical case.

I'll review this bug more thoroughly tomorrow.

Thank you,
-Eric

Comment 4 Masaki MAENO 2007-08-27 08:33:14 UTC
How do you review the bug?

I encountered this error. I confirmed that this bad path (meta-data journal --> 
in-place data) passed at the probability of 1% - 2% on high I/O stress test
in our experimental environment.

I think that RedHat should mend it so that the ordered mode for this bug is not
correct and the possibility of the problem generation by taking this patch is 
very small.


The excuse is unnecessary.
You must correct the mistake at RHEL5.1 if you understand the bug exists.
(custormer's voice)

Comment 5 Eric Sandeen 2007-08-27 16:33:03 UTC
I reviewed this upstream change last Friday, and it does look correct and safe
to me.  I'll submit it for peer review & kernel inclusion today.

Thanks,
-Eric

Comment 7 Masaki MAENO 2007-08-28 00:56:31 UTC
Thank you very much. Good job! ;-)

RHEL5.1 Errata Release? or RHEL5.2 Release?
Please tell me its scheduling.

Comment 8 Masaki MAENO 2007-08-28 00:58:36 UTC
I made a mistake.

RHEL5.1 Release? or RHEL5.1 Errata Release? or RHEL5.2 Release?
Please tell me its scheduling.

Comment 9 Eric Sandeen 2007-08-28 01:37:50 UTC
Right now it would likely be scheduled for 5.2 as it was filed after the general
5.1 cutoff.  If you have other needs, please let your support contact know and
they can request the appropriate action.

Thanks,
-Eric

Comment 11 Masaki MAENO 2007-08-30 10:06:59 UTC
I see.

Comment 13 RHEL Program Management 2007-10-12 07:34:51 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 32 errata-xmlrpc 2008-05-21 14:48:03 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0314.html



Note You need to log in before you can comment on or make changes to this bug.