Bug 250537 - [PATCH] jbd: wait for already submitted t_sync_datalist buffer to complete (Possibility of in-place data destruction)
[PATCH] jbd: wait for already submitted t_sync_datalist buffer to complete (P...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.0
All Linux
urgent Severity high
: rc
: ---
Assigned To: Eric Sandeen
Martin Jenner
http://git.kernel.org/?p=linux/kernel...
upstream patch
: ZStream
Depends On:
Blocks: 246139 296411 345141 372911 420521 422431 422441
  Show dependency treegraph
 
Reported: 2007-08-02 01:00 EDT by Masaki MAENO
Modified: 2010-10-22 13:07 EDT (History)
3 users (show)

See Also:
Fixed In Version: RHBA-2008-0314
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-05-21 10:48:03 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Masaki MAENO 2007-08-02 01:00:59 EDT
Strong Request:

Please take the following patch:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=6f5a9da1af5a8c286575c30c2706dc1fbef9164b;hp=6d3a25f1fb75206ae8b2b1cdd1431b3852e1a45a
(I think that it is very safe to apply this patch.)

Otherwise, there is a possibility that in-place data block breaks 
when fs operates in "ordered" mode (journaling default mode) .


Description of problem:

> [PATCH] jbd: wait for already submitted t_sync_datalist buffer to complete
> Hisashi Hifumi [Fri, 22 Dec 2006 09:11:50 +0000 (01:11 -0800)]
> 
> In the current jbd code, if a buffer on BJ_SyncData list is dirty and not
> locked, the buffer is refiled to BJ_Locked list, submitted to the IO and
> waited for IO completion.
>
> But the fsstress test showed the case that when a buffer was already
> submitted to the IO just before the buffer_dirty(bh) check, the buffer was
> not waited for IO completion.
>
> Following patch solves this problem.  If it is assumed that a buffer is
> submitted to the IO before the buffer_dirty(bh) check and still being
> written to disk, this buffer is refiled to BJ_Locked list.

>--- a/fs/jbd/commit.c
>+++ b/fs/jbd/commit.c
>@@ -248,8 +248,12 @@ write_out_data:
>                                bufs = 0;
>                                goto write_out_data;
>                        }
>-               }
>-               else {
>+               } else if (!locked && buffer_locked(bh)) {
>+                       __journal_file_buffer(jh, commit_transaction,
>+                                               BJ_Locked);
>+                       jbd_unlock_bh_state(bh);
>+                       put_bh(bh);
>+               } else {
>                        BUFFER_TRACE(bh, "writeout complete: unfile");
>                        __journal_unfile_buffer(jh);
>                        jbd_unlock_bh_state(bh);

If t_sync_data (= in-place data) is dirty and not locked, the buffer 
is already submitted IO, but is not completed IO.
Though the buffer (= in-place data block) is not written to the disk, 
there is a possibility that the buffer is recycled and overwritten.
Also, the order of writing the disk goes mad (original: "in-place data
block" --> "journal data block" / current version (before patch): "journal
data block" --> "in-place data") .


Version-Release number of selected component (if applicable):
  RHEL5.0      : kernel-2.6.18-8.el5
  RHEL5.1 Beta1: kernel-2.6.18-36.el5


How reproducible:

A high I/O load is given to Linux system.


Actual results:

in-place data block is broken.


Expected results:

in-place data block is not broken.


Additional info:

[LinusTorvals]/fs/jbd/commit.c 's Patch History:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=history;f=fs/jbd/commit.c;hb=6f5a9da1af5a8c286575c30c2706dc1fbef9164b
Comment 1 Masaki MAENO 2007-08-16 05:33:10 EDT
I researched this problem.
As result, the buffer (= bh) is not released for condition of buffer_busy()
though the reference count of buffer (= bh->b_count) is zero. So, I think that
in-place data block doesn't break.

But, the ordered mode doesn't work correctly because there is a high possibility
of   writing "in-place data" after "jounrnal of meta-data", if ext3FS is crashed
for a sudden power down.
Comment 3 Eric Sandeen 2007-08-24 02:43:19 EDT
The patch is not yet committed in our kernel.

Has the customer actually seen this problem in practice, and encountered an
error?  I did not expect this bug to be an urgent issue, it looked almost like a
hypothetical case.

I'll review this bug more thoroughly tomorrow.

Thank you,
-Eric
Comment 4 Masaki MAENO 2007-08-27 04:33:14 EDT
How do you review the bug?

I encountered this error. I confirmed that this bad path (meta-data journal --> 
in-place data) passed at the probability of 1% - 2% on high I/O stress test
in our experimental environment.

I think that RedHat should mend it so that the ordered mode for this bug is not
correct and the possibility of the problem generation by taking this patch is 
very small.


The excuse is unnecessary.
You must correct the mistake at RHEL5.1 if you understand the bug exists.
(custormer's voice)
Comment 5 Eric Sandeen 2007-08-27 12:33:03 EDT
I reviewed this upstream change last Friday, and it does look correct and safe
to me.  I'll submit it for peer review & kernel inclusion today.

Thanks,
-Eric
Comment 7 Masaki MAENO 2007-08-27 20:56:31 EDT
Thank you very much. Good job! ;-)

RHEL5.1 Errata Release? or RHEL5.2 Release?
Please tell me its scheduling.
Comment 8 Masaki MAENO 2007-08-27 20:58:36 EDT
I made a mistake.

RHEL5.1 Release? or RHEL5.1 Errata Release? or RHEL5.2 Release?
Please tell me its scheduling.
Comment 9 Eric Sandeen 2007-08-27 21:37:50 EDT
Right now it would likely be scheduled for 5.2 as it was filed after the general
5.1 cutoff.  If you have other needs, please let your support contact know and
they can request the appropriate action.

Thanks,
-Eric
Comment 11 Masaki MAENO 2007-08-30 06:06:59 EDT
I see.
Comment 13 RHEL Product and Program Management 2007-10-12 03:34:51 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 32 errata-xmlrpc 2008-05-21 10:48:03 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0314.html

Note You need to log in before you can comment on or make changes to this bug.