Strong Request: Please take the following patch: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=6f5a9da1af5a8c286575c30c2706dc1fbef9164b;hp=6d3a25f1fb75206ae8b2b1cdd1431b3852e1a45a (I think that it is very safe to apply this patch.) Otherwise, there is a possibility that in-place data block breaks when fs operates in "ordered" mode (journaling default mode) . Description of problem: > [PATCH] jbd: wait for already submitted t_sync_datalist buffer to complete > Hisashi Hifumi [Fri, 22 Dec 2006 09:11:50 +0000 (01:11 -0800)] > > In the current jbd code, if a buffer on BJ_SyncData list is dirty and not > locked, the buffer is refiled to BJ_Locked list, submitted to the IO and > waited for IO completion. > > But the fsstress test showed the case that when a buffer was already > submitted to the IO just before the buffer_dirty(bh) check, the buffer was > not waited for IO completion. > > Following patch solves this problem. If it is assumed that a buffer is > submitted to the IO before the buffer_dirty(bh) check and still being > written to disk, this buffer is refiled to BJ_Locked list. >--- a/fs/jbd/commit.c >+++ b/fs/jbd/commit.c >@@ -248,8 +248,12 @@ write_out_data: > bufs = 0; > goto write_out_data; > } >- } >- else { >+ } else if (!locked && buffer_locked(bh)) { >+ __journal_file_buffer(jh, commit_transaction, >+ BJ_Locked); >+ jbd_unlock_bh_state(bh); >+ put_bh(bh); >+ } else { > BUFFER_TRACE(bh, "writeout complete: unfile"); > __journal_unfile_buffer(jh); > jbd_unlock_bh_state(bh); If t_sync_data (= in-place data) is dirty and not locked, the buffer is already submitted IO, but is not completed IO. Though the buffer (= in-place data block) is not written to the disk, there is a possibility that the buffer is recycled and overwritten. Also, the order of writing the disk goes mad (original: "in-place data block" --> "journal data block" / current version (before patch): "journal data block" --> "in-place data") . Version-Release number of selected component (if applicable): RHEL5.0 : kernel-2.6.18-8.el5 RHEL5.1 Beta1: kernel-2.6.18-36.el5 How reproducible: A high I/O load is given to Linux system. Actual results: in-place data block is broken. Expected results: in-place data block is not broken. Additional info: [LinusTorvals]/fs/jbd/commit.c 's Patch History: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=history;f=fs/jbd/commit.c;hb=6f5a9da1af5a8c286575c30c2706dc1fbef9164b
I researched this problem. As result, the buffer (= bh) is not released for condition of buffer_busy() though the reference count of buffer (= bh->b_count) is zero. So, I think that in-place data block doesn't break. But, the ordered mode doesn't work correctly because there is a high possibility of writing "in-place data" after "jounrnal of meta-data", if ext3FS is crashed for a sudden power down.
The patch is not yet committed in our kernel. Has the customer actually seen this problem in practice, and encountered an error? I did not expect this bug to be an urgent issue, it looked almost like a hypothetical case. I'll review this bug more thoroughly tomorrow. Thank you, -Eric
How do you review the bug? I encountered this error. I confirmed that this bad path (meta-data journal --> in-place data) passed at the probability of 1% - 2% on high I/O stress test in our experimental environment. I think that RedHat should mend it so that the ordered mode for this bug is not correct and the possibility of the problem generation by taking this patch is very small. The excuse is unnecessary. You must correct the mistake at RHEL5.1 if you understand the bug exists. (custormer's voice)
I reviewed this upstream change last Friday, and it does look correct and safe to me. I'll submit it for peer review & kernel inclusion today. Thanks, -Eric
Thank you very much. Good job! ;-) RHEL5.1 Errata Release? or RHEL5.2 Release? Please tell me its scheduling.
I made a mistake. RHEL5.1 Release? or RHEL5.1 Errata Release? or RHEL5.2 Release? Please tell me its scheduling.
Right now it would likely be scheduled for 5.2 as it was filed after the general 5.1 cutoff. If you have other needs, please let your support contact know and they can request the appropriate action. Thanks, -Eric
I see.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2008-0314.html