Bug 618798

Summary: IO Barriers for filesystems causes severe performance drop with DB2
Product: Red Hat Enterprise Linux 6 Reporter: Sanjay Rao <srao>
Component: kernelAssignee: chellwig <chellwig>
Status: CLOSED DUPLICATE QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: high Docs Contact:
Priority: low    
Version: 6.0CC: esandeen, rwheeler
Target Milestone: rcKeywords: RHELNAK
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
The Linux barrier mount option allows file systems to run safely on storage devices that have an enabled, non-battery backed write cache. For example, local S-ATA or SAS disks with write cache enabled will need to run with barriers in order to survive a system outage or crash. File systems that live on top of high end storage, for example an external disk array or internal hardware RAID card with internal battery support, do not benefit or need this. For this second class of devices, the file system should be mounted with "-o nobarrier" as a mount option.
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-02-01 11:32:40 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Sanjay Rao 2010-07-27 18:36:42 UTC
Description of problem:

Enabling IO barriers for file systems causes severe drop in DB2 performance. This was seen for both ext3 and ext4 file systems.


On a side note - leaving IO barriers on and turning off journals on ext4, returned performance back to the numbers seen with barriers off.


Version-Release number of selected component (if applicable):


How reproducible:
Easily reproducible

Steps to Reproduce:
1. Mount files systems with default options - barriers on
2. Run DB2 workload
3.
  
Actual results:

The number of transactions on ext3
with barriers - 33289.67
without barriers - 214695.73

The number of transactions on ext4
with barriers - 52438.69
with barriers (without journals) - 232095.19
without barriers - 242671.33


As the numbers show, leaving barriers on causes performance to drop more than 1/5 times than without barriers. On ext4, leaving barriers on but mounting without journals resulted in performance at the levels without barriers.


Expected results:


Additional info:

Comment 2 Ondrej Vasik 2010-07-27 18:43:52 UTC
Filesystem is just package with basic directory layout. It has nothing to do with ext3/4 - reassigning to kernel.

Comment 3 Eric Sandeen 2010-07-27 18:45:50 UTC
(In reply to comment #0)

> On a side note - leaving IO barriers on and turning off journals on ext4,
> returned performance back to the numbers seen with barriers off.

I think this is perfectly expected, because only JBD2 issues barrier IO directly; other barriers come from blkdev_issue_flush in the sync paths, but:

ext4_sync_file()
{
        ...

        if (!journal)
                return simple_fsync(file, dentry, datasync);

        ...

        if (jbd2_log_start_commit(journal, commit_tid))
                jbd2_log_wait_commit(journal, commit_tid);
        else if (journal->j_flags & JBD2_BARRIER)
                blkdev_issue_flush(inode->i_sb->s_bdev, NULL);
        return ret;
}

we do not ever get that far if we don't have a journal.  IOW, "barriers on" with no journal is a no-op.

-Eric

Comment 4 Sanjay Rao 2010-07-27 18:52:49 UTC
I also noticed that the DB2 data files grow during the run as opposed to other databases like Sybase and Oracle which fully pre-allocate the data files. This results in very little meta data modifications during the actual run on Oracle and Sybase compared to DB2.

Comment 5 RHEL Program Management 2010-07-27 18:57:44 UTC
This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release.

** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **

Comment 6 Ric Wheeler 2010-07-27 18:59:01 UTC
It would be interesting to see if DB2 can support better pre-allocation and if so, that might be a quick term work around.

Comment 8 Eric Sandeen 2010-08-03 20:58:32 UTC
Technical note added. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.

New Contents:
For now we will need to document (if not already done) that enterprise-class storage should be mounted with -o nobarrier.

Comment 11 Ryan Lerch 2010-09-02 02:15:59 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1 @@
-For now we will need to document (if not already done) that enterprise-class storage should be mounted with -o nobarrier.+Enterprise-class storage should always be mounted with using the -o nobarrier option

Comment 12 Ric Wheeler 2010-09-23 19:37:37 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1,5 @@
-Enterprise-class storage should always be mounted with using the -o nobarrier option+The Linux barrier mount option allows file systems to run safely on storage devices that have an enabled, non-battery backed write cache.  For example, local S-ATA or SAS disks with write cache enabled will need to run with barriers in order to survive a system outage or crash.
+
+File systems that live on top of high end storage, for example an external disk array or internal hardware RAID card with internal battery support, do not benefit or need this.
+
+For this second class of devices, the file system should be mounted with "-o nobarrier" as a mount option.

Comment 13 RHEL Program Management 2011-01-07 04:32:58 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unfortunately unable to
address this request at this time. Red Hat invites you to
ask your support representative to propose this request, if
appropriate and relevant, in the next release of Red Hat
Enterprise Linux. If you would like it considered as an
exception in the current release, please ask your support
representative.

Comment 14 Ric Wheeler 2011-01-07 17:53:28 UTC
I think that this has been resolved with 6.1 with our updated barrier patches.

Sanjay, do you still see this huge loss with DB2 and barriers? If not, we should close this as fixed in 6.1.

Thanks!

Comment 15 RHEL Program Management 2011-02-01 05:33:35 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unfortunately unable to
address this request at this time. Red Hat invites you to
ask your support representative to propose this request, if
appropriate and relevant, in the next release of Red Hat
Enterprise Linux. If you would like it considered as an
exception in the current release, please ask your support
representative.

Comment 16 Ric Wheeler 2011-02-01 11:32:40 UTC
This is fixed by the patches from 657046.

*** This bug has been marked as a duplicate of bug 657046 ***