Bug 167186
Summary: | ext2 and ext3 file systems are extremely slow when deleting large files on a busy file system | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 3 | Reporter: | Jos VanWezel <jvw> |
Component: | kernel | Assignee: | Stephen Tweedie <sct> |
Status: | CLOSED DEFERRED | QA Contact: | Brian Brock <bbrock> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 3.0 | CC: | petrides |
Target Milestone: | --- | Keywords: | FutureFeature |
Target Release: | --- | ||
Hardware: | i386 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Enhancement | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2005-09-05 16:19:44 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Jos VanWezel
2005-08-31 13:28:32 UTC
Deleting large files requires ext2/3 to walk a fairly large amount of scattered metadata. The indirect blocks needed to map files to disk blocks are not laid out sequentially, so delete requires all of these to be read in randomly. And each such block only gets read once the previous one has been processed. It's a set of synchronous read operations. Each read only gets queued once the previous one has been processed; on a very busy disk, each read has to compete for the disk's attention separately, so the time taken builds up rapidly. Because this is metadata read IO that's taking the time, ext2 is affected just as much; the journal in ext3 only has an impact on write performance, it doesn't come into the picture for reads at all. There are several possible ways to improve the situation. In RHEL-4, two new disk IO schedulers (the CFQ "completely fair queuing" scheduler --- the new default --- and especially the "anticipatory" scheduler, optimised for interactive performance) will both improve this by allowing a task to submit multiple reads without being preempted too much. There are prototype patches to let ext3 maintain its disk mapping information more compactly via "extent maps", which would greatly reduce the amount of metadata to be read to complete a delete (this is what XFS already does.) And we might asynchronously read-ahead some of the metadata in question. However, all of these changes are far too invasive to consider for an established, stable release like RHEL-3. The scheduler improvements in RHEL-4 are likely to help a lot; other than that, it will be up to future enhancements (extents in particular) to really address this particular performance property of ext3. |