Bug 459339 - Misordered exceptions in snapshots may cause leaking of disk space
Misordered exceptions in snapshots may cause leaking of disk space
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: lvm2 (Show other bugs)
All Linux
medium Severity low
: rc
: ---
Assigned To: Mikuláš Patočka
Corey Marthaler
Depends On:
  Show dependency treegraph
Reported: 2008-08-16 17:38 EDT by Mikuláš Patočka
Modified: 2010-11-29 18:04 EST (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2010-11-29 18:04:56 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Mikuláš Patočka 2008-08-16 17:38:34 EDT
Chunks for new exceptions are allocated sequentially at the time when they are first accessed. The are committed to on-disk exception table when their copying finishes. Because that copying can finish out of order (the block layer or disk may reorder requests), exceptions are sometimes written out-of-order into on-disk exception storage.

Normally there is no problem with out-of-order exceptions, the exception record contains both new and old chunk, so they will be loaded correctly on next activation and there is no data corruption.

However, when the computer crashes at certain point (when the high-chunk number exception record is written but succeeding low-chunk number exception is not yet written), the space for that low-chunk number exception record is lost and will never be used.

The result is that snapshots with sufficient space can overflow because of this behaviour.

Example (assume 4k chunk size):
- write simultaneously to chunks 0 ... 511 on the origin.
- exceptions are allocated at places 2 ... 257 and 259 ... 514 in the chunk storage. All the exceptions are being copied simultaneously
- now assume that copying of exceptions at origin 256 ... 511 (snapshot 259 ... 514) finishes first
- records containing origin 256...511 and snapshot 259...514 are written into exception metadata area at chunk 1

now assume *CRASH*

after reboot, 256 exceptions from the metadata area at chunk 1 is correctly loaded. The next free snapshot exception number is set to 515. The snapshot chunks 2 ... 257 are lost and can't ever be used.
Comment 1 RHEL Product and Program Management 2009-06-15 17:01:30 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
Comment 3 RHEL Product and Program Management 2010-07-15 11:02:31 EDT
This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release. It has
been denied for the current Red Hat Enterprise Linux release.

** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **
Comment 6 Mikuláš Patočka 2010-11-29 18:04:24 EST
A fix would need re-architecting the snapshot support. And the bug isn't really serious --- if the machine crashes, there may remain few chunks allocated-but-unused, but no data corruption. So I'm naking it.
Comment 7 RHEL Product and Program Management 2010-11-29 18:04:56 EST
Development Management has reviewed and declined this request.  You may appeal
this decision by reopening this request.

Note You need to log in before you can comment on or make changes to this bug.