Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1334534 - collection split causes orphaned files in filestore causing inconsistent scrubs and crashes on pg removal
collection split causes orphaned files in filestore causing inconsistent scru...
Status: CLOSED ERRATA
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: RADOS (Show other bugs)
1.3.2
Unspecified Unspecified
unspecified Severity urgent
: rc
: 1.3.3
Assigned To: Samuel Just
shylesh
Bara Ancincova
:
Depends On: 1318055
Blocks: 1372735
  Show dependency treegraph
 
Reported: 2016-05-09 19:20 EDT by Ken Dreyer (Red Hat)
Modified: 2017-07-30 11:07 EDT (History)
11 users (show)

See Also:
Fixed In Version: RHEL: ceph-0.94.7-5.el7cp Ubuntu: ceph_0.94.7-3redhat1trusty
Doc Type: Bug Fix
Doc Text:
.PG collection split no longer produces any orphaned files Due to a bug in the underlying source code, a placement group (PG) collection split could produce orphaned files. Consequently, the PG could be incorrectly marked as inconsistent during scrubbing, or the OSD nodes could terminate unexpectedly. The bug has been fixed, and PG collection split no longer produces any orphaned files.
Story Points: ---
Clone Of: 1318055
Environment:
Last Closed: 2016-09-29 08:58:28 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Ceph Project Bug Tracker 14766 None None None 2016-05-09 19:20 EDT
Red Hat Product Errata RHSA-2016:1972 normal SHIPPED_LIVE Moderate: Red Hat Ceph Storage 1.3.3 security, bug fix, and enhancement update 2016-09-29 12:51:21 EDT

  None (edit)
Description Ken Dreyer (Red Hat) 2016-05-09 19:20:15 EDT
This clone is to fix this BZ in Hammer / RHCS 1.3.

+++ This bug was initially created as a clone of Bug #1318055 +++

Description of problem:

When a collection split occurs during a collection_move_rename on an ec object, it can result in the old link being orphaned in the parent directory (see the commits for details).  

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Create new cluster
2. Create ec pool with one pg.
3. Fill up pool by alternately creating a new object and overwriting an existing object.  All of the objects should have names longer than 300 characters.
4. When the pg collection splits, there should be an orphaned file in the <pgid>_head/ directory for the single pg on the osds with the pg.

Actual results:

4. is proof that the bug happened.  The following symptoms may also occur depending on the exact objects created:

Scrubbing the pg may result in the pg being reported incorrectly as inconsistent.  Removing the orphaned file and re-scrubbing should cause the pg to stop being inconsistent.

Causing the pg mapping to change may cause the ENOTEMPTY crash in the linked bug.  Again, the file remaining in the collection directory will be the orphaned file.

Expected results:

No orphaned file, crashes, or pg inconsistencies.



--- Additional comment from Samuel Just on 2016-03-15 17:52:51 EDT ---

https://github.com/ceph/ceph/pull/8136 -- merged.  Not backported to hammer yet.

--- Additional comment from Samuel Just on 2016-04-08 10:41:56 EDT ---

Still needs backport to hammer.
Comment 1 Samuel Just 2016-06-03 18:34:19 EDT
Penidng https://github.com/ceph/ceph/pull/9277
Comment 2 Ken Dreyer (Red Hat) 2016-06-16 11:12:19 EDT
Change was merged to hammer upstream and will be in v0.94.8.
Comment 4 Ken Dreyer (Red Hat) 2016-08-17 10:14:14 EDT
Sam is there a Teuthology test for this? (how would I find that information myself? Would I see some reference to the upstream Redmine ticket in ceph-qa-suite.git?)
Comment 5 Federico Lucifredi 2016-08-17 18:39:43 EDT
medium priority for 1.3.3. testing.
Comment 10 shylesh 2016-09-13 08:24:49 EDT
Followed the bug description and didn't find any orphan objects or crashed.
Hence marking this as verified.
Verified on 0.94.9-1.el7cp.x86_64
Comment 13 errata-xmlrpc 2016-09-29 08:58:28 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-1972.html

Note You need to log in before you can comment on or make changes to this bug.