Bug 1457097 - PG repair does not repair the objects whose attributes are corrupted.
Summary: PG repair does not repair the objects whose attributes are corrupted.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: RADOS
Version: 2.3
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: 2.3
Assignee: David Zafman
QA Contact: Parikshith
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-05-31 06:35 UTC by Parikshith
Modified: 2017-07-30 15:17 UTC (History)
7 users (show)

Fixed In Version: RHEL: ceph-10.2.7-24.el7cp Ubuntu: ceph_10.2.7-26redhat1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-06-19 13:33:45 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:1497 normal SHIPPED_LIVE Red Hat Ceph Storage 2.3 bug fix and enhancement update 2017-06-19 17:24:11 UTC
Ceph Project Bug Tracker 20089 None None None 2017-05-31 17:06:18 UTC

Description Parikshith 2017-05-31 06:35:14 UTC
Description of problem:

PG repair does not repair the objects whose attributes are corrupted.

Version-Release number of selected component (if applicable): ceph version 10.2.7-21.el7cp 


How reproducible:

Steps to Reproduce:
1. Created a EC pool(5+2), wrote some data.
2. Created a snapshot of this pool.
3. Picked one of the shard and corrupted user.ceph.snapset xattr
4. After running the scrub on primary, status is reporting an inconsistent object

ceph -s:
     health HEALTH_ERR
            1 pgs inconsistent
            1 scrub errors
     monmap e3: 3 mons at {aircobra=10.70.39.1:6789/0,cornell=10.70.39.6:6789/0,corsair=10.70.39.7:6789/0}
            election epoch 24, quorum 0,1,2 aircobra,cornell,corsair
     osdmap e388: 9 osds: 8 up, 8 in
            flags sortbitwise,require_jewel_osds
      pgmap v11682: 300 pgs, 2 pools, 288 GB data, 73910 objects
            673 GB used, 8224 GB / 8898 GB avail
                 299 active+clean
                   1 active+clean+inconsistent


5. Ran pg repair on the affected pg(3.11)


Actual results:

Corrupted object does not get repaired (pg 3.11)

ceph -w: 
2017-05-30 16:16:27.264217 mon.2 [INF] from='client.? 10.70.39.2:0/1474454893' entity='client.admin' cmd=[{"prefix": "pg repair", "pgid": "3.11"}]: dispatch
2017-05-30 16:16:31.773155 mon.0 [INF] pgmap v11683: 300 pgs: 1 active+clean+inconsistent, 299 active+clean; 288 GB data, 673 GB used, 8224 GB / 8898 GB avail
2017-05-30 16:16:32.796121 mon.0 [INF] pgmap v11684: 300 pgs: 1 active+clean+inconsistent, 299 active+clean; 288 GB data, 673 GB used, 8224 GB / 8898 GB avail
2017-05-30 16:16:29.614098 osd.6 [INF] 3.11 repair starts
2017-05-30 16:16:32.158385 osd.6 [ERR] 3.11 repair 1 errors, 0 fixed
2017-05-30 16:16:34.853660 mon.0 [INF] pgmap v11685: 300 pgs: 1 active+clean+inconsistent, 299 active+clean; 288 GB data, 673 GB used, 8224 GB / 8898 GB avail
2017-05-30 16:16:35.893849 mon.0 [INF] pgmap v11686: 300 pgs: 1 active+clean+inconsistent, 299 active+clean; 288 GB data, 673 GB used, 8224 GB / 8898 GB avail
2017-05-30 16:16:36.929105 mon.0 [INF] pgmap v11687: 300 pgs: 1 active+clean+inconsistent, 299 active+clean; 288 GB data, 673 GB used, 8224 GB / 8898 GB avail


Expected results:


Additional info:

Comment 2 John Poelstra 2017-05-31 15:38:23 UTC
Based on initial review, appears to be a blocker.  Development doing root cause analysis now.

Comment 14 errata-xmlrpc 2017-06-19 13:33:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1497


Note You need to log in before you can comment on or make changes to this bug.