Description of problem: PG repair does not repair the objects whose attributes are corrupted. Version-Release number of selected component (if applicable): ceph version 10.2.7-21.el7cp How reproducible: Steps to Reproduce: 1. Created a EC pool(5+2), wrote some data. 2. Created a snapshot of this pool. 3. Picked one of the shard and corrupted user.ceph.snapset xattr 4. After running the scrub on primary, status is reporting an inconsistent object ceph -s: health HEALTH_ERR 1 pgs inconsistent 1 scrub errors monmap e3: 3 mons at {aircobra=10.70.39.1:6789/0,cornell=10.70.39.6:6789/0,corsair=10.70.39.7:6789/0} election epoch 24, quorum 0,1,2 aircobra,cornell,corsair osdmap e388: 9 osds: 8 up, 8 in flags sortbitwise,require_jewel_osds pgmap v11682: 300 pgs, 2 pools, 288 GB data, 73910 objects 673 GB used, 8224 GB / 8898 GB avail 299 active+clean 1 active+clean+inconsistent 5. Ran pg repair on the affected pg(3.11) Actual results: Corrupted object does not get repaired (pg 3.11) ceph -w: 2017-05-30 16:16:27.264217 mon.2 [INF] from='client.? 10.70.39.2:0/1474454893' entity='client.admin' cmd=[{"prefix": "pg repair", "pgid": "3.11"}]: dispatch 2017-05-30 16:16:31.773155 mon.0 [INF] pgmap v11683: 300 pgs: 1 active+clean+inconsistent, 299 active+clean; 288 GB data, 673 GB used, 8224 GB / 8898 GB avail 2017-05-30 16:16:32.796121 mon.0 [INF] pgmap v11684: 300 pgs: 1 active+clean+inconsistent, 299 active+clean; 288 GB data, 673 GB used, 8224 GB / 8898 GB avail 2017-05-30 16:16:29.614098 osd.6 [INF] 3.11 repair starts 2017-05-30 16:16:32.158385 osd.6 [ERR] 3.11 repair 1 errors, 0 fixed 2017-05-30 16:16:34.853660 mon.0 [INF] pgmap v11685: 300 pgs: 1 active+clean+inconsistent, 299 active+clean; 288 GB data, 673 GB used, 8224 GB / 8898 GB avail 2017-05-30 16:16:35.893849 mon.0 [INF] pgmap v11686: 300 pgs: 1 active+clean+inconsistent, 299 active+clean; 288 GB data, 673 GB used, 8224 GB / 8898 GB avail 2017-05-30 16:16:36.929105 mon.0 [INF] pgmap v11687: 300 pgs: 1 active+clean+inconsistent, 299 active+clean; 288 GB data, 673 GB used, 8224 GB / 8898 GB avail Expected results: Additional info:
Based on initial review, appears to be a blocker. Development doing root cause analysis now.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1497