Hide Forgot
Description of problem: Need to do a Deep Scrub to remove the inconsistent PG list Version-Release number of selected component (if applicable): 10.1.1.1 How reproducible: Always Steps to Reproduce: 1. Had a cluster which had 15 PG's as inconsistent. 2. Identified the problem was with 1 particular Disk which have gone BAD. 3. Removed that particular OSD from Crush, data re-balance took place, but still those 15 PG's was showing as inconsistent. 4. And if i query rados list-inconsistent-obj 6.59 []error 2: (2) No such file or directory I was seeing error because that particular OSD was not there. Actual results: Needed to do deep-scrub on those inconsistent PG's to make my cluster clean. Expected results: It should have been taken care automatically. Additional info:
this problem is two folded: still marked inconsistent after removing the bad OSD ==================================================== we share the monitor with current status after scrubbing. but we don't clean the PG_STATE_INCONSISTENT flag after peering. as we don't track why/who caused the inconsistency, and revert the flag once the bad guy is gone. it would be very tricky if we want to do this way. so a stupid and safer approach is to keep that flag until it is reset with a deep scrub which set it. rados list-inconsistent-obj =========================== "rados list-inconsistent-obj" targets the primary osd for getting the latest scrub result - after the peering, the interval changed, so the object for storing the result of last scrub is zapped. that's why we have empty return value. - and since the command does not send the epoch # as should the scrub script. we can hardly check if this inconsistency is outdated or not. Not a blocker - recommend moving to 2.z
Kefu can you please confirm that you meant to close this one as NOTABUG? The previous comment says "recommend moving to 2.z", so I wanted to double-check this.
Ken, yes, I confirm. sorry for the confusion. I forgot to remove that line after editing the reasons to close this bug as NOTABUG.
Hi Kefu, I think its a BUG but as designed. Should we mark it as NOTABUG ? Thanks, Tanay
Tanay, sorry for the latency. makes sense. changing it to WONTFIX.