Bug 1600040

Summary: [bluestore]: 1 Pg stuck in active+clean+remapped state after running reweight-by-utilization
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Parikshith <pbyregow>
Component: RADOSAssignee: Josh Durgin <jdurgin>
Status: CLOSED NOTABUG QA Contact: Parikshith <pbyregow>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.1CC: ceph-eng-bugs, dzafman, hnallurv, jdurgin, kchai, nojha
Target Milestone: rc   
Target Release: 3.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-07-11 23:32:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
crush map
none
pg query
none
osd dump none

Description Parikshith 2018-07-11 09:31:21 UTC
Created attachment 1458031 [details]
crush map

Description of problem:
1 Pg stuck in active+clean+remapped state for over 12 hrs after running reweight-by-utilization

Version-Release number of selected component (if applicable):
ceph version 12.2.5-27.el7cp

Steps to Reproduce:
1. Created cluster with only bluestore osds. Filled the cluster with 50-60% data.
2. Ran ceph osd reweight-by-utilization (default threshold)

Actual results:

One of the Pgs(13.5) belonging to an EC pool(overwrites enabled) is stuck in active+clean+remapped state

$ceph osd erasure-code-profile get newprofile
crush-device-class=
crush-failure-domain=host
crush-root=default
jerasure-per-chunk-alignment=false
k=4
m=2
plugin=jerasure
technique=reed_sol_van
w=8

$ceph osd tree
ID  CLASS WEIGHT   TYPE NAME         STATUS REWEIGHT PRI-AFF 
 -1       10.91510 root default                              
 -3        2.72878     host magna106                         
  1   hdd  0.90959         osd.1         up  1.00000 1.00000 
  2   hdd  0.90959         osd.2         up  1.00000 1.00000 
  3   hdd  0.90959         osd.3         up  1.00000 1.00000 
 -9        1.81918     host magna107                         
  9   hdd  0.90959         osd.9         up  1.00000 1.00000 
 10   hdd  0.90959         osd.10        up  0.90002 1.00000 
 -7        0.90959     host magna108                         
  8   hdd  0.90959         osd.8         up  1.00000 1.00000 
 -5        2.72878     host magna113                         
  0   hdd  0.90959         osd.0         up  1.00000 1.00000 
  4   hdd  0.90959         osd.4         up  1.00000 1.00000 
  5   hdd  0.90959         osd.5         up  1.00000 1.00000 
-13        1.81918     host magna114                         
  7   hdd  0.90959         osd.7         up  1.00000 1.00000 
 11   hdd  0.90959         osd.11        up  1.00000 1.00000 
-11        0.90959     host magna115                         
  6   hdd  0.90959         osd.6         up  0.95001 1.00000 

$ceph pg dump | grep active+clean+remapped
dumped all
13.5       2691                  0        0      2691       0  3178894720  9560     9560       active+clean+remapped 2018-07-11 07:59:52.847350   377'9560  509:16941 [4,3,10,11,NONE,8]          4 [4,3,10,11,8,8]              4        0'0 2018-07-09 13:11:58.937740             0'0 2018-07-09 13:11:58.937740             0 

$ceph pg 13.5 query 
{
    "state": "active+clean+remapped",
    "snap_trimq": "[]",
    "snap_trimq_len": 0,
    "epoch": 537,
    "up": [
        4,
        3,
        10,
        11,
        2147483647,
        8
    ],
    "acting": [
        4,
        3,
        10,
        11,
        8,
        8
    ],
    "actingbackfill": [
        "3(1)",
        "4(0)",
        "8(4)",
        "8(5)",
        "10(2)",
        "11(3)"
    ],

Attached complete pg query, ceph osd dump and osd crush dump

Additional info: I did not manually change crush map.

Comment 3 Parikshith 2018-07-11 09:32:23 UTC
Created attachment 1458032 [details]
pg query

Comment 4 Parikshith 2018-07-11 09:33:24 UTC
Created attachment 1458033 [details]
osd dump

Comment 5 Josh Durgin 2018-07-11 23:32:03 UTC
being remapped isn't an error - in this case crush happens to not reach an assignment for shard 4, so the osds override it and add one.