Bug 2231346

Summary: [GSS][ODF 4.12] When OSD goes down PG's are not redistributed to other OSDs
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Rafrojas <rafrojas>
Component: RookAssignee: Santosh Pillai <sapillai>
Status: ASSIGNED --- QA Contact: Tejas <tchandra>
Severity: low Docs Contact:
Priority: unspecified    
Version: 6.2CC: bhubbard, ceph-eng-bugs, cephqe-warriors, nojha, rsachere, rzarzyns, sapillai, tnielsen, vumrao
Target Milestone: ---Flags: sapillai: needinfo? (rafrojas)
Target Release: 7.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Rafrojas 2023-08-11 10:25:02 UTC
Description of problem:
 When taking down one storage node hosting an OSD, PG.s are not redistributed to other OSDs

Version-Release number of selected component (if applicable):
ODF 4.12

How reproducible:
Manually taking one OSD down

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Meanwhile in the rook-ceph-operator logs I just see the flag set

2023-06-28T13:23:33.675488974Z 2023-06-28 13:23:33.675473 D | ceph-cluster-controller: updating ceph cluster "openshift-storage" status and condition to &{Health:{Status:HEALTH_WARN Checks:map[MDS_SLOW_METADATA_IO:{Severity:HEALTH_WARN Summary:{Message:1 MDSs report slow metadata IOs}} OSDMAP_FLAGS:{Severity:HEALTH_WARN Summary:{Message:norecover flag(s) set}} OSD_DOWN:{Severity:HEALTH_WARN Summary:{Message:2 osds down}} OSD_HOST_DOWN:{Severity:HEALTH_WARN Summary:{Message:2 hosts (2 osds) down}} OSD_RACK_DOWN:{Severity:HEALTH_WARN Summary:{Message:2 racks (2 osds) down}} PG_AVAILABILITY:{Severity:HEALTH_WARN Summary:{Message:Reduced data availability: 98 pgs inactive}} PG_DEGRADED:{Severity:HEALTH_WARN Summary:{Message:Degraded data redundancy: 602/1569 objects degraded (38.368%), 118 pgs degraded, 311 pgs undersized}}]} FSID:c4699502-9c8b-4fcf-8040-38115e2cc0a1 ElectionEpoch:110 Quorum:[0 1 2] QuorumNames:[c d e] MonMap:{Epoch:7 FSID: CreatedTime: ModifiedTime: Mons:[]} OsdMap:{OsdMap:{Epoch:0 NumOsd:0 NumUpOsd:0 NumInOsd:0 Full:false NearFull:false NumRemappedPgs:0}} PgMap:{PgsByState:[{StateName:active+undersized Count:137} {StateName:active+undersized+degraded Count:76} {StateName:undersized+peered Count:56} {StateName:active+clean Count:42} {StateName:undersized+degraded+peered Count:42}] Version:0 NumPgs:353 DataBytes:728470648 UsedBytes:13220044800 AvailableBytes:2671134515200 TotalBytes:2684354560000 ReadBps:0 WriteBps:0 ReadOps:0 WriteOps:0 RecoveryBps:0 RecoveryObjectsPerSec:0 RecoveryKeysPerSec:0 CacheFlushBps:0 CacheEvictBps:0 CachePromoteBps:0} MgrMap:{Epoch:0 ActiveGID:0 ActiveName: ActiveAddr: Available:true Standbys:[]} Fsmap:{Epoch:6173 ID:1 Up:1 In:1 Max:1 ByRank:[{FilesystemID:1 Rank:0 Name:ocs-storagecluster-cephfilesystem-b Status:up:active Gid:144641} {FilesystemID:1 Rank:0 Name:ocs-storagecluster-cephfilesystem-a Status:up:standby-replay Gid:906740}] UpStandby:0}}, True, ClusterCreated, Cluster created successfully
2023-06-28T13:23:33.794875301Z 2023-06-28 13:23:33.794863 D | ceph-cluster-controller: Health: "HEALTH_WARN", code: "OSDMAP_FLAGS", message: "norecover flag(s) set"

in the logs of the osd pod how the flag is set and unset

[amanzane@supportshell-1 pods]$ grep -ir NORECOVER rook-ceph-osd-1-5469f6d9c9-b6rfx/osd/osd/logs/current.log
2023-06-28T13:23:12.575619884Z debug 2023-06-28T13:23:12.574+0000 7f9074980700  1 osd.1 704 pausing recovery (NORECOVER flag set)
2023-06-28T13:24:05.002745490Z debug 2023-06-28T13:24:05.002+0000 7f9074980700  1 osd.1 705 unpausing recovery (NORECOVER flag unset)

In the mon logs

2023-07-13T10:50:00.000317521Z debug 2023-07-13T10:49:59.998+0000 7f611f2cd700  0 log_channel(cluster) log [WRN] :     rack rack0 has flags noout
2023-07-13T10:50:00.842077591Z cluster 2023-07-13T10:50:00.000165+0000 mon.a (mon.0) 2275 : cluster [WRN] Health detail: HEALTH_WARN 1 osds down; 1 OSDs or CRUSH {nodes, device-classes} have {NOUP,NODOWN,NOIN,NOOUT} flags set; 1 host (1 osds) down; 1 rack (1 osds) down; Degraded data redundancy: 279/1464 objects degraded (19.057%), 86 pgs degraded, 210 pgs undersized
2023-07-13T10:50:00.842141364Z cluster 2023-07-13T10:50:00.000251+0000 mon.a (mon.0) 2278 : cluster [WRN] [WRN] OSD_FLAGS: 1 OSDs or CRUSH {nodes, device-classes} have {NOUP,NODOWN,NOIN,NOOUT} flags set
2023-07-13T10:50:00.842153981Z cluster 2023-07-13T10:50:00.000295+0000 mon.a (mon.0) 2279 : cluster [WRN]     rack rack0 has flags noout
2023-07-13T11:00:00.000327680Z debug 2023-07-13T10:59:59.998+0000 7f611f2cd700  0 log_channel(cluster) log [WRN] : Health detail: HEALTH_WARN 1 osds down; 1 OSDs or CRUSH {nodes, device-classes} have {NOUP,NODOWN,NOIN,NOOUT} flags set; 1 host (1 osds) down; 1 rack (1 osds) down; Degraded data redundancy: 279/1467 objects degraded (19.018%), 86 pgs degraded, 210 pgs undersized
2023-07-13T11:00:00.000327680Z debug 2023-07-13T10:59:59.998+0000 7f611f2cd700  0 log_channel(cluster) log [WRN] : [WRN] OSD_FLAGS: 1 OSDs or CRUSH {nodes, device-classes} have {NOUP,NODOWN,NOIN,NOOUT} flags set

I’ve prepared a lab and do the same test ( same version 4.12 ) but in my case the flag noout is not set

Comment 4 Raimund Sacherer 2023-08-14 11:36:12 UTC
Do we have any updates for this BZ?

Thank you, 

Raimund

Comment 5 Travis Nielsen 2023-08-14 22:30:10 UTC
The noout flag is expected to clear out after some timeout such as 30m. 
Santosh PTAL