Bug 1983961
| Summary: | [ROKS] ODF/OCP - Rebuilding Data resiliency shows 99% for long time | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Elvir Kuric <ekuric> |
| Component: | ceph | Assignee: | Travis Nielsen <tnielsen> |
| Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Raz Tamir <ratamir> |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.7 | CC: | afrahman, assingh, bniver, jefbrown, madam, MCGINNES, muagarwa, nojha, ocs-bugs, odf-bz-bot, tnielsen |
| Target Milestone: | --- | Flags: | tnielsen:
needinfo?
(ekuric) muagarwa: needinfo? (ekuric) |
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-10-06 09:57:57 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
What is the impact of this taking so long? e.g if other nodes went down during this period do we risk any data/availability loss? Form UI perspective, the message looks good.
Your Pgs are:
pgs: 655 active+clean
1 active+clean+scrubbing+deep+repair
Resiliency = active+clean/total = 655/656
We should move this BZ to operator/ceph. They can invenstigate it better.
Since one PG is still not active+clean, the recovery is still in progress:
pgs: 655 active+clean
1 active+clean+scrubbing+deep+repair
Can you collect a must-gather on the cluster? Ceph would need more details to troubleshoot.
(In reply to Travis Nielsen from comment #6) > Since one PG is still not active+clean, the recovery is still in progress: > > pgs: 655 active+clean > 1 active+clean+scrubbing+deep+repair You should able to run I/O against a PG, which has a state active+*something*. In this case, the PG is in active+clean+scrubbing+deep+repair, which means one of the following: 1. during a deep-scrub there were issues that were found which needed a repair (hence "+repair") 2. we are seeing this cosmetic issue, tracked in https://tracker.ceph.com/issues/50446, which puts a PG in active+clean+scrubbing+deep+repair state instead of active+clean+scrubbing+deep, when a deep-scrub is run on a PG You can set nodeep-scrub in the cluster during the tests to rule either of the above. > > Can you collect a must-gather on the cluster? Ceph would need more details > to troubleshoot. Thank you Travis and Neha!
setting
# ceph osd noscrup nodeep-scrub
helps it is shown faster "Rebuilding data resiliency" achieve 100%.
After additional tests we figured out the following:
1. node is replaced, eg command :
ibmcloud ks worker replace --worker kube-c3hidk8d0cktobbskgr0-dmodff1-default-00000f73 --cluster dm-odff1
2. OSDs will be down
cluster:
id: 5506601c-7254-498c-aeea-d9331b4be16e
health: HEALTH_WARN
noscrub,nodeep-scrub flag(s) set
3 osds down
1 OSDs or CRUSH {nodes, device-classes} have {NOUP,NODOWN,NOIN,NOOUT} flags set
3 hosts (3 osds) down
Degraded data redundancy: 5126099/40375602 objects degraded (12.696%), 275 pgs degraded, 262 pgs undersized
services:
mon: 3 daemons, quorum a,b,d (age 5h)
mgr: a(active, since 10m)
mds: ocs-storagecluster-cephfilesystem:1 {0=ocs-storagecluster-cephfilesystem-b=up:active} 1 up:standby-replay
osd: 24 osds: 21 up (since 6m), 24 in (since 14h); 9 remapped pgs
flags noscrub,nodeep-scrub
rgw: 1 daemon active (ocs.storagecluster.cephobjectstore.a)
data:
pools: 10 pools, 656 pgs
objects: 13.46M objects, 51 TiB
usage: 134 TiB used, 194 TiB / 328 TiB avail
pgs: 5126099/40375602 objects degraded (12.696%)
364 active+clean
237 active+undersized+degraded
28 active+recovery_wait+degraded
16 active+undersized
8 active+recovery_wait+undersized+degraded+remapped
1 active+recovering+degraded
1 active+recovering+undersized+degraded+remapped
1 active+recovery_wait
io:
client: 29 KiB/s rd, 9.2 MiB/s wr, 5 op/s rd, 13 op/s wr
recovery: 47 MiB/s, 11 objects/s
3. after some time , now will start and OSDs will up up
cluster:
id: 5506601c-7254-498c-aeea-d9331b4be16e
health: HEALTH_WARN
noscrub,nodeep-scrub flag(s) set
1 OSDs or CRUSH {nodes, device-classes} have {NOUP,NODOWN,NOIN,NOOUT} flags set
Degraded data redundancy: 35856/40380021 objects degraded (0.089%), 203 pgs degraded
services:
mon: 3 daemons, quorum a,b,d (age 5h)
mgr: a(active, since 23m)
mds: ocs-storagecluster-cephfilesystem:1 {0=ocs-storagecluster-cephfilesystem-b=up:active} 1 up:standby-replay
osd: 24 osds: 24 up (since 9s), 24 in (since 15h); 197 remapped pgs
flags noscrub,nodeep-scrub
rgw: 1 daemon active (ocs.storagecluster.cephobjectstore.a)
data:
pools: 10 pools, 656 pgs
objects: 13.46M objects, 51 TiB
usage: 154 TiB used, 221 TiB / 375 TiB avail
pgs: 35856/40380021 objects degraded (0.089%)
450 active+clean
197 active+recovery_wait+undersized+degraded+remapped
5 active+recovery_wait+degraded
2 active+recovery_wait
1 active+recovering
1 active+recovering+degraded
io:
client: 6.7 MiB/s wr, 0 op/s rd, 7 op/s wr
recovery: 134 B/s, 10 objects/s
Wed-2021-07-28-10-52-06 ceph -s status
4. it will take time till all these "objects degraded" are fixed, with noscrub, nodeep-scrup this is faster.
We found that after test with node replace, OSDs will not spread equally across nodes
$ oc get pods -n openshift-storage -o wide |grep osd
rook-ceph-osd-0-fd95fb959-9dqwx 2/2 Running 1 5d21h 172.17.99.137 10.240.64.10 <none> <none>
rook-ceph-osd-1-d45fbd654-tt4dk 2/2 Running 1 5d21h 172.17.99.140 10.240.64.10 <none> <none>
rook-ceph-osd-12-5d9cbdc748-8f846 2/2 Running 1 5d22h 172.17.99.138 10.240.64.10 <none> <none>
rook-ceph-osd-8-78c8d778f9-tcjvp 2/2 Running 1 5d21h 172.17.99.139 10.240.64.10 <none> <none>
rook-ceph-osd-10-78647675cf-cwh4h 2/2 Running 0 5d19h 172.17.89.109 10.240.128.12 <none> <none>
rook-ceph-osd-11-7f88545866-72s2x 2/2 Running 0 5d21h 172.17.89.74 10.240.128.12 <none> <none>
rook-ceph-osd-19-5675f5d695-d7g4m 2/2 Running 0 5d21h 172.17.89.73 10.240.128.12 <none> <none>
rook-ceph-osd-3-5656f8598f-jph8k 2/2 Running 0 15h 172.17.89.76 10.240.128.12 <none> <none>
rook-ceph-osd-4-547ff5697f-bc2xm 2/2 Running 0 5d21h 172.17.89.75 10.240.128.12 <none> <none>
rook-ceph-osd-9-5fbb9c9d9c-lkwgs 2/2 Running 0 15h 172.17.89.111 10.240.128.12 <none> <none>
rook-ceph-osd-13-7574d8f5c-qjfj9 2/2 Running 3 5d19h 172.17.104.9 10.240.0.19 <none> <none>
rook-ceph-osd-14-7d6858967f-x9jsq 2/2 Running 0 5d19h 172.17.104.12 10.240.0.19 <none> <none>
rook-ceph-osd-21-6474d6c979-xc6p2 2/2 Running 1 5d19h 172.17.104.11 10.240.0.19 <none> <none>
rook-ceph-osd-22-57b89578f6-gdvz4 2/2 Running 3 5d19h 172.17.104.10 10.240.0.19 <none> <none>
rook-ceph-osd-15-76fbdc74f4-62qqc 2/2 Running 0 5d21h 172.17.88.144 10.240.64.9 <none> <none>
rook-ceph-osd-17-7998f86fc6-rsmmb 2/2 Running 1 5d21h 172.17.88.142 10.240.64.9 <none> <none>
rook-ceph-osd-6-6647966dd6-d6bcz 2/2 Running 2 5d21h 172.17.88.140 10.240.64.9 <none> <none>
rook-ceph-osd-7-7b758d7bb9-x9xgn 2/2 Running 0 5d21h 172.17.88.143 10.240.64.9 <none> <none>
rook-ceph-osd-16-6f5c8f7548-5ssps 2/2 Running 3 6d22h 172.17.88.79 10.240.0.17 <none> <none>
rook-ceph-osd-18-7b87f6c669-24mlb 2/2 Running 3 6d22h 172.17.88.78 10.240.0.17 <none> <none>
rook-ceph-osd-20-6957574bc9-79mdm 2/2 Running 1 6d22h 172.17.88.80 10.240.0.17 <none> <none>
rook-ceph-osd-23-85f6c89956-kvsh4 2/2 Running 2 5d19h 172.17.88.94 10.240.0.17 <none> <none>
rook-ceph-osd-2-8fcdd546f-4qqjn 2/2 Running 0 15h 172.17.113.137 10.240.128.14 <none> <none>
rook-ceph-osd-5-756f9cd4ff-nvctd 2/2 Running 0 15h 172.17.113.136 10.240.128.14 <none> <none>
You can see that node "10.240.128.12" got 2 more OSDs ( if we delete these newest ones on 10.240.128.12 , they will move to 10.240.128.14.
In second attempt output was
rook-ceph-osd-0-fd95fb959-28hdw 2/2 Running 0 85m 172.17.88.155 10.240.64.9 <none> <none>
rook-ceph-osd-15-76fbdc74f4-62qqc 2/2 Running 0 6d1h 172.17.88.144 10.240.64.9 <none> <none>
rook-ceph-osd-17-7998f86fc6-rsmmb 2/2 Running 1 6d1h 172.17.88.142 10.240.64.9 <none> <none>
rook-ceph-osd-6-6647966dd6-d6bcz 2/2 Running 3 6d1h 172.17.88.140 10.240.64.9 <none> <none>
rook-ceph-osd-7-7b758d7bb9-x9xgn 2/2 Running 1 6d1h 172.17.88.143 10.240.64.9 <none> <none>
rook-ceph-osd-1-d45fbd654-ll4qx 2/2 Running 0 85m 172.17.103.73 10.240.64.11 <none> <none>
rook-ceph-osd-12-5d9cbdc748-lmzhw 2/2 Running 0 85m 172.17.103.74 10.240.64.11 <none> <none>
rook-ceph-osd-8-78c8d778f9-xs97w 2/2 Running 1 85m 172.17.103.72 10.240.64.11 <none> <none>
rook-ceph-osd-10-78647675cf-cwh4h 2/2 Running 0 5d23h 172.17.89.109 10.240.128.12 <none> <none>
rook-ceph-osd-11-7f88545866-72s2x 2/2 Running 0 6d 172.17.89.74 10.240.128.12 <none> <none>
rook-ceph-osd-19-5675f5d695-d7g4m 2/2 Running 0 6d 172.17.89.73 10.240.128.12 <none> <none>
rook-ceph-osd-4-547ff5697f-bc2xm 2/2 Running 0 6d 172.17.89.75 10.240.128.12 <none> <none>
rook-ceph-osd-13-7574d8f5c-qjfj9 2/2 Running 3 5d23h 172.17.104.9 10.240.0.19 <none> <none>
rook-ceph-osd-14-7d6858967f-x9jsq 2/2 Running 0 5d23h 172.17.104.12 10.240.0.19 <none> <none>
rook-ceph-osd-21-6474d6c979-xc6p2 2/2 Running 1 5d23h 172.17.104.11 10.240.0.19 <none> <none>
rook-ceph-osd-22-57b89578f6-gdvz4 2/2 Running 3 5d23h 172.17.104.10 10.240.0.19 <none> <none>
rook-ceph-osd-16-6f5c8f7548-5ssps 2/2 Running 4 7d1h 172.17.88.79 10.240.0.17 <none> <none>
rook-ceph-osd-18-7b87f6c669-24mlb 2/2 Running 4 7d1h 172.17.88.78 10.240.0.17 <none> <none>
rook-ceph-osd-20-6957574bc9-79mdm 2/2 Running 1 7d1h 172.17.88.80 10.240.0.17 <none> <none>
rook-ceph-osd-23-85f6c89956-kvsh4 2/2 Running 2 5d23h 172.17.88.94 10.240.0.17 <none> <none>
rook-ceph-osd-2-8fcdd546f-4qqjn 2/2 Running 0 18h 172.17.113.137 10.240.128.14 <none> <none>
rook-ceph-osd-3-5656f8598f-zmx9k 2/2 Running 0 95m 172.17.113.145 10.240.128.14 <none> <none>
rook-ceph-osd-5-756f9cd4ff-nvctd 2/2 Running 0 18h 172.17.113.136 10.240.128.14 <none> <none>
rook-ceph-osd-9-5fbb9c9d9c-r8rs9 2/2 Running 0 95m 172.17.113.143 10.240.128.14 <none> <none>
Is it real to expect that after replacing node ( ibmcloud ks worker replace --worker kube-c3hidk8d0cktobbskgr0-dmodff1-default-00000f73 --cluster dm-odff1 ) new nodes get same number of OSDs and that OSDs are equally balanced across nodes?
Not a 4.8 blocker A few questions: 1. Please show the output of "ceph osd tree" from the toolbox to show if there is any other hierarchy than the nodes 2. You followed the docs to purge the old OSDs from Ceph, right? For example, ceph osd tree should not still show the osds from the bad node 3. Please share the CephCluster CR. This will show the topology spread constraints and other settings used to create the new OSDs. This might be related to an OCS bug that was fixed recently for spreading OSDs evenly across racks, though I'll have to look for that BZ another day... `. see [1]
2. only "ibmcloud ks worker replace < worker id >" is executed from machine ( laptop ) which rights to run commands against cluster on IBM cloud, then it does in background where it replaces machine in OCP cluster.
3. [2]
[1]
# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 375.00000 root default
-5 375.00000 region us-south
-30 125.00000 zone us-south-1
-55 15.62500 host ocs-deviceset-0-data-0mfz2p
23 hdd 15.62500 osd.23 up 1.00000 1.00000
-57 15.62500 host ocs-deviceset-0-data-4tb4cl
22 hdd 15.62500 osd.22 up 1.00000 1.00000
-29 15.62500 host ocs-deviceset-0-data-6mbqx6
13 hdd 15.62500 osd.13 up 1.00000 1.00000
-47 15.62500 host ocs-deviceset-1-data-39j6rg
18 hdd 15.62500 osd.18 up 1.00000 1.00000
-41 15.62500 host ocs-deviceset-1-data-4265dj
16 hdd 15.62500 osd.16 up 1.00000 1.00000
-45 15.62500 host ocs-deviceset-2-data-1ljbq8
21 hdd 15.62500 osd.21 up 1.00000 1.00000
-37 15.62500 host ocs-deviceset-2-data-2gnr96
14 hdd 15.62500 osd.14 up 1.00000 1.00000
-49 15.62500 host ocs-deviceset-2-data-7gbks9
20 hdd 15.62500 osd.20 up 1.00000 1.00000
-4 125.00000 zone us-south-2
-3 15.62500 host ocs-deviceset-0-data-26kq8b
0 hdd 15.62500 osd.0 up 1.00000 1.00000
-9 15.62500 host ocs-deviceset-0-data-5bxw5z
1 hdd 15.62500 osd.1 up 1.00000 1.00000
-39 15.62500 host ocs-deviceset-1-data-0dkqxr
15 hdd 15.62500 osd.15 up 1.00000 1.00000
-51 15.62500 host ocs-deviceset-1-data-2hwzcm
8 hdd 15.62500 osd.8 up 1.00000 1.00000
-27 15.62500 host ocs-deviceset-1-data-65q279
7 hdd 15.62500 osd.7 up 1.00000 1.00000
-17 15.62500 host ocs-deviceset-2-data-0vqtjj
6 hdd 15.62500 osd.6 up 1.00000 1.00000
-35 15.62500 host ocs-deviceset-2-data-4d2sp6
12 hdd 15.62500 osd.12 up 1.00000 1.00000
-53 15.62500 host ocs-deviceset-2-data-6x467v
17 hdd 15.62500 osd.17 up 1.00000 1.00000
-12 125.00000 zone us-south-3
-43 15.62500 host ocs-deviceset-0-data-1bxjfw
19 hdd 15.62500 osd.19 up 1.00000 1.00000
-19 15.62500 host ocs-deviceset-0-data-3gsspv
2 hdd 15.62500 osd.2 up 1.00000 1.00000
-33 15.62500 host ocs-deviceset-0-data-7cljwc
11 hdd 15.62500 osd.11 up 1.00000 1.00000
-25 15.62500 host ocs-deviceset-1-data-12dvjw
3 hdd 15.62500 osd.3 up 1.00000 1.00000
-23 15.62500 host ocs-deviceset-1-data-575th8
9 hdd 15.62500 osd.9 up 1.00000 1.00000
-11 15.62500 host ocs-deviceset-1-data-7jvrkg
4 hdd 15.62500 osd.4 up 1.00000 1.00000
-21 15.62500 host ocs-deviceset-2-data-3ddvnb
10 hdd 15.62500 osd.10 up 1.00000 1.00000
-15 15.62500 host ocs-deviceset-2-data-59qvrt
5 hdd 15.62500 osd.5 up 1.00000 1.0000
[2]
$ oc get storagecluster -n openshift-storage -o yaml
apiVersion: v1
items:
- apiVersion: ocs.openshift.io/v1
kind: StorageCluster
metadata:
annotations:
uninstall.ocs.openshift.io/cleanup-policy: delete
uninstall.ocs.openshift.io/mode: graceful
creationTimestamp: "2021-07-06T08:56:17Z"
finalizers:
- storagecluster.ocs.openshift.io
generation: 3
managedFields:
- apiVersion: ocs.openshift.io/v1
fieldsType: FieldsV1
fieldsV1:
f:spec:
.: {}
f:encryption: {}
f:externalStorage: {}
f:monPVCTemplate:
.: {}
f:metadata: {}
f:spec:
.: {}
f:accessModes: {}
f:resources:
.: {}
f:requests: {}
f:storageClassName: {}
f:volumeMode: {}
f:status: {}
manager: manager
operation: Update
time: "2021-07-06T08:56:17Z"
- apiVersion: ocs.openshift.io/v1
fieldsType: FieldsV1
fieldsV1:
f:spec:
f:monPVCTemplate:
f:spec:
f:resources:
f:requests:
f:storage: {}
manager: oc
operation: Update
time: "2021-07-06T09:32:58Z"
- apiVersion: ocs.openshift.io/v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
f:uninstall.ocs.openshift.io/cleanup-policy: {}
f:uninstall.ocs.openshift.io/mode: {}
f:finalizers: {}
f:spec:
f:arbiter: {}
f:encryption:
f:kms: {}
f:managedResources:
.: {}
f:cephBlockPools: {}
f:cephConfig: {}
f:cephFilesystems: {}
f:cephObjectStoreUsers: {}
f:cephObjectStores: {}
f:storageDeviceSets: {}
f:version: {}
f:status:
.: {}
f:conditions: {}
f:failureDomain: {}
f:failureDomainKey: {}
f:failureDomainValues: {}
f:images:
.: {}
f:ceph:
.: {}
f:actualImage: {}
f:desiredImage: {}
f:noobaaCore:
.: {}
f:actualImage: {}
f:desiredImage: {}
f:noobaaDB:
.: {}
f:actualImage: {}
f:desiredImage: {}
f:nodeTopologies:
.: {}
f:labels:
.: {}
f:kubernetes.io/hostname: {}
f:topology.kubernetes.io/region: {}
f:topology.kubernetes.io/zone: {}
f:phase: {}
f:relatedObjects: {}
manager: ocs-operator
operation: Update
time: "2021-07-06T09:59:57Z"
name: ocs-storagecluster
namespace: openshift-storage
resourceVersion: "21454329"
selfLink: /apis/ocs.openshift.io/v1/namespaces/openshift-storage/storageclusters/ocs-storagecluster
uid: 143ed9e8-cd7a-4242-9108-c95bdda70106
spec:
arbiter: {}
encryption:
kms: {}
externalStorage: {}
managedResources:
cephBlockPools: {}
cephConfig: {}
cephFilesystems: {}
cephObjectStoreUsers: {}
cephObjectStores: {}
monPVCTemplate:
metadata: {}
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 25Gi
storageClassName: ibmc-vpc-block-metro-10iops-tier
volumeMode: Filesystem
status: {}
storageDeviceSets:
- config: {}
count: 8
dataPVCTemplate:
metadata: {}
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 16000Gi
storageClassName: ibmc-vpc-block-metro-3iops-tier
volumeMode: Block
status: {}
name: ocs-deviceset
placement: {}
portable: true
preparePlacement: {}
replica: 3
resources: {}
version: 4.7.0
status:
conditions:
- lastHeartbeatTime: "2021-08-04T19:11:24Z"
lastTransitionTime: "2021-08-04T16:11:30Z"
message: Reconcile completed successfully
reason: ReconcileCompleted
status: "True"
type: ReconcileComplete
- lastHeartbeatTime: "2021-08-04T19:11:24Z"
lastTransitionTime: "2021-07-30T20:04:18Z"
message: Reconcile completed successfully
reason: ReconcileCompleted
status: "True"
type: Available
- lastHeartbeatTime: "2021-08-04T19:11:24Z"
lastTransitionTime: "2021-07-30T20:04:18Z"
message: Reconcile completed successfully
reason: ReconcileCompleted
status: "False"
type: Progressing
- lastHeartbeatTime: "2021-08-04T19:11:24Z"
lastTransitionTime: "2021-07-30T20:04:18Z"
message: Reconcile completed successfully
reason: ReconcileCompleted
status: "False"
type: Degraded
- lastHeartbeatTime: "2021-08-04T19:11:24Z"
lastTransitionTime: "2021-07-30T20:04:18Z"
message: Reconcile completed successfully
reason: ReconcileCompleted
status: "True"
type: Upgradeable
failureDomain: zone
failureDomainKey: topology.kubernetes.io/zone
failureDomainValues:
- us-south-3
- us-south-2
- us-south-1
images:
ceph:
actualImage: registry.redhat.io/rhceph/rhceph-4-rhel8@sha256:725f93133acc0fb1ca845bd12e77f20d8629cad0e22d46457b2736578698eb6c
desiredImage: registry.redhat.io/rhceph/rhceph-4-rhel8@sha256:725f93133acc0fb1ca845bd12e77f20d8629cad0e22d46457b2736578698eb6c
noobaaCore:
actualImage: registry.redhat.io/ocs4/mcg-core-rhel8@sha256:6ff8645efdde95fa97d496084d3555b7680895f0b79c147f2a880b43742af3a4
desiredImage: registry.redhat.io/ocs4/mcg-core-rhel8@sha256:6ff8645efdde95fa97d496084d3555b7680895f0b79c147f2a880b43742af3a4
noobaaDB:
actualImage: registry.redhat.io/rhel8/postgresql-12@sha256:f486bbe07f1ddef166bab5a2a6bdcd0e63e6e14d15b42d2425762f83627747bf
desiredImage: registry.redhat.io/rhel8/postgresql-12@sha256:f486bbe07f1ddef166bab5a2a6bdcd0e63e6e14d15b42d2425762f83627747bf
nodeTopologies:
labels:
kubernetes.io/hostname:
- 10.240.128.7
- 10.240.64.4
- 10.240.64.5
- 10.240.0.6
- 10.240.0.7
- 10.240.128.6
- 10.240.0.12
- 10.240.0.13
- 10.240.128.10
- 10.240.0.14
- 10.240.128.11
- 10.240.0.15
- 10.240.0.17
- 10.240.64.10
- 10.240.64.9
- 10.240.128.12
- 10.240.128.13
- 10.240.0.19
- 10.240.128.14
- 10.240.64.11
- 10.240.64.12
topology.kubernetes.io/region:
- us-south
topology.kubernetes.io/zone:
- us-south-3
- us-south-2
- us-south-1
phase: Ready
relatedObjects:
- apiVersion: ceph.rook.io/v1
kind: CephCluster
name: ocs-storagecluster-cephcluster
namespace: openshift-storage
resourceVersion: "21453982"
uid: 83035032-0845-40f7-8e40-a3caa0de138d
- apiVersion: noobaa.io/v1alpha1
kind: NooBaa
name: noobaa
namespace: openshift-storage
resourceVersion: "21454327"
uid: bc2a2fce-e507-41f4-8543-233169a292b0
kind: List
metadata:
resourceVersion: ""
selfLink: ""
Could you also share the CephCluster CR in addition to the StorageCluster CR? The CephCluster would have the topology spread constraints to show exactly how the OSDs are expected to be spread across hosts. Or if the TSCs are only specified for zones, then this explains why the hosts are not evenly spread. At least the OSDs are spread evenly across zones as expected. (In reply to Travis Nielsen from comment #12) > Could you also share the CephCluster CR in addition to the StorageCluster this is roks installation - do you mind to write commands you want me to run on this cluster. I do not have CephCluster CR as cluster is installed via addon > CR? The CephCluster would have the topology spread constraints to show > exactly how the OSDs are expected to be spread across hosts. Or if the TSCs > are only specified for zones, then this explains why the hosts are not > evenly spread. At least the OSDs are spread evenly across zones as expected. Also, from #11 we see below [1] , cluster originally has 6 nodes, but in storagecluster configuration it preserves old nodes once --replace is issued and new node is created. Per [1] one can conclude that we have many nodes ( but there are actually only 6 ) and all other were at some point in time cluster member but not any more. --- [1] kubernetes.io/hostname: - 10.240.128.7 - 10.240.64.4 - 10.240.64.5 - 10.240.0.6 - 10.240.0.7 - 10.240.128.6 - 10.240.0.12 - 10.240.0.13 - 10.240.128.10 - 10.240.0.14 - 10.240.128.11 - 10.240.0.15 - 10.240.0.17 - 10.240.64.10 - 10.240.64.9 - 10.240.128.12 - 10.240.128.13 - 10.240.0.19 - 10.240.128.14 - 10.240.64.11 - 10.240.64.12 --- The CephCluster CR can be retrieved with this: oc -n openshift-storage get cephcluster -o yaml Please reopen once we have all the details. |
Description of problem: Panel "Data Resiliency" in OCP web interface shows "Rebuilding data resiliency 99% " for very long time. Version-Release number of selected component (if applicable): $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.7.16 True False 14d Cluster version is 4.7.16 [elvir@makina datapresent]$ oc get storagecluster -n openshift-storage NAME AGE PHASE EXTERNAL CREATED AT VERSION ocs-storagecluster 14d Ready 2021-07-06T08:56:17Z 4.7.0 How reproducible: always 2 of 2 tries . Steps to Reproduce: 1. Write 50 TBs of data to ceph storage backend ( available storage is 100 TB so storage is not full ). We believe this is visible with smaller data set too. 2. Replace one of the nodes in the cluster (Delete old node, create a new node) - via (ibmcloud ks worker replace < worker id > ) from command line 3. Monitor Data Resiliency in console - has be stuck on 99% for over 3 hours now - please check graph . Actual results: Data Resiliency in console - has be stuck on 99% for over 3 hours now - please check graph Expected results: Data Resiliency to finish faster Additional info: ODF/OCP cluster was OK - possible to use it. It is unclear why web console was reporting that status for long time even ODF cluster was up and HEALTHY for hours prior and after test. ceph cluster was HEALTHY when OCP web console was reporting above status. $ ceph -s cluster: id: 5506601c-7254-498c-aeea-d9331b4be16e health: HEALTH_OK services: mon: 3 daemons, quorum a,b,d (age 23h) mgr: a(active, since 21h) mds: ocs-storagecluster-cephfilesystem:1 {0=ocs-storagecluster-cephfilesystem-b=up:active} 1 up:standby-replay osd: 24 osds: 24 up (since 16h), 24 in (since 16h) rgw: 1 daemon active (ocs.storagecluster.cephobjectstore.a) data: pools: 10 pools, 656 pgs objects: 12.59M objects, 48 TiB usage: 144 TiB used, 231 TiB / 375 TiB avail pgs: 655 active+clean 1 active+clean+scrubbing+deep+repair io: client: 938 B/s rd, 12 KiB/s wr, 2 op/s rd, 1 op/s wr