2259187 – [RDR] Osd migration has halted in between

Bug 2259187 - [RDR] Osd migration has halted in between

Summary: [RDR] Osd migration has halted in between

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenShift Data Foundation
Classification:	Red Hat Storage
Component:	rook
Sub Component:
Version:	4.15
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	ODF 4.15.0
Assignee:	Santosh Pillai
QA Contact:	Pratik Surve
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2024-01-19 11:52 UTC by Pratik Surve
Modified:	2024-03-19 15:31 UTC (History)
CC List:	4 users (show)
Fixed In Version:	4.15.0-125
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2024-03-19 15:31:44 UTC
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	red-hat-storage rook pull 557	None	open	Bug 2259187: osd: wait for the pgs to be clean	2024-01-23 00:48:08 UTC
Github	rook rook pull 13590	None	open	osd: fix operator reconcile during OSD migration	2024-01-19 11:58:32 UTC
Red Hat Product Errata	RHSA-2024:1383	None	None	None	2024-03-19 15:31:47 UTC

Description Pratik Surve 2024-01-19 11:52:36 UTC

Description of problem (please be detailed as possible and provide log
snippests):
[RDR] Osd migration has halted in between

Version of all relevant components (if applicable):

OCP version:- 4.15.0-0.nightly-2024-01-13-050900
ODF version:- 4.15.0-113
CEPH version:- ceph version 17.2.6-167.el9cp (5ef1496ea3e9daaa9788809a172bd5a1c3192cf7) quincy (stable)
ACM version:- 2.9.1
SUBMARINER version:- v0.16.2

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
yes

Is there any workaround available to the best of your knowledge?
Restart rook-operator pod

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?


Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1.Deploy Cluster without enabling it for RDR 
2.Write some data
3.Start Migation from UI


Actual results:
Migration of osd stopped in between 

storeType:
          bluestore: 2
          bluestore-rdr: 4

#cephstatus
  cluster:
    id:     303b3709-c0e9-4a3c-9352-923b1e9eef1f
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum d,e,f (age 25h)
    mgr: a(active, since 5d), standbys: b
    mds: 1/1 daemons up, 1 hot standby
    osd: 6 osds: 6 up (since 21h), 6 in (since 21h)
    rgw: 1 daemon active (1 hosts, 1 zones)
 
  data:
    volumes: 1/1 healthy
    pools:   12 pools, 169 pgs
    objects: 100.64k objects, 384 GiB
    usage:   1.1 TiB used, 11 TiB / 12 TiB avail
    pgs:     169 active+clean
 
  io:
    client:   305 MiB/s rd, 6.5 MiB/s wr, 598 op/s rd, 21 op/s wr
 


Expected results:
Migation should not stopped in between 

Additional info:

Must-gather:- Logs location: http://rhsqe-repo.lab.eng.blr.redhat.com/ocs4qe/pratik/bz/migation_stop/jan19/19-01-2024_11-58-17

Comment 6 errata-xmlrpc 2024-03-19 15:31:44 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.15.0 security, enhancement, & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:1383

Note You need to log in before you can comment on or make changes to this bug.