2027666 – [DR] CephBlockPool resources reports wrong mirroringStatus

Bug 2027666 - [DR] CephBlockPool resources reports wrong mirroringStatus

Summary: [DR] CephBlockPool resources reports wrong mirroringStatus

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenShift Data Foundation
Classification:	Red Hat Storage
Component:	ceph
Sub Component:
Version:	4.9
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	ODF 4.10.0
Assignee:	Ilya Dryomov
QA Contact:	Pratik Surve
Docs Contact:
URL:
Whiteboard:
Depends On:	2020618 2057414 2100519
Blocks:	2041446 2047249
TreeView+	depends on / blocked

Reported:	2021-11-30 11:29 UTC by Pratik Surve
Modified:	2023-08-09 16:37 UTC (History)
CC List:	18 users (show)
Fixed In Version:	4.10.0-171
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Clones:	2041446 2047249 (view as bug list)
Environment:
Last Closed:	2022-04-13 18:50:37 UTC
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2022:1372	0	None	None	None	2022-04-13 18:51:04 UTC

Description Pratik Surve 2021-11-30 11:29:56 UTC

Description of problem (please be detailed as possible and provide log
snippets):

[DR] CephBlockPool resources reports the wrong mirroringStatus


A version of all relevant components (if applicable):

ODF version:- 4.9.0-248.ci
OCP version:- 4.9.0-0.nightly-2021-11-12-222121

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?


Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
4

Can this issue reproducible?
yes

Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Deploy DR cluster
2. Deploy workloads
3. Perform failover
4. Delete the workload
5. check cephblockpool resource


Actual results:

  mirroringStatus:
    lastChecked: "2021-11-30T11:24:04Z"
    summary:
      daemon_health: OK
      health: ERROR
      image_health: ERROR
      states:
        error: 2
        replaying: 3



Expected results:


Additional info:

When we do rbd mirror image status for all rbd image we don't see any image in error state


bash-4.4$ for i in $(rbd ls -p ocs-storagecluster-cephblockpool); do rbd mirror image status ocs-storagecluster-cephblockpool/$i 2>/dev/null; done
2021-11-30T11:21:34.492+0000 7f6fb909c2c0 20 librbd::api::Image: list_images: list 0x7ffdfc8cc080
2021-11-30T11:21:34.495+0000 7f6fb909c2c0 20 librbd::api::Image: list_images_v2: io_ctx=0x7ffdfc8cc080
2021-11-30T11:21:34.496+0000 7f6fb909c2c0 20 librbd::api::Trash: list: list 0x7ffdfc8cc080
2021-11-30T11:21:34.496+0000 7f6fb909c2c0 20 librbd::api::Trash: list_trash_image_specs: list_trash_image_specs 0x7ffdfc8cc080
csi-vol-dummy-34c13019-232f-42ca-9102-9050ce1eea88:
  global_id:   8782ec3e-7b16-4d9e-9299-b115f40bafb0
  state:       up+replaying
  description: replaying, {"bytes_per_second":0.0,"bytes_per_snapshot":0.0,"local_snapshot_timestamp":1638271140,"remote_snapshot_timestamp":1638271200,"replay_state":"syncing","syncing_percent":30,"syncing_snapshot_timestamp":1638271200}
  service:     a on prsurve-vm-dev-v6775-worker-528v4
  last_update: 2021-11-30 11:21:08
  peer_sites:
    name: 34c13019-232f-42ca-9102-9050ce1eea88
    state: up+stopped
    description: local image is primary
    last_update: 2021-11-30 11:21:21
csi-vol-dummy-436a1e97-7d6a-41f2-8420-3cc2cdfae539:
  global_id:   230b088e-c2e7-450f-8e9b-ddf6b1ffee98
  state:       up+stopped
  description: local image is primary
  service:     a on prsurve-vm-dev-v6775-worker-528v4
  last_update: 2021-11-30 11:21:10
  peer_sites:
    name: 34c13019-232f-42ca-9102-9050ce1eea88
    state: up+replaying
    description: replaying, {"bytes_per_second":0.0,"bytes_per_snapshot":0.0,"local_snapshot_timestamp":1638271260,"remote_snapshot_timestamp":1638271260,"replay_state":"idle"}
    last_update: 2021-11-30 11:21:19
  snapshots:
    9424 .mirror.primary.230b088e-c2e7-450f-8e9b-ddf6b1ffee98.48108f1e-2ba0-4c97-b7d6-bfc5eb1ca0ee (peer_uuids:[e1fbbb13-a7bc-4c57-8128-e7a8a033b7b2])
test_1110:
  global_id:   24a3c8b6-e6e1-4739-9319-6408c3e6b38f
  state:       up+replaying
  description: replaying, {"bytes_per_second":0.0,"bytes_per_snapshot":0.0,"local_snapshot_timestamp":1638212565,"remote_snapshot_timestamp":1638212565,"replay_state":"idle"}
  service:     a on prsurve-vm-dev-v6775-worker-528v4
  last_update: 2021-11-30 11:21:08
  peer_sites:
    name: 34c13019-232f-42ca-9102-9050ce1eea88
    state: up+stopped
    description: local image is primary
    last_update: 2021-11-30 11:21:21

Comment 3 Sébastien Han 2021-11-30 14:47:13 UTC

The output comes from the command "rbd mirror pool status <poolName>" without any changes.
So if there is a difference we should move this to the "ceph" component.

Please move or close.
Thanks.

Comment 4 Sébastien Han 2021-12-06 09:19:30 UTC

Closing due to the inactivity, feel free to re-open if you have any concerns. If so do it under the "ceph" component if you feel someone needs to investigate further.
Thanks.

Comment 7 Elad 2021-12-19 09:04:22 UTC

Proposing as a blocker for 4.10.0 as this is definitely something we can't have for GA support level

Comment 34 Mudit Agarwal 2022-01-27 13:04:28 UTC

Sure Ilya, if this is fixed with BZ #2008587 then I can move this to ON_QA as well.

QE team, please retest with the latest ODF build which has the fix for BZ #2008587
If you can still see the problem then please re-open or raise a new BZ.

Comment 35 Mudit Agarwal 2022-01-27 13:06:39 UTC

Ilya,

As this bug is a blocker for 4.9.z TP we need the fix of BZ #2008587 to be backported to 5.0z as well as in 4.9.z we  are using 5.0z unless we get that QE can't test it in 4.9.z

Comment 55 krishnaram Karthick 2022-02-17 16:16:02 UTC

Moving to Assigned based on the comment#52

Comment 60 errata-xmlrpc 2022-04-13 18:50:37 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.10.0 enhancement, security & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1372

Note You need to log in before you can comment on or make changes to this bug.