Bug 2308348 - Add a Prometheus alert for detecting when the same RBD image is used in two or more namespaces within or across GW groups [NEEDINFO]
Summary: Add a Prometheus alert for detecting when the same RBD image is used in two o...
Keywords:
Status: VERIFIED
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: NVMeOF
Version: 8.0
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 8.1
Assignee: Vallari
QA Contact: Sunil Kumar Nagaraju
ceph-doc-bot
URL:
Whiteboard:
Depends On:
Blocks: 2317218
TreeView+ depends on / blocked
 
Reported: 2024-08-28 15:28 UTC by Rahul Lepakshi
Modified: 2025-04-21 06:34 UTC (History)
16 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
.Restrict adding namespaces with RBD images already assigned to other NVMe-oF gateway groups to avoid data corruption Currently, using the same Ceph Block Device (`rbd`) images in different NVMe-oF gateway groups may lead to unexpected results. As a workaround, avoid using the same Ceph Block Device images in multiple NVMe-oF gateway groups.
Clone Of:
Environment:
Last Closed:
Embargoed:
rlepaksh: needinfo? (aviv.caro)
rlepaksh: needinfo? (vaagrawa)


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHCEPH-9606 0 None None None 2024-08-28 15:29:00 UTC

Description Rahul Lepakshi 2024-08-28 15:28:01 UTC
Description of problem:
This might be tough to implement as RBD image metadata are part of different omap state files but we need to find a way to restrict creation of namespaces with RBD images which are already part of other GW groups as namespaces, as these volumes can be accessed by multiple clients causing data inconsistency. With huge scale, say 1K to 4K RBD images, it will be difficult for customer to keep track of used/unused images to create namespaces 
For now, it is decided to add an alert at 8.1. I raised https://bugzilla.redhat.com/show_bug.cgi?id=2359211 to implement actual resolution 

Version-Release number of selected component (if applicable):
ceph version 19.1.0-42.el9cp
cp.stg.icr.io/cp/ibm-ceph/nvmeof-rhel9:1.2.17-11


How reproducible: Always


Steps to Reproduce:
1. Deploy 4 nvmeof services with 4 GW groups having 2+ Gateways
2. Configure subsystems and add namespaces with a set of RBD images within a Gateway group
3. On other GW group, configure subsystems and add namespaces with same set of namespaces

Actual results: Same RBD images can be used to create namespaces in different GW groups


Expected results: Namespace addition should never succeed with RBD image already used by namespace in other GW group


Additional info:

Comment 3 Anon 2024-09-02 03:43:43 UTC
Can someone please help me with this then. Because I have more docs in my phone.

Comment 7 Vallari 2024-11-20 11:07:59 UTC
PR opened upstream to add a Prometheus alert for detecting when the same RBD image is used in two or more namespaces: https://github.com/ceph/ceph/pull/60777

Comment 8 Vallari 2025-01-16 11:13:20 UTC
Merged PR upstream: https://github.com/ceph/ceph/pull/60777 too add prometheus alert NVMeoFMultipleNamespacesOfRBDImage


Note You need to log in before you can comment on or make changes to this bug.