Bug 2283976

Summary: Abort nvmeof deployment when tried to deploy nvmeof on node running nvmeof service already
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: harika chebrolu <hchebrol>
Component: CephadmAssignee: Adam King <adking>
Status: VERIFIED --- QA Contact: harika chebrolu <hchebrol>
Severity: urgent Docs Contact: Rivka Pollack <rpollack>
Priority: unspecified    
Version: 7.1CC: akane, bhkaur, cephqe-warriors, mobisht, msaini, rlepaksh, rpollack, tserlin
Target Milestone: ---   
Target Release: 8.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ceph-19.2.1-121.el9cp Doc Type: Bug Fix
Doc Text:
.NVMe-oF deployment no longer stops when deploying on a node with an existing NVMe-oF service Previously, there was no built-in restriction in Cephadm to prevent the deployment of NVMe-oF services, such as gateway groups, multiple times on the same host. As a result, this would sometimes lead to deployment and high availability issues, as previously deployed services on the nodes were disrupted. With this fix, restrictions are in place, and deploying the same gateway nodes across multiple gateway groups is not supported. Gateway entities, such as subsystems and namespaces, remain intact, and the `nvme-statemap` is preserved, as expected.
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2317218    
Bug Blocks: 2351689, 2267614, 2298578, 2298579    

Description harika chebrolu 2024-05-30 11:36:30 UTC
Description of problem:

NVMe-of service was getting created when we try to create service using different pools.

Version-Release number of selected component (if applicable):

version : cp.stg.icr.io/cp/ibm-ceph/nvmeof-rhel9:1.2.13-2

How reproducible:


Steps to Reproduce:
1. Create a pool and service on gw nodes.
2. Create different pool and apply service using it in client node.

Actual results:

2 services are getting created.

Expected results:
As we do not support 2 daemons running on the same node, but what if mistakenly happens at customer(which we do not want to) , we want the GW code to abort deployment with different pool name ( cuz we may apply again with same pool_name for various reasons like GW scaling) with a error message if a node already has a nvmeof service.


Additional info:

Comment 1 Aviv Caro 2024-05-30 12:32:44 UTC
The issue here is restricting deploying nvmeof service on more than 1 pool for 7.1. We need to see how to restrict this for 7.1.z. For 7.1 we will add this to the RN.