Bug 2144435

Summary: cephadm rm-cluster sometimes gets stuck with Lock <num> not acquired on /run/cephadm/<fsid>.lock
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Vasishta <vashastr>
Component: CephadmAssignee: Adam King <adking>
Status: NEW --- QA Contact: Manisha Saini <msaini>
Severity: low Docs Contact:
Priority: unspecified    
Version: 5.3CC: cephqe-warriors, saraut
Target Milestone: ---   
Target Release: 8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Vasishta 2022-11-21 09:12:19 UTC
Description of problem:
'cephadm rm-cluster --force' gets stuck sometimes

Version-Release number of selected component (if applicable):
cephadm-16.2.10-69.el8cp.noarch

How reproducible:
Could reproduce in 4 nodes of the same cluster

Steps to Reproduce:
1. Configure cluster and try to purge (Tried without cephadm-ansible)

Actual results:
2022-11-21 07:19:14,420 7fa4a0c50b80 DEBUG Acquiring lock 140344940079760 on /run/cephadm/28e06588-5bfa-11ed-8f8a-004e013d8491.lock
2022-11-21 07:19:14,420 7fa4a0c50b80 DEBUG Lock 140344940079760 not acquired on /run/cephadm/28e06588-5bfa-11ed-8f8a-004e013d8491.lock, waiting 0.05 seconds ...

Expected results:
rm-cluster not to get stuck

Additional info:
Workaround is to remove /run/cephadm/<fsid>.lock and try rm-cluster again.