2291163 – On the scale cluster, rm -rf on the NFS mount point is stuck indefinitely when cluster is filled around 90% <eom>

Bug 2291163 - On the scale cluster, rm -rf on the NFS mount point is stuck indefinitely when cluster is filled around 90% <eom>

Summary: On the scale cluster, rm -rf on the NFS mount point is stuck indefinitely whe...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	CephFS
Sub Component:
Version:	7.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	8.1
Assignee:	Kotresh HR
QA Contact:	Manisha Saini
Docs Contact:	Rivka Pollack
URL:
Whiteboard:
Depends On:
Blocks:	2351689
TreeView+	depends on / blocked

Reported:	2024-06-10 13:03 UTC by Manisha Saini
Modified:	2025-06-28 01:45 UTC (History)
CC List:	12 users (show)
Fixed In Version:	ceph-19.2.1-21.el9cp
Doc Type:	Bug Fix
Doc Text:	.Async write deadlock fixed under OSD full conditions Previously, when asynchronous writes were ongoing and the OSD became full, the client received a notification to cancel the writes. The cancellation method and the callback invoked after the write was canceled both attempted to acquire the same lock. As a result, this led to a deadlock, causing the client to hang indefinitely during an OSD full scenario. With this fix, the deadlock in the client code has been resolved. Consequently, asynchronous writes during an OSD full scenario no longer cause the client to hang.
Clone Of:
Environment:
Last Closed:	2025-06-26 12:13:12 UTC
Embargoed:
Dependent Products:
Flags:	khiremat: needinfo- khiremat: needinfo- gfarnum: needinfo? (spunadik) gfarnum: needinfo- khiremat: needinfo-

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Ceph Project Bug Tracker	68641	None	None	None	2025-03-07 08:37:29 UTC
Red Hat Issue Tracker	RHCEPH-9154	None	None	None	2024-06-10 13:03:32 UTC
Red Hat Product Errata	RHSA-2025:9775	None	None	None	2025-06-26 12:13:21 UTC

Description Manisha Saini 2024-06-10 13:03:00 UTC

Description of problem:
============

Test setup Details -
2000 NFS exports mapped on 2000 subvolume in the backend.Mounted the exports on 100 clients ( 1 client has 20 mounts) via v4.1 protocol.
Ran FIO in parallel on all the exports from 100 clients. Post FIO, cleanup was performed by performing rm -rf /mnt/<nfs_mount_point>/*.

The mount point were 100% filled.

While the cleanup was running, the rm -rf operation has been hung for over 12 hours and remains in this state. Given this situation, what potential solutions can be implemented to resolve this issue, aside from adding additional OSDs?

On checking the logs, observed that the NFS container died and restarted in ganesha.log twice while cleanup was running.

Ceph health status--
=======

[ceph: root@cali013 /]# ceph -s
  cluster:
    id:     4e687a60-638e-11ee-8772-b49691cee574
    health: HEALTH_WARN
            19 backfillfull osd(s)
            11 nearfull osd(s)
            9 pool(s) backfillfull

  services:
    mon: 1 daemons, quorum cali013 (age 4d)
    mgr: cali013.qakwdk(active, since 4d), standbys: cali016.rhribl, cali015.hvvbwh
    mds: 1/1 daemons up, 1 standby
    osd: 35 osds: 35 up (since 4d), 35 in (since 6w)
    rgw: 2 daemons active (2 hosts, 1 zones)

  data:
    volumes: 1/1 healthy
    pools:   9 pools, 1233 pgs
    objects: 6.74M objects, 26 TiB
    usage:   77 TiB used, 9.1 TiB / 86 TiB avail
    pgs:     1233 active+clean

  io:
    client:   170 B/s rd, 0 op/s rd, 0 op/s wr


[ceph: root@cali013 /]# ceph df
--- RAW STORAGE ---
CLASS    SIZE    AVAIL    USED  RAW USED  %RAW USED
hdd    44 TiB  4.7 TiB  39 TiB    39 TiB      89.29
ssd    42 TiB  4.4 TiB  38 TiB    38 TiB      89.60
TOTAL  86 TiB  9.1 TiB  77 TiB    77 TiB      89.44

--- POOLS ---
POOL                 ID   PGS   STORED  OBJECTS     USED  %USED  MAX AVAIL
.mgr                  1     1  7.1 MiB        3   21 MiB      0     91 GiB
.rgw.root             2    32  2.6 KiB        6   72 KiB      0     91 GiB
default.rgw.log       3    32  3.6 KiB      209  408 KiB      0     91 GiB
default.rgw.control   4    32      0 B        8      0 B      0     91 GiB
default.rgw.meta      5    32  1.6 KiB        3   28 KiB      0     91 GiB
rbd                   8    32     19 B        1   12 KiB      0     91 GiB
.nfs                  9    32  1.1 MiB    2.00k   24 MiB      0     91 GiB
cephfs.cephfs.meta   94    16  263 MiB    4.09k  789 MiB   0.28     91 GiB
cephfs.cephfs.data   95  1024   26 TiB    6.73M   77 TiB  99.65     91 GiB



[ceph: root@cali013 /]# ceph health detail
HEALTH_WARN 19 backfillfull osd(s); 11 nearfull osd(s); 9 pool(s) backfillfull
[WRN] OSD_BACKFILLFULL: 19 backfillfull osd(s)
    osd.0 is backfill full
    osd.1 is backfill full
    osd.2 is backfill full
    osd.5 is backfill full
    osd.7 is backfill full
    osd.9 is backfill full
    osd.10 is backfill full
    osd.14 is backfill full
    osd.16 is backfill full
    osd.17 is backfill full
    osd.18 is backfill full
    osd.19 is backfill full
    osd.22 is backfill full
    osd.27 is backfill full
    osd.28 is backfill full
    osd.30 is backfill full
    osd.31 is backfill full
    osd.32 is backfill full
    osd.34 is backfill full
[WRN] OSD_NEARFULL: 11 nearfull osd(s)
    osd.6 is near full
    osd.8 is near full
    osd.11 is near full
    osd.12 is near full
    osd.13 is near full
    osd.15 is near full
    osd.20 is near full
    osd.21 is near full
    osd.23 is near full
    osd.25 is near full
    osd.26 is near full
[WRN] POOL_BACKFILLFULL: 9 pool(s) backfillfull
    pool '.mgr' is backfillfull
    pool '.rgw.root' is backfillfull
    pool 'default.rgw.log' is backfillfull
    pool 'default.rgw.control' is backfillfull
    pool 'default.rgw.meta' is backfillfull
    pool 'rbd' is backfillfull
    pool '.nfs' is backfillfull
    pool 'cephfs.cephfs.meta' is backfillfull
    pool 'cephfs.cephfs.data' is backfillfull


Version-Release number of selected component (if applicable):
===========================


[ceph: root@cali013 /]# ceph --version
ceph version 18.2.1-194.el9cp (04a992766839cd3207877e518a1238cdbac3787e) reef (stable)

[ceph: root@cali013 /]# rpm -qa | grep nfs
libnfsidmap-2.5.4-25.el9.x86_64
nfs-utils-2.5.4-25.el9.x86_64
nfs-ganesha-selinux-5.7-5.el9cp.noarch
nfs-ganesha-5.7-5.el9cp.x86_64
nfs-ganesha-rgw-5.7-5.el9cp.x86_64
nfs-ganesha-ceph-5.7-5.el9cp.x86_64
nfs-ganesha-rados-grace-5.7-5.el9cp.x86_64
nfs-ganesha-rados-urls-5.7-5.el9cp.x86_64


How reproducible:
============
1/1


Steps to Reproduce:
1. Create NFS cluster on 2 nodes 
2. Create 1 subvolume group and create 2000 subvolume
3. Mount the 2000 exports on 100 clients
4. Run fio on all 2000 exports in parallel
5. Stop fio and Perform rm -rf * on mount point one by one

Actual results:
============

rm operations are hung indefinitely


Expected results:
============

rm should be completed and space should be freed up to bring back the cluster in HEALTHY state

Additional info:

=============

Client mount shows 100% filled.

[root@ceph-nfs-client-ymkppj-node16 ~]# df -hT
Filesystem               Type      Size  Used Avail Use% Mounted on
devtmpfs                 devtmpfs  4.0M     0  4.0M   0% /dev
tmpfs                    tmpfs     1.8G     0  1.8G   0% /dev/shm
tmpfs                    tmpfs     732M   13M  720M   2% /run
/dev/vda4                xfs        40G  2.8G   37G   8% /
/dev/vda3                xfs       495M  287M  209M  58% /boot
/dev/vda2                vfat      200M  7.1M  193M   4% /boot/efi
10.8.130.236:/export_621 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_621
10.8.130.236:/export_622 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_622
10.8.130.236:/export_623 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_623
10.8.130.236:/export_624 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_624
10.8.130.236:/export_625 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_625
10.8.130.236:/export_626 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_626
10.8.130.236:/export_627 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_627
10.8.130.236:/export_628 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_628
10.8.130.236:/export_629 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_629
10.8.130.236:/export_630 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_630
10.8.130.236:/export_631 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_631
10.8.130.236:/export_632 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_632
10.8.130.236:/export_633 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_633
10.8.130.236:/export_634 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_634
10.8.130.236:/export_635 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_635
10.8.130.236:/export_636 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_636
10.8.130.236:/export_637 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_637
10.8.130.236:/export_638 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_638
10.8.130.236:/export_639 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_639
10.8.130.236:/export_640 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_640
10.8.130.236:/export_661 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_661
10.8.130.236:/export_662 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_662
10.8.130.236:/export_663 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_663
10.8.130.236:/export_664 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_664
10.8.130.236:/export_665 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_665
10.8.130.236:/export_666 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_666
10.8.130.236:/export_667 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_667
10.8.130.236:/export_668 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_668
10.8.130.236:/export_669 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_669
10.8.130.236:/export_670 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_670
10.8.130.236:/export_671 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_671
10.8.130.236:/export_672 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_672
10.8.130.236:/export_673 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_673
10.8.130.236:/export_674 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_674
10.8.130.236:/export_675 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_675
10.8.130.236:/export_676 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_676
10.8.130.236:/export_677 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_677
10.8.130.236:/export_678 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_678
10.8.130.236:/export_679 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_679
10.8.130.236:/export_680 nfs4       26T   26T   91G 100% /mnt/nfs_scale_fio_680
tmpfs                    tmpfs     366M     0  366M   0% /run/user/0
tmpfs                    tmpfs     366M     0  366M   0% /run/user/1000

Comment 1 Storage PM bot 2024-06-10 13:03:14 UTC

Please specify the severity of this bug. Severity is defined here:
https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.

Comment 42 errata-xmlrpc 2025-06-26 12:13:12 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat Ceph Storage 8.1 security, bug fix, and enhancement updates), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2025:9775

Note You need to log in before you can comment on or make changes to this bug.