Bug 1705792

Summary: ls command taking too much time on nfs-ganesha mount path directory having 800 inodes and exits with "memory exhausted" error
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Prashant Dhange <pdhange>
Component: NFS-GaneshaAssignee: Frank Filz <ffilz>
Status: CLOSED CURRENTRELEASE QA Contact: ceph-qe-bugs <ceph-qe-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 3.1CC: dang, ffilz, jlayton, kkeithle, linuxkidd, mbenjamin
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-06-24 11:25:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 15 Daniel Gryniewicz 2019-05-13 12:35:22 UTC
Okay, if this is reproducible with dir_chunk=0, then it's not a problem in MDCACHE.  It's almost certainly in libcephfs or Ceph itself, since with that set, readdir is a straight pass-through in Ganesha.  This means there won't be any patch to Ganesha, since there's nothing to fix.

"Disabling MDCACHE" is not really a thing.  The MDCACHE layer does 3 things: 

1. Provides a handle cache, and with it all the handle locking Ganesha needs.

2. Provides an attribute cache

3. Provides a dirent cache.

1 cannot be disabled, as Ganesha depends on it for proper execution.  2 can be disabled with Attr_Expiration = 0, which you've done, and 3 can be disabled with dir_chunk = 0, which you've done.  So you've disabled as much of MDCACHE as as can be disabled.

Comment 19 Daniel Gryniewicz 2019-05-15 12:10:05 UTC
There are certainly readdir fixes upstream between 2.7.1 and 2.7.3 (I think some of them, at least, are downstream).  But they're all related to dir_chunk, and so cannot have any affect when dir_chunk=0 is on.

Comment 37 Daniel Gryniewicz 2019-05-17 13:03:02 UTC
1. Are they using a "%url rados" entry in their config?  If so, does the rados config set dir_chunk?  If it's later in the file, it's probably overriding the dir_chunk in the file.

2. 2.7.1 has known readdir chunk issues.  2.7.3 should be fine.

Comment 44 Kaleb KEITHLEY 2020-06-24 11:25:25 UTC
If this is still an issue please reopen