Description of problem: The MDS unnecessarily trims all un-pinned dentries under read heavy workloads. This is especially problematic for large directories (>1M dentries) where the majority of dentries loaded for each dir fragment are not pinned because the client is only reading a small selection of them. Version-Release number of selected component (if applicable): 3.0 How reproducible: 100% Steps to Reproduce: 1. create a large directory hierarchy (cp -av /usr/include /mnt/cephfs) 2. Unmount clients so nothing is pinned in cache. 3. set a low memory limit: ceph daemon mds.b config set mds_cache_memory_limit $((20*2**20)) 4. Observe the entire cache is trimmed in the logs: $ < out/mds.b.log grep trim_lru [...] 2017-10-14 16:31:32.402 7f0cb5efc700 7 mds.0.cache trim_lru trimming 0 items from LRU size=14797 mid=0 pintail=10 pinned=942 2017-10-14 16:31:33.194 7f0cb5efc700 7 mds.0.cache trim_lru trimmed 14787 items 2017-10-14 16:31:37.402 7f0cb5efc700 7 mds.0.cache trim_lru trimming 0 items from LRU size=10 mid=0 pintail=10 pinned=10 2017-10-14 16:31:37.402 7f0cb5efc700 7 mds.0.cache trim_lru trimmed 0 items Bug is here: https://github.com/ceph/ceph/blob/master/src/mds/MDCache.cc#L6461 The count variable underflows when trimming the cache because it's too full. This causes trim_lru to continue trimming all unpinned dentries.
@Patrick, is this fix independent of RHEL, Ubuntu and container? can it be tested on one of them? Please let us know at the earliest.
Followed the steps as in bug description. After unmounting FS, collected mds logs. 2017-10-25 10:11:45.460500 7f3ed54f9700 7 mds.0.cache trim_lru trimming 96 items from LRU size=1943 mid=74 pintail=14 pinned=1837 2017-10-25 10:11:45.461343 7f3ed54f9700 7 mds.0.cache trim_lru trimmed 96 items 2017-10-25 10:11:45.489778 7f3ed54f9700 7 mds.0.cache trim_lru trimming 310 items from LRU size=1847 mid=217 pintail=14 pinned=1536 2017-10-25 10:11:45.494010 7f3ed54f9700 7 mds.0.cache trim_lru trimmed 310 items 2017-10-25 10:11:55.366490 7f3ed8d00700 7 mds.0.cache trim_lru trimming 0 items from LRU size=1537 mid=7 pintail=14 pinned=1527 2017-10-25 10:11:55.366504 7f3ed8d00700 7 mds.0.cache trim_lru trimmed 0 items 2017-10-25 10:12:05.366623 7f3ed8d00700 7 mds.0.cache trim_lru trimming 0 items from LRU size=1537 mid=7 pintail=14 pinned=1527 2017-10-25 10:12:05.366639 7f3ed8d00700 7 mds.0.cache trim_lru trimmed 0 items 2017-10-25 10:12:15.366702 7f3ed8d00700 7 mds.0.cache trim_lru trimming 0 items from LRU size=1537 mid=7 pintail=14 pinned=1527 Moving this bug to verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:3387