Bug 1461507

Summary: [Ganesha] : Ganesha logs flooded with "unable to find reclaimable dupreq LRU entry" messages
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Ambarish <asoman>
Component: nfs-ganeshaAssignee: Kaleb KEITHLEY <kkeithle>
Status: CLOSED ERRATA QA Contact: Manisha Saini <msaini>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.3CC: amukherj, bturner, chpai, dang, ffilz, info, jthottan, kkeithle, mbenjamin, msaini, pmulay, rhinduja, rhs-bugs, sheggodu, skoduri, storage-qa-internal
Target Milestone: ---Keywords: ZStream
Target Release: RHGS 3.4.0   
Hardware: x86_64   
OS: Linux   
Whiteboard: 3.3.0-defer
Fixed In Version: nfs-ganesha-2.5.4-1 Doc Type: Known Issue
Doc Text:
When duplicate request cache (DRC) entries maintained by NFS-Ganesha server reaches the high watermark limit, the server tries to reclaim old entries which may still be in use. As a result, every time the server cannot reclaim an entry, it logs a warning. This may flood the log file at times if there are too many requests being processed. Workaround: Increase the DRC limit by executing the following steps: 1) Edit the /run/gluster/shared_storage/nfs-ganesha/ganesha.conf file and add the following parameters in NFS_Core_Param block: NFS_Core_Param { DRC_TCP_Hiwat = 1024; #default is 256 } 2) Restart the NFS-Ganesha process on all the nodes in the NFS-Ganesha cluster using the following command: # systemctl restart nfs-ganesha This decreases the chances of these warnings filling up the logs.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-09-04 06:53:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1417153, 1503135    

Description Ambarish 2017-06-14 15:55:46 UTC
Description of problem:
-----------------------

2 Node Ganesha cluster.

Ran Bonnie,iozone,dbench,finds,rms from three v4 mounts.

Ganesha logs are flooded with these messages :

<snip> 

14/06/2017 07:17:52 : epoch 50420000 : gqas007.sbu.lab.eng.bos.redhat.com : ganesha.nfsd-26891[work-154] nfs_dupreq_finish :DUPREQ :WARN :DRC retire entries: unable to find reclaimable dupreq LRU entry after 5 tries on DRC=0x7f46a40008c0, drc->size=2467
14/06/2017 07:17:52 : epoch 50420000 : gqas007.sbu.lab.eng.bos.redhat.com : ganesha.nfsd-26891[work-90] nfs_dupreq_finish :DUPREQ :WARN :DRC retire entries: unable to find reclaimable dupreq LRU entry after 5 tries on DRC=0x7f46a40008c0, drc->size=2466
14/06/2017 07:17:52 : epoch 50420000 : gqas007.sbu.lab.eng.bos.redhat.com : ganesha.nfsd-26891[work-223] nfs_dupreq_finish :DUPREQ :WARN :DRC retire entries: unable to find reclaimable dupreq LRU entry after 5 tries on DRC=0x7f46a40008c0, drc->size=2462
14/06/2017 07:17:52 : epoch 50420000 : gqas007.sbu.lab.eng.bos.redhat.com : ganesha.nfsd-26891[work-92] nfs_dupreq_finish :DUPREQ :WARN :DRC retire entries: unable to find reclaimable dupreq LRU entry after 5 tries on DRC=0x7f4650002a80, drc->size=1048
14/06/2017 07:19:06 : epoch 50420000 : gqas007.sbu.lab.eng.bos.redhat.com : ganesha.nfsd-26891[work-236] nfs_dupreq_finish :DUPREQ :WARN :DRC retire entries: unable to find reclaimable dupreq LRU entry after 5 tries on DRC=0x7f4650002a80, drc->size=1026
14/06/2017 07:19:21 : epoch 50420000 : gqas007.sbu.lab.eng.bos.redhat.com : ganesha.nfsd-26891[work-143] nfs_dupreq_finish :DUPREQ :WARN :DRC retire entries: unable to find reclaimable dupreq LRU entry after 5 tries on DRC=0x7f4650002a80, drc->size=1050
14/06/2017 07:19:21 : epoch 50420000 : gqas007.sbu.lab.eng.bos.redhat.com : ganesha.nfsd-26891[work-58] nfs_dupreq_finish :DUPREQ :WARN :DRC retire entries: unable to find reclaimable dupreq LRU entry after 5 tries on DRC=0x7f46a40008c0, drc->size=1026
14/06/2017 07:19:21 : epoch 50420000 : gqas007.sbu.lab.eng.bos.redhat.com : ganesha.nfsd-26891[work-14] nfs_dupreq_finish :DUPREQ :WARN :DRC retire entries: unable to find reclaimable dupreq LRU entry after 5 tries on DRC=0x7f46a40008c0, drc->size=1025
14/06/2017 07:25:59 : epoch 50420000 : gqas007.sbu.lab.eng.bos.redhat.com : ganesha.nfsd-26891[work-203] nfs_dupreq_finish :DUPREQ :WARN :DRC retire entries: unable to find reclaimable dupreq LRU entry after 5 tries on DRC=0x7f46a40008c0, drc->size=1028
</snip>

[root@gqas007 ~]# cat /var/log/ganesha.log |grep -i "unable to find reclaimable"|wc -l
2927
[root@gqas007 ~]# 

[root@gqas007 ~]# 
[root@gqas007 ~]# ll -h /var/log/ganesha.log 
-rw-r--r-- 1 root root 814K Jun 14 11:51 /var/log/ganesha.log
[root@gqas007 ~]#



Version-Release number of selected component (if applicable):
-------------------------------------------------------------

nfs-ganesha-2.4.4-8.el7rhgs.x86_64
glusterfs-ganesha-3.8.4-25.el7rhgs.x86_64


How reproducible:
-----------------

Fairly.


Additional info:
-----------------

Volume Name: testvol
Type: Distributed-Replicate
Volume ID: 3b04b36a-1837-48e8-b437-fbc091b2f992
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: gqas007.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick0
Brick2: gqas009.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick1
Brick3: gqas007.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick2
Brick4: gqas009.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick3
Options Reconfigured:
ganesha.enable: on
features.cache-invalidation: on
server.allow-insecure: on
performance.stat-prefetch: off
transport.address-family: inet
nfs.disable: on
nfs-ganesha: enable
cluster.enable-shared-storage: enable
[root@gqas007 ~]#

Comment 9 Frank Filz 2017-06-19 13:08:44 UTC
A simple code fix to this could be to just change the log level of that message.

Comment 10 Atin Mukherjee 2017-06-19 13:16:20 UTC
(In reply to Frank Filz from comment #9)
> A simple code fix to this could be to just change the log level of that
> message.

But then will you not actually be suppressing the log for a genuine failure as well?

Comment 11 Frank Filz 2017-06-19 13:35:13 UTC
(In reply to Atin Mukherjee from comment #10)
> (In reply to Frank Filz from comment #9)
> > A simple code fix to this could be to just change the log level of that
> > message.
> 
> But then will you not actually be suppressing the log for a genuine failure
> as well?

Hmm, considering what Matt said, it is actually warning about something that is really happening, so I guess the config fix is the best workaround, so never mind about my idea for a quick code fix...

Comment 23 Daniel Gryniewicz 2017-09-19 12:33:37 UTC
You can also turn logging down for the DUPREQ component like this:

LOG {
        Components {
                DUPREQ = CRIT;
        }
}

(That log line is at the WARN level, so CRIT or lower will disable that warning).

Comment 25 Kaleb KEITHLEY 2017-10-05 11:28:51 UTC
POST with rebase to nfs-ganesha-2.5.x

Comment 32 Daniel Gryniewicz 2018-08-29 12:42:07 UTC
Is this intended to be a doctext for 3.4?  Because 3.4 is base on 2.5, which has this fixed, and doesn't even have that log message in it anymore.  This should just be closed fixed with no doctext in 3.4.

Comment 35 errata-xmlrpc 2018-09-04 06:53:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2610