Bug 1159173

Summary: [USS] Logging is completely messed up after enabling the USS and carrying out few cases, snapd logs and nfs logs exhausted 47GB of logs in 3 hours
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Rahul Hinduja <rhinduja>
Component: snapshotAssignee: Sachin Pandit <spandit>
Status: CLOSED ERRATA QA Contact: Rahul Hinduja <rhinduja>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: rhgs-3.0CC: amainkar, nsathyan, rhs-bugs, rjoseph, senaik, spandura, storage-qa-internal, surs
Target Milestone: ---Keywords: ZStream
Target Release: RHGS 3.0.3   
Hardware: x86_64   
OS: Linux   
Whiteboard: USS
Fixed In Version: glusterfs-3.6.0.35-1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-01-15 13:41:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1166197, 1175736    
Bug Blocks: 1154965, 1162694    

Description Rahul Hinduja 2014-10-31 06:39:03 UTC
Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Rahul Hinduja 2014-10-31 07:01:48 UTC
Description of problem:
=======================

snapd logs are filled with following error messages. 

From snapX*.log
================

[root@inception ~]# tail -100 /var/log/glusterfs/snap2-c0e76752-8ec5-42be-8b1f-13a13670aed8.log | grep "2014-10-30 14:55:01"
[2014-10-30 14:44:55.934297] W [glfs-[2014-10-30 14:55:01.085073] W [glfs-handleops.c:1086:glfs_h_create_from_handle] 0-meta-autoload: inode refresh of 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6 failed: No such file or directory
[2014-10-30 14:55:01.085139] E [snapview-server.c:153:svs_lookup_gfid] 0-vol2-snapview-server: failed to do lookup and get the handle on the snapshot (null) (path: <gfid:5677aeba-c1dd-43c6-a65b-ecd7ef0557b6>, gfid: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6)
[2014-10-30 14:55:01.085142] W [glfs-handleops.c:1086:glfs_h_create_from_handle] 0-meta-autoload: inode refresh of 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6 failed: No such file or directory
[2014-10-30 14:55:01.085161] E [snapview-server.c:153:svs_lookup_gfid] 0-vol2-snapview-server: failed to do lookup and get the handle on the snapshot (null) (path: <gfid:5677aeba-c1dd-43c6-a65b-ecd7ef0557b6>, gfid: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6)
[2014-10-30 14:55:01.085237] W [glfs-handleops.c:1086:glfs_h_create_from_handle] 0-meta-autoload: inode refresh of 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6 failed: No such file or directory
[2014-10-30 14:55:01.085261] E [snapview-server.c:153:svs_lookup_gfid] 0-vol2-snapview-server: failed to do lookup and get the handle on the snapshot (null) (path: <gfid:5677aeba-c1dd-43c6-a65b-ecd7ef0557b6>, gfid: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6)
[2014-10-30 14:55:01.085334] W [glfs-handleops.c:1086:glfs_h_create_from_handle] 0-meta-autoload: inode refresh of 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6 failed: No such file or directory
[2014-10-30 14:55:01.085353] E [snapview-server.c:153:svs_lookup_gfid] 0-vol2-snapview-server: failed to do lookup and get the handle on the snapshot (null) (path: <gfid:5677aeba-c1dd-43c6-a65b-ecd7ef0557b6>, gfid: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6)
[2014-10-30 14:55:01.085399] W [glfs-handleops.c:1086:glfs_h_create_from_handle] 0-meta-autoload: inode refresh of 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6 failed: No such file or directory
[2014-10-30 14:55:01.085427] E [snapview-server.c:153:svs_lookup_gfid] 0-vol2-snapview-server: failed to do lookup and get the handle on the snapshot (null) (path: <gfid:5677aeba-c1dd-43c6-a65b-ecd7ef0557b6>, gfid: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6)
[2014-10-30 14:55:01.085665] W [glfs-handleops.c:1086:glfs_h_create_from_handle] 0-meta-autoload: inode refresh of 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6 failed: No such file or directory
[2014-10-30 14:55:01.085688] E [snapview-server.c:153:svs_lookup_gfid] 0-vol2-snapview-server: failed to do lookup and get the handle on the snapshot (null) (path: <gfid:5677aeba-c1dd-43c6-a65b-ecd7ef0557b6>, gfid: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6)
[2014-10-30 14:55:01.085695] W [glfs-handleops.c:1086:glfs_h_create_from_handle] 0-meta-autoload: inode refresh of 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6 failed: No such file or directory
[2014-10-30 14:55:01.085714] E [snapview-server.c:153:svs_lookup_gfid] 0-vol2-snapview-server: failed to do lookup and get the handle on the snapshot (null) (path: <gfid:5677aeba-c1dd-43c6-a65b-ecd7ef0557b6>, gfid: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6)
[2014-10-30 14:55:01.085736] W [glfs-handleops.c:1086:glfs_h_create_from_handle] 0-meta-autoload: inode refresh of 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6 failed: No such file or directory
[2014-10-30 14:55:01.085752] E [snapview-server.c:153:svs_lookup_gfid] 0-vol2-snapview-server: failed to do lookup and get the handle on the snapshot (null) (path: <gfid:5677aeba-c1dd-43c6-a65b-ecd7ef0557b6>, gfid: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6)
[2014-10-30 14:55:01.085782] W [glfs-handleops.c:1086:glfs_h_create_from_handle] 0-meta-autoload: inode refresh of 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6 failed: No such file or directory
[2014-10-30 14:55:01.085802] E [snapview-server.c:153:svs_lookup_gfid] 0-vol2-snapview-server: failed to do lookup and get the handle on the snapshot (null) (path: <gfid:5677aeba-c1dd-43c6-a65b-ecd7ef0557b6>, gfid: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6)
[2014-10-30 14:55:01.085827] W [glfs-handleops.c:1086:glfs_h_create_from_handle] 0-meta-autoload: inode refresh of 5677aeba-c1dd-43c6-a65b-ecd7ef
[root@inception ~]#


From volx-snapd.log:
====================

[root@inception ~]# tail -100 /var/log/glusterfs/vol2-snapd.log | grep "2014-10-30 14:44:25"
[2014-10-30 14:43:56.844035] W [server-resolve.c:122:res[2014-10-30 14:44:25.930293] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol2-server: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6: failed to resolve (Stale file handle)
[2014-10-30 14:44:25.934178] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol2-server: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6: failed to resolve (Stale file handle)
[2014-10-30 14:44:25.934209] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol2-server: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6: failed to resolve (Stale file handle)
[2014-10-30 14:44:25.934226] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol2-server: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6: failed to resolve (Stale file handle)
[2014-10-30 14:44:25.934304] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol2-server: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6: failed to resolve (Stale file handle)
[2014-10-30 14:44:25.934357] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol2-server: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6: failed to resolve (Stale file handle)
[2014-10-30 14:44:25.934418] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol2-server: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6: failed to resolve (Stale file handle)
[2014-10-30 14:44:25.934426] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol2-server: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6: failed to resolve (Stale file handle)
[2014-10-30 14:44:25.934438] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol2-server: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6: failed to resolve (Stale file handle)
[2014-10-30 14:44:25.934562] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol2-server: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6: failed to resolve (Stale file handle)
[2014-10-30 14:44:25.934921] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol2-server: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6: failed to resolve (Stale file handle)
[2014-10-30 14:44:25.935004] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol2-server: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6: failed to resolve (Stale file handle)
[2014-10-30 14:44:25.935039] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol2-server: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6: failed to resolve (Stale file handle)
[2014-10-30 14:44:25.935108] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol2-server: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6: failed to resolve (Stale file handle)
[2014-10-30 14:44:25.935202] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol2-server: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6: failed to resolve (Stale file handle)
[2014-10-30 14:44:25.935237] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol2-server: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6: failed to resolve (Stale file handle)
[2014-10-30 14:44:25.935324] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol2-server: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6: failed to resolve (Stale file handle)
[2014-10-30 14:44:25.935362] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol2-server: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6: failed to resolve (Stale file handle)
[2014-10-30 14:44:25.935469] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol2-server: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6: failed to resolve (Stale file handle)
[2014-10-30 14:44:25.935695] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol2-server: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6: failed to resolve (Stale file handle)
[2014-10-30 14:44:25.935881] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol2-server: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6: failed to resolve (Stale file handle)
[2014-10-30 14:44:25.935902] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol2-server: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6: failed to resolve (Stale file handle)
[2014-10-30 14:44:25.935954] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol2-server: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6: failed to resolve (Stale file handle)
[2014-10-30 14:44:25.936063] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol2-server: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6: failed to resolve (Stale file handle)
[2014-10-30 14:44:25.936099] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol2-server: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6: failed to resolve (Stale file handle)
[2014-10-30 14:44:25.936224] W [server-resolve[2014-10-30 14:44:55.932522] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol2-server: 5677aeba-c1dd-43c6-a65b-ecd7ef0557b6: failed to resolve (Stale file handle)
[root@inception ~]#


Note:
=====

Every second these many logs are reported 

There are only 2 files a and b having following content:

[root@wingo vol2]# cat a b
abc
def

ajkadj
skjfksfj
[root@wingo vol2]# 

All these logs started filling after hitting the issue mentioned in BZ 1158883 and BZ 1158898


Size:
=====
[root@inception ~]# du -sh /var/log/glusterfs/snap2-c0e76752-8ec5-42be-8b1f-13a13670aed8.log
23G	/var/log/glusterfs/snap2-c0e76752-8ec5-42be-8b1f-13a13670aed8.log
[root@inception ~]# du -sh /var/log/glusterfs/vol2-snapd.log 
9.6G	/var/log/glusterfs/vol2-snapd.log
[root@inception ~]# 





Version-Release number of selected component (if applicable):
==============================================================

glusterfs-3.6.0.30-1.el6rhs.x86_64

Steps to Reproduce:
===================
1. Carry the scenario mentioned in bz 1158883 and 1158898

Actual results:
===============

Logs are filled in 33 GB's in hours and no more space left on root device which is of 50G

Expected results:
=================

At first place we need to investigate why these error messages are coming, and if these are the valid error messages than the amount in which they are logged is just not acceptable. It fills the complete file system

Comment 3 Rahul Hinduja 2014-10-31 07:09:33 UTC
When looked into the dmesg, observed that it has also hit OOM. Just to highlight this machine is physical machine having 32G of memory and 24 processor.

Since there is no space left on machine, I have copied /var/log to my local system for investigation. 

nrpe invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
nrpe cpuset=/ mems_allowed=0-1
Pid: 2580, comm: nrpe Not tainted 2.6.32-504.el6.x86_64 #1
Call Trace:
 [<ffffffff810d40c1>] ? cpuset_print_task_mems_allowed+0x91/0xb0
 [<ffffffff81127300>] ? dump_header+0x90/0x1b0
 [<ffffffff8122ea2c>] ? security_real_capable_noaudit+0x3c/0x70
 [<ffffffff81127782>] ? oom_kill_process+0x82/0x2a0
 [<ffffffff811276c1>] ? select_bad_process+0xe1/0x120
 [<ffffffff81127bc0>] ? out_of_memory+0x220/0x3c0
 [<ffffffff811344df>] ? __alloc_pages_nodemask+0x89f/0x8d0
 [<ffffffff8116c69a>] ? alloc_pages_current+0xaa/0x110
 [<ffffffff811246f7>] ? __page_cache_alloc+0x87/0x90
 [<ffffffff811240de>] ? find_get_page+0x1e/0xa0
 [<ffffffff81125697>] ? filemap_fault+0x1a7/0x500
 [<ffffffff8114eae4>] ? __do_fault+0x54/0x530
 [<ffffffff8114f0b7>] ? handle_pte_fault+0xf7/0xb00
 [<ffffffff811d2db2>] ? fsnotify_clear_marks_by_inode+0x32/0xf0
 [<ffffffff81012bfe>] ? copy_user_generic+0xe/0x20
 [<ffffffff8114fcea>] ? handle_mm_fault+0x22a/0x300
 [<ffffffff8104d0d8>] ? __do_page_fault+0x138/0x480
 [<ffffffff81014a79>] ? read_tsc+0x9/0x20
 [<ffffffff810aaa21>] ? ktime_get_ts+0xb1/0xf0
 [<ffffffff811a5728>] ? poll_select_copy_remaining+0xf8/0x150
 [<ffffffff8152ffbe>] ? do_page_fault+0x3e/0xa0
 [<ffffffff8152d375>] ? page_fault+0x25/0x30
Mem-Info:
Node 0 DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
CPU    2: hi:    0, btch:   1 usd:   0
CPU    3: hi:    0, btch:   1 usd:   0
CPU    4: hi:    0, btch:   1 usd:   0
CPU    5: hi:    0, btch:   1 usd:   0
CPU    6: hi:    0, btch:   1 usd:   0
CPU    7: hi:    0, btch:   1 usd:   0
CPU    8: hi:    0, btch:   1 usd:   0
CPU    9: hi:    0, btch:   1 usd:   0
CPU   10: hi:    0, btch:   1 usd:   0
CPU   11: hi:    0, btch:   1 usd:   0
CPU   12: hi:    0, btch:   1 usd:   0
CPU   13: hi:    0, btch:   1 usd:   0
CPU   14: hi:    0, btch:   1 usd:   0
CPU   15: hi:    0, btch:   1 usd:   0
CPU   16: hi:    0, btch:   1 usd:   0
CPU   17: hi:    0, btch:   1 usd:   0
CPU   18: hi:    0, btch:   1 usd:   0
CPU   19: hi:    0, btch:   1 usd:   0
CPU   20: hi:    0, btch:   1 usd:   0
CPU   21: hi:    0, btch:   1 usd:   0
CPU   22: hi:    0, btch:   1 usd:   0
CPU   23: hi:    0, btch:   1 usd:   0
Node 0 DMA32 per-cpu:
CPU    0: hi:  186, btch:  31 usd: 168
CPU    1: hi:  186, btch:  31 usd:  79
CPU    2: hi:  186, btch:  31 usd: 176
CPU    3: hi:  186, btch:  31 usd:  73
CPU    4: hi:  186, btch:  31 usd: 162
CPU    5: hi:  186, btch:  31 usd:  37
CPU    6: hi:  186, btch:  31 usd: 100
CPU    7: hi:  186, btch:  31 usd:  73
CPU    8: hi:  186, btch:  31 usd:  65
CPU    9: hi:  186, btch:  31 usd:  38
CPU   10: hi:  186, btch:  31 usd:  70
CPU   11: hi:  186, btch:  31 usd:  26
CPU   12: hi:  186, btch:  31 usd: 167
CPU   13: hi:  186, btch:  31 usd:  76
CPU   14: hi:  186, btch:  31 usd: 157
CPU   15: hi:  186, btch:  31 usd:  54
CPU   16: hi:  186, btch:  31 usd: 174
CPU   17: hi:  186, btch:  31 usd: 173
CPU   18: hi:  186, btch:  31 usd:  61
CPU   19: hi:  186, btch:  31 usd:  27
CPU   20: hi:  186, btch:  31 usd:  68
CPU   21: hi:  186, btch:  31 usd:  70
CPU   22: hi:  186, btch:  31 usd:  58
CPU   23: hi:  186, btch:  31 usd:  85
Node 0 Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd: 183
CPU    1: hi:  186, btch:  31 usd: 169
CPU    2: hi:  186, btch:  31 usd: 157
CPU    3: hi:  186, btch:  31 usd: 163
CPU    4: hi:  186, btch:  31 usd: 169
CPU    5: hi:  186, btch:  31 usd:  91
CPU    6: hi:  186, btch:  31 usd: 102
CPU    7: hi:  186, btch:  31 usd: 184
CPU    8: hi:  186, btch:  31 usd: 171
CPU    9: hi:  186, btch:  31 usd: 113
CPU   10: hi:  186, btch:  31 usd: 146
CPU   11: hi:  186, btch:  31 usd:  43
CPU   12: hi:  186, btch:  31 usd: 162
CPU   13: hi:  186, btch:  31 usd: 101
CPU   14: hi:  186, btch:  31 usd: 163
CPU   15: hi:  186, btch:  31 usd: 168
CPU   16: hi:  186, btch:  31 usd: 172
CPU   17: hi:  186, btch:  31 usd: 176
CPU   18: hi:  186, btch:  31 usd: 150
CPU   19: hi:  186, btch:  31 usd: 137
CPU   20: hi:  186, btch:  31 usd: 154
CPU   21: hi:  186, btch:  31 usd:  90
CPU   22: hi:  186, btch:  31 usd:  56
CPU   23: hi:  186, btch:  31 usd:  26
Node 1 Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd: 175
CPU    1: hi:  186, btch:  31 usd: 165
CPU    2: hi:  186, btch:  31 usd: 174
CPU    3: hi:  186, btch:  31 usd: 168
CPU    4: hi:  186, btch:  31 usd: 103
CPU    5: hi:  186, btch:  31 usd: 164
CPU    6: hi:  186, btch:  31 usd: 161
CPU    7: hi:  186, btch:  31 usd:  83
CPU    8: hi:  186, btch:  31 usd:  85
CPU    9: hi:  186, btch:  31 usd: 166
CPU   10: hi:  186, btch:  31 usd: 148
CPU   11: hi:  186, btch:  31 usd: 149
CPU   12: hi:  186, btch:  31 usd: 158
CPU   13: hi:  186, btch:  31 usd:  64
CPU   14: hi:  186, btch:  31 usd:  61
CPU   15: hi:  186, btch:  31 usd: 163
CPU   16: hi:  186, btch:  31 usd: 166
CPU   17: hi:  186, btch:  31 usd:  62
CPU   18: hi:  186, btch:  31 usd: 107
CPU   19: hi:  186, btch:  31 usd: 168
CPU   20: hi:  186, btch:  31 usd: 159
CPU   21: hi:  186, btch:  31 usd: 174
CPU   22: hi:  186, btch:  31 usd:  91
CPU   23: hi:  186, btch:  31 usd: 185
active_anon:7303138 inactive_anon:752638 isolated_anon:0
 active_file:235 inactive_file:17 isolated_file:0
 unevictable:9402 dirty:0 writeback:0 unstable:0
 free:39368 slab_reclaimable:3946 slab_unreclaimable:22451
 mapped:1702 shmem:0 pagetables:25452 bounce:0

Comment 4 Rahul Hinduja 2014-10-31 07:20:32 UTC
Even NFS logs are filled to 14G. IN 1 sec nfs has logged 8788 entries for "client-rpc-fops.c:2761:client3_3_lookup_cbk"

[root@inception ~]# tail -10000 /var/log/glusterfs/nfs.log | grep "2014-10-30 14:43:56" | grep "client-rpc-fops.c:2761:client3_3_lookup_cbk" | wc
   8788  140620 2100480
[root@inception ~]# 
[root@inception ~]# du -sh /var/log/glusterfs/nfs.log 
14G	/var/log/glusterfs/nfs.log
[root@inception ~]#

Comment 5 Sachin Pandit 2014-11-06 09:33:43 UTC
Part of the problem has been solved and the patch is sent for review
in the upstream. I'll update this bug once we make further progress on
this bug.

Comment 8 senaik 2014-11-13 10:31:14 UTC
Another scenario where size of snapd log is high and the logs are failed with 'Stale file handler' message 

Steps :
======
-Created 256 snapshots on the volume while IO was going on
-cd to .snaps from Fuse and NFS mount 
-While NFS server is down , try to cd to .snaps and list the snapshots. 
-check snapd log , it is filled with the below log messages :

[2014-11-13 09:06:26.442062] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol1-server: b1d45859-bb42-4057-95e5-b667705e2045: failed to resolve (Stale file handle)
[2014-11-13 09:06:26.442101] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol1-server: b1d45859-bb42-4057-95e5-b667705e2045: failed to resolve (Stale file handle)
[2014-11-13 09:06:26.442112] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol1-server: b1d45859-bb42-4057-95e5-b667705e2045: failed to resolve (Stale file handle)
[2014-11-13 09:06:26.442155] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol1-server: b1d45859-bb42-4057-95e5-b667705e2045: failed to resolve (Stale file handle)
[2014-11-13 09:06:26.442624] W [server-resolve.c:122:resolve_gfid_cbk] 0-vol1-server: b1d45859-bb42-4057-95e5-b667705e2045: failed to resolve (Stale file handle)


du -sh /var/log/glusterfs/vol1-snapd.log
5.1G	/var/log/glusterfs/vol1-snapd.log

Comment 10 Sachin Pandit 2014-11-19 06:46:17 UTC
This issue is consistently reproduced in the below mentioned scenario.

1) create a volume (2X2).
2) enable uss and create few snapshots
3) enter in any of the snapshots from uss world (say /<mnt>/.snaps/snap2)
4) stop the volume 
5) restore to snapshot snap2
6) start the volume
7) try "ls" from the directory snap2.

the command hangs and snapd.log is filled with numerous logs entries. Around 100 entries each second.

We are still working on root causing the issue, I'll update this bug as and when we make a progress.

Comment 11 Vijaikumar Mallikarjuna 2014-11-21 06:23:00 UTC
*** Bug 1115951 has been marked as a duplicate of this bug. ***

Comment 12 Sachin Pandit 2014-11-21 07:08:09 UTC
NFS Client was looking for the snapshot in the wrong place, and was not updating the subvolume once a proper path is resolved. Patch which resolves https://bugzilla.redhat.com/show_bug.cgi?id=1165704 also fixes this issue

Comment 13 Rahul Hinduja 2014-12-17 11:36:39 UTC
Verified with build: glusterfs-3.6.0.38-1.el6rhs.x86_64

Performed the steps mentioned in comment 10, bz - 1158883 and 1158898. The total size of logs is 8.9M

[root@inception ~]# du -sh /var/log/glusterfs
8.9M	/var/log/glusterfs
[root@inception ~]# 

Moving the bug to verified state

Comment 15 errata-xmlrpc 2015-01-15 13:41:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0038.html