Bug 1323423

Summary: nfs-ganesha+Tiering: Ganesha mount hangs during read_large fs-sanity tool on tiered volume.
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Shashank Raj <sraj>
Component: nfs-ganeshaAssignee: Jiffin <jthottan>
Status: CLOSED WORKSFORME QA Contact: storage-qa-internal <storage-qa-internal>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: rhgs-3.1CC: jthottan, kkeithle, ndevos, nlevinki, sashinde, skoduri, sraj
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-04-28 13:08:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Shashank Raj 2016-04-02 20:07:41 UTC
Description of problem:
Ganesha mount hangs during read_large fs-sanity tool on tiered volume.

Version-Release number of selected component (if applicable):
3.7.9-1

How reproducible:
Once

Steps to Reproduce:
1.Configure nfs-ganesha on cluster nodes.
2.Create a tiered volume and mount it with version 4
3.Execute fs-sanity test suite on the mount point
4.Observe that during read_large test tool, ganesha mount hangs with "server not responding messages"
5.ganesha service is active and running on the mounted node.

[root@dhcp37-180 tmp]# service nfs-ganesha status
Redirecting to /bin/systemctl status  nfs-ganesha.service
● nfs-ganesha.service - NFS-Ganesha file server
   Loaded: loaded (/usr/lib/systemd/system/nfs-ganesha.service; disabled; vendor preset: disabled)
   Active: active (running) since Sat 2016-04-02 02:27:15 IST; 12h ago
     Docs: http://github.com/nfs-ganesha/nfs-ganesha/wiki
 Main PID: 21795 (ganesha.nfsd)

6. Showmount gives below error message on the mounted node

[root@dhcp37-180 tmp]# showmount -e localhost
rpc mount export: RPC: Timed out

7. cd /mnt and df on the client hangs


Actual results:

Ganesha mount hangs during read_large fs-sanity tool on tiered volume.

Expected results:

read_large test suite should pass and should not be hanged.

Additional info:

Comment 2 Shashank Raj 2016-04-02 20:16:14 UTC
sosreport, ganesha log and packet trace are placed under http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1323423

Comment 3 Soumya Koduri 2016-04-04 06:29:21 UTC
From the pkt trace, I do not see any NFS calls but can see lots of gluster traffic. Could you please run ' read_large test' alone and check the behaviour along with pkt_trace. Thanks!

Comment 4 Soumya Koduri 2016-04-17 12:36:10 UTC
Please provide the results of executing read_large test alone.

Comment 5 Jiffin 2016-04-18 16:30:15 UTC
and also please provide setup if u hit the issue again.

I use following steps to reproduce the issue :

1.) create replicated volume(1x2) and start it
2.) export the volume using ganesha
3.) mount the volume using v4
4.) created directory dir and a large file using dd know as "src" inside it
5.) perform attach tier, wait for its completion.
6.) cat src > dst (what read_large do)

I didn't get hang when I perform above steps in a sequential order.

I hit hang when I perform 4 and 5(may be include 6 also) together only twice in my entire runs.
(I didn't debug hang at that point of time, because I was analyzing BZ1323424 and hang didn't happen while i tried to reproduce issue again)

Comment 6 Shashank Raj 2016-04-19 19:39:14 UTC
Executed read_large test suite individually for around 10 times but didn't hit the hang issue. Will keep an eye on this bug during testing cycle and update bugzilla accordingly.