1214616 – nfs-ganesha: iozone write test is causing nfs server crash

Bug 1214616 - nfs-ganesha: iozone write test is causing nfs server crash

Summary: nfs-ganesha: iozone write test is causing nfs server crash

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	nfs-ganesha
Sub Component:
Version:	unspecified
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	RHGS 3.1.0
Assignee:	Anand Subramanian
QA Contact:	Saurabh
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1202842
TreeView+	depends on / blocked

Reported:	2015-04-23 08:43 UTC by Manoj Pillai
Modified:	2016-01-19 06:16 UTC (History)
CC List:	9 users (show)
Fixed In Version:	nfs-ganesha-2.2.0-2
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2015-07-29 04:41:48 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
packet_trace (1.99 MB, application/octet-stream) 2015-05-05 11:34 UTC, Jiffin	no flags	Details
ganesha conf file (903 bytes, text/plain) 2015-05-08 09:02 UTC, Manoj Pillai	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2015:1495	0	normal	SHIPPED_LIVE	Important: Red Hat Gluster Storage 3.1 update	2015-07-29 08:26:26 UTC

Description Manoj Pillai 2015-04-23 08:43:19 UTC

Description of problem:
nfs-ganesha-gluster is consistently crashing on a fairly straightforward distributed iozone workload.

Version-Release number of selected component (if applicable):

glusterfs-3.7dev-0.994.gitf522001.el6.x86_64

glusterfs-ganesha-3.7dev-0.994.gitf522001.el6.x86_64
nfs-ganesha-gluster-2.2_dev_23-1.el6.x86_64
nfs-ganesha-vfs-2.2_dev_23-1.el6.x86_64
nfs-ganesha-2.2_dev_23-1.el6.x86_64

kernel-2.6.32-504.3.3.el6.x86_64

How reproducible:
Consistently. Issues seen range from iozone hang to nfs-ganesha crash, but have not had a single successful run in more than 5 tries.

Steps to Reproduce:

1. create a single-brick gluster volume and export it with nfs-ganesha. mount exported volume on clients.

2. run iozone in distributed mode on clients:
iozone -+m <distributed iozone conf file> -+h <main_client> -i 0 -w -+n -c -C -e -s 10g -r 64k -t 12


Actual results:

ganesha.nfsd crashes at some point in the test. Some lines from /var/log/messages:

Apr 23 01:48:18 gprfs046 kernel: ganesha.nfsd[7864]: segfault at 200000108 ip 00000039ad0097a6 sp 00007fab1e1f9630 error 4 in libgfapi.so.0.0.0[39ad000000+1a000]
Apr 23 01:48:29 gprfs046 abrt[8043]: Saved core dump of pid 7845 (/usr/bin/ganesha.nfsd) to /var/spool/abrt/ccpp-2015-04-23-01:48:18-7845 (2133598208 bytes)



Expected results:
Test should complete with 120g of data stored in 12 files in the volume


Additional info:

Comment 2 Manoj Pillai 2015-04-23 09:23:04 UTC

Simplifying the steps to reproduce (eliminating multiple clients and distributed iozone).

I am able to reproduce the crash in a very simple config.
1. single brick gluster volume, exported with nfs-ganesha.
2. single client mounting the ganesha-exported volume (on /mnt/glustervol in the example below).
3. run iozone on the client, like this:
iozone -i 0 -w -+n -c -C -e -s 8G -r 64k -t 8 -F /mnt/glustervol/f{1,2,3,4,5,6,7,8}-8g.ioz

So that's 8 iozone threads, writing 8g each.

Comment 3 Manoj Pillai 2015-04-23 11:39:22 UTC

nfs version in my script was set to 3. With nfs v4, I'm seeing much better behavior. I've had multiple runs of the tests above -- both single-client iozone and distributed iozone -- complete with nfs version set to 4.

Comment 4 Jiffin 2015-05-05 11:34:02 UTC

Created attachment 1022158 [details]
packet_trace

Comment 5 Jiffin 2015-05-05 11:41:19 UTC

I tried to reproduce same for v3 mount , there is no crash occurred in my setup, but it just hangs.There is no notable errors in ganesha and gfapi logs.
Volume is still exported via ganesha server.

In the packet trace, I found that client is still sending the request , but server is replying back with TCP ZeroWindowsize and it repeats in a loop.

I had attached packet.

If anyone needs to look into my setup

server : heka-client-13.gdev.lab.eng.bos.redhat.com exporting volume "vol"
client : heka-client-15.gdev.lab.eng.bos.redhat.com mount point "/mnt/nfs/1/"

Comment 6 Manoj Pillai 2015-05-06 05:48:18 UTC

As it says in comment 0, I have seen ganesha crash many times in my testing. I saved a core dump and have made it available for analysis. This looks like a seg fault encountered in libgfapi.

Any leads from the coredump analysis?

Comment 7 Jiffin 2015-05-06 09:36:35 UTC

I can't analysis the core. bt gives me :

#0  0x00000039ad0097a6 in ?? ()
#1  0x00007faaf81af380 in ?? ()
#2  0x00000039ad00e0b4 in ?? ()
#3  0xa9eeb31fb78cd404 in ?? ()
#4  0x014f7beed34de417 in ?? ()
#5  0x04d48cb71fb3eea9 in ?? ()
#6  0x000000000000fd03 in ?? ()
#7  0x0001030000000001 in ?? ()
#8  0x0000000000000001 in ?? ()
#9  0x0000000000000000 in ?? ()

I am using the same version of ganesha(2.2-dev-23)


And one more thing , the iozone test completed today(after 5 days) in my setup

iozone -i 0 -w -+n -c -C -e -s 512M -r 64k -t 4 -F /mnt/nfs/1/f{1,2,3,4}-.ioz
        Iozone: Performance Test of File I/O
                Version $Revision: 3.429 $
                Compiled for 64 bit mode.
                Build: linux-AMD64

        Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
                     Al Slater, Scott Rhine, Mike Wisner, Ken Goss
                     Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
                     Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
                     Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
                     Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
                     Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer,
                     Vangel Bojaxhi, Ben England, Vikentsi Lapa.

        Run began: Thu Apr 30 02:38:27 2015

        Setting no_unlink
        No retest option selected
        Include close in write timing
        Include fsync in write timing
        File size set to 524288 kB
        Record Size 64 kB
        Command line used: iozone -i 0 -w -+n -c -C -e -s 512M -r 64k -t 4 -F /mnt/nfs/1/f1-.ioz /mnt/nfs/1/f2-.ioz /mnt/nfs/1/f3-.ioz /mnt/nfs/1/f4-.ioz
        Output is in kBytes/sec
        Time Resolution = 0.000001 seconds.
        Processor cache size set to 1024 kBytes.
        Processor cache line size set to 32 bytes.
        File stride size set to 17 * record size.
        Throughput test with 4 processes
        Each process writes a 524288 kByte file in 64 kByte records

        Children see throughput for  4 initial writers  =       3.74 kB/sec
        Parent sees throughput for  4 initial writers   =       3.73 kB/sec
        Min throughput per process                      =       0.84 kB/sec
        Max throughput per process                      =       1.02 kB/sec
        Avg throughput per process                      =       0.94 kB/sec
        Min xfer                                        =  433920.00 kB
        Child[0] xfer count =  492992.00 kB, Throughput =       0.96 kB/sec
        Child[1] xfer count =  433920.00 kB, Throughput =       0.84 kB/sec
        Child[2] xfer count =  524288.00 kB, Throughput =       1.02 kB/sec
        Child[3] xfer count =  481792.00 kB, Throughput =       0.93 kB/sec



iozone test complete.

Comment 8 Manoj Pillai 2015-05-08 08:58:34 UTC

Upgraded to:
glusterfs*-3.7.0beta1-0.3.git7aeae00.el6.x86_64
nfs-ganesha*-2.2.0-0.el6.x86_64

Debuginfo installed as well.

Was able to reproduce the crash with single-client iozone runs, same as the one in comment 2.

gdb on the coredump gives:

(gdb) bt
#0  glfs_iatt_to_stat (fs=0x0, iatt=0x7f7cab9da5e0, stat=0x7f7cab9da6b8) at glfs-fops.c:58
#1  0x00007f7ccb8bc0b4 in pub_glfs_fstat (glfd=0x7f7c84002270, stat=0x7f7cab9da6b8) at glfs-fops.c:292
#2  0x00007f7ccbacf956 in getattrs (obj_hdl=0x7f7c680022d8) at /usr/src/debug/nfs-ganesha-2.2.0-0.1.1-Source/FSAL/FSAL_GLUSTER/handle.c:661
#3  0x00000000004e79c8 in cache_inode_refresh_attrs (entry=0x7f7c680025a0) at /usr/src/debug/nfs-ganesha-2.2.0-0.1.1-Source/include/cache_inode.h:967
#4  0x00000000004e8d3b in cache_inode_rdwr_plus (entry=0x7f7c680025a0, io_direction=CACHE_INODE_WRITE, offset=104857600, io_size=1048576, 
    bytes_moved=0x7f7cab9dbdc8, buffer=0x7f7c400c1290, eof=0x7f7cab9dbdc7, sync=0x7f7cab9dbdc6, info=0x0)
    at /usr/src/debug/nfs-ganesha-2.2.0-0.1.1-Source/cache_inode/cache_inode_rdwr.c:271
#5  0x00000000004e9111 in cache_inode_rdwr (entry=0x7f7c680025a0, io_direction=CACHE_INODE_WRITE, offset=104857600, io_size=1048576, 
    bytes_moved=0x7f7cab9dbdc8, buffer=0x7f7c400c1290, eof=0x7f7cab9dbdc7, sync=0x7f7cab9dbdc6)
    at /usr/src/debug/nfs-ganesha-2.2.0-0.1.1-Source/cache_inode/cache_inode_rdwr.c:304
#6  0x0000000000461b22 in nfs3_write (arg=0x7f7c40000f38, worker=0x7f7ca40008c0, req=0x7f7c40000df8, res=0x7f7ca40022f0)
    at /usr/src/debug/nfs-ganesha-2.2.0-0.1.1-Source/Protocols/NFS/nfs3_write.c:234
#7  0x00000000004574fc in nfs_rpc_execute (req=0x7f7c40479910, worker_data=0x7f7ca40008c0)
    at /usr/src/debug/nfs-ganesha-2.2.0-0.1.1-Source/MainNFSD/nfs_worker_thread.c:1268
#8  0x0000000000458296 in worker_run (ctx=0x2824ed0) at /usr/src/debug/nfs-ganesha-2.2.0-0.1.1-Source/MainNFSD/nfs_worker_thread.c:1535
#9  0x000000000051f272 in fridgethr_start_routine (arg=0x2824ed0) at /usr/src/debug/nfs-ganesha-2.2.0-0.1.1-Source/support/fridgethr.c:562
#10 0x0000003cc86079d1 in start_thread () from /lib64/libpthread.so.0
#11 0x0000003cc82e89dd in clone () from /lib64/libc.so.6
(gdb)

Comment 9 Manoj Pillai 2015-05-08 09:02:36 UTC

Created attachment 1023433 [details]
ganesha conf file

Comment 10 Manoj Pillai 2015-05-08 09:04:30 UTC

(In reply to Manoj Pillai from comment #9)
> Created attachment 1023433 [details]
> ganesha conf file

This is the ganesha conf file used for the test in comment 8.

Comment 11 Anand Subramanian 2015-05-13 09:36:02 UTC

Summary
========
Debugged and have the fix.


Problem technical details/Fix
==============================

There is a race between getattrs() and a close() on a cache entry object which leads the the glfd getting wiped out from under the feet of getattrs() leading to the crash. This is not a bug in ganesha or gfapi but due to an optimized way in which we try to code up getattrs in FSAL_GLUSTER.

If a given glfd is valid and if the loflags are not indicating a FSAL_O_CLOSED state, we try and use glfs_fstat() which unfortunately needs the objhandle->glfd to be valid. I believe the logic in doing this was to avoid the resolve/lookup under the covers for getattrs(). However it doesn't look feasible as the cache_inode_close() -> file_close() -> glfs_close() can sneak in and destroy the glfd.

The fix I have is to simply fall back to using glfs_h_stat() for all getattrs in FSAL_GLUSTER.

Testbed State
==============
I have installed the fix in Manoj's testbed. It is available under /root/ganesha-bin. Before starting ganesha.nfsd, one needs to set LD_LIBRARY_PATH to /root/ganesha-bin/lib64:$LD_LIBRARY_PATH and then start the server.

Ran iozone with the config Manoj had specified for NFSv3.

I'd appreciate if Manoj could give this a spin to see if he has any comments on the fix/stability etc.

I will wait for a day before pushing the fix upstream into nfs-ganesha tag V2.3-dev-2 or V2.3-dev-3.

Sanity Testing for the fix
===========================
"iozone  -i 0 -w -+n -c -C -e -s 8G -r 64k -t 8 -F /mnt/glustervol/f{1,2,3,4,5,6,7,8}-8g.ioz "

runs successfully for both V3 and V4.

Output: (Sorry slightly long)
==============================


V3:


[root@gprfc066 ~]# /root/mpillai/bin/iozone  -i 0 -w -+n -c -C -e -s 8G -r 64k -t 8 -F /mnt/glustervol/f{1,2,3,4,5,6,7,8}-8g.ioz
        Iozone: Performance Test of File I/O
                Version $Revision: 3.420 $
                Compiled for 64 bit mode.
                Build: linux-AMD64

        Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
                     Al Slater, Scott Rhine, Mike Wisner, Ken Goss
                     Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
                     Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
                     Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
                     Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
                     Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer,
                     Vangel Bojaxhi, Ben England, Vikentsi Lapa.

        Run began: Wed May 13 04:25:03 2015

        Setting no_unlink
        No retest option selected
        Include close in write timing
        Include fsync in write timing
        File size set to 8388608 KB
        Record Size 64 KB
        Command line used: /root/mpillai/bin/iozone -i 0 -w -+n -c -C -e -s 8G -r 64k -t 8 -F /mnt/glustervol/f1-8g.ioz /mnt/glustervol/f2-8g.ioz /mnt/glustervol/f3-8g.ioz /mnt/glustervol/f4-8g.ioz /mnt/glustervol/f5-8g.ioz /mnt/glustervol/f6-8g.ioz /mnt/glustervol/f7-8g.ioz /mnt/glustervol/f8-8g.ioz
        Output is in Kbytes/sec
        Time Resolution = 0.000001 seconds.
        Processor cache size set to 1024 Kbytes.
        Processor cache line size set to 32 bytes.
        File stride size set to 17 * record size.
        Throughput test with 8 processes
        Each process writes a 8388608 Kbyte file in 64 Kbyte records

        Children see throughput for  8 initial writers  =  417587.90 KB/sec
        Parent sees throughput for  8 initial writers   =  409992.15 KB/sec
        Min throughput per process                      =   51458.87 KB/sec
        Max throughput per process                      =   53289.76 KB/sec
        Avg throughput per process                      =   52198.49 KB/sec
        Min xfer                                        = 8131776.00 KB
        Child[0] xfer count = 8319488.00 KB, Throughput =   52967.62 KB/sec
        Child[1] xfer count = 8155264.00 KB, Throughput =   51922.92 KB/sec
        Child[2] xfer count = 8140864.00 KB, Throughput =   51458.87 KB/sec
        Child[3] xfer count = 8298560.00 KB, Throughput =   52468.45 KB/sec
        Child[4] xfer count = 8388608.00 KB, Throughput =   53289.76 KB/sec
        Child[5] xfer count = 8131776.00 KB, Throughput =   51777.37 KB/sec
        Child[6] xfer count = 8184896.00 KB, Throughput =   52100.61 KB/sec
        Child[7] xfer count = 8171840.00 KB, Throughput =   51602.31 KB/sec



V4:

[root@gprfc066 ~]# /root/mpillai/bin/iozone  -i 0 -w -+n -c -C -e -s 8G -r 64k -t 8 -F /mnt/glustervol/f{1,2,3,4,5,6,7,8}-8g.ioz
        Iozone: Performance Test of File I/O
                Version $Revision: 3.420 $
                Compiled for 64 bit mode.
                Build: linux-AMD64

        Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
                     Al Slater, Scott Rhine, Mike Wisner, Ken Goss
                     Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
                     Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
                     Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
                     Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
                     Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer,
                     Vangel Bojaxhi, Ben England, Vikentsi Lapa.

        Run began: Wed May 13 05:27:20 2015

        Setting no_unlink
        No retest option selected
        Include close in write timing
        Include fsync in write timing
        File size set to 8388608 KB
        Record Size 64 KB
        Command line used: /root/mpillai/bin/iozone -i 0 -w -+n -c -C -e -s 8G -r 64k -t 8 -F /mnt/glustervol/f1-8g.ioz /mnt/glustervol/f2-8g.ioz /mnt/glustervol/f3-8g.ioz /mnt/glustervol/f4-8g.ioz /mnt/glustervol/f5-8g.ioz /mnt/glustervol/f6-8g.ioz /mnt/glustervol/f7-8g.ioz /mnt/glustervol/f8-8g.ioz
        Output is in Kbytes/sec
        Time Resolution = 0.000001 seconds.
        Processor cache size set to 1024 Kbytes.
        Processor cache line size set to 32 bytes.
        File stride size set to 17 * record size.
        Throughput test with 8 processes
        Each process writes a 8388608 Kbyte file in 64 Kbyte records

        Children see throughput for  8 initial writers  =  411434.73 KB/sec
        Parent sees throughput for  8 initial writers   =  405304.36 KB/sec
        Min throughput per process                      =   50801.98 KB/sec
        Max throughput per process                      =   52086.17 KB/sec
        Avg throughput per process                      =   51429.34 KB/sec
        Min xfer                                        = 8163392.00 KB
        Child[0] xfer count = 8163392.00 KB, Throughput =   50801.98 KB/sec
        Child[1] xfer count = 8373632.00 KB, Throughput =   51973.79 KB/sec
        Child[2] xfer count = 8182400.00 KB, Throughput =   50802.51 KB/sec
        Child[3] xfer count = 8242432.00 KB, Throughput =   51199.33 KB/sec
        Child[4] xfer count = 8388608.00 KB, Throughput =   52086.17 KB/sec
        Child[5] xfer count = 8382144.00 KB, Throughput =   52039.33 KB/sec
        Child[6] xfer count = 8175744.00 KB, Throughput =   50879.82 KB/sec
        Child[7] xfer count = 8303680.00 KB, Throughput =   51651.80 KB/sec



iozone test complete.

Comment 12 Anand Subramanian 2015-05-13 09:40:51 UTC

ALso note that this issue is not seen with other FSALs (Malahal from IBM was kind enough to test it in his testbed and no crash was seen with FSAL_GPFS).

Thanks,
Anand

Comment 13 Manoj Pillai 2015-05-13 15:05:19 UTC

(In reply to Anand Subramanian from comment #11)

I'm continuing to see crashes, so as a first step can you (Anand) check whether the ganesha server that is currently running on my setup has your fix? I set LD_LIBRARY_PATH as in comment 11, but better to make sure.

[I'm wondering though: why would cache_inode_close() get called in the middle a write test on NFSv3? The protocol itself does not have a close. Maybe this is is not _the_ problem we are hitting, although it sounds like a good fix to have.]

Comment 14 Anand Subramanian 2015-05-13 15:37:25 UTC

Answering your second question first while I continue to look at your setup:

cache_inode_close() has nothing with the protocol specific RPC's like OPEN or CLOSE or anything. This is a piece of infrastructure specific only to nfs-ganesha.

Whenever a backend fs object needs to be accessed, it is accessed via an intermediate entry in the cache inode layer. This cache inode entry needs to be maintained in conjunction with any fops on the file in the backend filesystem.

If you are interested, here is the stack:

Thread 1 (Thread 0x7f7cab9dd700 (LWP 5797)):
#0  glfs_iatt_to_stat (fs=0x0, iatt=0x7f7cab9da5e0, stat=0x7f7cab9da6b8) at glfs-fops.c:58
#1  0x00007f7ccb8bc0b4 in pub_glfs_fstat (glfd=0x7f7c84002270, stat=0x7f7cab9da6b8) at glfs-fops.c:292
#2  0x00007f7ccbacf956 in getattrs (obj_hdl=0x7f7c680022d8) at /usr/src/debug/nfs-ganesha-2.2.0-0.1.1-Source/FSAL/FSAL_GLUSTER/handle.c:661
#3  0x00000000004e79c8 in cache_inode_refresh_attrs (entry=0x7f7c680025a0) at /usr/src/debug/nfs-ganesha-2.2.0-0.1.1-Source/include/cache_inode.h:967
#4  0x00000000004e8d3b in cache_inode_rdwr_plus (entry=0x7f7c680025a0, io_direction=CACHE_INODE_WRITE, offset=104857600, io_size=1048576, bytes_moved=0x7f7cab9dbdc8, buffer=0x7f7c400c1290, eof=0x7f7cab9dbdc7, sync=0x7f7cab9dbdc6, 
    info=0x0) at /usr/src/debug/nfs-ganesha-2.2.0-0.1.1-Source/cache_inode/cache_inode_rdwr.c:271
#5  0x00000000004e9111 in cache_inode_rdwr (entry=0x7f7c680025a0, io_direction=CACHE_INODE_WRITE, offset=104857600, io_size=1048576, bytes_moved=0x7f7cab9dbdc8, buffer=0x7f7c400c1290, eof=0x7f7cab9dbdc7, sync=0x7f7cab9dbdc6)
    at /usr/src/debug/nfs-ganesha-2.2.0-0.1.1-Source/cache_inode/cache_inode_rdwr.c:304
#6  0x0000000000461b22 in nfs3_write (arg=0x7f7c40000f38, worker=0x7f7ca40008c0, req=0x7f7c40000df8, res=0x7f7ca40022f0) at /usr/src/debug/nfs-ganesha-2.2.0-0.1.1-Source/Protocols/NFS/nfs3_write.c:234
#7  0x00000000004574fc in nfs_rpc_execute (req=0x7f7c40479910, worker_data=0x7f7ca40008c0) at /usr/src/debug/nfs-ganesha-2.2.0-0.1.1-Source/MainNFSD/nfs_worker_thread.c:1268
#8  0x0000000000458296 in worker_run (ctx=0x2824ed0) at /usr/src/debug/nfs-ganesha-2.2.0-0.1.1-Source/MainNFSD/nfs_worker_thread.c:1535
#9  0x000000000051f272 in fridgethr_start_routine (arg=0x2824ed0) at /usr/src/debug/nfs-ganesha-2.2.0-0.1.1-Source/support/fridgethr.c:562
#10 0x0000003cc86079d1 in start_thread () from /lib64/libpthread.so.0
#11 0x0000003cc82e89dd in clone () from /lib64/libc.so.6


In cache_inode_rdwr_plus, the code path does a cache_inode_open. This function in turn does a reopen if the filesystem supports it else does a close() and then an open() at the cost of losing associated locks. But that is the way it works or has to work (the cache inode layer). This is where your close() comes from.

HTH.

Comment 15 Anand Subramanian 2015-05-13 15:57:23 UTC

Just to bring the thread to a conclusion, Manoj was using the older version of ganesha.nfsd hence still seeing the crashes. Updated the process to try out the newer version, he will now be running fresh tests to try out the fix. Thanks, Anand.

Comment 16 Manoj Pillai 2015-05-14 09:14:37 UTC

With the new version, I've not seen any crash/hang in my runs so far. I tried both the simplified single-client test in comment 2, as well as the original multiple-client, distributed iozone test from comment 0. Both ran fine.

Great!

Comment 17 Saurabh 2015-07-20 10:30:47 UTC

Based on https://bugzilla.redhat.com/show_bug.cgi?id=1214616#c16 moving this BZ to verified.

Comment 18 errata-xmlrpc 2015-07-29 04:41:48 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1495.html

Note You need to log in before you can comment on or make changes to this bug.