Bug 1118387 - Server crashes if NFSv4 client does an call to OP4_WRITE on a directory if the related inode is not cached
Summary: Server crashes if NFSv4 client does an call to OP4_WRITE on a directory if th...
Alias: None
Product: nfs-ganesha
Classification: Community
Component: Protocols-NFS4
Version: 2.1
Hardware: All
OS: All
Target Milestone: ---
Assignee: Frank Filz
QA Contact:
Depends On:
TreeView+ depends on / blocked
Reported: 2014-07-10 14:57 UTC by Philippe DENIEL
Modified: 2015-10-05 21:45 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed:

Attachments (Terms of Use)

Description Philippe DENIEL 2014-07-10 14:57:27 UTC

I found a bug by chance (made a typo when typing a command line). The 
annoying point : it's actually a major bug.
It happens with NFSv4 and NFSv4.1, whatever the FSAL that you use.
I use latest 2.2-1 from Frank's repo. Stack from gdb is at the end of 
this mail.

Reproducer is:
     - start Ganesha and wait grace period
     - mount ganesha's share (let's say you mounted it under /mnt)
     - immediately do "echo something > /mnt/dir" where "/mnt/dir" is a 
directory that already exists
Instead of returning  "Is a directory", ganesha crashes.

If you "ls" right after the mount (or something that will call readdir() 
and populate the cache inode on the server side), you get a regular 
behaviour and no crash.
To reproduce the bug, you should operate on a non-cached directory.

This bug is pretty annoying : if the server restarts, then the client 
reconnects, waits for the grace period and then redo the same write 
request to the directory... and the server crashes again.
The loop can be broken by doing a ls to populate the cash during the 
grace period.

Segfault observed in gdb:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fff90dfa700 (LWP 29901)]
0x00000000004c6636 in cache_inode_setattr (entry=0x0, 
attr=0x7fff90df8a60, is_open_write=true) at 
63              struct fsal_obj_handle *obj_handle = entry->obj_handle;
Missing separate debuginfos, use: debuginfo-install 
glibc-2.12-1.25.el6_1.3.x86_64 keyutils-libs-1.4-1.el6.x86_64 
krb5-libs-1.9-22.el6.x86_64 libattr-2.4.44-7.el6.x86_64 
libblkid-2.17.2-12.14.el6_5.x86_64 libcap-2.16-5.2.el6.x86_64 
libcom_err-1.41.12-11.el6.x86_64 libcom_err-1.41.90.wc3-7.el6.x86_64 
libselinux-2.0.94-5.el6.x86_64 libuuid-2.17.2-12.14.el6_5.x86_64 
(gdb) where
#0  0x00000000004c6636 in cache_inode_setattr (entry=0x0, 
attr=0x7fff90df8a60, is_open_write=true)
     at /opt/GANESHA/src/cache_inode/cache_inode_setattr.c:63
#1  0x0000000000463a1b in open4_create (arg=0x7fff54002628, 
data=0x7fff90df8d50, res=0x7fff6c000c78, parent=0x7eb370, 
     filename=0x7fff6c0011f0 "dir") at 
#2  0x0000000000463b34 in open4_claim_null (arg=0x7fff54002628, 
data=0x7fff90df8d50, res=0x7fff6c000c78, entry=0x7fff90df8c38)
     at /opt/GANESHA/src/Protocols/NFS/nfs4_op_open.c:802
#3  0x0000000000464c0d in nfs4_op_open (op=0x7fff54002620, 
data=0x7fff90df8d50, resp=0x7fff6c000c70)
     at /opt/GANESHA/src/Protocols/NFS/nfs4_op_open.c:1188
#4  0x00000000004547cc in nfs4_Compound (arg=0x7fff540009f0, 
worker=0x7fff6c0008c0, req=0x7fff54000938, res=0x7fff6c000a20)
     at /opt/GANESHA/src/Protocols/NFS/nfs4_Compound.c:679
#5  0x0000000000449ba8 in nfs_rpc_execute (req=0x7fff540008c0, 
worker_data=0x7fff6c0008c0) at 
#6  0x000000000044a5ac in worker_run (ctx=0x885340) at 
#7  0x00000000004f07f9 in fridgethr_start_routine (arg=0x885340) at 
#8  0x00000030368077e1 in start_thread () from /lib64/libpthread.so.0
#9  0x00000030364e577d in clone () from /lib64/libc.so.6

Comment 1 Frank Filz 2015-10-05 21:45:27 UTC
Does this still occur in V2.3-rc5?

Note You need to log in before you can comment on or make changes to this bug.