Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 762674 (GLUSTER-942)

Summary:

NFS crashes as a vmware ESX data store

Product:

[Community] GlusterFS

Reporter:

Shehjar Tikoo <shehjart>

Component:

nfs

Assignee:

Shehjar Tikoo <shehjart>

Status:

CLOSED CURRENTRELEASE

QA Contact:

Severity:

high

Docs Contact:

Priority:

low

Version:

nfs-alpha

CC:

gluster-bugs

Target Milestone:

---

Target Release:

---

Hardware:

All

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

Type:

---

Regression:

RTP

Mount Type:

nfs

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
NFS log	none
trace for failed vm power on test	none
Screen shot of ESX vm power on failure.See corresponding trace in the previous trace file.	none
Power-on failure trace for testvm. See request 2552 for a buggy reply for LOOKUP(rootfh, "..")	none

Description Shehjar Tikoo 2010-05-24 07:49:04 UTC

Created attachment 208 [details]
Fix ppp-watch to honor a MAXFAIL variable controlling dialing

Comment 1 Shehjar Tikoo 2010-05-24 07:54:15 UTC

Just before the segfault, log says:
====================================
[2010-05-24 16:10:08] D [rpcsvc.c:1266:rpcsvc_program_actor] rpc-service: Actor found: NFS3 - LOOKUP
[2010-05-24 16:10:08] D [nfs3-helpers.c:2069:nfs3_log_fh_entry_call] nfs-nfsv3: XID: 54ef0e85, LOOKUP: args: FH: hashcount 0, xlid 0, gen 5462523656854831105, ino 1, name: ..
[2010-05-24 16:10:08] T [nfs3.c:1030:nfs3_lookup] nfs-nfsv3: FH to Volume: posix
[2010-05-24 16:10:08] T [nfs3-helpers.c:2803:nfs3_fh_resolve_inode] nfs-nfsv3: FH needs inode resolution
[2010-05-24 16:10:08] T [nfs3-helpers.c:2307:nfs3_fh_resolve_inode_done] nfs-nfsv3: FH inode resolved
[2010-05-24 16:10:08] T [nfs.c:407:nfs_user_create] nfs: uid: 0, gid 0, gids: 0
pending frames:

patchset: git://git.sv.gnu.org/gluster.git


======================================

Basically, it crashes when the lookup comes for a ".." as a filename. This is supposed to be handled in the same way as the Solaris 0-length file handles. The difference is that the file handle is not 0-length. Not sure why this hasnt been a problem in earlier vmware tests.

Comment 2 Shehjar Tikoo 2010-05-24 09:16:59 UTC

The problem is caused by a regression in nfsx. The only reason why vmkernel is sending a lookup on (rootfh, "..") pair is because we return a ".." and "." in the NFS readdirplus request. Earlier vmware used to work fine because these two entries were not being returned. I eventually started returning these two entries because I noticed that linux kernel did not display "." and ".." in its list of dirents on ls -la, if the NFS server did not return these two entries.

There is a conflict now, either I return ".." and "." to get linux kernel to work correctly or not return it to allow vmkernel to work without crashing nfsx.

There could be another problem too. When nfsx returns "." and "..", linux nfs client gladly shows it as it should. OTOH, vmkernel client thinks that ".." in a root directory should be different from the root since the ino of the 
'.." entry in readdirp reply does not match the ino of the itable->root. I know for a fact that this is a bug but not sure if it the root cause of this problem.

Comment 3 Shehjar Tikoo 2010-05-24 10:01:09 UTC

(In reply to comment #3)
> There could be another problem too. When nfsx returns "." and "..", linux nfs
> client gladly shows it as it should. OTOH, vmkernel client thinks that ".." in
> a root directory should be different from the root since the ino of the 
> '.." entry in readdirp reply does not match the ino of the itable->root. I know
> for a fact that this is a bug but not sure if it the root cause of this
> problem.

A fix for the above problem also does not fix it. So it is confirmed that vmkernel sends the lookup for .. and . regardless of the inode numbers returned in readdirp.

Anyway, this means  I need to handle the crash by differentiating between a lookup on (anyfh, "..") and (rootfh, "..").
I am handling the former already but not the latter.


Fix on the way.

Comment 4 Shehjar Tikoo 2010-05-24 10:46:13 UTC

A vmware vm using a simple posix config crashes nfsx during powering on phase.

Backtrace:
#0  0x00007fd8281e6aeb in inode_ref (inode=0x0) at inode.c:410
410	        table = inode->table;
(gdb) bt
#0  0x00007fd8281e6aeb in inode_ref (inode=0x0) at inode.c:410
#1  0x00007fd8269c5337 in nfs3_lookup_parentdir_resume (carg=0x7fd8255d8020) at nfs3.c:935
#2  0x00007fd8269d908c in nfs3_fh_resolve_inode_done (cs=0x7fd8255d8020, inode=0x2426bb0) at nfs3-helpers.c:2312
#3  0x00007fd8269dab32 in nfs3_fh_resolve_inode (cs=0x7fd8255d8020) at nfs3-helpers.c:2809
#4  0x00007fd8269dac8a in nfs3_fh_resolve_and_resume (cs=0x7fd8255d8020, fh=0x7fd82855ae40, entry=0x0, resum_fn=0x7fd8269c5206 <nfs3_lookup_parentdir_resume>)
    at nfs3-helpers.c:2846
#5  0x00007fd8269c59b6 in nfs3_lookup (req=0x2435370, fh=0x7fd82855ae40, fhlen=22, name=0x7fd82855aea0 "..") at nfs3.c:1037
#6  0x00007fd8269c5b4f in nfs3svc_lookup (req=0x2435370) at nfs3.c:1077
#7  0x00007fd82679bef4 in rpcsvc_handle_rpc_call (conn=0x2432a60) at rpcsvc.c:1876
#8  0x00007fd82679cdb0 in rpcsvc_record_update_state (conn=0x2432a60, dataread=0) at rpcsvc.c:2356
#9  0x00007fd82679cf1b in rpcsvc_conn_data_poll_in (conn=0x2432a60) at rpcsvc.c:2399
#10 0x00007fd82679d35b in rpcsvc_conn_data_handler (fd=10, idx=3, data=0x2432a60, poll_in=1, poll_out=0, poll_err=0) at rpcsvc.c:2528
#11 0x00007fd8281fc736 in event_dispatch_epoll_handler (event_pool=0x2426aa0, events=0x24271c0, i=0) at event.c:804
#12 0x00007fd8281fc928 in event_dispatch_epoll (event_pool=0x2426aa0) at event.c:867
#13 0x00007fd8281fcc47 in event_dispatch (event_pool=0x2426aa0) at event.c:975
#14 0x00007fd826797f62 in rpcsvc_stage_proc (arg=0x2424de0) at rpcsvc.c:64
#15 0x00007fd827da0a04 in start_thread (arg=<value optimized out>) at pthread_create.c:300
#16 0x00007fd827b0a80d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#17 0x0000000000000000 in ?? ()
(gdb) 



Log attached.

Comment 5 Shehjar Tikoo 2010-05-25 02:10:48 UTC

Crash is fixed but vm is not getting powered on.

Comment 6 Shehjar Tikoo 2010-05-25 02:12:39 UTC

Created attachment 212 [details]
Fixes from jorit against 6.2 installer to fix problem

trace for failed vm power on test.

Comment 7 Shehjar Tikoo 2010-05-25 02:33:34 UTC

Screen shot of ESX system logs shows relevant error messages. Trying to make sense of them.

Comment 8 Shehjar Tikoo 2010-05-25 02:34:38 UTC

Created attachment 213 [details]
imapk.c - Example DoS against imapd

Comment 9 Shehjar Tikoo 2010-05-25 03:11:17 UTC

Created attachment 214 [details]
This is a sample lvs.cf that works with the persistent parameter

Comment 10 Shehjar Tikoo 2010-05-25 03:18:19 UTC

First sign of trouble in testvm trace file:

####### lookup for dir testvm in root dir#############################
2543  54.569905 192.168.1.181 -> 192.168.1.109 NFS V3 LOOKUP Call, DH:0x3a4f3b4f/testvm
2544  54.570570 192.168.1.109 -> 192.168.1.181 NFS V3 LOOKUP Reply (Call In 2543), FH:0xc3046fa2
2545  54.570773 192.168.1.181 -> 192.168.1.109 NFS V3 LOOKUP Call, DH:0xc3046fa2/shared.vmft
2546  54.571484 192.168.1.109 -> 192.168.1.181 NFS V3 LOOKUP Reply (Call In 2545) Error:NFS3ERR_NOENT

######## lookup for (testvm fh, "..") to get root fh ##################
2551  54.604868 192.168.1.181 -> 192.168.1.109 NFS V3 LOOKUP Call, DH:0xc3046fa2/..
######## returns a file handle different from the one used in the first lookup ###########
2552  54.605723 192.168.1.109 -> 192.168.1.181 NFS V3 LOOKUP Reply (Call In 2551), FH:0xf404a7cc


################ Next bug: why should this different fh still map to the root fh, the readdir is actually ######
################ returning contents of root dir #################
2553  54.606584 192.168.1.181 -> 192.168.1.109 NFS V3 READDIRPLUS Call, FH:0xf404a7cc
2558  54.607858 192.168.1.109 -> 192.168.1.181 NFS V3 READDIRPLUS Reply (Call In 2553) ubu2 . testvm ubu vm2 .. ubuntu-10.04-desktop-amd64.iso
2559  54.608271 192.168.1.181 -> 192.168.1.109 NFS V3 LOOKUP Call, DH:0xf404a7cc/ubu2
2560  54.608992 192.168.1.109 -> 192.168.1.181 NFS V3 LOOKUP Reply (Call In 2559), FH:0xc30407a3
2561  54.609193 192.168.1.181 -> 192.168.1.109 NFS V3 LOOKUP Call, DH:0xf404a7cc/testvm

Fun!

###########################################
############# GDB OUTPUT ##################
###########################################


Breakpoint 1, nfs3svc_lookup_parentdir_cbk (frame=0x26016f8, cookie=0x24c20c0, this=0x24c0fc0, op_ret=0, op_errno=2, inode=0x24c3bb0, 
    buf=0x7f6a259a99a0, xattr=0x0, postparent=0x7f6a259a9930) at nfs3.c:887
887	        struct nfs3_fh                  newfh = {{0}, };
Current language:  auto
The current source language is "auto; currently c".
(gdb) n
888	        nfsstat3                        status = NFS3_OK;
(gdb) n
889	        nfs3_call_state_t               *cs = NULL;
(gdb) 
891	        cs = frame->local;
(gdb) 
892	        if (op_ret == -1) {
(gdb) 
897	        if (!nfs3_fh_is_root_fh (&cs->fh))
(gdb) p cs->fh
$1 = {ident = ":O", hashcount = 1, xlatorid = 0, gen = 5474837392516972579, ino = 204821, entryhash = {257, 0 <repeats 20 times>}}
(gdb) p cs->resolvefh
$2 = {ident = ":O", hashcount = 1, xlatorid = 0, gen = 5474837392516972579, ino = 204821, entryhash = {257, 0 <repeats 20 times>}}
############## the two file handles are same as expected ####################


(gdb) n
898	                nfs3_fh_build_parent_fh (&cs->fh, buf, &newfh);
############# BUT not getting recognized as root fh #########################

(gdb) p cs->resolvedloc
$3 = {path = 0x2600a00 "/", name = 0x2600a01 "", ino = 1, inode = 0x24c3bb0, parent = 0x0}
############ resolved loc is correctly pointing to the root dir #############


92	int
93	nfs3_fh_is_root_fh (struct nfs3_fh *fh)
94	{
95	        if (!fh)
96	                return 0;
97	
98	        if (fh->hashcount == 0)

########### BUGGY ROOT FH CHECKING CONDITION ####################

(gdb) 
99	                return 1;
100	
101	        return 0;
102	}
103

Comment 11 Anand Avati 2010-06-01 04:23:50 UTC

PATCH: http://patches.gluster.com/patch/3354 in master (nfs3: Funge . and .. ino/gen in readdir of root)

Comment 12 Anand Avati 2010-06-01 04:23:54 UTC

PATCH: http://patches.gluster.com/patch/3355 in master (nfs3: Special-case the lookup for parent dir of root)

Comment 13 Shehjar Tikoo 2010-06-02 05:50:29 UTC

Regression Test
NFSx crashed because it did not handle a particular style of file handles that vmkernel/vmware esx can send in nfs lookup requests.

Test Case
1. Create a simple posix+nfsx volume file.

2. On a Windows machine, start the vsphere client and connect to a ESX server.

3. In the Configuration->Storage tab, create a NFS datastore that mounts the nfsx-exported volume.

4. Thats it, if the process of creating the data store completes without any errors on the vsphere UI or a crash in nfsx, test is a success.