| Summary: | NFS crashes as a vmware ESX data store | ||
|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Shehjar Tikoo <shehjart> |
| Component: | nfs | Assignee: | Shehjar Tikoo <shehjart> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | |
| Severity: | high | Docs Contact: | |
| Priority: | low | ||
| Version: | nfs-alpha | CC: | gluster-bugs |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | --- | |
| Regression: | RTP | Mount Type: | nfs |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Attachments: | |||
Just before the segfault, log says: ==================================== [2010-05-24 16:10:08] D [rpcsvc.c:1266:rpcsvc_program_actor] rpc-service: Actor found: NFS3 - LOOKUP [2010-05-24 16:10:08] D [nfs3-helpers.c:2069:nfs3_log_fh_entry_call] nfs-nfsv3: XID: 54ef0e85, LOOKUP: args: FH: hashcount 0, xlid 0, gen 5462523656854831105, ino 1, name: .. [2010-05-24 16:10:08] T [nfs3.c:1030:nfs3_lookup] nfs-nfsv3: FH to Volume: posix [2010-05-24 16:10:08] T [nfs3-helpers.c:2803:nfs3_fh_resolve_inode] nfs-nfsv3: FH needs inode resolution [2010-05-24 16:10:08] T [nfs3-helpers.c:2307:nfs3_fh_resolve_inode_done] nfs-nfsv3: FH inode resolved [2010-05-24 16:10:08] T [nfs.c:407:nfs_user_create] nfs: uid: 0, gid 0, gids: 0 pending frames: patchset: git://git.sv.gnu.org/gluster.git ====================================== Basically, it crashes when the lookup comes for a ".." as a filename. This is supposed to be handled in the same way as the Solaris 0-length file handles. The difference is that the file handle is not 0-length. Not sure why this hasnt been a problem in earlier vmware tests. The problem is caused by a regression in nfsx. The only reason why vmkernel is sending a lookup on (rootfh, "..") pair is because we return a ".." and "." in the NFS readdirplus request. Earlier vmware used to work fine because these two entries were not being returned. I eventually started returning these two entries because I noticed that linux kernel did not display "." and ".." in its list of dirents on ls -la, if the NFS server did not return these two entries. There is a conflict now, either I return ".." and "." to get linux kernel to work correctly or not return it to allow vmkernel to work without crashing nfsx. There could be another problem too. When nfsx returns "." and "..", linux nfs client gladly shows it as it should. OTOH, vmkernel client thinks that ".." in a root directory should be different from the root since the ino of the '.." entry in readdirp reply does not match the ino of the itable->root. I know for a fact that this is a bug but not sure if it the root cause of this problem. (In reply to comment #3) > There could be another problem too. When nfsx returns "." and "..", linux nfs > client gladly shows it as it should. OTOH, vmkernel client thinks that ".." in > a root directory should be different from the root since the ino of the > '.." entry in readdirp reply does not match the ino of the itable->root. I know > for a fact that this is a bug but not sure if it the root cause of this > problem. A fix for the above problem also does not fix it. So it is confirmed that vmkernel sends the lookup for .. and . regardless of the inode numbers returned in readdirp. Anyway, this means I need to handle the crash by differentiating between a lookup on (anyfh, "..") and (rootfh, ".."). I am handling the former already but not the latter. Fix on the way. A vmware vm using a simple posix config crashes nfsx during powering on phase.
Backtrace:
#0 0x00007fd8281e6aeb in inode_ref (inode=0x0) at inode.c:410
410 table = inode->table;
(gdb) bt
#0 0x00007fd8281e6aeb in inode_ref (inode=0x0) at inode.c:410
#1 0x00007fd8269c5337 in nfs3_lookup_parentdir_resume (carg=0x7fd8255d8020) at nfs3.c:935
#2 0x00007fd8269d908c in nfs3_fh_resolve_inode_done (cs=0x7fd8255d8020, inode=0x2426bb0) at nfs3-helpers.c:2312
#3 0x00007fd8269dab32 in nfs3_fh_resolve_inode (cs=0x7fd8255d8020) at nfs3-helpers.c:2809
#4 0x00007fd8269dac8a in nfs3_fh_resolve_and_resume (cs=0x7fd8255d8020, fh=0x7fd82855ae40, entry=0x0, resum_fn=0x7fd8269c5206 <nfs3_lookup_parentdir_resume>)
at nfs3-helpers.c:2846
#5 0x00007fd8269c59b6 in nfs3_lookup (req=0x2435370, fh=0x7fd82855ae40, fhlen=22, name=0x7fd82855aea0 "..") at nfs3.c:1037
#6 0x00007fd8269c5b4f in nfs3svc_lookup (req=0x2435370) at nfs3.c:1077
#7 0x00007fd82679bef4 in rpcsvc_handle_rpc_call (conn=0x2432a60) at rpcsvc.c:1876
#8 0x00007fd82679cdb0 in rpcsvc_record_update_state (conn=0x2432a60, dataread=0) at rpcsvc.c:2356
#9 0x00007fd82679cf1b in rpcsvc_conn_data_poll_in (conn=0x2432a60) at rpcsvc.c:2399
#10 0x00007fd82679d35b in rpcsvc_conn_data_handler (fd=10, idx=3, data=0x2432a60, poll_in=1, poll_out=0, poll_err=0) at rpcsvc.c:2528
#11 0x00007fd8281fc736 in event_dispatch_epoll_handler (event_pool=0x2426aa0, events=0x24271c0, i=0) at event.c:804
#12 0x00007fd8281fc928 in event_dispatch_epoll (event_pool=0x2426aa0) at event.c:867
#13 0x00007fd8281fcc47 in event_dispatch (event_pool=0x2426aa0) at event.c:975
#14 0x00007fd826797f62 in rpcsvc_stage_proc (arg=0x2424de0) at rpcsvc.c:64
#15 0x00007fd827da0a04 in start_thread (arg=<value optimized out>) at pthread_create.c:300
#16 0x00007fd827b0a80d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#17 0x0000000000000000 in ?? ()
(gdb)
Log attached.
Crash is fixed but vm is not getting powered on. Created attachment 212 [details]
Fixes from jorit against 6.2 installer to fix problem
trace for failed vm power on test.
Screen shot of ESX system logs shows relevant error messages. Trying to make sense of them. Created attachment 213 [details]
imapk.c - Example DoS against imapd
Created attachment 214 [details]
This is a sample lvs.cf that works with the persistent parameter
First sign of trouble in testvm trace file:
####### lookup for dir testvm in root dir#############################
2543 54.569905 192.168.1.181 -> 192.168.1.109 NFS V3 LOOKUP Call, DH:0x3a4f3b4f/testvm
2544 54.570570 192.168.1.109 -> 192.168.1.181 NFS V3 LOOKUP Reply (Call In 2543), FH:0xc3046fa2
2545 54.570773 192.168.1.181 -> 192.168.1.109 NFS V3 LOOKUP Call, DH:0xc3046fa2/shared.vmft
2546 54.571484 192.168.1.109 -> 192.168.1.181 NFS V3 LOOKUP Reply (Call In 2545) Error:NFS3ERR_NOENT
######## lookup for (testvm fh, "..") to get root fh ##################
2551 54.604868 192.168.1.181 -> 192.168.1.109 NFS V3 LOOKUP Call, DH:0xc3046fa2/..
######## returns a file handle different from the one used in the first lookup ###########
2552 54.605723 192.168.1.109 -> 192.168.1.181 NFS V3 LOOKUP Reply (Call In 2551), FH:0xf404a7cc
################ Next bug: why should this different fh still map to the root fh, the readdir is actually ######
################ returning contents of root dir #################
2553 54.606584 192.168.1.181 -> 192.168.1.109 NFS V3 READDIRPLUS Call, FH:0xf404a7cc
2558 54.607858 192.168.1.109 -> 192.168.1.181 NFS V3 READDIRPLUS Reply (Call In 2553) ubu2 . testvm ubu vm2 .. ubuntu-10.04-desktop-amd64.iso
2559 54.608271 192.168.1.181 -> 192.168.1.109 NFS V3 LOOKUP Call, DH:0xf404a7cc/ubu2
2560 54.608992 192.168.1.109 -> 192.168.1.181 NFS V3 LOOKUP Reply (Call In 2559), FH:0xc30407a3
2561 54.609193 192.168.1.181 -> 192.168.1.109 NFS V3 LOOKUP Call, DH:0xf404a7cc/testvm
Fun!
###########################################
############# GDB OUTPUT ##################
###########################################
Breakpoint 1, nfs3svc_lookup_parentdir_cbk (frame=0x26016f8, cookie=0x24c20c0, this=0x24c0fc0, op_ret=0, op_errno=2, inode=0x24c3bb0,
buf=0x7f6a259a99a0, xattr=0x0, postparent=0x7f6a259a9930) at nfs3.c:887
887 struct nfs3_fh newfh = {{0}, };
Current language: auto
The current source language is "auto; currently c".
(gdb) n
888 nfsstat3 status = NFS3_OK;
(gdb) n
889 nfs3_call_state_t *cs = NULL;
(gdb)
891 cs = frame->local;
(gdb)
892 if (op_ret == -1) {
(gdb)
897 if (!nfs3_fh_is_root_fh (&cs->fh))
(gdb) p cs->fh
$1 = {ident = ":O", hashcount = 1, xlatorid = 0, gen = 5474837392516972579, ino = 204821, entryhash = {257, 0 <repeats 20 times>}}
(gdb) p cs->resolvefh
$2 = {ident = ":O", hashcount = 1, xlatorid = 0, gen = 5474837392516972579, ino = 204821, entryhash = {257, 0 <repeats 20 times>}}
############## the two file handles are same as expected ####################
(gdb) n
898 nfs3_fh_build_parent_fh (&cs->fh, buf, &newfh);
############# BUT not getting recognized as root fh #########################
(gdb) p cs->resolvedloc
$3 = {path = 0x2600a00 "/", name = 0x2600a01 "", ino = 1, inode = 0x24c3bb0, parent = 0x0}
############ resolved loc is correctly pointing to the root dir #############
92 int
93 nfs3_fh_is_root_fh (struct nfs3_fh *fh)
94 {
95 if (!fh)
96 return 0;
97
98 if (fh->hashcount == 0)
########### BUGGY ROOT FH CHECKING CONDITION ####################
(gdb)
99 return 1;
100
101 return 0;
102 }
103
PATCH: http://patches.gluster.com/patch/3354 in master (nfs3: Funge . and .. ino/gen in readdir of root) PATCH: http://patches.gluster.com/patch/3355 in master (nfs3: Special-case the lookup for parent dir of root) Regression Test NFSx crashed because it did not handle a particular style of file handles that vmkernel/vmware esx can send in nfs lookup requests. Test Case 1. Create a simple posix+nfsx volume file. 2. On a Windows machine, start the vsphere client and connect to a ESX server. 3. In the Configuration->Storage tab, create a NFS datastore that mounts the nfsx-exported volume. 4. Thats it, if the process of creating the data store completes without any errors on the vsphere UI or a crash in nfsx, test is a success. |
Created attachment 208 [details] Fix ppp-watch to honor a MAXFAIL variable controlling dialing