Bug 762693 (GLUSTER-961)

Summary: Unmount with invalid export crashes nfsx
Product: [Community] GlusterFS Reporter: Shehjar Tikoo <shehjart>
Component: nfsAssignee: Shehjar Tikoo <shehjart>
Severity: medium Docs Contact:
Priority: low    
Version: nfs-alphaCC: gluster-bugs, lakshmipathi
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: RTP Mount Type: nfs
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Description Shehjar Tikoo 2010-05-27 02:02:14 EDT
Backtrace on crash after restart of nfsx:
Breakpoint 1, __mnt3svc_umount (ms=0xa9d770, dirpath=0x7fc95611c570 "/distribute", hostname=0x7fc95611c970 "") at mount3.c:505
505     {
(gdb) n
510             if ((!ms) || (!dirpath) || (!hostname))
(gdb) n
513             if (list_empty (&ms->mountlist))
(gdb) n
516             list_for_each_entry (me, &ms->mountlist, mlist) {
(gdb) n
517                     if (dirpath[0] == '/')
(gdb) n
521                     if ((strcmp (me->exname, exname) == 0) &&
(gdb) n
517                     if (dirpath[0] == '/')
(gdb) n
521                     if ((strcmp (me->exname, exname) == 0) &&
(gdb) n
516             list_for_each_entry (me, &ms->mountlist, mlist) {
(gdb) n
526             if (!me)
(gdb) p *me
$1 = {mlist = {next = 0xb64f60, prev = 0xb64f60}, 
  exname = '\000' <repeats 32 times>, "!\000\000\000\000\000\000\000protocol-version\000\000\000\000\000\000\000\000!\000\000\000\000\000\000\000\200ة", '\000' <repeats 21 times>, "Q\000\000\000\000\000\000\000P\354\251\000\000\000\000\000P$\254\000\000\000\000\000\365\064\376K\000\000\000\000\354\371\001\000\000\000\000\000\200\060\274T\311\177\000\000\220\277\251", '\000' <repeats 21 times>, "P\000\000\000\000\000\000\000\060\000\000\000\000\000\000\000\001\000\000\000\004", '\000' <repeats 11 times>, "{\235\275T\311\177\000\000\001\000\000\000\001", '\000' <repeats 11 times>, "Q\000\000\000\000\000\000\000 \000\000\000\002\000\000\000\001\000\000\000\006\000\000\000\020\000\000\000t, f\260ة", '\000' <repeats 21 times>..., 
  hostname = "\000\000\000\000\000\000\000\000\070?\345Q\311\177\000\000(?\345Q\311\177\000\000\030?\345Q\311\177\000\000\b?\345Q\311\177", '\000' <repeats 18 times>, "X?\345Q\311\177", '\000' <repeats 66 times>, "H?\345Q\311\177", '\000' <repeats 202 times>, "H>\345Q\311\177\000\000@\200\304Q\311\177\000\000p\241\304Q\311\177\000\000\t\000\034\000\000\000\000\000\020˩\000\000\000\000\000\003\000\000\000\000\000\000\000\270ީ", '\000' <repeats 21 times>"\260, ө\000\000\000\000\000\006\000\000\000k\000\000\000\a\000\000\000\t\000\000\000\220\202\304Q\311\177\000\000Ђ\304Q\311\177\000\000\274\203\304Q\311\177\000\000\001\000\000\000\016@", '\000' <repeats 26 times>"\206, \230\304Q\311\177\000\000\060ѩ\000\000\000\000\000\000\200\304Q\311\177\000\000\210E\345Q\311\177\000\000\000@\305Q\311\177\000\000\340\363\036V\311\177\000\000"...}
(gdb) n
529             gf_log (GF_MNT, GF_LOG_DEBUG, "Unmounting: dir %s, host: %s",
(gdb) n
531             list_del (&me->mlist);
(gdb) n
532             FREE (me);
(gdb) n

Program received signal SIGABRT, Aborted.
0x00007fc955614a75 in raise () from /lib/libc.so.6
(gdb) bt
#0  0x00007fc955614a75 in raise () from /lib/libc.so.6
#1  0x00007fc9556185c0 in abort () from /lib/libc.so.6
#2  0x00007fc95564e4fb in ?? () from /lib/libc.so.6
#3  0x00007fc9556585b6 in ?? () from /lib/libc.so.6
#4  0x00007fc95565ee53 in free () from /lib/libc.so.6
#5  0x00007fc953d05312 in __mnt3svc_umount (ms=<value optimized out>, dirpath=<value optimized out>, hostname=0x7fc95611c970 "") at mount3.c:532
#6  0x00007fc953d053c0 in mnt3svc_umount (ms=0xa9d770, dirpath=<value optimized out>, hostname=<value optimized out>) at mount3.c:549
#7  0x00007fc953d05e77 in mnt3svc_umnt (req=0xb6fbb0) at mount3.c:597
#8  0x00007fc953ae40da in rpcsvc_handle_rpc_call (conn=0xaa11f0) at rpcsvc.c:1876
#9  0x00007fc953ae47b1 in rpcsvc_record_update_state (conn=0xaa11f0, dataread=0) at rpcsvc.c:2356
#10 0x00007fc953ae4b58 in rpcsvc_conn_data_handler (fd=<value optimized out>, idx=11298, data=0xaa11f0, poll_in=-1, poll_out=0, poll_err=1432811584) at rpcsvc.c:2528
#11 0x00007fc955db584d in event_dispatch_epoll_handler (event_pool=0xa97fe0) at event.c:804
#12 event_dispatch_epoll (event_pool=0xa97fe0) at event.c:867
#13 0x00007fc953ae6012 in rpcsvc_stage_proc (arg=<value optimized out>) at rpcsvc.c:64
#14 0x00007fc9559699ca in start_thread () from /lib/libpthread.so.0
#15 0x00007fc9556c769d in clone () from /lib/libc.so.6
#16 0x0000000000000000 in ?? ()
Comment 1 Shehjar Tikoo 2010-05-27 02:05:55 EDT
Steps to reproduce:

Say export is /distribute

1. In vsphere client, create a NFS datastore that mounts this export.

2. Stop Glusterfs process running nfsx.

3. Edit the nfsx volfile, change the name of the distribute volume to dist and change the subvolumes line in nfsx. Say the new name is dist, so the export will be /dist.

4. Start the glusterfs process with the nfsx volfile.

5. Mount the new name on a linux nfs client from the command line:
$ mount <srv>:/dist /mnt

6. In vSphere, unmount the datastore created earlier.

The nfs translator would've crashed in the unmount path.
Comment 2 Shehjar Tikoo 2010-05-27 04:34:01 EDT
Suppose the export is /distribute and is mounted as such by the ESX nfs client. Then nfsx is restarted but this time the export name is changed to /dist.

After the restart, on unmounting from vmware, nfsx crashes because vmware sends the old export name in the unmount request. This is not handled properly in nfsx. Wasnt seen till now because Linux nfs client does not send a unmount request when it sees that the server has restarted. I think it assumes that it is possible the export names to have changed after the restart.
Comment 3 Shehjar Tikoo 2010-05-31 00:20:38 EDT
The problem is not with vmotion'ing only but also with simple self-heal scenarios. Even with just two replicas and a dd run there are two problems:

In a nfs export of a 2-replica volume, no perf translators, just io-threads on posix.

Scenario 1:
1. Start a dd run on nfs mount.
2. Bring down the replica that is not the nfs server, since nfs server is also a replica.
3. After a few seconds, restart the downed replica.
4. dd finishes.
5. File on the downed replica is same size as the nfs server replica but corrupted.

Scenario 2:
1. Start a dd run on nfs mount.
2. Bring down the replica that is not the nfs server, since nfs server is also a replica.
3. dd finishes.
4. Bring up the downed replica.
4. Do ls -lRh on the mount point.
5. Check the file on the downed replica. It has not been self-healed.
Comment 4 Shehjar Tikoo 2010-05-31 00:21:35 EDT
Comment three is not applicable for this bug. Typed in wrong  browser tab.
Comment 5 Anand Avati 2010-06-01 00:24:01 EDT
PATCH: http://patches.gluster.com/patch/3357 in master (mount3: Handle unmount for unknown volume names)
Comment 6 Shehjar Tikoo 2010-06-02 02:01:35 EDT
Regression Test
The test is caused by nfsx trying to remove a non-existent mount from its mount list.

Test Case
1. Create a posix+nfsx vol file and start glusterfsd.

2. Start vsphere client tool on a windows machine and connect to the ESX

3. If the steps in comment 2 do not crash the nfs server, the test is a success.
Comment 7 Lakshmipathi G 2010-07-29 04:11:14 EDT
verified with nfs-beta-rc10 .
Regression url - http://test.gluster.com/show_bug.cgi?id=91