Bug 1372283
Summary: | [GSS] NFS Sub-directory mount not working on solaris10 client | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Bipin Kunal <bkunal> | |
Component: | gluster-nfs | Assignee: | Bipin Kunal <bkunal> | |
Status: | CLOSED ERRATA | QA Contact: | surabhi <sbhaloth> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | rhgs-3.1 | CC: | amukherj, asrivast, bkunal, jthottan, mbenjamin, ndevos, rcyriac, rhinduja, rhs-bugs, rnalakka, skoduri, storage-qa-internal | |
Target Milestone: | --- | Keywords: | Triaged | |
Target Release: | RHGS 3.3.0 | |||
Hardware: | x86_64 | |||
OS: | Solaris | |||
Whiteboard: | ||||
Fixed In Version: | glusterfs-3.8.4-23 | Doc Type: | Enhancement | |
Doc Text: |
Subdirectories of a volume can now be mounted on Solaris 10 clients using WebNFS.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1426667 1445594 (view as bug list) | Environment: | ||
Last Closed: | 2017-09-21 04:28:23 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1426667, 1451224 | |||
Bug Blocks: | 1369781, 1417145 | |||
Attachments: |
Description
Bipin Kunal
2016-09-01 10:27:19 UTC
nfs.log gives error : [2016-08-31 10:55:11.554873] E [nfs3.c:341:__nfs3_get_volume_id] (-->/usr/lib64/glusterfs/3.7.9/xlator/nfs/server.so(nfs3_lookup_reply+0x4d) [0x7fe09db206ed] -->/usr/lib64/glusterfs/3.7.9/xlator/nfs/server.so(nfs3_request_xlator_deviceid+0x78) [0x7fe09db1f1f8] -->/usr/lib64/glusterfs/3.7.9/xlator/nfs/server.so(__nfs3_get_volume_id+0xae) [0x7fe09db1f13e] ) 0-nfs-nfsv3: invalid argument: xl [Invalid argument] ============================================================================== On the server : # rpcinfo -p localhost program vers proto port service 100000 4 tcp 111 portmapper 100000 3 tcp 111 portmapper 100000 2 tcp 111 portmapper 100000 4 udp 111 portmapper 100000 3 udp 111 portmapper 100000 2 udp 111 portmapper 100005 1 tcp 38466 mountd 100003 3 tcp 2049 nfs 100021 4 tcp 38468 nlockmgr 100227 3 tcp 2049 nfs_acl 100024 1 udp 58999 status 100024 1 tcp 50032 status 100021 1 udp 1018 nlockmgr 100021 1 tcp 600 nlockmgr # rpcinfo | grep nfs 100003 3 tcp 0.0.0.0.8.1 nfs superuser 100227 3 tcp 0.0.0.0.8.1 nfs_acl superuser On the solaris client : # rpcinfo -p localhost program vers proto port service 100000 4 tcp 111 rpcbind 100000 3 tcp 111 rpcbind 100000 2 tcp 111 rpcbind 100000 4 udp 111 rpcbind 100000 3 udp 111 rpcbind 100000 2 udp 111 rpcbind 100024 1 udp 32773 status 100024 1 tcp 32771 status 100133 1 udp 32773 100133 1 tcp 32771 1073741824 1 tcp 32772 100021 1 udp 4045 nlockmgr 100021 2 udp 4045 nlockmgr 100021 3 udp 4045 nlockmgr 100021 4 udp 4045 nlockmgr 100021 1 tcp 4045 nlockmgr 100021 2 tcp 4045 nlockmgr 100021 3 tcp 4045 nlockmgr 100021 4 tcp 4045 nlockmgr 100229 1 tcp 32775 metad 100229 2 tcp 32775 metad 100001 2 udp 32777 rstatd 100001 3 udp 32777 rstatd 100001 4 udp 32777 rstatd 100422 1 tcp 32776 100422 2 tcp 32776 100242 1 tcp 32777 metamedd 100230 1 tcp 32778 metamhd 100068 2 udp 32778 100068 3 udp 32778 100068 4 udp 32778 100068 5 udp 32778 100083 1 tcp 32779 100002 2 tcp 32780 rusersd 100002 3 tcp 32780 rusersd 100002 2 udp 32779 rusersd 100002 3 udp 32779 rusersd 100011 1 udp 32780 rquotad 300598 1 udp 32793 300598 1 tcp 32781 805306368 1 udp 32793 805306368 1 tcp 32781 100249 1 udp 32800 100249 1 tcp 32782 100003 2 udp 2049 nfs 100003 3 udp 2049 nfs 100227 2 udp 2049 nfs_acl 100227 3 udp 2049 nfs_acl 100003 2 tcp 2049 nfs 100003 3 tcp 2049 nfs 100003 4 tcp 2049 nfs 100227 2 tcp 2049 nfs_acl 100227 3 tcp 2049 nfs_acl # rpcinfo -T tcp dhcp6-132.gsslab.pnq.redhat.com nfs rpcinfo: RPC: Program/version mismatch; low version = 0, high version = 0 program 100003 version 0 is not available # rpcinfo | grep nfs 100003 2 udp 0.0.0.0.8.1 nfs 1 100003 3 udp 0.0.0.0.8.1 nfs 1 100227 2 udp 0.0.0.0.8.1 nfs_acl 1 100227 3 udp 0.0.0.0.8.1 nfs_acl 1 100003 2 tcp 0.0.0.0.8.1 nfs 1 100003 3 tcp 0.0.0.0.8.1 nfs 1 100003 4 tcp 0.0.0.0.8.1 nfs 1 100227 2 tcp 0.0.0.0.8.1 nfs_acl 1 100227 3 tcp 0.0.0.0.8.1 nfs_acl 1 On RHEL client : # rpcinfo -p localhost program vers proto port service 100000 4 tcp 111 portmapper 100000 3 tcp 111 portmapper 100000 2 tcp 111 portmapper 100000 4 udp 111 portmapper 100000 3 udp 111 portmapper 100000 2 udp 111 portmapper 100024 1 udp 52190 status 100024 1 tcp 34675 status 100005 1 udp 20048 mountd 100005 1 tcp 20048 mountd 100005 2 udp 20048 mountd 100005 2 tcp 20048 mountd 100005 3 udp 20048 mountd 100005 3 tcp 20048 mountd 100021 1 udp 50508 nlockmgr 100021 3 udp 50508 nlockmgr 100021 4 udp 50508 nlockmgr 100021 1 tcp 56749 nlockmgr 100021 3 tcp 56749 nlockmgr 100021 4 tcp 56749 nlockmgr # rpcinfo | grep nfs 100003 3 tcp6 ::.8.1 nfs superuser 100003 4 tcp6 ::.8.1 nfs superuser 100227 3 tcp6 ::.8.1 nfs_acl superuser 100003 3 udp6 ::.8.1 nfs superuser 100003 4 udp6 ::.8.1 nfs superuser 100227 3 udp6 ::.8.1 nfs_acl superuser Maybe it is Day-1 issue. Niels shall be able to confirm. (gdb) bt #0 __nfs3_get_volume_id (nfs3=0x7f40b405d8b0, xl=0x0, volumeid=0x7f40ab5fc720 "") at nfs3.c:336 #1 0x00007f40b887b1f8 in nfs3_request_xlator_deviceid (rq=<value optimized out>) at nfs3.c:706 #2 0x00007f40b887c6ed in nfs3_lookup_reply (req=0x7f40b818006c, stat=NFS3ERR_NOENT, newfh=0x0, stbuf=0x0, postparent=<value optimized out>) at nfs3.c:1236 #3 0x00007f40b88860a2 in nfs3_lookup (req=0x7f40b818006c, fh=<value optimized out>, fhlen=<value optimized out>, name=<value optimized out>) at nfs3.c:1528 #4 0x00007f40b8886516 in nfs3svc_lookup (req=0x7f40b818006c) at nfs3.c:1558 #5 0x00007f40c68d71a4 in rpcsvc_handle_rpc_call (svc=<value optimized out>, trans=<value optimized out>, msg=0x7f40a40016f0) at rpcsvc.c:705 #6 0x00007f40c68d73a3 in rpcsvc_notify (trans=0x7f40a40074d0, mydata=<value optimized out>, event=<value optimized out>, data=0x7f40a40016f0) at rpcsvc.c:799 #7 0x00007f40c68d8af8 in rpc_transport_notify (this=<value optimized out>, event=<value optimized out>, data=<value optimized out>) at rpc-transport.c:546 #8 0x00007f40bb6587ed in socket_event_poll_in (this=0x7f40a40074d0) at socket.c:2353 #9 0x00007f40bb659abd in socket_event_handler (fd=<value optimized out>, idx=<value optimized out>, data=0x7f40a40074d0, poll_in=1, poll_out=0, poll_err=0) at socket.c:2466 #10 0x00007f40c6b82080 in event_dispatch_epoll_handler (data=0x7f40b4068be0) at event-epoll.c:575 #11 event_dispatch_epoll_worker (data=0x7f40b4068be0) at event-epoll.c:678 #12 0x00007f40c5bf8aa1 in start_thread (arg=0x7f40ab5fe700) at pthread_create.c:301 #13 0x00007f40c5561aad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 (gdb) f 3 #3 0x00007f40b88860a2 in nfs3_lookup (req=0x7f40b818006c, fh=<value optimized out>, fhlen=<value optimized out>, name=<value optimized out>) at nfs3.c:1528 1528 nfs3_lookup_reply (req, stat, NULL, NULL, NULL); (gdb) p name $14 = <value optimized out> (gdb) f 4 #4 0x00007f40b8886516 in nfs3svc_lookup (req=0x7f40b818006c) at nfs3.c:1558 1558 ret = nfs3_lookup (req, &fh, args.what.dir.data.data_len, name); (gdb) p name $15 = "replicate/bipin2", '\000' <repeats 128 times>"\217, X\272\306@\177\000\000~_\265\306@\177\000\000\217X\272\306@\177\000\000\235X\272\306@\177\000\000\b\000\000\000\000\000\000\000:\003\261\306@\177", '\000' <repeats 74 times>, "O\002", '\000' <repeats 22 times>, "\020PA\307@\177", '\000' <repeats 111 times>, "199", '\000' <repeats 40 times>, "\020\000\000\000\060\000\000\000\360\317_\253@\177\000\000\060\317_\253@\177", '\000' <repeats 61 times>, "38924", '\000' <repeats 40 times>, "\030\000\000\000\060\000\000\000\260\320_\253@\177\000\000\360\317_\253@\177", '\000' <repeats 66 times>"\360, \315_\253@\177\000\000`\320_\253@\177\000\000`\320_\253@\177\000\000\335\005]\305@\177\000\000"... (gdb) $16 = "replicate/bipin2", '\000' <repeats 128 times>"\217, X\272\306@\177\000\000~_\265\306@\177\000\000\217X\272\306@\177\000\000\235X\272\306@\177\000\000\b\000\000\000\000\000\000\000:\003\261\306@\177", '\000' <repeats 74 times>, "O\002", '\000' <repeats 22 times>, "\020PA\307@\177", '\000' <repeats 111 times>, "199", '\000' <repeats 40 times>, "\020\000\000\000\060\000\000\000\360\317_\253@\177\000\000\060\317_\253@\177", '\000' <repeats 61 times>, "38924", '\000' <repeats 40 times>, "\030\000\000\000\060\000\000\000\260\320_\253@\177\000\000\360\317_\253@\177", '\000' <repeats 66 times>"\360, \315_\253@\177\000\000`\320_\253@\177\000\000`\320_\253@\177\000\000\335\005]\305@\177\000\000"... (gdb) c Continuing. [Switching to Thread 0x7f40ba032700 (LWP 9165)] Breakpoint 5, __nfs3_get_volume_id (nfs3=0x7f40b405d8b0, xl=0x0, volumeid=0x7f40ba030720 "") at nfs3.c:336 336 { (gdb) b nfs3_lookup Breakpoint 6 at 0x7f40b8885dc0: file nfs3.c, line 1489. (gdb) c Continuing. Breakpoint 6, nfs3_lookup (req=0x7f40b8180b78, fh=0x7f40ba031b10, fhlen=0, name=0x7f40ba030b10 "replicate/bipin2") at nfs3.c:1489 1489 { (gdb) n 1496 GF_VALIDATE_OR_GOTO (GF_NFS3, req, out); (gdb) 1489 { (gdb) 1496 GF_VALIDATE_OR_GOTO (GF_NFS3, req, out); (gdb) 1497 GF_VALIDATE_OR_GOTO (GF_NFS3, fh, out); (gdb) 1498 GF_VALIDATE_OR_GOTO (GF_NFS3, name, out); (gdb) 1500 nfs3_log_fh_entry_call (rpcsvc_request_xid (req), "LOOKUP", fh, (gdb) [Switching to Thread 0x7f40ab5fe700 (LWP 9180)] Breakpoint 5, __nfs3_get_volume_id (nfs3=0x7f40b405d8b0, xl=0x0, volumeid=0x7f40ab5fc720 "") at nfs3.c:336 336 { (gdb) c Continuing. Breakpoint 6, nfs3_lookup (req=0x7f40b818006c, fh=0x7f40ab5fdb10, fhlen=0, name=0x7f40ab5fcb10 "replicate/bipin2") at nfs3.c:1489 1489 { (gdb) n 1496 GF_VALIDATE_OR_GOTO (GF_NFS3, req, out); (gdb) 1489 { (gdb) 1496 GF_VALIDATE_OR_GOTO (GF_NFS3, req, out); (gdb) 1497 GF_VALIDATE_OR_GOTO (GF_NFS3, fh, out); (gdb) l 1492 int ret = -EFAULT; 1493 struct nfs3_state *nfs3 = NULL; 1494 nfs3_call_state_t *cs = NULL; 1495 1496 GF_VALIDATE_OR_GOTO (GF_NFS3, req, out); 1497 GF_VALIDATE_OR_GOTO (GF_NFS3, fh, out); 1498 GF_VALIDATE_OR_GOTO (GF_NFS3, name, out); 1499 1500 nfs3_log_fh_entry_call (rpcsvc_request_xid (req), "LOOKUP", fh, 1501 name); (gdb) 1502 nfs3_validate_nfs3_state (req, nfs3, stat, nfs3err, ret); 1503 if (nfs3_solaris_zerolen_fh (fh, fhlen)) 1504 nfs3_funge_solaris_zerolen_fh (nfs3, fh, name, stat, nfs3err); 1505 else 1506 nfs3_validate_gluster_fh (fh, stat, nfs3err); 1507 nfs3_validate_strlen_or_goto (name, NFS_NAME_MAX, nfs3err, stat, ret); 1508 nfs3_map_fh_to_volume (nfs3, fh, req, vol, stat, nfs3err); 1509 nfs3_volume_started_check (nfs3, vol, ret, out); 1510 nfs3_handle_call_state_init (nfs3, cs, req, vol, stat, nfs3err); 1511 (gdb) n 1498 GF_VALIDATE_OR_GOTO (GF_NFS3, name, out); (gdb) 1500 nfs3_log_fh_entry_call (rpcsvc_request_xid (req), "LOOKUP", fh, (gdb) [Switching to Thread 0x7f40ba032700 (LWP 9165)] Breakpoint 5, __nfs3_get_volume_id (nfs3=0x7f40b405d8b0, xl=0x0, volumeid=0x7f40ba030720 "") at nfs3.c:336 336 { (gdb) b nfs3_solaris_zerolen_fh Breakpoint 7 at 0x7f40b887a940: file nfs3.c, line 431. (gdb) c Continuing. Program received signal SIGPIPE, Broken pipe. 0x00007f40c5559b97 in __libc_writev (fd=<value optimized out>, vector=<value optimized out>, count=<value optimized out>) at ../sysdeps/unix/sysv/linux/writev.c:56 56 result = INLINE_SYSCALL (writev, 3, fd, CHECK_N (vector, count), count); (gdb) c Continuing. Breakpoint 6, nfs3_lookup (req=0x7f40b8180b78, fh=0x7f40ba031b10, fhlen=0, name=0x7f40ba030b10 "replicate/bipin2") at nfs3.c:1489 1489 { (gdb) c Continuing. Breakpoint 7, nfs3_solaris_zerolen_fh (fh=0x7f40ba031b10, fhlen=0) at nfs3.c:431 431 if (!fh) (gdb) l 426 } 427 428 int 429 nfs3_solaris_zerolen_fh (struct nfs3_fh *fh, int fhlen) 430 { 431 if (!fh) 432 return 0; 433 434 if (nfs3_fh_validate (fh)) 435 return 0; (gdb) 436 437 if (fhlen == 0) 438 return 1; 439 440 return 0; 441 } 442 443 444 /* Function pointer that represents the generic prototypes of functions used 445 * to serialize NFS3 message structures into the XDR format. (gdb) b nfs3_fh_validate Breakpoint 8 at 0x7f40b8879050: file nfs3-fh.c, line 32. (gdb) c Continuing. Breakpoint 8, nfs3_fh_validate (fh=0x7f40ba031b10) at nfs3-fh.c:32 32 if (!fh) (gdb) n 35 if (fh->ident[0] != GF_NFSFH_IDENT0) (gdb) 41 if (fh->ident[2] != GF_NFSFH_IDENT2) (gdb) nfs3_fh_validate (fh=0x7f40ba031b10) at nfs3-fh.c:48 48 } (gdb) nfs3_solaris_zerolen_fh (fh=<value optimized out>, fhlen=0) at nfs3.c:441 441 } (gdb) p fh->ident[2] value has been optimized out (gdb) p fh->ident[1] value has been optimized out (gdb) n nfs3_lookup (req=0x7f40b8180b78, fh=0x7f40ba031b10, fhlen=0, name=0x7f40ba030b10 "replicate/bipin2") at nfs3.c:1504 1504 nfs3_funge_solaris_zerolen_fh (nfs3, fh, name, stat, nfs3err); (gdb) b nfs_mntpath_to_xlator Breakpoint 9 at 0x7f40b8864685: file nfs-common.c, line 80. (gdb) c Continuing. Breakpoint 9, nfs_mntpath_to_xlator (cl=0x7f40b4015150, path=0x7f40ba030b10 "replicate/bipin2") at nfs-common.c:80 80 { (gdb) p cl->xlator $17 = (xlator_t *) 0x7f40b40110e0 (gdb) p cl->xlator->name $18 = 0x7f40b400f500 "brickrmvtest" (gdb) p cl $19 = (xlator_list_t *) 0x7f40b4015150 (gdb) p $19->next->xlator->name $20 = 0x7f40b400b210 "replicate" (gdb) p $19->next $21 = (struct xlator_list *) 0x7f40b4015210 (gdb) p $21->next->xlator->name Cannot access memory at address 0x0 (gdb) Solaris client seems to be doing lookup of path 'replicate/bipin2' -> 'volume/subdir' instead of issuing mount cmd directly on the entire path. And we seem to be having special handling for solaris client. When fhlen is zero, "nfs3_funge_solaris_zerolen_fh()"->"nfs_mntpath_to_xlator" , we are matching up paths with exportlist which contains only volume names. That is the reason lookup of volume/subdir path fails with ENOENT error. This looks like Day1 issue. We need to add support or fix our code to handle subdir paths too for solaris client. Niels, Please have a look at C #6 and let me know your opinion. From https://github.com/gluster/glusterfs/commit/ddb31110db8e1b5995d392ced988f34d2f9145d2#diff-c76fa92cce27f67d089e7d8cac4a44bdR2114 : + /* + * 1. First check if the MOUNT is for whole volume. + * i.e. __mnt3udp_get_export_volume_inode () + * 2. If NOT, then TRY for SUBDIR MOUNT. + * i.e. __mnt3udp_get_export_subdir_inode () + * 3. If a subdir is exported using nfs.export-dir, + * then the mount type would be MNT3_EXPTYPE_DIR, + * so make sure to find the proper path to be + * resolved using __volume_subdir() + * 3. Make sure subdir export is allowed. + */ It seems to be intended to be able to mount a subdir in case it is listed in nfs.export-dir. While testing more I was able to mount sub-directory and it did not need any extra patch. Steps are really weird but it worked on my setup. On Solaris client : 1) try mounting sub-directory We expect error : "RPC: Program not registered" 2) Abort the mount On Gluster server 3) Kill gNFS process associated with the server being used for mounting kill -9 <pid> You can get gNFS pid in gluster vol status <volname> On Solaris client : 4) try mounting sub-directory We expect error "RPC: Program not registered" 5) Abort the mount On Gluster server 6) restart glusterd service to start gNFS process service glusterd restart 7) Check gluster vol status to verify if gNFS is up You might have to wait for 1-2 sec or try checking status few times On Solaris client 8) Try mounting sub-directory We expect sub-dir mount to be successful 9) Access mount If not able to access and get connectivity issue with sever Expect : gNFS crashed On Gluster server 10) Check the volume status to check the status of gNFS process and check nfs.log from backtrace 11) restart glusterd to restart gNFS and then access mount from client As updated by Bipin , sub-dir mount works certain times but not always. Below are the observations from pkt trace - 1) When sub-dir mount doesn't work - [skoduri@skoduri ~]$ tshark -r /tmp/client.pcap "rpc.program == 100003 || rpc.program == 100005 || rpc.program == 100000" 253 20:08:52.412564 10.10.188.199 → 10.65.6.132 NFS 194 V3 LOOKUP Call, DH: 0x00000000/replicate/bipin2 269 20:08:52.637463 10.65.6.132 → 10.10.188.199 NFS 90 V3 LOOKUP Reply (Call In 253) Error: NFS3ERR_NOENT 272 20:08:52.638154 10.10.188.199 → 10.65.6.132 Portmap 98 V2 GETPORT Call MOUNT(100005) V:3 UDP 287 20:08:52.863357 10.65.6.132 → 10.10.188.199 Portmap 70 V2 GETPORT Reply (Call In 272) PROGRAM_NOT_AVAILABLE 294 20:08:53.090922 10.10.188.199 → 10.65.6.132 NFS 194 V3 LOOKUP Call, DH: 0x00000000/replicate/bipin2 301 20:08:53.316083 10.65.6.132 → 10.10.188.199 NFS 90 V3 LOOKUP Reply (Call In 294) Error: NFS3ERR_NOENT 304 20:08:53.316732 10.10.188.199 → 10.65.6.132 Portmap 98 V2 GETPORT Call MOUNT(100005) V:3 UDP 318 20:08:53.542107 10.65.6.132 → 10.10.188.199 Portmap 70 V2 GETPORT Reply (Call In 304) PROGRAM_NOT_AVAILABLE 407 20:08:58.769031 10.10.188.199 → 10.65.6.132 NFS 194 V3 LOOKUP Call, DH: 0x00000000/replicate/bipin2 409 20:08:58.994121 10.65.6.132 → 10.10.188.199 NFS 90 V3 LOOKUP Reply (Call In 407) Error: NFS3ERR_NOENT 412 20:08:58.994879 10.10.188.199 → 10.65.6.132 Portmap 98 V2 GETPORT Call MOUNT(100005) V:3 UDP 416 20:08:59.220470 10.65.6.132 → 10.10.188.199 Portmap 70 V2 GETPORT Reply (Call In 412) PROGRAM_NOT_AVAILABLE 656 20:09:09.447638 10.10.188.199 → 10.65.6.132 NFS 194 V3 LOOKUP Call, DH: 0x00000000/replicate/bipin2 665 20:09:09.673086 10.65.6.132 → 10.10.188.199 NFS 90 V3 LOOKUP Reply (Call In 656) Error: NFS3ERR_NOENT 668 20:09:09.673780 10.10.188.199 → 10.65.6.132 Portmap 98 V2 GETPORT Call MOUNT(100005) V:3 UDP 675 20:09:09.899194 10.65.6.132 → 10.10.188.199 Portmap 70 V2 GETPORT Reply (Call In 668) PROGRAM_NOT_AVAILABLE [skoduri@skoduri ~]$ As can be seen above, solaris client tries to mount via UDP. But since by default gluster-NFS doesn't support mount over udp, portmap replies with PROGRAM_NOT_AVAILABLE error. 2) when sub-dir mount works [skoduri@skoduri ~]$ tshark -r /tmp/client2.pcap "rpc.program == 100003 || rpc.program == 100005 || rpc.program == 100000" 45 20:10:08.196895 10.10.188.199 → 10.65.6.132 Portmap 98 V2 GETPORT Call MOUNT(100005) V:3 UDP 47 20:10:08.422716 10.65.6.132 → 10.10.188.199 Portmap 70 V2 GETPORT Reply (Call In 45) PROGRAM_NOT_AVAILABLE 89 20:10:09.106495 10.10.188.199 → 10.65.6.132 Portmap 98 V2 GETPORT Call MOUNT(100005) V:3 UDP 90 20:10:09.331793 10.65.6.132 → 10.10.188.199 Portmap 70 V2 GETPORT Reply (Call In 89) PROGRAM_NOT_AVAILABLE 179 20:10:15.008324 10.10.188.199 → 10.65.6.132 Portmap 98 V2 GETPORT Call MOUNT(100005) V:3 UDP 187 20:10:15.233649 10.65.6.132 → 10.10.188.199 Portmap 70 V2 GETPORT Reply (Call In 179) PROGRAM_NOT_AVAILABLE 326 20:10:25.915413 10.10.188.199 → 10.65.6.132 Portmap 98 V2 GETPORT Call MOUNT(100005) V:3 UDP 337 20:10:26.141768 10.65.6.132 → 10.10.188.199 Portmap 70 V2 GETPORT Reply (Call In 326) PROGRAM_NOT_AVAILABLE 902 20:10:46.368657 10.10.188.199 → 10.65.6.132 NFS 194 V3 LOOKUP Call, DH: 0x00000000/replicate/bipin2 904 20:10:46.595552 10.65.6.132 → 10.10.188.199 NFS 90 V3 LOOKUP Reply (Call In 902) Error: NFS3ERR_NOENT 907 20:10:46.596331 10.10.188.199 → 10.65.6.132 Portmap 98 V2 GETPORT Call MOUNT(100005) V:3 UDP 911 20:10:46.821527 10.65.6.132 → 10.10.188.199 Portmap 70 V2 GETPORT Reply (Call In 907) Port:951 912 20:10:46.821815 10.10.188.199 → 10.65.6.132 MOUNT 82 V3 NULL Call 923 20:10:47.046824 10.65.6.132 → 10.10.188.199 MOUNT 66 V3 NULL Reply (Call In 912) 924 20:10:47.047246 10.10.188.199 → 10.65.6.132 MOUNT 174 V3 MNT Call replicate/bipin2 936 20:10:47.284963 10.65.6.132 → 10.10.188.199 MOUNT 134 V3 MNT Reply (Call In 924) But sometimes portmap seem to be responding with "951" for GETPORT call for MOUNT over UDP even though there doesn't seem any service listening on that port. From rpcinfo - [root@dhcp6-132 ~]# rpcinfo -p program vers proto port service 100000 4 tcp 111 portmapper 100000 3 tcp 111 portmapper 100000 2 tcp 111 portmapper 100000 4 udp 111 portmapper 100000 3 udp 111 portmapper 100000 2 udp 111 portmapper 100005 1 tcp 38466 mountd 100003 3 tcp 2049 nfs 100021 4 tcp 38468 nlockmgr 100227 3 tcp 2049 nfs_acl 100024 1 udp 58835 status 100024 1 tcp 50952 status 100021 1 udp 946 nlockmgr 100021 1 tcp 948 nlockmgr [root@dhcp6-132 ~]# Since this behaviour is seen just after gluster-NFS restart, I suspect if that port is opened for a while and that resulted in MNT calls getting succeeded. And the reason why volume mount works always is because lookup on the volume path always succeeds as updated in comment#6 84 21:08:10.300234 10.10.188.199 → 10.65.6.132 NFS 194 V3 LOOKUP Call, DH: 0x00000000/replicate/bipin2 88 21:08:10.526815 10.65.6.132 → 10.10.188.199 NFS 90 V3 LOOKUP Reply (Call In 84) Error: NFS3ERR_NOENT 91 21:08:10.527892 10.10.188.199 → 10.65.6.132 Portmap 98 V2 GETPORT Call MOUNT(100005) V:3 UDP 101 21:08:10.753877 10.65.6.132 → 10.10.188.199 Portmap 70 V2 GETPORT Reply (Call In 91) PROGRAM_NOT_AVAILABLE 114 21:08:10.981222 10.10.188.199 → 10.65.6.132 NFS 194 V3 LOOKUP Call, DH: 0x00000000/replicate/bipin2 119 21:08:11.206717 10.65.6.132 → 10.10.188.199 NFS 90 V3 LOOKUP Reply (Call In 114) Error: NFS3ERR_NOENT 122 21:08:11.207641 10.10.188.199 → 10.65.6.132 Portmap 98 V2 GETPORT Call MOUNT(100005) V:3 UDP 127 21:08:11.432815 10.65.6.132 → 10.10.188.199 Portmap 70 V2 GETPORT Reply (Call In 122) PROGRAM_NOT_AVAILABLE 548 21:08:27.230009 10.10.188.199 → 10.65.6.132 NFS 190 V3 LOOKUP Call, DH: 0x00000000/replicate 553 21:08:27.459050 10.65.6.132 → 10.10.188.199 NFS 234 V3 LOOKUP Reply (Call In 548), FH: 0xbca72b2a If I enable "nfs.mount-udp" option, subdir mount works fine. But there is an issue even after the successful mount. First GETATTR post mount results in NFS server panic. We need to fix that first I think. Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7f8cd0a63700 (LWP 8064)] 0x00007f8cd9ec5a2d in nfs3_getattr_resume (carg=0x7f8cd169806c) at nfs3.c:847 847 if (ictx->generation != priv->generation) { (gdb) bt #0 0x00007f8cd9ec5a2d in nfs3_getattr_resume (carg=0x7f8cd169806c) at nfs3.c:847 #1 0x00007f8cd9eccb63 in nfs3_fh_resolve_inode_done (cs=0x7f8cd169806c, inode=<value optimized out>) at nfs3-helpers.c:3624 #2 0x00007f8cd9ecccab in nfs3_fh_resolve_inode (cs=0x7f8cd169806c) at nfs3-helpers.c:3831 #3 0x00007f8cd9eccd35 in nfs3_fh_resolve_resume (cs=0x7f8cd169806c) at nfs3-helpers.c:3863 #4 0x00007f8cd9ecce88 in nfs3_fh_resolve_root (cs=0x7f8cd169806c) at nfs3-helpers.c:3918 #5 0x00007f8cd9ec52a3 in nfs3_getattr (req=0x7f8cd97be06c, fh=0x7f8cd0a62b20) at nfs3.c:899 #6 0x00007f8cd9ec56cd in nfs3svc_getattr (req=0x7f8cd97be06c) at nfs3.c:935 #7 0x00007f8ce7f151a4 in rpcsvc_handle_rpc_call (svc=<value optimized out>, trans=<value optimized out>, msg=0x7f8cc4001810) at rpcsvc.c:705 #8 0x00007f8ce7f153a3 in rpcsvc_notify (trans=0x7f8cd4108060, mydata=<value optimized out>, event=<value optimized out>, data=0x7f8cc4001810) at rpcsvc.c:799 ---Type <return> to continue, or q <return> to quit--- #9 0x00007f8ce7f16af8 in rpc_transport_notify (this=<value optimized out>, event=<value optimized out>, data=<value optimized out>) at rpc-transport.c:546 #10 0x00007f8cdcc967ed in socket_event_poll_in (this=0x7f8cd4108060) at socket.c:2353 #11 0x00007f8cdcc97abd in socket_event_handler (fd=<value optimized out>, idx=<value optimized out>, data=0x7f8cd4108060, poll_in=1, poll_out=0, poll_err=0) at socket.c:2466 #12 0x00007f8ce81c0080 in event_dispatch_epoll_handler (data=0x7f8cd4068be0) at event-epoll.c:575 #13 event_dispatch_epoll_worker (data=0x7f8cd4068be0) at event-epoll.c:678 #14 0x00007f8ce7236aa1 in start_thread (arg=0x7f8cd0a63700) at pthread_create.c:301 #15 0x00007f8ce6b9faad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 (gdb) l 842 * haven't been validated since the state change are affected. 843 */ 844 if (inode_ctx_get(cs->resolvedloc.inode,cs->nfsx,&raw_ctx) == 0) { 845 ictx = (struct nfs_inode_ctx *)raw_ctx; 846 priv = cs->nfsx->private; 847 if (ictx->generation != priv->generation) { 848 ret = nfs_lookup (cs->nfsx, cs->vol, &nfu, 849 &cs->resolvedloc, 850 nfs3svc_getattr_lookup_cbk, cs); 851 goto check_err; (gdb) p ictx $1 = (struct nfs_inode_ctx *) 0x2 (gdb) p cs->resolvedloc.inode $2 = (inode_t *) 0x7f8cd329c108 (gdb) p raw_ctx $3 = 2 (gdb) Maybe we are not setting proper inode_ctx while creating inode in udp mount. Hello Alok, Please have a look at C #25. This is reply from customer. I would request you to get this prioritized, at least we need a timeline for this. Thanks, Bipin Kunal Created attachment 1213450 [details] Script to disable MOUNT/tcp on a Gluster/NFS server After running this script on a Gluster/NFS server, the MOUNT protocol can only be used over UDP (not TCP as standard). This might make it possible to script a test-case and not rely on the Solaris 10 NFS-client. In combination with a hacked nfsshell (running on the NFS-client), it is possible to get the same tcpdump from a Linux system. This also reproduces the segfault in Gluster/NFS. 1. get the sources from https://github.com/nixpanic/nfsshell/tree/wip/stat 2. run 'make' to compile 'nfsshell' 3. run nfssshell like: $ ./nfsshell nfs> host server.example.com nfs> mount volume/subdir nfs> stat . 4. observe Gluster/NFS segfault in /var/log/gluster/nfs.log After restarting the Gluster/NFS service, it sometimes fails to register at rpcbind (because of the segfault it did not unregister). Subsequent mounting then fails because there is no MOUNT service on UDP available. A next restart of Gluster/NFS seems to 'correct' this. The problem is most likely caused by the inode context that gets set in libgfapi through glfs_resolve_component() and glfs_resolve_at() (called from xlators/nfs/server/src/mount3.c:__mnt3udp_get_export_subdir_inode). gfapi is not involved in the rest of the revalidation in this case, and the inode context is not used afterwards. glfs_resolve_at() is an internal function, and could get a parameter that prevents calling inode_ctx_set() in glfs_resolve_component(). I'll try to experiment a little with that and see how that works out. Created attachment 1213554 [details] Fix for segfault when doing subdir mounts over UDP This patch seems to fix the segfault for me when reproducing with nfsshell (see comment #27). I have not decided yet if this approach is acceptable for merging upstream, and have not sent it to review.gluster.org either. Please test this on a RHGS-3.1 build with a Solaris client, thanks! Niels, I tried using the patch given by you on top of 3.1 and 3.1.1 but I get hunk issue for both. Could you please re verify the RHGS/gluster version to be used for this patch. Created attachment 1214391 [details]
Rebased patch on top of RHGS-3.1.3
Please check try this one. The other version was written on top of the upstream master branch.
Thanks Niles. Here is the scratch build : https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=11964476 I tested patch with solaris client and was able to perform subdir mount successfully with any crash and restart of process was even not needed. # mount -o proto=tcp,vers=3 nfs://dhcp6-132.gsslab.pnq.redhat.com:38465/replicate/bipin2/test/ /bipin/mount1 # cd /bipin/mount1 # ls lzhdf # echo "zkhgczd" >> lzhdf # cat lzhdf zkhgczd # Although subdir mount is working fine but nfs.log do report a error at the time of mount as below: [2016-10-27 10:50:32.160677] E [nfs3.c:341:__nfs3_get_volume_id] (-->/usr/lib64/glusterfs/3.7.9/xlator/nfs/server.so(nfs3_lookup_reply+0x4d) [0x7f1ea30416ed] -->/usr/lib64/glusterfs/3.7.9/xlator/nfs/server.so(nfs3_request_xlator_deviceid+0x78) [0x7f1ea30401f8] -->/usr/lib64/glusterfs/3.7.9/xlator/nfs/server.so(__nfs3_get_volume_id+0xae) [0x7f1ea304013e] ) 0-nfs-nfsv3: invalid argument: xl [Invalid argument] Niles, What is your opinion on the error coming in nfs.log at mount time? Check C #34 Thanks, Bipin 10:49 < ndevos> bkunal: hmm, does that happen on each time of the mounting, or only the first time (before any I/O was done over the mount)? 10:49 < bkunal> ndevos, yes 10:49 < bkunal> ndevos, everytime 10:49 < ndevos> well, thats ugly :-/ 10:50 < bkunal> ndevos, I have solaris setup with me. that can be seen easily 10:55 < ndevos> bkunal: can you enable log-level=TRACE for the gluster/nfs and mount again? I would like to see if nfs_mntpath_to_xlator() then logs the "Subvolume search" 10:56 < bkunal> ndevos, sure, Anything else ? 10:56 < ndevos> bkunal: easiest to do so is check the whole glusterfs command in ps, kill it and start it manually with the --log-level=TRACE option 10:56 < ndevos> bkunal: nope, thats all for now Created attachment 1225603 [details]
successful mount
Created attachment 1225604 [details]
unsuccessful mount
Both logs have the "E [nfs3.c:341:__nfs3_get_volume_id]" error, so it does not seem to be critical, or related to the mount failures. The logic in nfs_mntpath_to_xlator() does not seem to take subdirectory mounts into account. 73 xlator_t * 74 nfs_mntpath_to_xlator (xlator_list_t *cl, char *path) 75 { 76 char *volname = NULL; 77 char *volptr = NULL; 78 size_t pathlen; 79 xlator_t *targetxl = NULL; 80 81 if ((!cl) || (!path)) 82 return NULL; 83 84 volname = strdupa (path); 85 pathlen = strlen (volname); 86 gf_msg_trace (GF_NFS, 0, "Subvolume search: %s", path); 87 if (volname[0] == '/') 88 volptr = &volname[1]; 89 else 90 volptr = &volname[0]; 91 92 if (pathlen && volname[pathlen - 1] == '/') 93 volname[pathlen - 1] = '\0'; 94 95 while (cl) { 96 if (strcmp (volptr, cl->xlator->name) == 0) { 97 targetxl = cl->xlator; 98 break; 99 } 100 101 cl = cl->next; 102 } 103 104 return targetxl; 105 106 } The above while-loop needs to get the volname as input, not the volname/subdir as it is currently. The logs clearly contain Subvolume search: replicate/bipin2/ Splitting the volname string at the first /, instead of dropping the last / should fix this (modify line 92/93). Bipin, do you want to send a patch for this? Sure, Niels. I will try and get back to you. I will leave needinfo on me. Thanks again. Hi Niels, I have made some changes in the function "nfs_mntpath_to_xlator" as per your instruction. I have not yet tested my patch. Please have a look below and let me know if you think this will work. xlator_t * nfs_mntpath_to_xlator (xlator_list_t *cl, char *path) { char *volname = NULL; char *volptr = NULL; size_t pathlen; xlator_t *targetxl = NULL; int i; if ((!cl) || (!path)) return NULL; volname = strdupa (path); pathlen = strlen (volname); gf_msg_trace (GF_NFS, 0, "Subvolume search: %s", path); if (volname[0] == '/') volptr = &volname[1]; else volptr = &volname[0]; -> for (i=0; (pathlen - 1); i++ ) { -> if(volname[i] == '/' && i!=0) { -> break; -> } -> } -> -> if (pathlen && volname[i] == '/') -> volname[i] = '\0'; while (cl) { if (strcmp (volptr, cl->xlator->name) == 0) { targetxl = cl->xlator; break; } cl = cl->next; } return targetxl; } Created attachment 1243538 [details]
bz-1372283-nfs_mntpath_to_xlator
Niels, Did you get time to review the attached patch in C #47? Thanks, Bipin Kunal I did reply by email with some inline comments a while back. Please let me know if you can not find the mail and I'll dig it up again. Bipin, Can you please post upstream patch for the proposed fix and attach the link to this bug Created attachment 1250881 [details]
Partial patch to extend the functionality and correctness
Bipin and I discussed a little over IRC about the additional failures that he got when doing the right thing in nfs_mntpath_to_xlator(). This change (tries to) resolve the subdir in the right volume, and sets the filehandle to the correct GFID.
Addressing the TODO will be needed, and Bipin can include it all in his change.
Created attachment 1256460 [details]
patch
from nfs.log : [2017-02-22 13:38:51.714383] D [MSGID: 0] [nfs.c:458:nfs_subvolume_set_started] 0-nfs: Starting up: replicate , vols started till now: 1 [2017-02-22 13:45:00.279608] D [rpc-clnt-ping.c:300:rpc_clnt_start_ping] 0-replicate-client-0: returning as transport is already disconnected OR there are no frames (0 || 0) [2017-02-22 13:45:00.279658] D [rpc-clnt-ping.c:300:rpc_clnt_start_ping] 0-replicate-client-1: returning as transport is already disconnected OR there are no frames (0 || 0) [2017-02-22 13:45:00.279668] D [logging.c:1830:gf_log_flush_timeout_cbk] 0-logging-infra: Log timer timed out. About to flush outstanding messages if present [2017-02-22 13:45:00.279685] D [logging.c:1792:__gf_log_inject_timer_event] 0-logging-infra: Starting timer now. Timeout = 120, current buf size = 5 [2017-02-22 13:46:00.112932] D [MSGID: 0] [nfs3-helpers.c:1665:nfs3_log_fh_entry_call] 0-nfs-nfsv3: XID: 58afdea4, LOOKUP: args: FH: exportid 00000000-0000-0000-0000-000000000000, gfid 00000000-0000-0000-0000-000000000000, mountid 00000000-0000-0000-0000-000000000000, name: replicate/bipin1 [2017-02-22 13:47:04.341267] D [logging.c:1830:gf_log_flush_timeout_cbk] 0-logging-infra: Log timer timed out. About to flush outstanding messages if present [2017-02-22 13:47:04.341329] D [logging.c:1792:__gf_log_inject_timer_event] 0-logging-infra: Starting timer now. Timeout = 120, current buf size = 5 [2017-02-22 13:47:04.341503] D [MSGID: 0] [nfs3-helpers.c:1665:nfs3_log_fh_entry_call] 0-nfs-nfsv3: XID: 58a6b2ac, LOOKUP: args: FH: exportid 00000000-0000-0000-0000-000000000000, gfid 00000000-0000-0000-0000-000000000000, mountid 00000000-0000-0000-0000-000000000000, name: replicate/bipin1 [2017-02-22 13:47:06.328219] D [MSGID: 0] [nfs-common.c:110:nfs_mntpath_to_xlator] 0-nfs: Volptr : replicate and cl->xlator->name : replicate [2017-02-22 13:53:37.123540] D [logging.c:1830:gf_log_flush_timeout_cbk] 0-logging-infra: Log timer timed out. About to flush outstanding messages if present [2017-02-22 13:53:37.123585] D [logging.c:1792:__gf_log_inject_timer_event] 0-logging-infra: Starting timer now. Timeout = 120, current buf size = 5 [2017-02-22 13:53:37.123664] D [MSGID: 0] [nfs-common.c:110:nfs_mntpath_to_xlator] 0-nfs: Volptr : replicate and cl->xlator->name : replicate [2017-02-22 13:53:37.123720] D [logging.c:1766:gf_log_flush_extra_msgs] 0-logging-infra: Log buffer size reduced. About to flush 1 extra log messages [2017-02-22 13:53:37.123743] D [logging.c:1769:gf_log_flush_extra_msgs] 0-logging-infra: Just flushed 1 extra log messages pending frames: pending frames: frame : type(0) op(0) patchset: git://git.gluster.com/glusterfs.git signal received: 11 time of crash: 2017-02-22 13:53:37 configuration details: argp 1 backtrace 1 dlfcn 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.7.9 /usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb8)[0x7fd877aa8b78] /usr/lib64/libglusterfs.so.0(gf_print_trace+0x32f)[0x7fd877ad1bef] /lib64/libc.so.6(+0x335ce32660)[0x7fd876445660] /usr/lib64/glusterfs/3.7.9/xlator/nfs/server.so(nfs3_lookup+0x28c)[0x7fd8698200cc] /usr/lib64/glusterfs/3.7.9/xlator/nfs/server.so(nfs3svc_lookup+0x96)[0x7fd869820716] /usr/lib64/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x314)[0x7fd8778711a4] /usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x103)[0x7fd8778713a3] /usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x7fd877872af8] /usr/lib64/glusterfs/3.7.9/rpc-transport/socket.so(+0x97ed)[0x7fd86c5f27ed] /usr/lib64/glusterfs/3.7.9/rpc-transport/socket.so(+0xaabd)[0x7fd86c5f3abd] /usr/lib64/libglusterfs.so.0(+0x99080)[0x7fd877b1c080] /lib64/libpthread.so.0(+0x335d607aa1)[0x7fd876b92aa1] upstream patch : https://review.gluster.org/#/c/16770/ downstream patch : https://code.engineering.redhat.com/gerrit/#/c/103007/ Tried with previous versions and was able to reproduce the issue. The sub-directory mount on solaris 10 client fails as follows: mount -o proto=tcp,vers=3 nfs://10.70.47.59:/testvol/subdir /mnt/solmount nfs mount: 10.70.47.59: : RPC: Program not registered nfs mount: retrying: /mnt/solmount Then upgraded to gluster build : glusterfs-3.8.4-24.el7rhgs.x86_64 Now the sub-directory mount on solaris client succeeds. mount -o proto=tcp,vers=3 nfs://10.70.35.213:/distrep3/subdir /mnt/solmount # cd /mnt/solmount # ls file2 file3 hello j1 j10 j2 j3 j4 j5 j6 j7 j8 j9 Also tried running some I/O's on sub-directory mount point it succeeds. Will do some more tests before moving this BZ to verified. Tried running some more i/o's on sub-directory mount point and it works fine. Moving the BZ to verified. More testing will be done with regressions Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2774 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2774 |