Description of problem: fs-sanity: rpc test failed with nfs-ganesha Version-Release number of selected component (if applicable): glusterfs-3.7.5-19.el7rhgs.x86_64 nfs-ganesha-gluster-2.2.0-12.el7rhgs.x86_64 How reproducible: Always on this build Steps to Reproduce: 1. Run fs-sanity : rpc test, on nfs-ganeshaV4 mount, it fails time /opt/qa/tools/system_light/run.sh -w /mnt/nfs -l /rhgs7.2-19-rpc.log -t rpc /opt/qa/tools/system_light/scripts /opt/qa/tools/system_light /latest_setup_cases/distaf /mnt/nfs /mnt ----- /mnt/nfs /mnt/nfs/run31192/ Tests available: arequal bonnie compile_kernel dbench dd ffsb fileop fs_mark fsx glusterfs_build iozone locks ltp multiple_files openssl posix_compliance postmark read_large rpc syscallbench tiobench ===========================TESTS RUNNING=========================== Changing to the specified mountpoint /mnt/nfs/run31192 executing rpc start: 19:20:46 real 0m7.636s user 0m0.044s sys 0m0.163s end: 19:20:54 rpc failed 0 Total 0 tests were successful Switching over to the previous working directory Removing /mnt/nfs/run31192/ rmdir: failed to remove ‘/mnt/nfs/run31192/’: Directory not empty rmdir failed:Directory not empty real 0m11.702s user 0m0.056s sys 0m0.204s Actual results: rpc test is failing Expected results: rpc test should pass Additional info:
Created attachment 1122723 [details] packet trace for flock command under rpc
From the pkt-trace it doesn't look like server-side issue. Please provide the exact steps you have tried and the results while using gluster-nfs, nfs-ganesha and fuse server.
And also what is the linux version of the system which you are using as nfs/fuse client
[root@dhcp46-69 ~]# cat /etc/redhat-release Red Hat Enterprise Linux Server release 7.1 (Maipo) Ganesha v4 mount: [root@dhcp46-69 ~]# mount -t nfs -o vers=4 10.70.44.100:/testvol /mnt/nfs [root@dhcp46-69 ~]# cd /mnt/nfs [root@dhcp46-69 nfs]# exec 200>lockfile [root@dhcp46-69 nfs]# ls file file1 lockfile run31170 run31192 [root@dhcp46-69 nfs]# flock -s 200 flock: 200: Bad file descriptor fuse mount mount -t glusterfs 10.70.46.59:/testvol /mnt/glusterfs [root@dhcp46-69 glusternfs]# exec 200>lockfile1 [root@dhcp46-69 glusternfs]# ls lockfile1 [root@dhcp46-69 glusternfs]# flock -s 200 Glusternfs mount: [root@dhcp46-69 ~]# mount -t nfs -o vers=3 10.70.46.59:/testvol /mnt/glusternfs [root@dhcp46-69 ~]# cd /mnt/glusternfs [root@dhcp46-69 glusternfs]# ls file file1 lockfile lockfile2 run31170 run31192 [root@dhcp46-69 glusternfs]# exec 200>lockfile3 [root@dhcp46-69 glusternfs]# ls file file1 lockfile lockfile2 lockfile3 run31170 run31192 [root@dhcp46-69 glusternfs]# flock -s 200 This step hangs starce for that flock command: [root@dhcp46-69 glusternfs]# starce flock -s 200 -bash: starce: command not found [root@dhcp46-69 glusternfs]# strace flock -s 200 execve("/usr/bin/flock", ["flock", "-s", "200"], [/* 26 vars */]) = 0 brk(0) = 0xf92000 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f6b85293000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=28064, ...}) = 0 mmap(NULL, 28064, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f6b8528c000 close(3) = 0 open("/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\0\34\2\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=2107760, ...}) = 0 mmap(NULL, 3932736, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f6b84cb2000 mprotect(0x7f6b84e68000, 2097152, PROT_NONE) = 0 mmap(0x7f6b85068000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1b6000) = 0x7f6b85068000 mmap(0x7f6b8506e000, 16960, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f6b8506e000 close(3) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f6b8528b000 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f6b85289000 arch_prctl(ARCH_SET_FS, 0x7f6b85289740) = 0 mprotect(0x7f6b85068000, 16384, PROT_READ) = 0 mprotect(0x604000, 4096, PROT_READ) = 0 mprotect(0x7f6b85294000, 4096, PROT_READ) = 0 munmap(0x7f6b8528c000, 28064) = 0 brk(0) = 0xf92000 brk(0xfb3000) = 0xfb3000 brk(0) = 0xfb3000 open("/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=106065056, ...}) = 0 mmap(NULL, 106065056, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f6b7e78b000 close(3) = 0 flock(200, LOCK_SH ........... hanging here
Please check your firewalld settings if all the relevant ports are opened both on server and client machines. Also if possible please check the behaviour on RHEL6 client.
So after flushing the iptables on all the servers and client, it worked fine on both v3 and v4 mounts. Now we need to find which port was blocking the fop.