Red Hat Bugzilla – Bug 983507
Fileop tests fail on smb mount
Last modified: 2013-09-23 18:29:50 EDT
Description of problem:
Fileop is failing on a 6x2 distributed replicate volume on smb mount in a ctdb setup.
Version-Release number of selected component (if applicable):
[root@dhcp159-205 glusterfs]# rpm -qa | grep glusterfs
I tried three times and it failed.
Steps to Reproduce:
1. create a ctdb setup.
2. create a dis-rep volume and do a cifs mount on the client with virtual ip corresponding to the physical ip on which the volume is created.
3. From the mount point run fileop.
Fileop failed .
Fileop should pass.
For the first time fileop ran for ~1 hour an then failed.
Second time for about 2 hours
Third time it ran for almost 4 hours and failed.
Tried the fileop test on nfs mount and fuse mount as well ,it passed on both.
But takes 4-5 hours to finish.
Executed the test on following version:
Performed the same steps and fileop failed again.
Can you tell what was the error message from fileop?
eg, what operations failed.
Here is the output from fileop log,
read: Files = 27000 Total Time = 13.466583252 seconds
Avg read(s)/sec = 2004.96 ( 0.000498762 seconds/op)
Best read(s)/sec = 4236.67 ( 0.000236034 seconds/op)
Worst read(s)/sec = 240.79 ( 0.004153013 seconds/op)
access: Files = 27000 Total Time = 44.049818754 seconds
Avg access(s)/sec = 612.94 ( 0.001631475 seconds/op)
Best access(s)/sec = 1057.03 ( 0.000946045 seconds/op)
Worst access(s)/sec = 77.97 ( 0.012825966 seconds/op)
chmod: Files = 27000 Total Time = 118.453602791 seconds
Avg chmod(s)/sec = 227.94 ( 0.004387170 seconds/op)
Best chmod(s)/sec = 375.80 ( 0.002660990 seconds/op)
Worst chmod(s)/sec = 24.99 ( 0.040018797 seconds/op)
readdir: Files = 900 Total Time = 12.854415417 seconds
Avg readdir(s)/sec = 70.01 ( 0.014282684 seconds/op)
Best readdir(s)/sec = 97.70 ( 0.010235071 seconds/op)
Worst readdir(s)/sec = 35.54 ( 0.028141022 seconds/op)
Not included all messages but the last ones with error.
It could possibly be the case that fileop is doing some test related to hard link
and with unix extensions being off the behaviour may be different than what it expects.
I have asked Chris to comment here if that is the case.
I am still seeing this on fs sanity runs on glusterfs-18.104.22.168rhs-1.el6rhs.x86_64. To note I don't have ctdb configured, just regular samba.
can you please specify whether you used fuse mount(older method) or vfs module(new one) ?
With the version glusterfs-22.214.171.124rhs-1.el6rhs.x86_64
I have tried the test with both the options unix extensions = yes
and unix extensions = no the test failed with the same link fail error.
read: Files = 27000 Total Time = 10.803651094 seconds
Avg read(s)/sec = 2499.16 ( 0.000400135 seconds/op)
Best read(s)/sec = 262144.00 ( 0.000003815 seconds/op)
Worst read(s)/sec = 110.21 ( 0.009073973 seconds/op)
access: Files = 27000 Total Time = 66.434024096 seconds
Avg access(s)/sec = 406.42 ( 0.002460519 seconds/op)
Best access(s)/sec = 661.46 ( 0.001511812 seconds/op)
Worst access(s)/sec = 133.78 ( 0.007474899 seconds/op)
chmod: Files = 27000 Total Time = 281.460999250 seconds
Avg chmod(s)/sec = 95.93 ( 0.010424481 seconds/op)
Best chmod(s)/sec = 151.47 ( 0.006602049 seconds/op)
Worst chmod(s)/sec = 44.66 ( 0.022388935 seconds/op)
readdir: Files = 900 Total Time = 19.414461851 seconds
Avg readdir(s)/sec = 46.36 ( 0.021571624 seconds/op)
Best readdir(s)/sec = 60.36 ( 0.016567945 seconds/op)
Worst readdir(s)/sec = 27.50 ( 0.036361933 seconds/op)
This time even I didn't run on ctdb setup, it failed on regular samba vfs.
It is bug with hard link in gfapi.
10.70.42.194#ls -l new
-rwxr-xr-x 1 root root 0 Aug 8 11:00 new
13127611488624932007 -rwxr-xr-x 1 root root 0 Aug 8 11:00 new
10.70.42.194#ln new new2
ln: creating hard link `new2' => `new': Operation not supported
13127611488624932007 -rwxr-xr-x 0 root root 0 Aug 8 11:00 new
13127611488624932007 -rwxr-xr-x 0 root root 0 Aug 8 11:00 new2
link gets created but we get a operation not supported error.
Susant is looking into it.
I have discussed this bug with one of the CIFS kernel client developers. Under SMBv1, support for hard links is a feature of the Unix Extensions for SMB and is, therefore, only supported by a very limited set of clients. Key among those is the Linux Kernel CIFS client. By default, the kernel client negotiates the Unix Extensions with the Samba server, so hard links *should* be enabled in this scenario.
From an SMB perspective, this problem currently only impacts support for hard links via the Linux CIFS Kernel client when the Unix Extensions are negotiated over SMB1. Hard links are not otherwise supported, so this is not a critical fix for SMB in Big Bend.
However, this will be an important fix for future releases, particularly if/when other access methods start using libgfapi.
Note that newer versions of Windows (NTFS) do support hard links, and that support for the Windows CreateHardLink function is available in SMB3. Fixing this will be important for future SMB3 support.
FWIW, creating a hardlink works just fine between the Linux CIFS client and samba when exporting a "normal" filesystem, both with and without unix extensions.
Previous tests show that creation of hardlinks works between the Linux CIFS kernel client and Samba when exporting file systems other than Gluster (e.g. XFS or EXT4). As indicated above, in comment #13, the problem has been localized to libgfapi, the interface to the Gluster client.
The remaining open question has to do with the priority of the bug, with respect to others that currently require fixing.
bug 996063 is now fixed in upstream. patch downstream @ https://code.engineering.redhat.com/gerrit/#/c/11407/
Fileop is now passing for me on glusterfs-126.96.36.199rhs-2.el6rhs.x86_64:
INFO: Switching to /gluster-mount/run6691_fileop
INFO: Currently in directory /gluster-mount/run6691_fileop
Fileop: Working in ., File size is 1, Output is in Ops/sec. (A=Avg, B=Best, W=Worst)
mkdir: Dirs = 27930 Total Time = 431.059383631 seconds
Avg mkdir(s)/sec = 64.79 ( 0.015433562 seconds/op)
Best mkdir(s)/sec = 156.25 ( 0.006400108 seconds/op)
Worst mkdir(s)/sec = 0.27 ( 3.740863085 seconds/op)
chdir: Dirs = 27930 Total Time = 36.934317827 seconds
Avg chdir(s)/sec = 756.21 ( 0.001322389 seconds/op)
Best chdir(s)/sec = 1519.68 ( 0.000658035 seconds/op)
Worst chdir(s)/sec = 0.86 ( 1.164689064 seconds/op)
rmdir: Dirs = 27930 Total Time = 364.032657862 seconds
Avg rmdir(s)/sec = 76.72 ( 0.013033751 seconds/op)
Best rmdir(s)/sec = 129.37 ( 0.007730007 seconds/op)
Worst rmdir(s)/sec = 0.42 ( 2.371411085 seconds/op)
create: Files = 27000 Total Time = 271.101895571 seconds
Avg create(s)/sec = 99.59 ( 0.010040811 seconds/op)
Best create(s)/sec = 164.91 ( 0.006063938 seconds/op)
Worst create(s)/sec = 0.45 ( 2.199863911 seconds/op)
write: Files = 27000 Total Time = 9.480272532 seconds
Avg write(s)/sec = 2848.02 ( 0.000351121 seconds/op)
Best write(s)/sec = 5433.04 ( 0.000184059 seconds/op)
Worst write(s)/sec = 14.76 ( 0.067746878 seconds/op)
close: Files = 27000 Total Time = 26.621973038 seconds
Avg close(s)/sec = 1014.20 ( 0.000985999 seconds/op)
Best close(s)/sec = 3744.91 ( 0.000267029 seconds/op)
Worst close(s)/sec = 0.63 ( 1.577756882 seconds/op)
stat: Files = 27000 Total Time = 22.073333025 seconds
Avg stat(s)/sec = 1223.20 ( 0.000817531 seconds/op)
Best stat(s)/sec = 1996.34 ( 0.000500917 seconds/op)
Worst stat(s)/sec = 6.67 ( 0.149971962 seconds/op)
open: Files = 27000 Total Time = 37.427981853 seconds
Avg open(s)/sec = 721.39 ( 0.001386222 seconds/op)
Best open(s)/sec = 971.80 ( 0.001029015 seconds/op)
Worst open(s)/sec = 4.95 ( 0.201858044 seconds/op)
read: Files = 27000 Total Time = 5.365317583 seconds
Avg read(s)/sec = 5032.32 ( 0.000198715 seconds/op)
Best read(s)/sec = 8924.05 ( 0.000112057 seconds/op)
Worst read(s)/sec = 965.10 ( 0.001036167 seconds/op)
access: Files = 27000 Total Time = 20.328207016 seconds
Avg access(s)/sec = 1328.20 ( 0.000752897 seconds/op)
Best access(s)/sec = 1858.35 ( 0.000538111 seconds/op)
Worst access(s)/sec = 445.02 ( 0.002247095 seconds/op)
chmod: Files = 27000 Total Time = 38.459837675 seconds
Avg chmod(s)/sec = 702.03 ( 0.001424438 seconds/op)
Best chmod(s)/sec = 1014.10 ( 0.000986099 seconds/op)
Worst chmod(s)/sec = 264.19 ( 0.003785133 seconds/op)
readdir: Files = 900 Total Time = 9.273023367 seconds
Avg readdir(s)/sec = 97.06 ( 0.010303359 seconds/op)
Best readdir(s)/sec = 108.15 ( 0.009246111 seconds/op)
Worst readdir(s)/sec = 15.03 ( 0.066555023 seconds/op)
link: Files = 27000 Total Time = 197.369472027 seconds
Avg link(s)/sec = 136.80 ( 0.007309980 seconds/op)
Best link(s)/sec = 220.71 ( 0.004530907 seconds/op)
Worst link(s)/sec = 0.44 ( 2.255393982 seconds/op)
unlink: Files = 27000 Total Time = 98.380872250 seconds
Avg unlink(s)/sec = 274.44 ( 0.003643736 seconds/op)
Best unlink(s)/sec = 512.81 ( 0.001950026 seconds/op)
Worst unlink(s)/sec = 0.51 ( 1.962489128 seconds/op)
delete: Files = 27000 Total Time = 97.155341148 seconds
Avg delete(s)/sec = 277.91 ( 0.003598346 seconds/op)
Best delete(s)/sec = 509.95 ( 0.001960993 seconds/op)
Worst delete(s)/sec = 0.43 ( 2.331680059 seconds/op)
I don't run on ctdb setups though, could one of the ctdb people re run before we mark as verified?
I ran it on ctdb setup and normal smb setup the test passed on both.
Total 1 tests were successful
So moving it to verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA.
For information on the advisory, and where to find the updated files, follow the link below.
If the solution does not work for you, open a new bug report.