Created attachment 1588661 [details] Windows error 01 Description of problem: SMBD thread panics when a file operation performed from a Windows, Linux or OS X client when the share is using the glusterfs VFS module, either on its own, or in conjunction with others i.e.: > vfs objects = catia fruit streams_xattr glusterfs Gluster volume info: Volume Name: mcv01 Type: Distributed-Replicate Volume ID: 1580ab45-0a14-4f2f-8958-b55b435cdc47 Status: Started Snapshot Count: 0 Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: mcn01:/mnt/h1a/mcv01_data Brick2: mcn02:/mnt/h1b/mcv01_data Brick3: mcn01:/mnt/h2a/mcv01_data Brick4: mcn02:/mnt/h2b/mcv01_data Options Reconfigured: features.quota-deem-statfs: on nfs.disable: on features.inode-quota: on features.quota: on cluster.brick-multiplex: off cluster.server-quorum-ratio: 50% Version-Release number of selected component (if applicable): Gluster 6.3 Samba 4.10.6-5 How reproducible: Every time Steps to Reproduce: 1. Mount share as mapped drive 2. Write to share or read from share Actual results: Multiple error messages, attached to bug In OS X or Linux, running 'dd if=/dev/zero of=/mnt/share/test.dat bs=1M count=100' results in a hang. Tailing OS X console logs reveals that the share is timing out. Expected results: File operation is successful Additional info: Gluster client logs, and SMB debug 10 logs attached
Created attachment 1588662 [details] Windows error 02
Created attachment 1588663 [details] Samba Debug 10 logs
Created attachment 1588664 [details] Gluster client logs
Tested on Gluster 6.1 with the same issue. Gluster 5.6 works fine.
(In reply to ryan from comment #0) > Created attachment 1588661 [details] > Windows error 01 > > Description of problem: > SMBD thread panics when a file operation performed from a Windows, Linux or > OS X client when the share is using the glusterfs VFS module, either on its > own, or in conjunction with others i.e.: > > vfs objects = catia fruit streams_xattr glusterfs > > > Gluster volume info: > Volume Name: mcv01 > Type: Distributed-Replicate > Volume ID: 1580ab45-0a14-4f2f-8958-b55b435cdc47 > Status: Started > Snapshot Count: 0 > Number of Bricks: 2 x 2 = 4 > Transport-type: tcp > Bricks: > Brick1: mcn01:/mnt/h1a/mcv01_data > Brick2: mcn02:/mnt/h1b/mcv01_data > Brick3: mcn01:/mnt/h2a/mcv01_data > Brick4: mcn02:/mnt/h2b/mcv01_data > Options Reconfigured: > features.quota-deem-statfs: on > nfs.disable: on > features.inode-quota: on > features.quota: on > cluster.brick-multiplex: off > cluster.server-quorum-ratio: 50% > > > Version-Release number of selected component (if applicable): > Gluster 6.3 > Samba 4.10.6-5 > > How reproducible: > Every time > > Steps to Reproduce: > 1. Mount share as mapped drive > 2. Write to share or read from share > > Actual results: > Multiple error messages, attached to bug > In OS X or Linux, running 'dd if=/dev/zero of=/mnt/share/test.dat bs=1M > count=100' results in a hang. Tailing OS X console logs reveals that the > share is timing out. This is weird. Can you post your smb.conf?
Hi Anoop, It's very odd, i've got a feeling it's something related to the upgrade/downgrade process I've been using to test different versions of Gluster for the different bug tickets I've got open. Currently I'm using the following script to upgrade/downgrade (This one's to upgrade to 6): yum remove centos-release-gluster* -y yum install centos-release-gluster6 -y yum remove glusterfs* -y yum install glusterfs-server* -y yum install sernet-samba-vfs-glusterfs -y systemctl stop glusterd systemctl stop glusterfsd sed -i 's/operating-version=.*/operating-version=60000/gi' /var/lib/glusterd/glusterd.info systemctl stop glusterfsd systemctl restart glusterd gluster volume set all cluster.op-version 60000 Please could you flag any issues with this? Or a recommended way of downgrading particularly. SMB config: [global] security = ADS workgroup = MAGENTA realm = MAGENTA.LOCAL netbios name = MAGENTANAS01 max protocol = SMB3 min protocol = SMB2 ea support = yes clustering = yes server signing = no max log size = 10000 glusterfs:loglevel = 7 log file = /var/log/samba/log-%M.smbd logging = file log level = 2 template shell = /sbin/nologin winbind offline logon = false winbind refresh tickets = yes winbind enum users = Yes winbind enum groups = Yes allow trusted domains = yes passdb backend = tdbsam idmap cache time = 604800 idmap negative cache time = 300 winbind cache time = 604800 idmap config magenta:backend = rid idmap config magenta:range = 10000-999999 idmap config * : backend = tdb idmap config * : range = 3000-7999 guest account = nobody map to guest = bad user force directory mode = 0777 force create mode = 0777 create mask = 0777 directory mask = 0777 hide unreadable = no store dos attributes = no unix extensions = no load printers = no printing = bsd printcap name = /dev/null disable spoolss = yes glusterfs:volfile_server = localhost kernel share modes = No strict locking = auto oplocks = yes durable handles = yes kernel oplocks = no posix locking = no level2 oplocks = no readdir_attr:aapl_rsize = yes readdir_attr:aapl_finder_info = no readdir_attr:aapl_max_access = no fruit:aapl = yes [QC] guest ok = no read only = no vfs objects = glusterfs glusterfs:volume = mcv01 path = "/data/qc_only" valid users = @"QC_ops" recycle:repository = .recycle recycle:keeptree = yes recycle:versions = yes recycle:directory_mode = 0770 recycle:subdir_mode = 0777 glusterfs:logfile = /var/log/samba/glusterfs-mcv01.%M.log [QC-GlusterFuse] guest ok = no read only = no vfs objects = glusterfs_fuse path = "/mnt/mcv01/data/qc_only" valid users = @"QC_ops" recycle:repository = .recycle recycle:keeptree = yes recycle:versions = yes recycle:directory_mode = 0770 recycle:subdir_mode = 0777 glusterfs:logfile = /var/log/samba/glusterfs-mcv01.%M.log [QC-FUSE] guest ok = no read only = no path = "/mnt/mcv01/data/qc_only" valid users = @"QC_ops" recycle:repository = .recycle recycle:keeptree = yes recycle:versions = yes recycle:directory_mode = 0770 recycle:subdir_mode = 0777 glusterfs:logfile = /var/log/samba/glusterfs-mcv01-fuse.%M.log ______ Many thanks, Ryan
Anyone able to offer some assistance with this? We're still seeing the issue on two of our servers after upgrading to Gluster 6.5 and Samba 4.10.7.
Trying to copy a file with a windows 10 client results in the transfer failing with error (See screenshot). Looking through the smb logs shows this: mag-desktop-01 (ipv4:10.0.3.12:57488) connect to service Grading initially as user editor01 (uid=2000, gid=2900) (pid 296596) [2019/10/17 14:09:35.784481, 2] ../../source3/smbd/smbXsrv_open.c:675(smbXsrv_open_global_verify_record) smbXsrv_open_global_verify_record: key 'FA7F6275' server_id 296320 does not exist. [2019/10/17 14:09:35.784509, 1] ../../librpc/ndr/ndr.c:422(ndr_print_debug) &global_blob: struct smbXsrv_open_globalB version : SMBXSRV_VERSION_0 (0) seqnum : 0x00000002 (2) info : union smbXsrv_open_globalU(case 0) info0 : * info0: struct smbXsrv_open_global0 db_rec : NULL server_id: struct server_id pid : 0x0000000000048580 (296320) task_id : 0x00000000 (0) vnn : 0xffffffff (4294967295) unique_id : 0x3f2d4bc50a3ad530 (4552378107993707824) open_global_id : 0xfa7f6275 (4202652277) open_persistent_id : 0x00000000fa7f6275 (4202652277) open_volatile_id : 0x0000000037cfd301 (936366849) open_owner : S-1-5-21-3658843901-2482107748-408451428-1000 open_time : Thu Oct 17 14:09:36 2019 BST create_guid : aea7fead-f0de-11e9-b036-b88584997125 client_guid : aea7fb79-f0de-11e9-b036-b88584997125 app_instance_id : 00000000-0000-0000-0000-000000000000 disconnect_time : NTTIME(0) durable_timeout_msec : 0x0000ea60 (60000) durable : 0x01 (1) backend_cookie : DATA_BLOB length=452 [0000] 56 46 53 5F 44 45 46 41 55 4C 54 5F 44 55 52 41 VFS_DEFA ULT_DURA [0010] 42 4C 45 5F 43 4F 4F 4B 49 45 5F 4D 41 47 49 43 BLE_COOK IE_MAGIC [0020] 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 [0030] 00 00 00 00 00 00 00 00 96 89 8E 03 00 00 00 00 ........ ........ [0040] 39 E0 DF F2 31 0E 74 B0 00 00 00 00 00 00 00 00 9...1.t. ........ [0050] 00 00 02 00 04 00 02 00 00 00 10 37 00 00 00 00 ........ ...7.... skipping zero buffer bytes [0080] 96 89 8E 03 00 00 00 00 39 E0 DF F2 31 0E 74 B0 ........ 9...1.t. [0090] FF 81 00 00 00 00 00 00 01 00 00 00 00 00 00 00 ........ ........ [00A0] D0 07 00 00 00 00 00 00 54 0B 00 00 00 00 00 00 ........ T....... [00B0] 00 00 00 00 00 00 00 00 7A E3 01 37 00 00 00 00 ........ z..7.... [00C0] F3 67 A8 5D 00 00 00 00 00 00 00 00 00 00 00 00 .g.].... ........ [00D0] F3 67 A8 5D 00 00 00 00 00 00 00 00 00 00 00 00 .g.].... ........ [00E0] F3 67 A8 5D 00 00 00 00 00 00 00 00 00 00 00 00 .g.].... ........ [00F0] F3 67 A8 5D 00 00 00 00 00 00 00 00 00 00 00 00 .g.].... ........ [0100] 99 7F 00 00 00 00 00 00 08 00 00 00 00 00 00 00 ........ ........ [0110] 00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00 ........ ........ [0120] 06 00 00 00 2F 64 61 74 61 00 00 00 3C 00 00 00 ..../dat a...<... [0130] 00 00 00 00 3C 00 00 00 6E 65 77 20 66 6F 6C 64 ....<... new fold [0140] 65 72 20 66 72 6F 6D 20 6D 61 63 2F 77 38 6B 76 er from mac/w8kv [0150] 76 2D 73 74 68 2D 34 38 66 70 73 2D 31 30 74 6F v-sth-48 fps-10to [0160] 31 72 65 64 63 6F 64 65 5F 46 46 2E 52 44 43 2E 1redcode _FF.RDC. [0170] 7A 69 70 00 00 00 00 00 00 00 00 00 00 00 00 00 zip..... ........ [0180] 00 00 00 00 00 00 00 00 F3 67 A8 5D 00 00 00 00 ........ .g.].... [0190] 00 00 00 00 00 00 00 00 F3 67 A8 5D 00 00 00 00 ........ .g.].... [01A0] 00 00 00 00 00 00 00 00 F3 67 A8 5D 00 00 00 00 ........ .g.].... [01B0] 00 00 00 00 00 00 00 00 F3 67 A8 5D 00 00 00 00 ........ .g.].... [01C0] 00 00 00 00 .... channel_sequence : 0x0000 (0) channel_generation : 0x0000000000000000 (0) [2019/10/17 14:09:35.785374, 3] ../../source3/smbd/smb2_create.c:800(smbd_smb2_create_send)
Created attachment 1626823 [details] Screenshot of Windows 10 error
What are your current Samba and GlusterFS versions? (In reply to ryan from comment #8) > Trying to copy a file with a windows 10 client results in the transfer > failing with error (See screenshot). * Does it happen every time you attempt a copy of the same file? * Is it something specific to a file/directory type? > mag-desktop-01 (ipv4:10.0.3.12:57488) connect to service Grading initially > as user editor01 (uid=2000, gid=2900) (pid 296596) I don't see a share named [Grading] in the smb.conf from comment #6. If that's newly added, was there any changes to global parameters?
Hi Anoop, Versions: Gluster = 6.5 Samba = 4.10.8 > Does it happen every time you attempt a copy of the same file? > Is it something specific to a file/directory type? Yes, this happens with any file being copied, or a new file being created (write fails). Happens in multiple directories 100% of the time. In an effort to reduce the variables in play, I'd changed the config. Complete config below: [global] security = user username map script = /bin/echo max protocol = SMB3 min protocol = SMB2 ea support = yes clustering = no server signing = no max log size = 10000 glusterfs:loglevel = 5 log file = /var/log/samba/log-%M.smbd logging = file log level = 3 template shell = /sbin/nologin passdb backend = tdbsam guest account = nobody map to guest = bad user force directory mode = 0777 force create mode = 0777 create mask = 0777 directory mask = 0777 hide unreadable = no unix extensions = no load printers = no printing = bsd printcap name = /dev/null disable spoolss = yes glusterfs:volfile_server = localhost kernel share modes = No [Grading] read only = no guest ok = yes vfs objects = catia fruit streams_xattr glusterfs glusterfs:volume = mcv01 path = "/data" valid users = "nobody" @"audio" @"QC_ops" @"MAGENTA\domain admins" @"MAGENTA\domain users" @"nas_users" glusterfs:logfile = /var/log/samba/glusterfs-mcv01.%M.log Best, Ryan
Hi Anoop, Did you get a chance to look into this? Can I assist in anyway?
Hi Anoop, I believe we have found the issue with this, however require some assistance with the workaround. When running the op-version at 40100 with Gluster 6.5 we don't have any issues. However, when running at the max cluster op version of 60000 we get lots of panics in the SMB logs. I contacted Sernet about this, and it seems the issue is because they still compile the VFS against Gluster 3.12. We're going to try testing with a package compiled against 6.5 to see if the issue goes away. In the meantime, is it possible to downgrade the op-version? Many thanks, Ryan
(In reply to ryan from comment #13) > and it seems the issue is because they still compile the VFS against Gluster 3.12. GFAPI uses symbol versions. Unless some API got removed(zero chance for this to happen) every old version of a modified API must be still present in newer GlusterFS. Assuming Samba version is maintained I am curious how such a incompatibility can lead to panics. > We're going to try testing with a package compiled against 6.5 to see if the > issue goes away. How did it go? > In the meantime, is it possible to downgrade the op-version? I would suggest to stay or operate at maximum available op-version to make use of latest features in updated GlusterFS.
Hi Anoop, Below were the test versions and results Gluster 4.1 (op-version 40100) + Sernet Samba Gluster VFS (Built against Gluster 3.12) = PASS Gluster 6.5 (op-version 60000) + Sernet Samba Gluster VFS (Built against Gluster 3.12) = FAIL Gluster 6.5 (op-version 40100) + Sernet Samba Gluster VFS (Built against Gluster 3.12) = PASS Gluster 6.5 (op-version 60000) + Sernet Samba Gluster VFS (Built against Gluster 6.5) = PASS The VFS packages compiled for us by Sernet, against Gluster 6.5 has resolved this issue for us. I also downgraded the op-version by modifying the vol config files, which resulted in the Gluster 3.12 VFS, which fixed the issue. Please let me know if you need any more info/data. Best regards, Ryan
(In reply to ryan from comment #15) > Hi Anoop, > > Below were the test versions and results > > Gluster 4.1 (op-version 40100) + Sernet Samba Gluster VFS (Built against > Gluster 3.12) = PASS Expected.. > Gluster 6.5 (op-version 60000) + Sernet Samba Gluster VFS (Built against > Gluster 3.12) = FAIL > Gluster 6.5 (op-version 40100) + Sernet Samba Gluster VFS (Built against > Gluster 3.12) = PASS Just like GlusterFS VFS module based on v3.12 works fine with op-version 40100, I would expect it to work with op-version 60000 too. Or else it needs some investigation. > Gluster 6.5 (op-version 60000) + Sernet Samba Gluster VFS (Built against > Gluster 6.5) = PASS Fine. > The VFS packages compiled for us by Sernet, against Gluster 6.5 has resolved > this issue for us. > I also downgraded the op-version by modifying the vol config files, which > resulted in the Gluster 3.12 VFS, which fixed the issue. Good. > Please let me know if you need any more info/data. I remember that you were blocked in testing bz #1680085 due to this bug. Can you re-visit bz #1680085 now?
IMO the results you see are consistent with the design of the versioned symbols in gfapi; i.e. that old programs (and other consumers of gfapi such as the Samba glusterfs VFS) that were originally compiled and linked with old libraries can be used with newer versions of gfapi without having to rebuild and relink. For now this does imply that gluster needs to use the same (or close) op-version associated with 3.12 if you're using a gluster VFS that was linked with 3.12 libgfapi.
Hi Kaleb, Thanks for confirming. Is there a recommended way of downgrading the op-version, other than editing the vol file? Best, Ryan
I'm missing a little detail in this bug report. Compiling the vfs_gluster Samba module against glusterfs-3.12 results in a binary that can be used with glusterfs-6.x (on the Gluster client, the Samba server). It is not clear to me what versuin of Gluster client was used in the test of comment #15. Did it match the version of the Gluster server, or was it kept at 3.12?
Hi Niels, Please see revised comment, does this answer your question? Gluster Server 4.1 (op-version 40100) + Sernet Samba Gluster VFS (Built against Gluster Client 3.12) = PASS Gluster Server 6.5 (op-version 60000) + Sernet Samba Gluster VFS (Built against Gluster Client 3.12) = FAIL Gluster Server 6.5 (op-version 40100) + Sernet Samba Gluster VFS (Built against Gluster Client 3.12) = PASS Gluster Server 6.5 (op-version 60000) + Sernet Samba Gluster VFS (Built against Gluster Client 6.5) = PASS Best, Ryan
Does that also mean the Gluster client packages on the Samba server are kept at the "Built against Gluster Client" version? This is not a requirement from a libgfapi gluster-bindings perspective. It expected to work correctly when compiling Samba against glusterfs-3.12, but run the resulting vfs_gluster module (Built against Gluster Client 3.12) on a system that has only the glusterfs-6.x versions installed. The built Samba/vfs_gluster binary should be compatible with glusterfs-6.x. It is recommended that Gluster clients and Gluster servers run with the same Gluster version (even when Samba/vfs_gluster is built with an older version of Gluster).
In our test case, the Gluster client & Samba server is on the same nodes as the Gluster server, so would all be on the same version as the server. Best, Ryan
Thanks! In that case I'm really surprised to hear that different op-versions can cause a panic in Samba... Anoop would be the best person to help with this.
This bug is moved to https://github.com/gluster/glusterfs/issues/898, and will be tracked there from now on. Visit GitHub issues URL for further details