Bug 762127 (GLUSTER-395) - Samba Cluster (ctdb) kernel oplock problems
Summary: Samba Cluster (ctdb) kernel oplock problems
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: GLUSTER-395
Product: GlusterFS
Classification: Community
Component: locks
Version: 2.0.7
Hardware: x86_64
OS: Linux
low
medium
Target Milestone: ---
Assignee: Raghavendra G
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-11-18 14:28 UTC by Matthias
Modified: 2015-12-01 16:45 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Description Matthias 2009-11-18 14:28:15 UTC
Hi,

i am trying to get the glfs with a samba ctdb running.
So my setup is like this:

2x Bricks (/data) as Server and client, sharing one volume, this volume will be
the samba share (/mnt/glfs).

Here is the glfs config for both bricks. So server and client config are in the same file. See Configs below
====

The problem now i am encountering is: 
I set up an samba cluster using ctdb.
The cluster is running fine, i can copy data to the cluster very well.
But if i try to copy the data back to my workstation i got an error
that some processes are locking the files. Every file.
So i tried it out with samba strickt locks = no , and i am able to copy.

I dig deeper into the samba log an i found this here.
So there might be a problem with the locking on glfs?
I talked to some guys at the samba/ctdb irc and they suggested me to tell this list that it might be a bug.

Someone might dig into the log.
If you need further infos, pls contact me.

Regards Matthias
======================================
.
.
[2009/11/18 14:52:33,  3] smbd/trans2.c:2599(call_trans2qfsinfo)
  call_trans2qfsinfo: level = 261
[2009/11/18 14:52:33,  9] smbd/trans2.c:820(send_trans2_replies)
  t2_rep: params_sent_thistime = 0, data_sent_thistime = 20, useable_space = 131012
[2009/11/18 14:52:33,  9] smbd/trans2.c:822(send_trans2_replies)
  t2_rep: params_to_send = 0, data_to_send = 20, paramsize = 0, datasize = 20
[2009/11/18 14:52:33,  5] lib/util.c:632(show_msg)
[2009/11/18 14:52:33,  5] lib/util.c:642(show_msg)
  size=76
  smb_com=0x32
  smb_rcls=0
  smb_reh=0
  smb_err=0
  smb_flg=136
  smb_flg2=51201
  smb_tid=1
  smb_pid=5556
  smb_uid=100
  smb_mid=61248
  smt_wct=10
  smb_vwv[ 0]=    0 (0x0)
  smb_vwv[ 1]=   20 (0x14)
  smb_vwv[ 2]=    0 (0x0)
  smb_vwv[ 3]=    0 (0x0)
  smb_vwv[ 4]=   56 (0x38)
  smb_vwv[ 5]=    0 (0x0)
  smb_vwv[ 6]=   20 (0x14)
  smb_vwv[ 7]=   56 (0x38)
  smb_vwv[ 8]=    0 (0x0)
  smb_vwv[ 9]=    0 (0x0)
  smb_bcc=21
[2009/11/18 14:52:33, 10] ../lib/util/util.c:304(_dump_data)
  [0000] 00 2F 00 01 00 FF 00 00   00 08 00 00 00 4E 00 54   ./...... .....N.T
  [0010] 00 46 00 53 00                                    .F.S. 
[2009/11/18 14:52:33,  4] smbd/trans2.c:3131(call_trans2qfsinfo)
  SMBtrans2 info_level = 261
[2009/11/18 14:52:33, 10] lib/util_sock.c:789(read_smb_length_return_keepalive)
  got smb length of 59
[2009/11/18 14:52:33,  6] smbd/process.c:1456(process_smb)
  got message type 0x0 of len 0x3b
[2009/11/18 14:52:33,  3] smbd/process.c:1459(process_smb)
  Transaction 55 of length 63 (0 toread)
[2009/11/18 14:52:33,  5] lib/util.c:632(show_msg)
[2009/11/18 14:52:33,  5] lib/util.c:642(show_msg)
  size=59
  smb_com=0x2e
  smb_rcls=0
  smb_reh=0
  smb_err=0
  smb_flg=24
  smb_flg2=51207
  smb_tid=1
  smb_pid=65279
  smb_uid=100
  smb_mid=61312
  smt_wct=12
  smb_vwv[ 0]=  255 (0xFF)
  smb_vwv[ 1]=57054 (0xDEDE)
  smb_vwv[ 2]= 4327 (0x10E7)
  smb_vwv[ 3]=    0 (0x0)
  smb_vwv[ 4]=    0 (0x0)
  smb_vwv[ 5]=61440 (0xF000)
  smb_vwv[ 6]=61440 (0xF000)
  smb_vwv[ 7]=    0 (0x0)
  smb_vwv[ 8]=    0 (0x0)
  smb_vwv[ 9]=61440 (0xF000)
  smb_vwv[10]=    0 (0x0)
  smb_vwv[11]=    0 (0x0)
  smb_bcc=0
[2009/11/18 14:52:33,  3] smbd/process.c:1273(switch_message)
  switch message SMBreadX (pid 3266) conn 0x2b2e5e467920
[2009/11/18 14:52:33,  4] smbd/uid.c:256(change_to_user)
  change_to_user: Skipping user change - already user
[2009/11/18 14:52:33, 10] lib/dbwrap_ctdb.c:878(fetch_locked_internal)
  Locking db 337771384 key B7384B5F57C863650486
[2009/11/18 14:52:33, 10] locking/brlock.c:1849(brl_get_locks_internal)
  brl_get_locks_internal: 0 current locks on file_id 6563c8575f4b38b7:37a28604:0
[2009/11/18 14:52:33, 10] locking/posix.c:284(is_posix_locked)
  is_posix_locked: File IT-Tools/tmp/Stargate.Universe.S01E07.HDTV.XviD-NoTV.avi, offset = 0, count = 61440, type = READ
[2009/11/18 14:52:33, 10] locking/posix.c:172(posix_lock_in_range)
  posix_lock_in_range: offset_out = 0, count_out = 61440
[2009/11/18 14:52:33,  8] locking/posix.c:234(posix_fcntl_getlock)
  posix_fcntl_getlock 39 0 61440 0
[2009/11/18 14:52:33,  8] lib/util.c:1832(fcntl_getlock)
  fcntl_getlock fd=39 offset=0 count=61440 type=0
[2009/11/18 14:52:33,  3] lib/util.c:1856(fcntl_getlock)
  fcntl_getlock: fd 39 is returned info 0 pid 0
[2009/11/18 14:52:33,  8] locking/posix.c:264(posix_fcntl_getlock)
  posix_fcntl_getlock: Lock query call successful
[2009/11/18 14:52:33, 10] locking/brlock.c:1270(brl_locktest)
  brl_locktest: posix start=0 len=61440 locked for fnum 4327 file IT-Tools/tmp/Stargate.Universe.S01E07.HDTV.XviD-NoTV.avi
[2009/11/18 14:52:33, 10] lib/dbwrap_ctdb.c:825(db_ctdb_record_destr)
  Unlocking db 337771384 key B7384B5F57C863650486
[2009/11/18 14:52:33, 10] locking/locking.c:152(strict_lock_default)
  strict_lock_default: flavour = WINDOWS_LOCK brl start=0 len=61440 locked for fnum 4327 file IT-Tools/tmp/Stargate.Universe.S01E07.HDTV.XviD-NoTV.avi
[2009/11/18 14:52:33,  3] smbd/error.c:60(error_packet_set)
  error packet at smbd/reply.c(3302) cmd=46 (SMBreadX) NT_STATUS_FILE_LOCK_CONFLICT
[2009/11/18 14:52:33,  5] lib/util.c:632(show_msg)
[2009/11/18 14:52:33,  5] lib/util.c:642(show_msg)
  size=35
  smb_com=0x2e
  smb_rcls=84
  smb_reh=0
  smb_err=49152
  smb_flg=136
  smb_flg2=51201
  smb_tid=1
  smb_pid=65279
  smb_uid=100
  smb_mid=61312
  smt_wct=0
  smb_bcc=0
[2009/11/18 14:52:33, 10] lib/util_sock.c:789(read_smb_length_return_keepalive)
  got smb length of 41
[2009/11/18 14:52:33,  6] smbd/process.c:1456(process_smb)
  got message type 0x0 of len 0x29
[2009/11/18 14:52:33,  3] smbd/process.c:1459(process_smb)
  Transaction 56 of length 45 (0 toread)
[2009/11/18 14:52:33,  5] lib/util.c:632(show_msg)
[2009/11/18 14:52:33,  5] lib/util.c:642(show_msg)
  size=41
  smb_com=0x4
  smb_rcls=0
  smb_reh=0
  smb_err=0
  smb_flg=24
  smb_flg2=51207
  smb_tid=1
  smb_pid=65279
  smb_uid=100
  smb_mid=61376
  smt_wct=3
  smb_vwv[ 0]= 4327 (0x10E7)
  smb_vwv[ 1]=65535 (0xFFFF)
  smb_vwv[ 2]=65535 (0xFFFF)
  smb_bcc=0
[2009/11/18 14:52:33,  3] smbd/process.c:1273(switch_message)
  switch message SMBclose (pid 3266) conn 0x2b2e5e467920
[2009/11/18 14:52:33,  4] smbd/uid.c:256(change_to_user)
  change_to_user: Skipping user change - already user
[2009/11/18 14:52:33,  3] smbd/reply.c:4483(reply_close)
  close fd=39 fnum=4327 (numopen=1)
[2009/11/18 14:52:33,  6] smbd/close.c:454(set_close_write_time)
  close_write_time: Sun Feb  7 07:28:15 2106
[2009/11/18 14:52:33, 10] lib/dbwrap_ctdb.c:878(fetch_locked_internal)
  Locking db 1123971781 key B7384B5F57C863650486
[2009/11/18 14:52:33, 10] locking/locking.c:552(parse_share_modes)
  parse_share_modes: delete_on_close: 0, owrt: Sa 07 Nov 2009 14:11:42 CET CET, cwrt: Do 01 Jan 1970 01:00:00 CET CET, tok: 0, num_share_modes: 1
[2009/11/18 14:52:33, 10] locking/locking.c:649(parse_share_modes)
  parse_share_modes: share_mode_entry[0]:  pid = 0:3266, share_access = 0x1, private_options = 0x200044, access_mask = 0x20089, mid = 0x0, type= 0x0, gen_id = 5, uid = 1002, flags = 0, file_id 6563c8575f4b38b7:37a28604:0
[2009/11/18 14:52:33, 10] lib/dbwrap_ctdb.c:825(db_ctdb_record_destr)
  Unlocking db 1123971781 key B7384B5F57C863650486
[2009/11/18 14:52:33, 10] locking/posix.c:495(get_windows_lock_ref_count)
  get_windows_lock_count for file IT-Tools/tmp/Stargate.Universe.S01E07.HDTV.XviD-NoTV.avi = 0
[2009/11/18 14:52:33, 10] locking/posix.c:521(delete_windows_lock_ref_count)
  delete_windows_lock_ref_count for file IT-Tools/tmp/Stargate.Universe.S01E07.HDTV.XviD-NoTV.avi
[2009/11/18 14:52:33,  2] smbd/close.c:612(close_normal_file)
  mn closed file IT-Tools/tmp/Stargate.Universe.S01E07.HDTV.XviD-NoTV.avi (numopen=0) NT_STATUS_OK
[2009/11/18 14:52:33,  5] smbd/files.c:476(file_free)
  freed files structure 4327 (0 used)
[2009/11/18 14:52:33,  5] lib/util.c:632(show_msg)
[2009/11/18 14:52:33,  5] lib/util.c:642(show_msg)
  size=35
  smb_com=0x4
  smb_rcls=0
  smb_reh=0
  smb_err=0
  smb_flg=136
  smb_flg2=51201
  smb_tid=1
  smb_pid=65279
  smb_uid=100
  smb_mid=61376
  smt_wct=0
  smb_bcc=0
[2009/11/18 14:52:43, 10] lib/util_sock.c:789(read_smb_length_return_keepalive)
  got smb length of 39
[2009/11/18 14:52:43,  6] smbd/process.c:1456(process_smb)
  got message type 0x0 of len 0x27
[2009/11/18 14:52:43,  3] smbd/process.c:1459(process_smb)
  Transaction 57 of length 43 (0 toread)
[2009/11/18 14:52:43,  5] lib/util.c:632(show_msg)
[2009/11/18 14:52:43,  5] lib/util.c:642(show_msg)
  size=39
  smb_com=0x74
  smb_rcls=0
  smb_reh=0
  smb_err=0
  smb_flg=24
  smb_flg2=51207
  smb_tid=0
  smb_pid=65279
  smb_uid=101
  smb_mid=61440
  smt_wct=2
  smb_vwv[ 0]=  255 (0xFF)
  smb_vwv[ 1]=    0 (0x0)
  smb_bcc=0
[2009/11/18 14:52:43,  3] smbd/process.c:1273(switch_message)
  switch message SMBulogoffX (pid 3266) conn 0x0
[2009/11/18 14:52:43,  3] smbd/sec_ctx.c:310(set_sec_ctx)
  setting sec ctx (0, 0) - sec_ctx_stack_ndx = 0
[2009/11/18 14:52:43,  5] auth/token_util.c:522(debug_nt_user_token)
  NT user token: (NULL)
[2009/11/18 14:52:43,  5] auth/token_util.c:548(debug_unix_user_token)
  UNIX token of user 0
  Primary group is 0 and contains 0 supplementary groups
[2009/11/18 14:52:43,  5] smbd/uid.c:368(change_to_root_user)
  change_to_root_user: now uid=(0,0) gid=(0,0)
[2009/11/18 14:52:43,  3] smbd/reply.c:1943(reply_ulogoffX)
  ulogoffX vuid=101
[2009/11/18 14:52:43, 10] lib/util_sock.c:789(read_smb_length_return_keepalive)
  got smb length of 35
[2009/11/18 14:52:43,  6] smbd/process.c:1456(process_smb)
  got message type 0x0 of len 0x23
[2009/11/18 14:52:43,  3] smbd/process.c:1459(process_smb)
  Transaction 58 of length 39 (0 toread)
[2009/11/18 14:52:43,  5] lib/util.c:632(show_msg)
[2009/11/18 14:52:43,  5] lib/util.c:642(show_msg)

============================
Some locks before that samba log..
node 201
=========
[root@dev1 ~]# cat /proc/locks 
1: POSIX  ADVISORY  READ  23559 08:03:6750221 4 4
2: POSIX  ADVISORY  READ  24361 08:03:65142799 4 4
3: POSIX  ADVISORY  READ  23737 08:03:65142802 4 4
4: POSIX  ADVISORY  READ  23741 08:03:65142799 4 4
5: POSIX  ADVISORY  READ  23737 08:03:65142803 4 4
6: POSIX  ADVISORY  READ  23737 08:03:65142799 4 4
7: POSIX  ADVISORY  WRITE 23737 08:03:65142801 0 0
8: POSIX  ADVISORY  READ  23559 08:03:6750215 4 4
9: POSIX  ADVISORY  READ  23559 08:03:6750216 4 4
10: POSIX  ADVISORY  READ  23559 08:03:6750214 4 4
11: POSIX  ADVISORY  READ  23559 08:03:6750213 4 4
12: POSIX  ADVISORY  READ  23734 08:03:65142799 4 4
13: POSIX  ADVISORY  WRITE 23734 08:03:65142800 0 0
14: POSIX  ADVISORY  WRITE 3042 08:03:65044692 0 EOF
15: FLOCK  ADVISORY  WRITE 3012 08:03:65044688 0 EOF
16: POSIX  ADVISORY  WRITE 2982 08:03:65044684 0 EOF
17: POSIX  ADVISORY  WRITE 2957 08:03:65044682 0 EOF
[root@dev1 ~]# man ls
[root@dev1 ~]# ll /mnt/glfs/Team/IT-Tools/tmp/ -i
insgesamt 35765080
933398006 -rwxrwxrwx 1 ce users 4681566208 15. Nov 15:53 A
933398016 -rwxrwxrwx 1 ce users 4681469952 29. Mär 2009  BUR
933398018 -rwxrwxrwx 1 ce users  210274304 13. Nov 18:03 DAS_
933398010 -rwxrwxrwx 1 ce users 4681486336  6. Nov 19:52 DAS_
933398012 -rwxrwxrwx 1 ce users 4681459712  6. Nov 20:24 DIE_
933398014 -rwxrwxrwx 1 ce users 4681535488  6. Nov 11:27 NIG
933398008 -rwxrwxrwx 1 ce users 4614637568 15. Nov 13:59 P
933398032 -rwxrwxrwx 1 ce users  733945420  3. Okt 22:24 Stargate.U
933398034 -rwxrwxrwx 1 ce users  366611240  1. Nov 16:27 Stargate.
933398036 -rwxrwxrwx 1 ce users  366991868  1. Nov 16:42 Stargate.Un
933398038 -rwxrwxrwx 1 ce users  366997482  1. Nov 16:37 Stargate.Univ
933398040 -rwxrwxrwx 1 ce users  367861402 31. Okt 18:48 Stargate.Unive
933398020 -rwxrwxrwx 1 ce users  366935682  7. Nov 14:11 Stargate.Uni
933398022 -rwxrwxrwx 1 ce users  366799682  8. Nov 15:31 Star.Wars
933398024 -rwxrwxrwx 1 ce users  186093804  8. Nov 14:53 Star.Wars.
933398026 -rwxrwxrwx 1 ce users  183494034  8. Nov 15:02 Star.Wars.
933398028 -rwxrwxrwx 1 ce users  183897142  8. Nov 15:05 Star.War
933398030 -rwxrwxrwx 1 ce users  183855040 14. Nov 13:59 Star
933398004 -rwxrwxrwx 1 ce users 4681517056 15. Nov 18:37 TRAN
[root@dev1 ~]# ll /data/Team/IT-Tools/tmp/ -i
insgesamt 35765080
466699003 -rwxrwxrwx 1 ce users 4681566208 15. Nov 15:53 A
466699008 -rwxrwxrwx 1 ce users 4681469952 29. Mär 2009  BU
466699009 -rwxrwxrwx 1 ce users  210274304 13. Nov 18:03 D
466699005 -rwxrwxrwx 1 ce users 4681486336  6. Nov 19:52 DA
466699006 -rwxrwxrwx 1 ce users 4681459712  6. Nov 20:24 DIE
466699007 -rwxrwxrwx 1 ce users 4681535488  6. Nov 11:27 NIG
466699004 -rwxrwxrwx 1 ce users 4614637568 15. Nov 13:59 P
466699016 -rwxrwxrwx 1 ce users  733945420  3. Okt 22:24 Stargate.Unive
466699017 -rwxrwxrwx 1 ce users  366611240  1. Nov 16:27 Stargate.Univ
466699018 -rwxrwxrwx 1 ce users  366991868  1. Nov 16:42 Stargate.Un
466699019 -rwxrwxrwx 1 ce users  366997482  1. Nov 16:37 Stargate.Un
466699020 -rwxrwxrwx 1 ce users  367861402 31. Okt 18:48 Stargate.Un
466699010 -rwxrwxrwx 1 ce users  366935682  7. Nov 14:11 Stargate.U
466699011 -rwxrwxrwx 1 ce users  366799682  8. Nov 15:31 Star.Wars.
466699012 -rwxrwxrwx 1 ce users  186093804  8. Nov 14:53 Star.Wars.Thevi
466699013 -rwxrwxrwx 1 ce users  183494034  8. Nov 15:02 Star.Wars..Clone.Wars
466699015 -rwxrwxrwx 1 ce users  183855040 14. Nov 13:59 Star.Wars.
466699002 -rwxrwxrwx 1 ce users 4681517056 15. Nov 18:37 TR

---------------------------------------

 # file: /etc/glusterfs/glusterfs-server.vol
volume posix
  type storage/posix
  option directory /data
end-volume

volume locks
  type features/locks
  subvolumes posix
end-volume

volume brick
  type performance/io-threads
  option thread-count 8
  subvolumes locks
end-volume

volume server
  type protocol/server
  option transport-type tcp
  option auth.addr.brick.allow *
  subvolumes brick
end-volume

# file: /etc/glusterfs/glusterfs-client.vol

volume remote2
  type protocol/client
  option transport-type tcp
  option remote-host 192.168.60.201
  option remote-subvolume brick
end-volume

volume remote3
  type protocol/client
  option transport-type tcp
  option remote-host 192.168.60.202
  option remote-subvolume brick
end-volume

volume replicate1
  type cluster/replicate
  subvolumes remote2 remote3
end-volume

volume distribute
  type cluster/distribute
   subvolumes replicate1
end-volume

volume writebehind
  type performance/write-behind
  option window-size 1MB
  subvolumes distribute
end-volume

volume cache
  type performance/io-cache
  option cache-size 512MB
  subvolumes writebehind
end-volume
================================

Comment 1 Pavan Vilas Sondur 2010-02-02 03:53:03 UTC
Is this problem still seen? Can you upgrade to 3.0 and check if ctdb throws up locking issues? Use 'option trace on' in the locks section of the volfile and attach the log it gives out? It traces all locks and dumps that info into the log file.

Comment 2 Matthias 2010-07-30 04:39:18 UTC
(Mit Bezug zu comment #1)
> Is this problem still seen? Can you upgrade to 3.0 and check if ctdb throws up
> locking issues? Use 'option trace on' in the locks section of the volfile and
> attach the log it gives out? It traces all locks and dumps that info into the
> log file.

I upgraded to 3.0.5 , it seems to work now.

I encountered now some more fancy effects..-> new ticket.

Comment 3 Tejas Bhise 2010-08-10 02:44:17 UTC
Pavan,

From the updates in the end, it looks like it can be closed. Can you take a look.

Regards,
Tejas.


Note You need to log in before you can comment on or make changes to this bug.