Hide Forgot
Description of problem: lock test is failing for stripe-replicate Version-Release number of selected component (if applicable): Mainline How reproducible: often Steps to Reproduce: 1.create a stripe-replicate volume 2.mount it 3.run lock tests Actual results: Init process initalization .................... -------------------------------------- TEST : TRY TO WRITE ON A READ LOCK:========== TEST : TRY TO WRITE ON A WRITE LOCK:========== TEST : TRY TO READ ON A READ LOCK:========== TEST : TRY TO READ ON A WRITE LOCK:========== TEST : TRY TO SET A READ LOCK ON A READ LOCK:========== TEST : TRY TO SET A WRITE LOCK ON A WRITE LOCK:Master: can't set lock : Resource temporarily unavailable Echec : Resource temporarily unavailable Expected results: should not abort Additional info:
This issue seems to be wrt replica. passes on stripe/dht/ volumes. ---------------------- strace output: for non replica: open("test", O_RDWR|O_CREAT|O_SYNC, 0600) = 25 write(1, "\n", 1) = 1 write(0, "TEST : TRY TO SET A WRITE LOCK O"..., 47) = 47 write(25, "Ceci est une phrase test \303\251crite"..., 62) = 62 fcntl(25, F_SETLK, {type=F_RDLCK, whence=SEEK_SET, start=0, len=0}) = 0 ... ----------------- for replica: open("test", O_RDWR|O_CREAT|O_SYNC, 0600) = 25 write(1, "\n", 1) = 1 write(0, "TEST : TRY TO SET A WRITE LOCK O"..., 47) = 47 write(25, "Ceci est une phrase test \303\251crite"..., 62) = 62 fcntl(25, F_SETLK, {type=F_WRLCK, whence=SEEK_SET, start=0, len=0}) = -1 EAGAIN (Resource temporarily unavailable) dup(2) = 26 fcntl(26, F_GETFL) = 0x8002 (flags O_RDWR|O_LARGEFILE) fstat(26, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fde74e64000 lseek(26, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) write(26, "Master: can't set lock\n", 23) = 23 write(26, ": Resource temporarily unavailab"..., 35) = 35 ------------ Actual run: root@shishirng:/mnt# strace -o /tmp/trace.afr /opt/qa/tools/locks/locktests -n 10 -f test Init process initalization .................... -------------------------------------- TEST : TRY TO WRITE ON A READ LOCK:========== TEST : TRY TO WRITE ON A WRITE LOCK:========== TEST : TRY TO READ ON A READ LOCK:========== TEST : TRY TO READ ON A WRITE LOCK:========== TEST : TRY TO SET A READ LOCK ON A READ LOCK:========== TEST : TRY TO SET A WRITE LOCK ON A WRITE LOCK:Master: can't set lock : Resource temporarily unavailable Echec : Resource temporarily unavailable --------------------- Error logs: [2011-12-29 12:43:53.939349] D [afr-lk-common.c:405:transaction_lk_op] 0-new1-replicate-0: lk op is for a transaction [2011-12-29 12:43:53.939367] D [afr-lk-common.c:606:afr_unlock_inodelk] 0-new1-replicate-0: attempting data unlock range 0 0 by 139822922292356 [2011-12-29 12:43:53.939802] D [afr-lk-common.c:1426:afr_nonblocking_inodelk] 0-new1-replicate-0: attempting data lock range 0 62 by 139822922293936 [2011-12-29 12:43:53.940253] D [fuse-bridge.c:3043:fuse_setlk_cbk] 0-glusterfs-fuse: Returning EAGAIN Flock: start=0, len=0, pid=542, lk-owner=1483090 1126550067613
The issue is that write-behind is not barrier'ing the lk calls after the flush-"behind" call completion. Extending write-behind's barrier for all background operations and including the lk() call to enter the wb queue will fix this problem the right way.
Du, can you please resend your patches with rebase, review comments? (for master branch only for now).
(In reply to comment #3) > Du, can you please resend your patches with rebase, review comments? (for > master branch only for now). Its been done, the patch applies fine on 65c6e3706f529094717992 and passes tests. regards, Raghavendra.
http://review.gluster.org/2610 need a rebase..
REVIEW: http://review.gluster.org/2610 (performance/write-behind: implement lk.) posted (#11) for review on master by Raghavendra G (raghavendra)
REVIEW: http://review.gluster.org/2610 (performance/write-behind: implement lk.) posted (#12) for review on master by Raghavendra G (raghavendra)
REVIEW: http://review.gluster.org/2610 (performance/write-behind: implement lk.) posted (#13) for review on master by Raghavendra G (raghavendra)
because of the large number of bugs filed against mainline version\ is ambiguous and about to be removed as a choice. If you believe this is still a bug, please change the status back to NEW and choose the appropriate, applicable version for it.