1717824 – Fencing: Added the tcmu-runner ALUA feature support but after one of node is rebooted the glfs_file_lock() get stucked

Bug 1717824 - Fencing: Added the tcmu-runner ALUA feature support but after one of node is rebooted the glfs_file_lock() get stucked

Summary: Fencing: Added the tcmu-runner ALUA feature support but after one of node is ...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	locks
Sub Component:
Version:	mainline
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Susant Kumar Palai
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1740494
TreeView+	depends on / blocked

Reported:	2019-06-06 09:35 UTC by Xiubo Li
Modified:	2019-08-13 05:59 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Clone Of:
Clones:	1740077 1740494 (view as bug list)
Environment:
Last Closed:	2019-08-12 09:29:58 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
pstack of on node rhel3 (9.18 KB, text/plain) 2019-06-06 09:39 UTC, Xiubo Li	no flags	Details
pstack of on node rhel1 (10.76 KB, text/plain) 2019-06-06 09:40 UTC, Xiubo Li	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Gluster.org Gerrit	23088	0	None	Open	locks/fencing: Address hang while lock preemption	2019-08-02 14:27:14 UTC

Description Xiubo Li 2019-06-06 09:35:29 UTC

Description of problem:

In Glusterfs, we have support the fencing feature support. With this we can suppor the ALUA feature in LIO/TCMU now.

The fencing doc: https://review.gluster.org/#/c/glusterfs-specs/+/21925/6/accepted/fencing.md

The fencing test example: https://review.gluster.org/#/c/glusterfs/+/21496/12/tests/basic/fencing/fence-basic.c

The LIO/tcmu-runner PR of supporting the ALUA is : https://github.com/open-iscsi/tcmu-runner/pull/554.

But currently when testing it based the above PR in tcmu-runner by shutting down of the HA node, and start it after 2~3 minutes, in all the HA nodes we can see that the glfs_file_lock() get stucked, the following is from the /var/log/tcmu-runner.log:

====
2019-06-06 13:50:15.755 1316 [DEBUG] tcmu_acquire_dev_lock:388 glfs/block3: lock call state 2 retries 0. tag 65535 reopen 0
2019-06-06 13:50:15.757 1316 [DEBUG] tcmu_acquire_dev_lock:440 glfs/block3: lock call done. lock state 1
2019-06-06 13:50:55.845 1316 [DEBUG] tcmu_acquire_dev_lock:388 glfs/block4: lock call state 2 retries 0. tag 65535 reopen 0
2019-06-06 13:50:55.847 1316 [DEBUG] tcmu_acquire_dev_lock:440 glfs/block4: lock call done. lock state 1
2019-06-06 13:57:50.102 1315 [DEBUG] tcmu_acquire_dev_lock:388 glfs/block3: lock call state 2 retries 0. tag 65535 reopen 0
2019-06-06 13:57:50.103 1315 [DEBUG] tcmu_acquire_dev_lock:440 glfs/block3: lock call done. lock state 1
2019-06-06 13:57:50.121 1315 [DEBUG] tcmu_acquire_dev_lock:388 glfs/block4: lock call state 2 retries 0. tag 65535 reopen 0
2019-06-06 13:57:50.132 1315 [DEBUG] tcmu_acquire_dev_lock:440 glfs/block4: lock call done. lock state 1
2019-06-06 14:09:03.654 1328 [DEBUG] tcmu_acquire_dev_lock:388 glfs/block3: lock call state 2 retries 0. tag 65535 reopen 0
2019-06-06 14:09:03.662 1328 [DEBUG] tcmu_acquire_dev_lock:440 glfs/block3: lock call done. lock state 1
2019-06-06 14:09:06.700 1328 [DEBUG] tcmu_acquire_dev_lock:388 glfs/block4: lock call state 2 retries 0. tag 65535 reopen 0
====

The lock operation is never returned.

I am using the following glusterfs built by myself:

# rpm -qa|grep glusterfs
glusterfs-extra-xlators-7dev-0.0.el7.x86_64
glusterfs-api-devel-7dev-0.0.el7.x86_64
glusterfs-7dev-0.0.el7.x86_64
glusterfs-server-7dev-0.0.el7.x86_64
glusterfs-cloudsync-plugins-7dev-0.0.el7.x86_64
glusterfs-resource-agents-7dev-0.0.el7.noarch
glusterfs-api-7dev-0.0.el7.x86_64
glusterfs-devel-7dev-0.0.el7.x86_64
glusterfs-regression-tests-7dev-0.0.el7.x86_64
glusterfs-gnfs-7dev-0.0.el7.x86_64
glusterfs-client-xlators-7dev-0.0.el7.x86_64
glusterfs-geo-replication-7dev-0.0.el7.x86_64
glusterfs-debuginfo-7dev-0.0.el7.x86_64
glusterfs-fuse-7dev-0.0.el7.x86_64
glusterfs-events-7dev-0.0.el7.x86_64
glusterfs-libs-7dev-0.0.el7.x86_64
glusterfs-cli-7dev-0.0.el7.x86_64
glusterfs-rdma-7dev-0.0.el7.x86_64
 

How reproducible:
30%.

Steps to Reproduce:
1. create one rep volume(HA >= 2) with the mandantary lock enabled
2. create one gluster-blockd target
3. login and do the fio in the client node
4. shutdown one of the HA nodes, and wait 2 ~3 minutes and start it again

Actual results:
all the time the fio couldn't recovery and the rw BW will be 0kb/s, and we can see tons of log from /var/log/tcmu-runnner.log file:

2019-06-06 15:01:06.641 1328 [DEBUG] alua_implicit_transition:561 glfs/block4: Lock acquisition operation is already in process.
2019-06-06 15:01:06.648 1328 [DEBUG_SCSI_CMD] tcmu_cdb_print_info:353 glfs/block4: 28 0 0 3 1f 80 0 0 8 0 
2019-06-06 15:01:06.648 1328 [DEBUG] alua_implicit_transition:561 glfs/block4: Lock acquisition operation is already in process.
2019-06-06 15:01:06.655 1328 [DEBUG_SCSI_CMD] tcmu_cdb_print_info:353 glfs/block4: 28 0 0 3 1f 80 0 0 8 0 
2019-06-06 15:01:06.655 1328 [DEBUG] alua_implicit_transition:561 glfs/block4: Lock acquisition operation is already in process.
2019-06-06 15:01:06.661 1328 [DEBUG_SCSI_CMD] tcmu_cdb_print_info:353 glfs/block4: 28 0 0 3 1f 80 0 0 8 0 
2019-06-06 15:01:06.662 1328 [DEBUG] alua_implicit_transition:561 glfs/block4: Lock acquisition operation is already in process.


Expected results:
just before the shutdown node is up, the fio could be recovery.

Comment 1 Xiubo Li 2019-06-06 09:39:50 UTC

Created attachment 1577819 [details]
pstack of on node rhel3

Comment 2 Xiubo Li 2019-06-06 09:40:16 UTC

Created attachment 1577820 [details]
pstack of on node rhel1

Comment 3 Xiubo Li 2019-06-06 09:42:10 UTC

The bt output from the gbd:

[root@rhel1 ~]# gdb -p 1325

(gdb) bt
#0  0x00007fc7761baf47 in pthread_join () from /lib64/libpthread.so.0
#1  0x00007fc7773de468 in event_dispatch_epoll (event_pool=0x559f03d4b560) at event-epoll.c:847
#2  0x0000559f02419658 in main (argc=21, argv=0x7fff9c6722c8) at glusterfsd.c:2871
(gdb)


[root@rhel3 ~]# gdb -p 7669

(gdb) bt
#0  0x00007fac80bd9f47 in pthread_join () from /usr/lib64/libpthread.so.0
#1  0x00007fac81dfd468 in event_dispatch_epoll (event_pool=0x55de6f845560) at event-epoll.c:847
#2  0x000055de6f143658 in main (argc=21, argv=0x7ffcafc3eff8) at glusterfsd.c:2871
(gdb)

The pl_inode->fop_wind_count is:

(gdb) thread 2
[Switching to thread 2 (Thread 0x7fc742184700 (LWP 1829))]
#0  0x00007fc7761bd965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
(gdb) frame 2
#2  0x00007fc76379c13b in pl_lk (frame=frame@entry=0x7fc750001128, this=this@entry=0x7fc75c0128f0, 
    fd=fd@entry=0x7fc73c0977d8, cmd=cmd@entry=6, flock=flock@entry=0x7fc73c076938, 
    xdata=xdata@entry=0x7fc73c071828) at posix.c:2637
2637	                ret = pl_lock_preempt(pl_inode, reqlock);
(gdb) p pl_inode->fop_wind_count
$1 = -30
(gdb)


The pstack logs please see the attachments


Thanks.

BRs

Comment 4 Susant Kumar Palai 2019-06-10 07:10:33 UTC

Just a small update: There are cases where fop_wind_count can go -ve. A basic fix will be never to bring its value down if it is zero. I will update more on this later as I am busy with a few other issues ATM.


Susant

Comment 5 Xiubo Li 2019-07-17 08:15:51 UTC

Hi Susant,

Is there any new update about this ?

Thanks.

Comment 6 Susant Kumar Palai 2019-07-17 08:20:56 UTC

(In reply to Xiubo Li from comment #5)
> Hi Susant,
> 
> Is there any new update about this ?
> 
> Thanks.

Hey Xiubo, most likely will be sending a patch by end of day today.

Comment 7 Xiubo Li 2019-07-17 08:22:28 UTC

(In reply to Susant Kumar Palai from comment #6)
> (In reply to Xiubo Li from comment #5)
> > Hi Susant,
> > 
> > Is there any new update about this ?
> > 
> > Thanks.
> 
> Hey Xiubo, most likely will be sending a patch by end of day today.

Sure and take your time Susant please :-)
Thanks very much.
BRs

Comment 8 Susant Kumar Palai 2019-07-17 09:13:59 UTC

Moved to POST by mistake. Resetting.

Comment 9 Worker Ant 2019-07-22 09:59:10 UTC

REVIEW: https://review.gluster.org/23088 (locks/fencing: Address while lock preemption) posted (#1) for review on master by Susant Palai

Comment 10 Susant Kumar Palai 2019-07-22 10:02:07 UTC

Xiubo, if you could test it out, it would be great. (Make sure you enable fencing before you create any client)

Comment 11 Xiubo Li 2019-07-22 10:42:38 UTC

(In reply to Susant Kumar Palai from comment #10)
> Xiubo, if you could test it out, it would be great. (Make sure you enable
> fencing before you create any client)

@Susant,

Yeah, thanks very much for your work on this. And I will test it late today or tomorrow.

BRs
Xiubo

Comment 12 Xiubo Li 2019-07-23 02:44:00 UTC

@Susant,

There 2 issues are found:


1, From my test the glfs_file_lock() sometimes it will takes around 43 seconds, is it normal ? And why ?

 90456 2019-07-23 10:08:57.444 31411 [INFO] tcmu_glfs_lock:901 glfs/block0: lxb--------------glfs_file_lock start
 ...
 319959 2019-07-23 10:09:40.183 31411 [INFO] tcmu_glfs_lock:905 glfs/block0: lxb--------------glfs_file_lock end

2, After the lock is broke and all the FIOs callback will return -1, the -EPERM, not the -EBUSY as we discussed before. Is there any change about the return value ? I am only checking the -EBUSY and -ENOTCONN then only after that the lock state in local tcmu node will be changed. Or the local state in tcmu-runner service will always in LOCKED state, but it actually already lost the lock and should be in UNLOCKED state, so all the IOs will fail.



Thanks,
BRs

Comment 13 Susant Kumar Palai 2019-07-23 05:45:01 UTC

(In reply to Xiubo Li from comment #12)
> @Susant,
> 
> There 2 issues are found:
> 
> 
> 1, From my test the glfs_file_lock() sometimes it will takes around 43
> seconds, is it normal ? And why ?
> 
>  90456 2019-07-23 10:08:57.444 31411 [INFO] tcmu_glfs_lock:901 glfs/block0:
> lxb--------------glfs_file_lock start
>  ...
>  319959 2019-07-23 10:09:40.183 31411 [INFO] tcmu_glfs_lock:905 glfs/block0:
> lxb--------------glfs_file_lock end

I wonder if it is related to draining of fops. Let me do some testing around this.



> 
> 2, After the lock is broke and all the FIOs callback will return -1, the
> -EPERM, not the -EBUSY as we discussed before. Is there any change about the
> return value ? I am only checking the -EBUSY and -ENOTCONN then only after
> that the lock state in local tcmu node will be changed. Or the local state
> in tcmu-runner service will always in LOCKED state, but it actually already
> lost the lock and should be in UNLOCKED state, so all the IOs will fail.

This is interesting. Will get back after some code checking.

> 
> 
> 
> Thanks,
> BRs

Comment 14 Xiubo Li 2019-07-23 05:47:15 UTC

(In reply to Susant Kumar Palai from comment #13)
> (In reply to Xiubo Li from comment #12)
> > @Susant,
> > 
> > There 2 issues are found:
> > 
> > 
> > 1, From my test the glfs_file_lock() sometimes it will takes around 43
> > seconds, is it normal ? And why ?
> > 
> >  90456 2019-07-23 10:08:57.444 31411 [INFO] tcmu_glfs_lock:901 glfs/block0:
> > lxb--------------glfs_file_lock start
> >  ...
> >  319959 2019-07-23 10:09:40.183 31411 [INFO] tcmu_glfs_lock:905 glfs/block0:
> > lxb--------------glfs_file_lock end
> 
> I wonder if it is related to draining of fops. Let me do some testing around
> this.
> 

Sure.

> 
> 
> > 
> > 2, After the lock is broke and all the FIOs callback will return -1, the
> > -EPERM, not the -EBUSY as we discussed before. Is there any change about the
> > return value ? I am only checking the -EBUSY and -ENOTCONN then only after
> > that the lock state in local tcmu node will be changed. Or the local state
> > in tcmu-runner service will always in LOCKED state, but it actually already
> > lost the lock and should be in UNLOCKED state, so all the IOs will fail.
> 
> This is interesting. Will get back after some code checking.
> 

Please take your time @Susant.

Thanks,
BRs

Comment 15 Susant Kumar Palai 2019-07-23 07:44:22 UTC

(In reply to Xiubo Li from comment #12)
> @Susant,
> 
> There 2 issues are found:
> 
> 
> 1, From my test the glfs_file_lock() sometimes it will takes around 43
> seconds, is it normal ? And why ?
> 
>  90456 2019-07-23 10:08:57.444 31411 [INFO] tcmu_glfs_lock:901 glfs/block0:
> lxb--------------glfs_file_lock start
>  ...
>  319959 2019-07-23 10:09:40.183 31411 [INFO] tcmu_glfs_lock:905 glfs/block0:
> lxb--------------glfs_file_lock end

Checked the time taken for file_lock and it completes immediately for me.

    ret = glfs_fsetxattr(fd1, GF_ENFORCE_MANDATORY_LOCK, "set", 8, 0);
    if (ret < 0) {
        LOG_ERR("glfs_fsetxattr", errno);
        ret = -1;
        goto out;
    }

    time(&before);
    /* take a write mandatory lock */
    ret = glfs_file_lock(fd1, F_SETLKW, &lock, GLFS_LK_MANDATORY);
    if (ret) {
        LOG_ERR("glfs_file_lock", errno);
        goto out;
    }
    time(&after);
    diff = (unsigned long )after - before;
    fprintf(fp, "time %lu %lu %lu\n", diff, before, after);

time 0 1563867824 1563867824

Can you attach the brick log here when you run the test next time?

> 
> 2, After the lock is broke and all the FIOs callback will return -1, the
> -EPERM, not the -EBUSY as we discussed before. Is there any change about the
> return value ? I am only checking the -EBUSY and -ENOTCONN then only after
> that the lock state in local tcmu node will be changed. Or the local state
> in tcmu-runner service will always in LOCKED state, but it actually already
> lost the lock and should be in UNLOCKED state, so all the IOs will fail.

Please attach the brick log after enabling trace logging. <gluster v set <vol> brick-log-level TRACE>
> 
> 
> 
> Thanks,
> BRs

Comment 16 Xiubo Li 2019-07-23 07:48:58 UTC

(In reply to Susant Kumar Palai from comment #15)
> (In reply to Xiubo Li from comment #12)
> > @Susant,
> > 
> > There 2 issues are found:
> > 
> > 
> > 1, From my test the glfs_file_lock() sometimes it will takes around 43
> > seconds, is it normal ? And why ?
> > 
> >  90456 2019-07-23 10:08:57.444 31411 [INFO] tcmu_glfs_lock:901 glfs/block0:
> > lxb--------------glfs_file_lock start
> >  ...
> >  319959 2019-07-23 10:09:40.183 31411 [INFO] tcmu_glfs_lock:905 glfs/block0:
> > lxb--------------glfs_file_lock end
> 
> Checked the time taken for file_lock and it completes immediately for me.
> 
>     ret = glfs_fsetxattr(fd1, GF_ENFORCE_MANDATORY_LOCK, "set", 8, 0);
>     if (ret < 0) {
>         LOG_ERR("glfs_fsetxattr", errno);
>         ret = -1;
>         goto out;
>     }
> 
>     time(&before);
>     /* take a write mandatory lock */
>     ret = glfs_file_lock(fd1, F_SETLKW, &lock, GLFS_LK_MANDATORY);
>     if (ret) {
>         LOG_ERR("glfs_file_lock", errno);
>         goto out;
>     }
>     time(&after);
>     diff = (unsigned long )after - before;
>     fprintf(fp, "time %lu %lu %lu\n", diff, before, after);
> 
> time 0 1563867824 1563867824
> 
> Can you attach the brick log here when you run the test next time?
> 
> > 
> > 2, After the lock is broke and all the FIOs callback will return -1, the
> > -EPERM, not the -EBUSY as we discussed before. Is there any change about the
> > return value ? I am only checking the -EBUSY and -ENOTCONN then only after
> > that the lock state in local tcmu node will be changed. Or the local state
> > in tcmu-runner service will always in LOCKED state, but it actually already
> > lost the lock and should be in UNLOCKED state, so all the IOs will fail.
> 
> Please attach the brick log after enabling trace logging. <gluster v set
> <vol> brick-log-level TRACE>
> > 
> > 
> > 

Sure, I will do that after my current work handy, possibly late today or tomorrow morning.
Thanks
BRs

Comment 17 Susant Kumar Palai 2019-07-25 10:05:47 UTC

On the permission denied: 

I did not see any error related to EPERM but saw EBUSY in the brick logs.


[2019-07-24 08:15:22.236283] E [MSGID: 101191] [event-epoll.c:765:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch handler
[2019-07-24 08:15:46.083306] E [MSGID: 115068] [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 29: READV 0 (7db899f8-bf56-4b89-a4c6-90235e8c720a), client: CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0, error-xlator: repvol3-locks [Resource temporarily unavailable]
[2019-07-24 08:15:46.088292] E [MSGID: 115068] [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 31: READV 0 (7db899f8-bf56-4b89-a4c6-90235e8c720a), client: CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0, error-xlator: repvol3-locks [Resource temporarily unavailable]
[2019-07-24 08:15:46.119463] E [MSGID: 115068] [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 33: READV 0 (7db899f8-bf56-4b89-a4c6-90235e8c720a), client: CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0, error-xlator: repvol3-locks [Resource temporarily unavailable]
[2019-07-24 08:15:46.124067] E [MSGID: 115068] [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 35: READV 0 (7db899f8-bf56-4b89-a4c6-90235e8c720a), client: CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0, error-xlator: repvol3-locks [Resource temporarily unavailable]
[2019-07-24 08:15:46.294554] E [MSGID: 115068] [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 37: READV 0 (7db899f8-bf56-4b89-a4c6-90235e8c720a), client: CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0, error-xlator: repvol3-locks [Resource temporarily unavailable]
[2019-07-24 08:15:46.298672] E [MSGID: 115068] [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 39: READV 0 (7db899f8-bf56-4b


Is it possible that the lower layer is converting the errnos to EPERM? Can you check gfapi logs and tcmu logs for corresponding error messages and confirm?

Comment 18 Xiubo Li 2019-07-25 10:39:23 UTC

(In reply to Susant Kumar Palai from comment #17)
> On the permission denied: 
> 
> I did not see any error related to EPERM but saw EBUSY in the brick logs.
> 
> 
> [2019-07-24 08:15:22.236283] E [MSGID: 101191]
> [event-epoll.c:765:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch
> handler
> [2019-07-24 08:15:46.083306] E [MSGID: 115068]
> [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 29: READV 0
> (7db899f8-bf56-4b89-a4c6-90235e8c720a), client:
> CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
> c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0,
> error-xlator: repvol3-locks [Resource temporarily unavailable]
> [2019-07-24 08:15:46.088292] E [MSGID: 115068]
> [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 31: READV 0
> (7db899f8-bf56-4b89-a4c6-90235e8c720a), client:
> CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
> c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0,
> error-xlator: repvol3-locks [Resource temporarily unavailable]
> [2019-07-24 08:15:46.119463] E [MSGID: 115068]
> [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 33: READV 0
> (7db899f8-bf56-4b89-a4c6-90235e8c720a), client:
> CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
> c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0,
> error-xlator: repvol3-locks [Resource temporarily unavailable]
> [2019-07-24 08:15:46.124067] E [MSGID: 115068]
> [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 35: READV 0
> (7db899f8-bf56-4b89-a4c6-90235e8c720a), client:
> CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
> c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0,
> error-xlator: repvol3-locks [Resource temporarily unavailable]
> [2019-07-24 08:15:46.294554] E [MSGID: 115068]
> [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 37: READV 0
> (7db899f8-bf56-4b89-a4c6-90235e8c720a), client:
> CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
> c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0,
> error-xlator: repvol3-locks [Resource temporarily unavailable]
> [2019-07-24 08:15:46.298672] E [MSGID: 115068]
> [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 39: READV 0
> (7db899f8-bf56-4b
> 
> 
> Is it possible that the lower layer is converting the errnos to EPERM? Can
> you check gfapi logs and tcmu logs for corresponding error messages and
> confirm?

If so maybe the gfapi is doing this. I will sent you the gfapi logs, the EPERM value comes from the gfapi directly and tcmu-runner do nothing with it.

Checked the gfapi log, it is also full of:

[2019-07-24 08:23:41.042339] W [MSGID: 114031] [client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-1: remote operation failed [Device or resource busy]
[2019-07-24 08:23:41.042381] W [MSGID: 114031] [client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-0: remote operation failed [Device or resource busy]
[2019-07-24 08:23:41.042556] W [MSGID: 114031] [client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-1: remote operation failed [Device or resource busy]
[2019-07-24 08:23:41.042574] W [MSGID: 114031] [client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-0: remote operation failed [Device or resource busy]
[2019-07-24 08:23:41.042655] W [MSGID: 114031] [client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-1: remote operation failed [Device or resource busy]
[2019-07-24 08:23:41.042671] W [MSGID: 114031] [client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-0: remote operation failed [Device or resource busy]
[2019-07-24 08:23:41.042709] W [MSGID: 114031] [client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-1: remote operation failed [Device or resource busy]
[2019-07-24 08:23:41.042722] W [MSGID: 114031] [client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-0: remote operation failed [Device or resource busy]
[2019-07-24 08:23:41.042784] W [MSGID: 114031] [client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-1: remote operation failed [Device or resource busy]

Checked the gfapi source code:

 677 out:
 678     if (rsp.op_ret == -1) {   
 679         gf_msg(this->name, GF_LOG_WARNING, gf_error_to_errno(rsp.op_errno),                                                                               
 680                PC_MSG_REMOTE_OP_FAILED, "remote operation failed");
 681     } else if (rsp.op_ret >= 0) {
 682         if (local->attempt_reopen)
 683             client_attempt_reopen(local->fd, this);
 684     }
 685     CLIENT_STACK_UNWIND(writev, frame, rsp.op_ret,
 686                         gf_error_to_errno(rsp.op_errno), &prestat, &poststat,
 687                         xdata);
 688   
 689     if (xdata)
 690         dict_unref(xdata);   


It seems the return valume is coverted.

Thanks,
BRs

Comment 19 Xiubo Li 2019-07-25 10:42:32 UTC

(In reply to Xiubo Li from comment #18)
> (In reply to Susant Kumar Palai from comment #17)
[...]
> 
> Checked the gfapi source code:
> 
>  677 out:
>  678     if (rsp.op_ret == -1) {   


It seems returning the rsp.op_ret here to the callback:

static void glfs_async_cbk(glfs_fd_t *fd, ssize_t ret, void *data)

Not the rsp.op_errno.


>  679         gf_msg(this->name, GF_LOG_WARNING,
> gf_error_to_errno(rsp.op_errno),                                            
> 
>  680                PC_MSG_REMOTE_OP_FAILED, "remote operation failed");
>  681     } else if (rsp.op_ret >= 0) {
>  682         if (local->attempt_reopen)
>  683             client_attempt_reopen(local->fd, this);
>  684     }
>  685     CLIENT_STACK_UNWIND(writev, frame, rsp.op_ret,
>  686                         gf_error_to_errno(rsp.op_errno), &prestat,
> &poststat,
>  687                         xdata);
>  688   
>  689     if (xdata)
>  690         dict_unref(xdata);   
> 
> 
> It seems the return valume is coverted.
> 
> Thanks,
> BRs

Comment 20 Xiubo Li 2019-08-01 01:04:52 UTC

When the ret == -1 and then check the errno directly will works for me now.

But I can get both the -EAGAIN and -EBUSY, which only the -EBUSY is expected. 

Then the problem is why there will always be -EAGAIN every time before acquiring the lock ? 

Thanks
BRs

Comment 21 Worker Ant 2019-08-02 14:27:15 UTC

REVIEW: https://review.gluster.org/23088 (locks/fencing: Address hang while lock preemption) merged (#4) on master by Amar Tumballi

Comment 22 Xiubo Li 2019-08-04 13:03:30 UTC

@Susant,

Since the Fencing patch has been into the release 6, so this fixing followed should be backported, right ?

Thanks.
BRs

Comment 23 Susant Kumar Palai 2019-08-12 09:29:58 UTC

(In reply to Xiubo Li from comment #22)
> @Susant,
> 
> Since the Fencing patch has been into the release 6, so this fixing followed
> should be backported, right ?
> 
> Thanks.
> BRs

Will backport to release 6 and 7.

Note You need to log in before you can comment on or make changes to this bug.