Bug 1740494 - Fencing: Added the tcmu-runner ALUA feature support but after one of node is rebooted the glfs_file_lock() get stucked
Summary: Fencing: Added the tcmu-runner ALUA feature support but after one of node is ...
Keywords:
Status: POST
Alias: None
Product: GlusterFS
Classification: Community
Component: locks
Version: 6
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Susant Kumar Palai
QA Contact:
URL:
Whiteboard:
Depends On: 1717824
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-08-13 05:59 UTC by Susant Kumar Palai
Modified: 2019-08-31 18:03 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1717824
Environment:
Last Closed:
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Gluster.org Gerrit 23217 None Open locks/fencing: Address hang while lock preemption 2019-08-13 06:14:07 UTC

Description Susant Kumar Palai 2019-08-13 05:59:47 UTC
+++ This bug was initially created as a clone of Bug #1717824 +++

Description of problem:

In Glusterfs, we have support the fencing feature support. With this we can suppor the ALUA feature in LIO/TCMU now.

The fencing doc: https://review.gluster.org/#/c/glusterfs-specs/+/21925/6/accepted/fencing.md

The fencing test example: https://review.gluster.org/#/c/glusterfs/+/21496/12/tests/basic/fencing/fence-basic.c

The LIO/tcmu-runner PR of supporting the ALUA is : https://github.com/open-iscsi/tcmu-runner/pull/554.

But currently when testing it based the above PR in tcmu-runner by shutting down of the HA node, and start it after 2~3 minutes, in all the HA nodes we can see that the glfs_file_lock() get stucked, the following is from the /var/log/tcmu-runner.log:

====
2019-06-06 13:50:15.755 1316 [DEBUG] tcmu_acquire_dev_lock:388 glfs/block3: lock call state 2 retries 0. tag 65535 reopen 0
2019-06-06 13:50:15.757 1316 [DEBUG] tcmu_acquire_dev_lock:440 glfs/block3: lock call done. lock state 1
2019-06-06 13:50:55.845 1316 [DEBUG] tcmu_acquire_dev_lock:388 glfs/block4: lock call state 2 retries 0. tag 65535 reopen 0
2019-06-06 13:50:55.847 1316 [DEBUG] tcmu_acquire_dev_lock:440 glfs/block4: lock call done. lock state 1
2019-06-06 13:57:50.102 1315 [DEBUG] tcmu_acquire_dev_lock:388 glfs/block3: lock call state 2 retries 0. tag 65535 reopen 0
2019-06-06 13:57:50.103 1315 [DEBUG] tcmu_acquire_dev_lock:440 glfs/block3: lock call done. lock state 1
2019-06-06 13:57:50.121 1315 [DEBUG] tcmu_acquire_dev_lock:388 glfs/block4: lock call state 2 retries 0. tag 65535 reopen 0
2019-06-06 13:57:50.132 1315 [DEBUG] tcmu_acquire_dev_lock:440 glfs/block4: lock call done. lock state 1
2019-06-06 14:09:03.654 1328 [DEBUG] tcmu_acquire_dev_lock:388 glfs/block3: lock call state 2 retries 0. tag 65535 reopen 0
2019-06-06 14:09:03.662 1328 [DEBUG] tcmu_acquire_dev_lock:440 glfs/block3: lock call done. lock state 1
2019-06-06 14:09:06.700 1328 [DEBUG] tcmu_acquire_dev_lock:388 glfs/block4: lock call state 2 retries 0. tag 65535 reopen 0
====

The lock operation is never returned.

I am using the following glusterfs built by myself:

# rpm -qa|grep glusterfs
glusterfs-extra-xlators-7dev-0.0.el7.x86_64
glusterfs-api-devel-7dev-0.0.el7.x86_64
glusterfs-7dev-0.0.el7.x86_64
glusterfs-server-7dev-0.0.el7.x86_64
glusterfs-cloudsync-plugins-7dev-0.0.el7.x86_64
glusterfs-resource-agents-7dev-0.0.el7.noarch
glusterfs-api-7dev-0.0.el7.x86_64
glusterfs-devel-7dev-0.0.el7.x86_64
glusterfs-regression-tests-7dev-0.0.el7.x86_64
glusterfs-gnfs-7dev-0.0.el7.x86_64
glusterfs-client-xlators-7dev-0.0.el7.x86_64
glusterfs-geo-replication-7dev-0.0.el7.x86_64
glusterfs-debuginfo-7dev-0.0.el7.x86_64
glusterfs-fuse-7dev-0.0.el7.x86_64
glusterfs-events-7dev-0.0.el7.x86_64
glusterfs-libs-7dev-0.0.el7.x86_64
glusterfs-cli-7dev-0.0.el7.x86_64
glusterfs-rdma-7dev-0.0.el7.x86_64
 

How reproducible:
30%.

Steps to Reproduce:
1. create one rep volume(HA >= 2) with the mandantary lock enabled
2. create one gluster-blockd target
3. login and do the fio in the client node
4. shutdown one of the HA nodes, and wait 2 ~3 minutes and start it again

Actual results:
all the time the fio couldn't recovery and the rw BW will be 0kb/s, and we can see tons of log from /var/log/tcmu-runnner.log file:

2019-06-06 15:01:06.641 1328 [DEBUG] alua_implicit_transition:561 glfs/block4: Lock acquisition operation is already in process.
2019-06-06 15:01:06.648 1328 [DEBUG_SCSI_CMD] tcmu_cdb_print_info:353 glfs/block4: 28 0 0 3 1f 80 0 0 8 0 
2019-06-06 15:01:06.648 1328 [DEBUG] alua_implicit_transition:561 glfs/block4: Lock acquisition operation is already in process.
2019-06-06 15:01:06.655 1328 [DEBUG_SCSI_CMD] tcmu_cdb_print_info:353 glfs/block4: 28 0 0 3 1f 80 0 0 8 0 
2019-06-06 15:01:06.655 1328 [DEBUG] alua_implicit_transition:561 glfs/block4: Lock acquisition operation is already in process.
2019-06-06 15:01:06.661 1328 [DEBUG_SCSI_CMD] tcmu_cdb_print_info:353 glfs/block4: 28 0 0 3 1f 80 0 0 8 0 
2019-06-06 15:01:06.662 1328 [DEBUG] alua_implicit_transition:561 glfs/block4: Lock acquisition operation is already in process.


Expected results:
just before the shutdown node is up, the fio could be recovery.

--- Additional comment from Xiubo Li on 2019-06-06 14:39:50 MVT ---



--- Additional comment from Xiubo Li on 2019-06-06 14:40:16 MVT ---



--- Additional comment from Xiubo Li on 2019-06-06 14:42:10 MVT ---

The bt output from the gbd:

[root@rhel1 ~]# gdb -p 1325

(gdb) bt
#0  0x00007fc7761baf47 in pthread_join () from /lib64/libpthread.so.0
#1  0x00007fc7773de468 in event_dispatch_epoll (event_pool=0x559f03d4b560) at event-epoll.c:847
#2  0x0000559f02419658 in main (argc=21, argv=0x7fff9c6722c8) at glusterfsd.c:2871
(gdb)


[root@rhel3 ~]# gdb -p 7669

(gdb) bt
#0  0x00007fac80bd9f47 in pthread_join () from /usr/lib64/libpthread.so.0
#1  0x00007fac81dfd468 in event_dispatch_epoll (event_pool=0x55de6f845560) at event-epoll.c:847
#2  0x000055de6f143658 in main (argc=21, argv=0x7ffcafc3eff8) at glusterfsd.c:2871
(gdb)

The pl_inode->fop_wind_count is:

(gdb) thread 2
[Switching to thread 2 (Thread 0x7fc742184700 (LWP 1829))]
#0  0x00007fc7761bd965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
(gdb) frame 2
#2  0x00007fc76379c13b in pl_lk (frame=frame@entry=0x7fc750001128, this=this@entry=0x7fc75c0128f0, 
    fd=fd@entry=0x7fc73c0977d8, cmd=cmd@entry=6, flock=flock@entry=0x7fc73c076938, 
    xdata=xdata@entry=0x7fc73c071828) at posix.c:2637
2637	                ret = pl_lock_preempt(pl_inode, reqlock);
(gdb) p pl_inode->fop_wind_count
$1 = -30
(gdb)


The pstack logs please see the attachments


Thanks.

BRs

--- Additional comment from Susant Kumar Palai on 2019-06-10 12:10:33 MVT ---

Just a small update: There are cases where fop_wind_count can go -ve. A basic fix will be never to bring its value down if it is zero. I will update more on this later as I am busy with a few other issues ATM.


Susant

--- Additional comment from Xiubo Li on 2019-07-17 13:15:51 MVT ---

Hi Susant,

Is there any new update about this ?

Thanks.

--- Additional comment from Susant Kumar Palai on 2019-07-17 13:20:56 MVT ---

(In reply to Xiubo Li from comment #5)
> Hi Susant,
> 
> Is there any new update about this ?
> 
> Thanks.

Hey Xiubo, most likely will be sending a patch by end of day today.

--- Additional comment from Xiubo Li on 2019-07-17 13:22:28 MVT ---

(In reply to Susant Kumar Palai from comment #6)
> (In reply to Xiubo Li from comment #5)
> > Hi Susant,
> > 
> > Is there any new update about this ?
> > 
> > Thanks.
> 
> Hey Xiubo, most likely will be sending a patch by end of day today.

Sure and take your time Susant please :-)
Thanks very much.
BRs

--- Additional comment from Susant Kumar Palai on 2019-07-17 14:13:59 MVT ---

Moved to POST by mistake. Resetting.

--- Additional comment from Worker Ant on 2019-07-22 14:59:10 MVT ---

REVIEW: https://review.gluster.org/23088 (locks/fencing: Address while lock preemption) posted (#1) for review on master by Susant Palai

--- Additional comment from Susant Kumar Palai on 2019-07-22 15:02:07 MVT ---

Xiubo, if you could test it out, it would be great. (Make sure you enable fencing before you create any client)

--- Additional comment from Xiubo Li on 2019-07-22 15:42:38 MVT ---

(In reply to Susant Kumar Palai from comment #10)
> Xiubo, if you could test it out, it would be great. (Make sure you enable
> fencing before you create any client)

@Susant,

Yeah, thanks very much for your work on this. And I will test it late today or tomorrow.

BRs
Xiubo

--- Additional comment from Xiubo Li on 2019-07-23 07:44:00 MVT ---

@Susant,

There 2 issues are found:


1, From my test the glfs_file_lock() sometimes it will takes around 43 seconds, is it normal ? And why ?

 90456 2019-07-23 10:08:57.444 31411 [INFO] tcmu_glfs_lock:901 glfs/block0: lxb--------------glfs_file_lock start
 ...
 319959 2019-07-23 10:09:40.183 31411 [INFO] tcmu_glfs_lock:905 glfs/block0: lxb--------------glfs_file_lock end

2, After the lock is broke and all the FIOs callback will return -1, the -EPERM, not the -EBUSY as we discussed before. Is there any change about the return value ? I am only checking the -EBUSY and -ENOTCONN then only after that the lock state in local tcmu node will be changed. Or the local state in tcmu-runner service will always in LOCKED state, but it actually already lost the lock and should be in UNLOCKED state, so all the IOs will fail.



Thanks,
BRs

--- Additional comment from Susant Kumar Palai on 2019-07-23 10:45:01 MVT ---

(In reply to Xiubo Li from comment #12)
> @Susant,
> 
> There 2 issues are found:
> 
> 
> 1, From my test the glfs_file_lock() sometimes it will takes around 43
> seconds, is it normal ? And why ?
> 
>  90456 2019-07-23 10:08:57.444 31411 [INFO] tcmu_glfs_lock:901 glfs/block0:
> lxb--------------glfs_file_lock start
>  ...
>  319959 2019-07-23 10:09:40.183 31411 [INFO] tcmu_glfs_lock:905 glfs/block0:
> lxb--------------glfs_file_lock end

I wonder if it is related to draining of fops. Let me do some testing around this.



> 
> 2, After the lock is broke and all the FIOs callback will return -1, the
> -EPERM, not the -EBUSY as we discussed before. Is there any change about the
> return value ? I am only checking the -EBUSY and -ENOTCONN then only after
> that the lock state in local tcmu node will be changed. Or the local state
> in tcmu-runner service will always in LOCKED state, but it actually already
> lost the lock and should be in UNLOCKED state, so all the IOs will fail.

This is interesting. Will get back after some code checking.

> 
> 
> 
> Thanks,
> BRs

--- Additional comment from Xiubo Li on 2019-07-23 10:47:15 MVT ---

(In reply to Susant Kumar Palai from comment #13)
> (In reply to Xiubo Li from comment #12)
> > @Susant,
> > 
> > There 2 issues are found:
> > 
> > 
> > 1, From my test the glfs_file_lock() sometimes it will takes around 43
> > seconds, is it normal ? And why ?
> > 
> >  90456 2019-07-23 10:08:57.444 31411 [INFO] tcmu_glfs_lock:901 glfs/block0:
> > lxb--------------glfs_file_lock start
> >  ...
> >  319959 2019-07-23 10:09:40.183 31411 [INFO] tcmu_glfs_lock:905 glfs/block0:
> > lxb--------------glfs_file_lock end
> 
> I wonder if it is related to draining of fops. Let me do some testing around
> this.
> 

Sure.

> 
> 
> > 
> > 2, After the lock is broke and all the FIOs callback will return -1, the
> > -EPERM, not the -EBUSY as we discussed before. Is there any change about the
> > return value ? I am only checking the -EBUSY and -ENOTCONN then only after
> > that the lock state in local tcmu node will be changed. Or the local state
> > in tcmu-runner service will always in LOCKED state, but it actually already
> > lost the lock and should be in UNLOCKED state, so all the IOs will fail.
> 
> This is interesting. Will get back after some code checking.
> 

Please take your time @Susant.

Thanks,
BRs

--- Additional comment from Susant Kumar Palai on 2019-07-23 12:44:22 MVT ---

(In reply to Xiubo Li from comment #12)
> @Susant,
> 
> There 2 issues are found:
> 
> 
> 1, From my test the glfs_file_lock() sometimes it will takes around 43
> seconds, is it normal ? And why ?
> 
>  90456 2019-07-23 10:08:57.444 31411 [INFO] tcmu_glfs_lock:901 glfs/block0:
> lxb--------------glfs_file_lock start
>  ...
>  319959 2019-07-23 10:09:40.183 31411 [INFO] tcmu_glfs_lock:905 glfs/block0:
> lxb--------------glfs_file_lock end

Checked the time taken for file_lock and it completes immediately for me.

    ret = glfs_fsetxattr(fd1, GF_ENFORCE_MANDATORY_LOCK, "set", 8, 0);
    if (ret < 0) {
        LOG_ERR("glfs_fsetxattr", errno);
        ret = -1;
        goto out;
    }

    time(&before);
    /* take a write mandatory lock */
    ret = glfs_file_lock(fd1, F_SETLKW, &lock, GLFS_LK_MANDATORY);
    if (ret) {
        LOG_ERR("glfs_file_lock", errno);
        goto out;
    }
    time(&after);
    diff = (unsigned long )after - before;
    fprintf(fp, "time %lu %lu %lu\n", diff, before, after);

time 0 1563867824 1563867824

Can you attach the brick log here when you run the test next time?

> 
> 2, After the lock is broke and all the FIOs callback will return -1, the
> -EPERM, not the -EBUSY as we discussed before. Is there any change about the
> return value ? I am only checking the -EBUSY and -ENOTCONN then only after
> that the lock state in local tcmu node will be changed. Or the local state
> in tcmu-runner service will always in LOCKED state, but it actually already
> lost the lock and should be in UNLOCKED state, so all the IOs will fail.

Please attach the brick log after enabling trace logging. <gluster v set <vol> brick-log-level TRACE>
> 
> 
> 
> Thanks,
> BRs

--- Additional comment from Xiubo Li on 2019-07-23 12:48:58 MVT ---

(In reply to Susant Kumar Palai from comment #15)
> (In reply to Xiubo Li from comment #12)
> > @Susant,
> > 
> > There 2 issues are found:
> > 
> > 
> > 1, From my test the glfs_file_lock() sometimes it will takes around 43
> > seconds, is it normal ? And why ?
> > 
> >  90456 2019-07-23 10:08:57.444 31411 [INFO] tcmu_glfs_lock:901 glfs/block0:
> > lxb--------------glfs_file_lock start
> >  ...
> >  319959 2019-07-23 10:09:40.183 31411 [INFO] tcmu_glfs_lock:905 glfs/block0:
> > lxb--------------glfs_file_lock end
> 
> Checked the time taken for file_lock and it completes immediately for me.
> 
>     ret = glfs_fsetxattr(fd1, GF_ENFORCE_MANDATORY_LOCK, "set", 8, 0);
>     if (ret < 0) {
>         LOG_ERR("glfs_fsetxattr", errno);
>         ret = -1;
>         goto out;
>     }
> 
>     time(&before);
>     /* take a write mandatory lock */
>     ret = glfs_file_lock(fd1, F_SETLKW, &lock, GLFS_LK_MANDATORY);
>     if (ret) {
>         LOG_ERR("glfs_file_lock", errno);
>         goto out;
>     }
>     time(&after);
>     diff = (unsigned long )after - before;
>     fprintf(fp, "time %lu %lu %lu\n", diff, before, after);
> 
> time 0 1563867824 1563867824
> 
> Can you attach the brick log here when you run the test next time?
> 
> > 
> > 2, After the lock is broke and all the FIOs callback will return -1, the
> > -EPERM, not the -EBUSY as we discussed before. Is there any change about the
> > return value ? I am only checking the -EBUSY and -ENOTCONN then only after
> > that the lock state in local tcmu node will be changed. Or the local state
> > in tcmu-runner service will always in LOCKED state, but it actually already
> > lost the lock and should be in UNLOCKED state, so all the IOs will fail.
> 
> Please attach the brick log after enabling trace logging. <gluster v set
> <vol> brick-log-level TRACE>
> > 
> > 
> > 

Sure, I will do that after my current work handy, possibly late today or tomorrow morning.
Thanks
BRs

--- Additional comment from Susant Kumar Palai on 2019-07-25 15:05:47 MVT ---

On the permission denied: 

I did not see any error related to EPERM but saw EBUSY in the brick logs.


[2019-07-24 08:15:22.236283] E [MSGID: 101191] [event-epoll.c:765:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch handler
[2019-07-24 08:15:46.083306] E [MSGID: 115068] [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 29: READV 0 (7db899f8-bf56-4b89-a4c6-90235e8c720a), client: CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0, error-xlator: repvol3-locks [Resource temporarily unavailable]
[2019-07-24 08:15:46.088292] E [MSGID: 115068] [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 31: READV 0 (7db899f8-bf56-4b89-a4c6-90235e8c720a), client: CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0, error-xlator: repvol3-locks [Resource temporarily unavailable]
[2019-07-24 08:15:46.119463] E [MSGID: 115068] [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 33: READV 0 (7db899f8-bf56-4b89-a4c6-90235e8c720a), client: CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0, error-xlator: repvol3-locks [Resource temporarily unavailable]
[2019-07-24 08:15:46.124067] E [MSGID: 115068] [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 35: READV 0 (7db899f8-bf56-4b89-a4c6-90235e8c720a), client: CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0, error-xlator: repvol3-locks [Resource temporarily unavailable]
[2019-07-24 08:15:46.294554] E [MSGID: 115068] [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 37: READV 0 (7db899f8-bf56-4b89-a4c6-90235e8c720a), client: CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0, error-xlator: repvol3-locks [Resource temporarily unavailable]
[2019-07-24 08:15:46.298672] E [MSGID: 115068] [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 39: READV 0 (7db899f8-bf56-4b


Is it possible that the lower layer is converting the errnos to EPERM? Can you check gfapi logs and tcmu logs for corresponding error messages and confirm?

--- Additional comment from Xiubo Li on 2019-07-25 15:39:23 MVT ---

(In reply to Susant Kumar Palai from comment #17)
> On the permission denied: 
> 
> I did not see any error related to EPERM but saw EBUSY in the brick logs.
> 
> 
> [2019-07-24 08:15:22.236283] E [MSGID: 101191]
> [event-epoll.c:765:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch
> handler
> [2019-07-24 08:15:46.083306] E [MSGID: 115068]
> [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 29: READV 0
> (7db899f8-bf56-4b89-a4c6-90235e8c720a), client:
> CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
> c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0,
> error-xlator: repvol3-locks [Resource temporarily unavailable]
> [2019-07-24 08:15:46.088292] E [MSGID: 115068]
> [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 31: READV 0
> (7db899f8-bf56-4b89-a4c6-90235e8c720a), client:
> CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
> c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0,
> error-xlator: repvol3-locks [Resource temporarily unavailable]
> [2019-07-24 08:15:46.119463] E [MSGID: 115068]
> [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 33: READV 0
> (7db899f8-bf56-4b89-a4c6-90235e8c720a), client:
> CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
> c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0,
> error-xlator: repvol3-locks [Resource temporarily unavailable]
> [2019-07-24 08:15:46.124067] E [MSGID: 115068]
> [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 35: READV 0
> (7db899f8-bf56-4b89-a4c6-90235e8c720a), client:
> CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
> c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0,
> error-xlator: repvol3-locks [Resource temporarily unavailable]
> [2019-07-24 08:15:46.294554] E [MSGID: 115068]
> [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 37: READV 0
> (7db899f8-bf56-4b89-a4c6-90235e8c720a), client:
> CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
> c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0,
> error-xlator: repvol3-locks [Resource temporarily unavailable]
> [2019-07-24 08:15:46.298672] E [MSGID: 115068]
> [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 39: READV 0
> (7db899f8-bf56-4b
> 
> 
> Is it possible that the lower layer is converting the errnos to EPERM? Can
> you check gfapi logs and tcmu logs for corresponding error messages and
> confirm?

If so maybe the gfapi is doing this. I will sent you the gfapi logs, the EPERM value comes from the gfapi directly and tcmu-runner do nothing with it.

Checked the gfapi log, it is also full of:

[2019-07-24 08:23:41.042339] W [MSGID: 114031] [client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-1: remote operation failed [Device or resource busy]
[2019-07-24 08:23:41.042381] W [MSGID: 114031] [client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-0: remote operation failed [Device or resource busy]
[2019-07-24 08:23:41.042556] W [MSGID: 114031] [client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-1: remote operation failed [Device or resource busy]
[2019-07-24 08:23:41.042574] W [MSGID: 114031] [client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-0: remote operation failed [Device or resource busy]
[2019-07-24 08:23:41.042655] W [MSGID: 114031] [client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-1: remote operation failed [Device or resource busy]
[2019-07-24 08:23:41.042671] W [MSGID: 114031] [client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-0: remote operation failed [Device or resource busy]
[2019-07-24 08:23:41.042709] W [MSGID: 114031] [client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-1: remote operation failed [Device or resource busy]
[2019-07-24 08:23:41.042722] W [MSGID: 114031] [client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-0: remote operation failed [Device or resource busy]
[2019-07-24 08:23:41.042784] W [MSGID: 114031] [client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-1: remote operation failed [Device or resource busy]

Checked the gfapi source code:

 677 out:
 678     if (rsp.op_ret == -1) {   
 679         gf_msg(this->name, GF_LOG_WARNING, gf_error_to_errno(rsp.op_errno),                                                                               
 680                PC_MSG_REMOTE_OP_FAILED, "remote operation failed");
 681     } else if (rsp.op_ret >= 0) {
 682         if (local->attempt_reopen)
 683             client_attempt_reopen(local->fd, this);
 684     }
 685     CLIENT_STACK_UNWIND(writev, frame, rsp.op_ret,
 686                         gf_error_to_errno(rsp.op_errno), &prestat, &poststat,
 687                         xdata);
 688   
 689     if (xdata)
 690         dict_unref(xdata);   


It seems the return valume is coverted.

Thanks,
BRs

--- Additional comment from Xiubo Li on 2019-07-25 15:42:32 MVT ---

(In reply to Xiubo Li from comment #18)
> (In reply to Susant Kumar Palai from comment #17)
[...]
> 
> Checked the gfapi source code:
> 
>  677 out:
>  678     if (rsp.op_ret == -1) {   


It seems returning the rsp.op_ret here to the callback:

static void glfs_async_cbk(glfs_fd_t *fd, ssize_t ret, void *data)

Not the rsp.op_errno.


>  679         gf_msg(this->name, GF_LOG_WARNING,
> gf_error_to_errno(rsp.op_errno),                                            
> 
>  680                PC_MSG_REMOTE_OP_FAILED, "remote operation failed");
>  681     } else if (rsp.op_ret >= 0) {
>  682         if (local->attempt_reopen)
>  683             client_attempt_reopen(local->fd, this);
>  684     }
>  685     CLIENT_STACK_UNWIND(writev, frame, rsp.op_ret,
>  686                         gf_error_to_errno(rsp.op_errno), &prestat,
> &poststat,
>  687                         xdata);
>  688   
>  689     if (xdata)
>  690         dict_unref(xdata);   
> 
> 
> It seems the return valume is coverted.
> 
> Thanks,
> BRs

--- Additional comment from Xiubo Li on 2019-08-01 06:04:52 MVT ---

When the ret == -1 and then check the errno directly will works for me now.

But I can get both the -EAGAIN and -EBUSY, which only the -EBUSY is expected. 

Then the problem is why there will always be -EAGAIN every time before acquiring the lock ? 

Thanks
BRs

--- Additional comment from Worker Ant on 2019-08-02 19:27:15 MVT ---

REVIEW: https://review.gluster.org/23088 (locks/fencing: Address hang while lock preemption) merged (#4) on master by Amar Tumballi

--- Additional comment from Xiubo Li on 2019-08-04 18:03:30 MVT ---

@Susant,

Since the Fencing patch has been into the release 6, so this fixing followed should be backported, right ?

Thanks.
BRs

--- Additional comment from Susant Kumar Palai on 2019-08-12 14:29:58 MVT ---

(In reply to Xiubo Li from comment #22)
> @Susant,
> 
> Since the Fencing patch has been into the release 6, so this fixing followed
> should be backported, right ?
> 
> Thanks.
> BRs

Will backport to release 6 and 7.

Comment 1 Worker Ant 2019-08-13 06:14:08 UTC
REVIEW: https://review.gluster.org/23217 (locks/fencing: Address hang while lock preemption) posted (#2) for review on release-6 by Susant Palai


Note You need to log in before you can comment on or make changes to this bug.