Bug 1464421

Summary: tcmu-runner: protect glfs objects cache from race
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Prasanna Kumar Kalever <prasanna.kalever>
Component: tcmu-runnerAssignee: Prasanna Kumar Kalever <prasanna.kalever>
Status: CLOSED ERRATA QA Contact: Sweta Anandpara <sanandpa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rhgs-3.3CC: amukherj, prasanna.kalever, rhs-bugs, sanandpa, storage-qa-internal
Target Milestone: ---   
Target Release: RHGS 3.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: 3.3.0-devel-freeze-exception
Fixed In Version: tcmu-runner-1.2.0-7.el7rhgs Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-09-21 04:20:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1417151    

Description Prasanna Kumar Kalever 2017-06-23 11:56:34 UTC
Description of problem:
The cache implementation in tcmu-runners glfs handler doesn't protect it with locks.

Without which there could be unnecessary cache contention.

Comment 2 Prasanna Kumar Kalever 2017-06-23 11:57:31 UTC
Patch already in master:
https://github.com/open-iscsi/tcmu-runner/pull/171

Comment 7 Sweta Anandpara 2017-07-17 07:41:22 UTC
Prasanna, please refer comment6.

Comment 9 Sweta Anandpara 2017-07-24 10:20:34 UTC
And how do I make sure if 'caching is working as expected' ?

Should I be executing multiple create/delete of blocks in the same volume? Should I try out multiple block creates and deletes in different volumes? Should I be scaling it up to a certain level? Should I executing multiple block creates/deletes from different peer nodes at the same time?

I am unsure of what the cache of tcmu-runner's glfs-handler does..

Comment 12 Sweta Anandpara 2017-08-01 10:06:16 UTC
Discussed it with Pranithk. Apprarently, gluster-block create involves tcmu-runner and a few other things, which are known to not show consistent behaviour. Hence, to validate whether glfs cache is working as expected, it is better to time 'gluster-block list' instead of 'gluster-block create'. Logs pasted below confirm that every second call of 'gluster-block list <volname>' takes much lesser time than the first call. After executing the command on n+1'th volume, the very first volume again takes more time, as it would have been pushed out of the cache. 

Moving this bug to verified in 3.3.0. Tested on glusterfs-3.8.4-35 and gluster-block-0.2.1-6.

[root@dhcp47-121 ~]# time gluster-block list vol1
bk1
bk2
bk3
bk4

real	0m4.083s
user	0m0.005s
sys	0m0.005s
[root@dhcp47-121 ~]# time gluster-block list vol1
bk1
bk2
bk3
bk4

real	0m0.038s
user	0m0.002s
sys	0m0.011s
[root@dhcp47-121 ~]# time gluster-block list vol2
bk1
bk2

real	0m4.907s
user	0m0.000s
sys	0m0.011s
[root@dhcp47-121 ~]# time gluster-block list vol2
bk1
bk2

real	0m0.251s
user	0m0.003s
sys	0m0.002s
[root@dhcp47-121 ~]# time gluster-block list vol3
bk1
bk2

real	0m3.096s
user	0m0.002s
sys	0m0.006s
[root@dhcp47-121 ~]# time gluster-block list vol3
bk1
bk2

real	0m0.027s
user	0m0.001s
sys	0m0.008s
[root@dhcp47-121 ~]# time gluster-block list vol4
bk1
bk2

real	0m4.106s
user	0m0.004s
sys	0m0.005s
[root@dhcp47-121 ~]# time gluster-block list vol4
bk1
bk2

real	0m0.032s
user	0m0.004s
sys	0m0.003s
[root@dhcp47-121 ~]# time gluster-block list vol5
bk1
bk2

real	0m4.102s
user	0m0.002s
sys	0m0.004s
[root@dhcp47-121 ~]# time gluster-block list vol5
bk1
bk2

real	0m0.030s
user	0m0.004s
sys	0m0.003s
[root@dhcp47-121 ~]# time gluster-block list vol6
bk1

real	0m4.891s
user	0m0.002s
sys	0m0.006s
[root@dhcp47-121 ~]# time gluster-block list vol1
bk1
bk2
bk3
bk4

real	0m6.488s
user	0m0.001s
sys	0m0.005s
[root@dhcp47-121 ~]# time gluster-block list vol2
bk1
bk2

real	0m4.209s
user	0m0.001s
sys	0m0.014s
[root@dhcp47-121 ~]# time gluster-block list vol3
bk1
bk2

real	0m4.870s
user	0m0.003s
sys	0m0.004s
[root@dhcp47-121 ~]# time gluster-block list vol3
bk1
bk2

real	0m0.035s
user	0m0.002s
sys	0m0.008s
[root@dhcp47-121 ~]# time gluster-block list vol3
bk1
bk2

real	0m0.034s
user	0m0.003s
sys	0m0.005s
[root@dhcp47-121 ~]# time gluster-block list vol2
bk1
bk2

real	0m0.034s
user	0m0.003s
sys	0m0.004s
[root@dhcp47-121 ~]# time gluster-block list vol4
bk1
bk2

real	0m4.956s
user	0m0.002s
sys	0m0.007s
[root@dhcp47-121 ~]# time gluster-block list vol2
bk1
bk2

real	0m0.037s
user	0m0.002s
sys	0m0.012s
[root@dhcp47-121 ~]# time gluster-block list vol3
bk1
bk2

real	0m0.319s
user	0m0.001s
sys	0m0.005s
[root@dhcp47-121 ~]# time gluster-block list vol4
bk1
bk2

real	0m0.477s
user	0m0.003s
sys	0m0.002s
[root@dhcp47-121 ~]# time gluster-block list vol5
bk1
bk2

real	0m3.756s
user	0m0.000s
sys	0m0.005s
[root@dhcp47-121 ~]# time gluster-block list vol2
bk1
bk2

real	0m0.029s
user	0m0.003s
sys	0m0.007s
[root@dhcp47-121 ~]# time gluster-block list vol3
bk1
bk2

real	0m0.033s
user	0m0.003s
sys	0m0.002s
[root@dhcp47-121 ~]# time gluster-block list vol4
bk1
bk2

real	0m0.019s
user	0m0.002s
sys	0m0.002s
[root@dhcp47-121 ~]# time gluster-block list vol5
bk1
bk2

real	0m0.031s
user	0m0.004s
sys	0m0.005s
[root@dhcp47-121 ~]# time gluster-block list vol6
bk1

real	0m4.865s
user	0m0.003s
sys	0m0.002s
[root@dhcp47-121 ~]# time gluster-block list vol6
bk1

real	0m0.032s
user	0m0.001s
sys	0m0.004s
[root@dhcp47-121 ~]# time gluster-block list vol2
bk1
bk2

real	0m0.038s
user	0m0.005s
sys	0m0.007s
[root@dhcp47-121 ~]# time gluster-block list vol3
bk1
bk2

real	0m0.032s
user	0m0.001s
sys	0m0.006s
[root@dhcp47-121 ~]# time gluster-block list vol4
bk1
bk2

real	0m0.031s
user	0m0.004s
sys	0m0.003s
[root@dhcp47-121 ~]# time gluster-block list vol5
bk1
bk2

real	0m0.032s
user	0m0.002s
sys	0m0.006s
[root@dhcp47-121 ~]# time gluster-block list vol6
bk1

real	0m0.032s
user	0m0.004s
sys	0m0.005s
[root@dhcp47-121 ~]# time gluster-block list vol1
bk1
bk2
bk3
bk4

real	0m4.139s
user	0m0.005s
sys	0m0.003s
[root@dhcp47-121 ~]# time gluster-block list vol2
bk1
bk2

real	0m4.992s
user	0m0.002s
sys	0m0.006s
[root@dhcp47-121 ~]# time gluster-block list vol4
bk1
bk2

real	0m0.033s
user	0m0.001s
sys	0m0.007s
[root@dhcp47-121 ~]# time gluster-block list vol5
bk1
bk2

real	0m0.023s
user	0m0.003s
sys	0m0.002s
[root@dhcp47-121 ~]# time gluster-block list vol6
bk1

real	0m0.025s
user	0m0.003s
sys	0m0.003s
[root@dhcp47-121 ~]# time gluster-block list vol1
bk1
bk2
bk3
bk4

real	0m0.167s
user	0m0.002s
sys	0m0.003s
[root@dhcp47-121 ~]# time gluster-block list vol2
bk1
bk2

real	0m0.025s
user	0m0.003s
sys	0m0.004s
[root@dhcp47-121 ~]# time gluster-block list vol3
bk1
bk2

real	0m5.085s
user	0m0.004s
sys	0m0.006s
[root@dhcp47-121 ~]#

Comment 15 errata-xmlrpc 2017-09-21 04:20:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:2773