Bug 949879

Summary: lsmcli take too much time when query VOLUMES (LUN) of NetApp ONTAP.
Product: Red Hat Enterprise Linux 7 Reporter: Gris Ge <fge>
Component: libstoragemgmtAssignee: Tony Asleson <tasleson>
Status: CLOSED CURRENTRELEASE QA Contact: Bruno Goncalves <bgoncalv>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.0CC: bgoncalv
Target Milestone: rcKeywords: Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: libstoragemgmt-0.0.19-1.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-06-13 10:32:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 754967    

Description Gris Ge 2013-04-09 08:26:06 UTC
Description of problem:

On NetApp ONTAP filer, we have about 1000 LUNs.
"lsmcli -l VOLUMES" take 5 minutes to return.

To be comparable, my own tool only take 30 seconds.
'libsan_utils -c query_all_lun_info -t netapp -h st05'

It just execute 'igorup show -v' and 'lun show -v' command via ssh session.


Version-Release number of selected component (if applicable):
libstoragemgmt-0.0.18-1.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1. time lsmcli -l VOLUMES
2.
3.
  
Actual results:
lsmcli take 5 minutes to return the results.

Expected results:
lsmcli take maximum 1 minutes to return the results.

Additional info:
I didn't look into the code of lsm yet. But taking 5 minutes for query LUN is not acceptable.

Comment 1 Tony Asleson 2013-04-09 17:07:39 UTC
Are you using the ontap plug-in or the smis plug-in with the NetApp filer?  

I just ran a test on the array in the Westford lab with 493 luns and it took 9.8 seconds.  I'm going to create another 500 to see if I can replicate what you are seeing.


Please provide your LSMCLI_URI.

Comment 2 Tony Asleson 2013-04-09 19:33:45 UTC
Repeated with 1002 luns, worst case ~15 seconds.

Without SSL

$ time ./lsmcli -l VOLUMES -t, | wc
   1002    1002  160191

real	0m10.922s
user	0m0.285s
sys	0m0.028s

With SSL

$ time ./lsmcli -l VOLUMES -t, | wc
   1002    1002  160191

real	0m15.150s
user	0m0.301s
sys	0m0.023s

Comment 3 Gris Ge 2013-04-14 00:59:22 UTC
[fge@Gris-Laptop ~]$ time lsmcli -l VOLUMES -u ontap://root@na -P | wc -l
Password:
691
real    2m49.333s
user    0m0.120s
sys     0m0.030s

I will mail you the ip address and login info for you to try on.

Comment 4 Tony Asleson 2013-04-16 00:00:38 UTC
This issue is showing up because the existing plug-in code does the following

1. Get list of all aggregates
2. For each aggregate retrieve the list of all volumes
3. For each volume retrieve all the luns for that specific volume

The performance suffers for this system because the number of volumes is large which causes a lot of RPC calls to retrieve all the lun information.

I have a patch that does the following instead

1. Get list of all aggregates
2. For each aggregate retrieve the list of all volumes, store mapping of which volume is located on which aggregate
3. Retrieve all luns for the entire array and then parse the volume path and use the mapping to quickly tag each lun with the uuid of the aggregate.

Note: The uuid of the aggregate is used as the POOL ID for for the Volume and thus the reason steps 1 & 2 are required.

With this approach I am seeing retrieval times of ~8 seconds instead of 1m57.

Good bug, thanks!

Comment 5 Bruno Goncalves 2013-08-02 10:37:57 UTC
Verified on libstoragemgmt-0.0.21-1.el7.

time lsmcli -u ontap://root@netapp -l VOLUMES -P | wc -l
Password: 
1197

real	0m2.190s
user	0m0.151s
sys	0m0.027s

Comment 6 Ludek Smid 2014-06-13 10:32:02 UTC
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.