Bug 1217401
| Summary: | [SCALE] Adding direct lun to a VM times out | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | akotov | ||||||
| Component: | vdsm | Assignee: | Fred Rolland <frolland> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | mlehrer | ||||||
| Severity: | urgent | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | 3.5.1 | CC: | aefrat, ahino, amureini, bazulay, bhaubeck, dmoessne, eedri, frolland, jbuchta, lpeer, lsurette, mlehrer, nashok, obockows, rgroten, rhodain, rmcswain, tnisan, yeylon, ykaul, ylavi, zkabelac | ||||||
| Target Milestone: | ovirt-3.6.0-rc3 | ||||||||
| Target Release: | 3.6.0 | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | 3.6.0-16 | Doc Type: | Bug Fix | ||||||
| Doc Text: |
Previously, adding a direct LUN to a virtual machine sometimes timed out. This occurred because a physical volume (PV) create test is performed for each device when calling getDeviceList. Since a PV create test requires significant resources, it affects the response time of GetDeviceList in scale setups, sometimes causing timeouts on the Red Hat Enterprise Virtualization Manager. This has been fixed and the PV create test can now be skipped using a flag. If the PV test is needed to know the usage state of the device, it can be run on specific devices, therefore minimizing the impact and decreasing the user waiting time on the following operations:
1. UI - Add direct LUN disk
2. UI/REST - Add ISCSI/FC storage domain
3. UI/REST - Edit ISCSI/FC storage domain
4. UI/REST - Extend ISCSI/FC storage domain
5. REST - Add direct LUN disk (if <host> parameter is provided)
6. UI/REST - Import an iSCSI/FCP domain
|
Story Points: | --- | ||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2016-03-09 19:38:42 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | Scale | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
Ala, as this week's QA contact, can you take a look please? Alexandar, I am not able to reproduce the issue as described. Please note that the lvm call you referred to as the root cause is *not* related to adding LUN - 'lvm pvcreate' is invoked when creating storage domain, not when adding LUNs. When adding LUNs, I's look for 'getDeviceList' calls. Can you isolate the operation and make sure you are not creating storage domains while adding the LUN? (In reply to Ala Hino from comment #3) > Alexandar, > > I am not able to reproduce the issue as described. > > Please note that the lvm call you referred to as the root cause is *not* > related to adding LUN - 'lvm pvcreate' is invoked when creating storage > domain, not when adding LUNs. > > When adding LUNs, I's look for 'getDeviceList' calls. > > Can you isolate the operation and make sure you are not creating storage > domains while adding the LUN? Hi, are you sure? getDeviceList is called any time you need to list devices available on the hypervisor. If you want to create a direct lun, you have call this so yo uknow what LUNs are available. lvm.testPVCreate is part of this call as it is part of HSM._getDeviceList which is called by HSM.getDeviceList. Click on Tab "Disks" Click on "add" Click on "External (Direct Lun)" engine tries to retrieve the available LUNs and triggers: Thread-72158::DEBUG::2015-05-21 14:13:44,227::BindingXMLRPC::318::vds::(wrapper) client [10.34.57.135] Thread-72158::DEBUG::2015-05-21 14:13:44,227::task::595::Storage.TaskManager.Task::(_updateState) Task=`bc8d4989-cb21-4529-9385-b97c449ee6dc`::moving from state init -> state preparing Thread-72158::INFO::2015-05-21 14:13:44,228::logUtils::44::dispatcher::(wrapper) Run and protect: getDeviceList(storageType=3, options={}) ... Thread-72158::DEBUG::2015-05-21 14:13:45,691::lvm::291::Storage.Misc.excCmd::(cmd) /usr/bin/sudo -n /sbin/lvm pvcreate --config .... See the code snipped mentioned by Alexandar. Roman Zdenek, We found out that testPVCreate take almost 3 minutes. In this deployment, there are 183 devices: 141 failed in pvcreate and 42 succeeded in pvcreate. I attached the command that is executed. Is this reasonable? Created attachment 1029508 [details]
Test PV create command
(In reply to Roman Hodain from comment #5) > (In reply to Ala Hino from comment #3) > > Alexandar, > > > > I am not able to reproduce the issue as described. > > > > Please note that the lvm call you referred to as the root cause is *not* > > related to adding LUN - 'lvm pvcreate' is invoked when creating storage > > domain, not when adding LUNs. > > > > When adding LUNs, I's look for 'getDeviceList' calls. > > > > Can you isolate the operation and make sure you are not creating storage > > domains while adding the LUN? > > Hi, > > are you sure? getDeviceList is called any time you need to list devices > available on the hypervisor. If you want to create a direct lun, you have > call this so yo uknow what LUNs are available. > > lvm.testPVCreate is part of this call as it is part of HSM._getDeviceList > which is called by HSM.getDeviceList. > > Click on Tab "Disks" > Click on "add" > Click on "External (Direct Lun)" > engine tries to retrieve the available LUNs and triggers: > > Thread-72158::DEBUG::2015-05-21 > 14:13:44,227::BindingXMLRPC::318::vds::(wrapper) client [10.34.57.135] > Thread-72158::DEBUG::2015-05-21 > 14:13:44,227::task::595::Storage.TaskManager.Task::(_updateState) > Task=`bc8d4989-cb21-4529-9385-b97c449ee6dc`::moving from state init -> state > preparing > Thread-72158::INFO::2015-05-21 > 14:13:44,228::logUtils::44::dispatcher::(wrapper) Run and protect: > getDeviceList(storageType=3, options={}) > ... > Thread-72158::DEBUG::2015-05-21 > 14:13:45,691::lvm::291::Storage.Misc.excCmd::(cmd) /usr/bin/sudo -n > /sbin/lvm pvcreate --config > .... > > > See the code snipped mentioned by Alexandar. > > Roman Hi, You are correct, testPVCreate is also invoked when performing getDeviceList. The thing is that in this deployment there are 183 devices and the testPVCreate takes almost 3 minutes, hence the timeout seen in the engine. The code mentioned in the description is required because it is used to mark what devices are used and what are not. This info is needed by the engine to provide warning to the admin when trying to add a lun that is used. Sent needinfo request to Zdenek to see what we can do here. Which version of lvm2 is here in use ? Is lvmetad in use ? Also if you plan to 'pvcreate + vgcreate' it much more efficient to pass directly devices to vgcreate (and skip extra pvcreate step) Would this help for better efficience for now ? Alexandar, Can you please reply to Zdenek needinfo? It was reported on the following versions so far:
lvm2-2.02.111-2.el6_6.1.x86_64
lvm2-2.02.111-2.el6_6.2.x86_64
lvmetad is not in use. It is not enabled by default on RHEL nor on RHEV-H
We can test that with lvmetad enabled.
I have run a couple of tests in my ENV with iSCSI. There are 50 LUNs. The performance is not very high as all of the LUNs are on a single disk exported via LIO (RHEL 7.1)
The parameter use_lvmetad=1 cannot be simply set as we pass the option in the --config section of pvcreate, but I have tweaked to code an test that. Here are the results:
lvmetad enabled:
# time vdsClient -s 0 getDeviceList >/dev/null
real 0m29.070s
user 0m0.187s
sys 0m0.046s
# time vdsClient -s 0 getDeviceList >/dev/null
real 0m28.417s
user 0m0.198s
sys 0m0.042s
# time vdsClient -s 0 getDeviceList >/dev/null
real 0m30.982s
user 0m0.195s
sys 0m0.040s
lvmetad disabled:
# time vdsClient -s 0 getDeviceList >/dev/null
real 2m54.038s
user 0m0.201s
sys 0m0.060s
# time vdsClient -s 0 getDeviceList >/dev/null
real 2m55.566s
user 0m0.193s
sys 0m0.060s
# time vdsClient -s 0 getDeviceList >/dev/null
real 3m1.949s
user 0m0.196s
sys 0m0.062s
So there is a significant improvement, but we cannot use it as it is a significant change in the storage stack.
I have also run the pvcreate command manually without lvmetad and it takes almost the same time as the vdsClient command. The interesting is how the performance of the command degradates. I have add timestamp before the lines reported by pvcreate command. All the pvs are on the commandline:
$ grep 2015-06-13 pvcreate.out
2015-06-13 09:51:02 Physical volume "/dev/mapper/360014056628b5d6642d45f0a53c4415d" successfully created
2015-06-13 09:51:03 Physical volume "/dev/mapper/360014051845d840046e498992e9be252" successfully created
2015-06-13 09:51:03 Physical volume "/dev/mapper/36001405c36d824ad312400aa25c07f9b" successfully created
2015-06-13 09:51:04 Physical volume "/dev/mapper/36001405bc71ffe3507f4e9db5fa17e8b" successfully created
2015-06-13 09:51:05 Physical volume "/dev/mapper/36001405b6ac01b3cb1346448a86168b6" successfully created
2015-06-13 09:51:06 Physical volume "/dev/mapper/360014053f20c8b48bc74e9fa94abce3c" successfully created
2015-06-13 09:51:07 Physical volume "/dev/mapper/3600140524b0f150c17043b4a06006cf9" successfully created
2015-06-13 09:51:08 Physical volume "/dev/mapper/3600140547d444218c574c61bf2102920" successfully created
2015-06-13 09:51:09 Physical volume "/dev/mapper/360014058b44394769264c85b5092f717" successfully created
2015-06-13 09:51:10 Physical volume "/dev/mapper/3600140588e25e0d887e4b7eaa5bb0aa3" successfully created
2015-06-13 09:51:12 Physical volume "/dev/mapper/36001405925e187b37024b10a3cd30cfa" successfully created
2015-06-13 09:51:12 Physical volume "/dev/mapper/360014059340348152274828aa152c894" successfully created
2015-06-13 09:51:13 Physical volume "/dev/mapper/360014054959fd6b9d084be18826a025a" successfully created
2015-06-13 09:51:14 Physical volume "/dev/mapper/36001405445bb63ae57d403e973f09cf2" successfully created
2015-06-13 09:51:16 Physical volume "/dev/mapper/36001405946e227c6c424659b5d6c20ea" successfully created
2015-06-13 09:51:17 Physical volume "/dev/mapper/3600140544a62a65ee3f455f9c0d72374" successfully created
2015-06-13 09:51:18 Physical volume "/dev/mapper/360014050e40f9deb69d4d29918708173" successfully created
2015-06-13 09:51:21 Physical volume "/dev/mapper/360014057deb77fec0e04de6a65f4ddd5" successfully created
2015-06-13 09:51:22 Physical volume "/dev/mapper/36001405d3e3ebdab1544383b11391de6" successfully created
2015-06-13 09:51:24 Physical volume "/dev/mapper/36001405f66e726ac84e4d0798b646e25" successfully created
2015-06-13 09:51:27 Physical volume "/dev/mapper/36001405b76a1eef0c5f4c78a481fe501" successfully created
2015-06-13 09:51:32 Physical volume "/dev/mapper/36001405c61c377577d04b29838ec458c" successfully created
2015-06-13 09:51:36 Physical volume "/dev/mapper/36001405c14320db62c242088820cd568" successfully created
2015-06-13 09:51:40 Physical volume "/dev/mapper/360014055af6a25fce3b4c7e8af621253" successfully created
2015-06-13 09:51:44 Physical volume "/dev/mapper/3600140548ccb367d0494b24888334fb6" successfully created
2015-06-13 09:51:48 Physical volume "/dev/mapper/36001405c639f497b715425486a3dfaad" successfully created
2015-06-13 09:51:53 Physical volume "/dev/mapper/36001405cc6ea3749a524478b86d23626" successfully created
2015-06-13 09:51:58 Physical volume "/dev/mapper/3600140539e18caa79ee43069598b4f25" successfully created
2015-06-13 09:52:08 Physical volume "/dev/mapper/36001405839725329b92472fbf5911851" successfully created
2015-06-13 09:52:13 Physical volume "/dev/mapper/360014057c4ae88d66d3455b80babcbc4" successfully created
2015-06-13 09:52:19 Physical volume "/dev/mapper/360014050814d94d749747a1b3dac343f" successfully created
2015-06-13 09:52:47 Physical volume "/dev/mapper/3600140563afa5c20c1740aba1dccc608" successfully created
2015-06-13 09:53:08 Physical volume "/dev/mapper/360014057b5533bb930d4973b8a4664bf" successfully created
2015-06-13 09:53:15 Physical volume "/dev/mapper/360014054eb95ba99c04429f9b666e082" successfully created
2015-06-13 09:53:21 Physical volume "/dev/mapper/360014050f4642d00f904ddf8e79e3ab2" successfully created
2015-06-13 09:53:27 Physical volume "/dev/mapper/3600140569245e0a199344d0ae7464062" successfully created
2015-06-13 09:53:34 Physical volume "/dev/mapper/360014056cd8cd6a1ee44910a859f7391" successfully created
2015-06-13 09:53:40 Physical volume "/dev/mapper/360014055db686a56c51465ca8fa66412" successfully created
2015-06-13 09:53:47 Physical volume "/dev/mapper/360014053de8438cb7de4b6484b4d5db5" successfully created
2015-06-13 09:53:54 Physical volume "/dev/mapper/3600140554a49dcf28aa4d0a96c0ea57a" successfully created
You can see that first couple of PVs were finished much faster than the later PVs. This I guess should be explained by the LVM guys. How can we fix this (on vdsm level or LVM level)
BTW regarding to Zdenek's question we pass all of the LUNs to the command as we are testing if they are already in use or not.
Created attachment 1038248 [details]
pvcreate command run with the timestamps
Any change here will be too substantional and/or risky for 3.5.4. Pushing out to 3.5.5 to reconsider. Fred, this patch modifies getDeviceList's API, and should actually speed up several flows, not just adding a LUN disk. Could you add some doctext detailing where we expect to see a performance gain? Thanks! Fred, if I'm reading the code correctly, I'd expect the following non-GUI flows to improve too: - Add an iSCSI/FCP domain via REST (see BackendStorageDomainsResource) - Import an iSCSI/FCP domain via GUI or REST (see GetUnregisteredBlockStorageDomainsQuery, BackendStorageDomainsResource) - Adding a LUN disk via REST if a <host> parameter is passed (if it isn't, the LUN isn't validated, so there's no effect - see AddDiskCommand and DiskValidator). Please confirm (and add to the doctext) or refute (and explain) fixing wrong fixed in version/build_id. Using:
RHEVM: 3.6.3.1-0.1.el6
VDSM: 3.6.3-2
with a host connected to storage with 183 possible luns.
VM was successful in attachment of direct lun and showing of list.
Details:
25s total time from Clicking "direct lun" to showing New Virtual Disk Window
2mins 4s total time to attach direct lun with 183 luns available to host
# time vdsClient -s 0 getDeviceList >/dev/null
real 1m32.897s
user 0m1.013s
sys 0m0.048s
# time vdsClient -s 0 getDeviceList >/dev/null
real 1m29.455s
user 0m1.043s
sys 0m0.056s
Please let me know if additional info is needed for this test case or if you would like to review the VDSM log from this test.
(In reply to mlehrer from comment #21) In order to test the change introduced, you should use the getDeviceList this way: getDeviceList ISCSI False or getDeviceList FCP False From vdsClient : getDeviceList [storageType] [checkStatus] [<devlist>] List of all block devices (optionally - matching storageType, optionally - check device statusoptionally - of each device listed). getDeviceList return all devices getDeviceList FCP return only FCP devices getDeviceList FCP True return only FCP devices and perform PV creation test getDeviceList ISCSI False guid1 guid2 return info for guid1 and guid2 , assuming ISCSI type and skip PV creation test total luns available: 383
total paths: 1,532
The additional luns were added in order to increase the total number of paths which were previously 732 paths, and now are 1532 paths.
time vdsClient -s 0 getDeviceList ISCSI False > /dev/null
real 0m28.467s
user 0m1.973s
sys 0m0.047s
time vdsClient -s 0 getDeviceList FCP False > /dev/null
real 0m27.981s
user 0m0.110s
sys 0m0.040s
and just for comparison sake:
time vdsClient -s 0 getDeviceList > /dev/null
real 9m37.496s
user 0m1.905s
sys 0m0.047s
UI:
39s to display "New Virtual Disk" Dialog for Direct lun Disk
34s to add disk of type direct lun to VM
I would say this is reasonable with so many devices present. I think we can do better, but at least now it doesn't time out. Freddy, any flows we still run "time vdsClient -s 0 getDeviceList > /dev/null" we should test? Because this could create a lot of issues since it's so heavy. For DC level 3.6, the only place we still do the PV creation test when executing GetDeviceList verb is through the REST API. When getting the LUNs list, we need to provide the "Status" of the LUN. It was kept for backward compatibility. Note that in scale setups, this REST API call will timeout as before. (In reply to Fred Rolland from comment #26) > For DC level 3.6, the only place we still do the PV creation test when > executing GetDeviceList verb is through the REST API. > When getting the LUNs list, we need to provide the "Status" of the LUN. > > It was kept for backward compatibility. Note that in scale setups, this REST > API call will timeout as before. Did we provide any alternative to this REST call that will work for scale? Not currently. Created BZ #1308350 to track API improvement. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-0362.html |
Description of problem: When adding new direct to the VM operations takes long time and then times out Version-Release number of selected component (if applicable): vdsm-4.14.18-4.el6ev.x86_64 How reproducible: , then external with settings virtio-scsi and fibre channel. After a while (some minutes) I get an empty list. Steps to Reproduce: 1. select VM, then discs, then add 2. select type virtio-scsi and fiber channel Actual results: After some minutes empty list is shown Expected results: LUN list is displaed Additional info: The delay is caused by LVM operation: Thread-76::DEBUG::2015-04-28 15:58:49,029::lvm::296::Storage.Misc.excCmd::(cmd) '/usr/bin/sudo -n /sbin/lvm pvcreate --config " devices { preferred_names = [\\"^/dev/mapper/\\"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list _from_udev=0 filter = [ \'a|/dev/mapper/3600507680181870c2000000000000085...--test.... ... Thread-76::DEBUG::2015-04-28 16:01:44,518::lvm::296::Storage.Misc.excCmd::(cmd) FAILED: <err> = ' TEST MODE: Metadata will NOT be updated and volumes will not be (de)activated.\n Found duplicate It seems that the call is not used for anything: vdsm/storage/hsm.py: 1948 def _getDeviceList(self, storageType=None, guids=None): ... 2002 # Look for devices that will probably fail if pvcreated. 2003 devNamesToPVTest = tuple(dev["GUID"] for dev in devices) 2004 unusedDevs, usedDevs = lvm.testPVCreate( 2005 devNamesToPVTest, metadataSize=blockSD.VG_METADATASIZE) 2006 # Assuming that unusables v unusables = None 2007 free = tuple(os.path.basename(d) for d in unusedDevs) 2008 used = tuple(os.path.basename(d) for d in usedDevs) 2009 for dev in devices: 2010 guid = dev['GUID'] 2011 if guid in free: 2012 dev['status'] = "free" 2013 elif guid in used: 2014 dev['status'] = "used" 2015 else: 2016 raise KeyError("pvcreate response foresight is " 2017 "can not be determined for %s", dev) 2018 2019 return devices We can see that we only return the devices, but we do not return free or used variable. We do not use the dev variable ether. I guess that this part of the code can be removed. Please confirm and also offer workaround until the issue is resolved