A request to create 8 Vms from template (thin provision) takes in average 10 minutes to complete. Looking at vdsm.logs we can see that the tasks are waiting on operational mutex too much - sometimes about 30 seconds. Here is one example of one task: vdsm.log.1.xz:8be2e436-22d8-4f6e-bfa9-4569e4a7c8b2::DEBUG::2013-06-26 14:45:13,985::lvm::414::OperationMutex::(_reloadlvs) Got the operational mutex vdsm.log.1.xz:8be2e436-22d8-4f6e-bfa9-4569e4a7c8b2::DEBUG::2013-06-26 14:45:22,021::lvm::414::OperationMutex::(_reloadlvs) Got the operational mutex vdsm.log.1.xz:8be2e436-22d8-4f6e-bfa9-4569e4a7c8b2::DEBUG::2013-06-26 14:45:32,539::lvm::488::OperationMutex::(_invalidatevgs) Operation 'lvm reload operation' is holding the operation mutex, waiting... >> 7 sec vdsm.log.1.xz:8be2e436-22d8-4f6e-bfa9-4569e4a7c8b2::DEBUG::2013-06-26 14:45:39,662::lvm::488::OperationMutex::(_invalidatevgs) Operation 'lvm invalidate operation' got the operation mutex vdsm.log.1.xz:8be2e436-22d8-4f6e-bfa9-4569e4a7c8b2::DEBUG::2013-06-26 14:45:39,667::lvm::490::OperationMutex::(_invalidatevgs) Operation 'lvm invalidate operation' released the operation mutex vdsm.log.1.xz:8be2e436-22d8-4f6e-bfa9-4569e4a7c8b2::DEBUG::2013-06-26 14:45:39,668::lvm::498::OperationMutex::(_invalidatelvs) Operation 'lvm invalidate operation' got the operation mutex vdsm.log.1.xz:8be2e436-22d8-4f6e-bfa9-4569e4a7c8b2::DEBUG::2013-06-26 14:45:39,670::lvm::510::OperationMutex::(_invalidatelvs) Operation 'lvm invalidate operation' released the operation mutex vdsm.log.1.xz:8be2e436-22d8-4f6e-bfa9-4569e4a7c8b2::DEBUG::2013-06-26 14:45:40,803::lvm::498::OperationMutex::(_invalidatelvs) Operation 'lvm reload operation' is holding the operation mutex, waiting... >> 3 sec vdsm.log.1.xz:8be2e436-22d8-4f6e-bfa9-4569e4a7c8b2::DEBUG::2013-06-26 14:45:43,829::lvm::498::OperationMutex::(_invalidatelvs) Operation 'lvm invalidate operation' got the operation mutex vdsm.log.1.xz:8be2e436-22d8-4f6e-bfa9-4569e4a7c8b2::DEBUG::2013-06-26 14:45:43,831::lvm::510::OperationMutex::(_invalidatelvs) Operation 'lvm invalidate operation' released the operation mutex vdsm.log.1.xz:8be2e436-22d8-4f6e-bfa9-4569e4a7c8b2::DEBUG::2013-06-26 14:45:43,837::lvm::414::OperationMutex::(_reloadlvs) Operation 'lvm reload operation' got the operation mutex vdsm.log.1.xz:8be2e436-22d8-4f6e-bfa9-4569e4a7c8b2::DEBUG::2013-06-26 14:45:58,291::lvm::498::OperationMutex::(_invalidatelvs) Operation 'lvm reload operation' is holding the operation mutex, waiting... vdsm.log.1.xz:8be2e436-22d8-4f6e-bfa9-4569e4a7c8b2::DEBUG::2013-06-26 14:46:04,299::lvm::498::OperationMutex::(_invalidatelvs) Operation 'lvm reload operation' is holding the operation mutex, waiting... vdsm.log.1.xz:8be2e436-22d8-4f6e-bfa9-4569e4a7c8b2::DEBUG::2013-06-26 14:46:08,735::lvm::498::OperationMutex::(_invalidatelvs) Operation 'lvm reload operation' is holding the operation mutex, waiting... >> 11 sec vdsm.log.1.xz:8be2e436-22d8-4f6e-bfa9-4569e4a7c8b2::DEBUG::2013-06-26 14:46:11,953::lvm::498::OperationMutex::(_invalidatelvs) Got the operational mutex vdsm.log.1.xz:8be2e436-22d8-4f6e-bfa9-4569e4a7c8b2::DEBUG::2013-06-26 14:46:11,954::lvm::510::OperationMutex::(_invalidatelvs) Operation 'lvm invalidate operation' released the operation mutex vdsm.log.1.xz:8be2e436-22d8-4f6e-bfa9-4569e4a7c8b2::DEBUG::2013-06-26 14:46:11,960::lvm::414::OperationMutex::(_reloadlvs) Operation 'lvm reload operation' got the operation mutex vdsm.log.1.xz:8be2e436-22d8-4f6e-bfa9-4569e4a7c8b2::DEBUG::2013-06-26 14:46:15,919::lvm::414::OperationMutex::(_reloadlvs) Got the operational mutex vdsm.log.1.xz:8be2e436-22d8-4f6e-bfa9-4569e4a7c8b2::DEBUG::2013-06-26 14:46:30,859::lvm::498::OperationMutex::(_invalidatelvs) Operation 'lvm reload operation' is holding the operation mutex, waiting... vdsm.log.1.xz:8be2e436-22d8-4f6e-bfa9-4569e4a7c8b2::DEBUG::2013-06-26 14:46:38,702::lvm::498::OperationMutex::(_invalidatelvs) Operation 'lvm reload operation' is holding the operation mutex, waiting... >> 13 sec vdsm.log.1.xz:8be2e436-22d8-4f6e-bfa9-4569e4a7c8b2::DEBUG::2013-06-26 14:46:43,186::lvm::498::OperationMutex::(_invalidatelvs) Operation 'lvm invalidate operation' got the operation mutex vdsm.log.1.xz:8be2e436-22d8-4f6e-bfa9-4569e4a7c8b2::DEBUG::2013-06-26 14:46:43,975::lvm::498::OperationMutex::(_invalidatelvs) Operation 'lvm reload operation' is holding the operation mutex, waiting... >> 5 sec vdsm.log.1.xz:8be2e436-22d8-4f6e-bfa9-4569e4a7c8b2::DEBUG::2013-06-26 14:46:48,255::lvm::498::OperationMutex::(_invalidatelvs) Got the operational mutex vdsm.log.1.xz:8be2e436-22d8-4f6e-bfa9-4569e4a7c8b2::DEBUG::2013-06-26 14:46:48,259::lvm::510::OperationMutex::(_invalidatelvs) Operation 'lvm invalidate operation' released the operation mutex Additional Information: Customer is using RHEV with EVM. EVM issues a request to spin VMs from a template via REST API. Due to long time creating the VMs, it times out and fails the request. Results: EVM is unusable. Sev1 case. Increasing the time out on EVM side may help, but from customer words, with each additional request, it takes longer time to reply. RHEV Setup: Two datacenters in one RHEV-M setup: Default (FC), Lab (iSCSI). Issuing similar request to iSCSI DC works 3 times faster. We would expect similar behaviour on FC DC. We checked general performance of the host + storage -> seems fine (https://bugzilla.redhat.com/show_bug.cgi?id=970179#c7). All actual storage commands are performed in a reasonably fast manner and seems like the bottlenck is the waiting on the mutex. Another piece of information, per customer words, the issue started happening after upgrading his hypervisors to 20130501.0.el6_4 (not sure from which version, prolly 20130318). I do not think this matters and the problem actually starts happening when he gets the system loaded. Versions: Hosts: 20130501.0.el6_4 (vdsm-4.10.2-1.13.el6ev) RHEVM: rhevm-3.2.0-11.30.el6ev.noarch
Created attachment 766351 [details] Engine.log Search for AddVmCommands comming from ajp
Created attachment 766353 [details] vdsm.log
LVM operations are slow probably because one or more storage domains are unreachable: 959484ad-7caa-42eb-b78f-76e53d278817::DEBUG::2013-06-26 15:01:59,868::misc::83::Storage.Misc.excCmd::(<lambda>) SUCCESS: <err> = ' /dev/sdba: read failed after 0 of 4096 at 0: Input/output error\n /dev/sdba: read failed after 0 of 4096 at 2199023190016: Input/output error\n /dev/sdba: read failed after 0 of 4096 at 2199023247360: Input/output error\n WARNING: Error counts reached a limit of 3. Device /dev/sdba was disabled\n /dev/sdw: read failed after 0 of 4096 at 0: Input/output error\n /dev/sdw: read failed after 0 of 4096 at 2199023190016: Input/output error\n /dev/sdw: read failed after 0 of 4096 at 2199023247360: Input/output error\n WARNING: Error counts reached a limit of 3. Device /dev/sdw was disabled\n /dev/sdbc: read failed after 0 of 4096 at 0: Input/output error\n /dev/sdbc: read failed after 0 of 4096 at 2199023190016: Input/output error\n /dev/sdbc: read failed after 0 of 4096 at 2199023247360: Input/output error\n WARNING: Error counts reached a limit of 3. Device /dev/sdbc was disabled\n /dev/sdbe: read failed after 0 of 4096 at 0: Input/output error\n /dev/sdbe: read failed after 0 of 4096 at 2199023190016: Input/output error\n /dev/sdbe: read failed after 0 of 4096 at 2199023247360: Input/output error\n WARNING: Error counts reached a limit of 3. Device /dev/sdbe was disabled\n'; <rc> = 0
(In reply to Federico Simoncelli from comment #4) > LVM operations are slow probably because one or more storage domains are > unreachable: > From talking to the storage team, I understand that the devices in question are not the active device and that's what will happen if you scan them and they are actually unavailable. For instance: 3600601605ab02c00620df4e048a0e211 dm-2 DGC,VRAID size=2.0T features='1 queue_if_no_path' hwhandler='1 emc' wp=rw |-+- policy='round-robin 0' prio=1 status=active | |- 1:0:1:5 sdn 8:208 active ready running | |- 1:0:0:5 sdg 8:96 active ready running | |- 2:0:1:5 sdas 66:192 active ready running | `- 2:0:0:5 sdam 66:96 active ready running `-+- policy='round-robin 0' prio=0 status=enabled |- 1:0:2:5 sdu 65:64 active ready running |- 1:0:3:5 sdab 65:176 active ready running |- 2:0:2:5 sday 67:32 active ready running `- 2:0:3:5 sdbe 67:128 active ready running -> problematic So, sdbe device should not be used, it is a passive path, and not active, and we should not try writing to it.
After some testing this morning, we figured out that the filter created by vdsm should be modified to look only at the relevant pvs only, accessing them by the full path: /dev/mapper/wwid. Compare three outputs: [#1 with the specific filter] # lvs -vvvv --config " devices { preferred_names = [\"^/dev/mapper/\"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter = [ \"a|/dev/mapper/3600601605ab02c0028c71db4c076e211|/dev/mapper/3600601605ab02c005a0df4e048a0e211|/dev/mapper/3600601605ab02c005c0df4e048a0e211|/dev/mapper/3600601605ab02c005e0df4e048a0e211|/dev/mapper/3600601605ab02c00600df4e048a0e211|/dev/mapper/3600601605ab02c00620df4e048a0e211|/dev/mapper/3600601605ab02c00640df4e048a0e211|\", \"r|.*|\" ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 } backup { retain_min = 50 retain_days = 0 } " --noheadings --units b --nosuffix --separator '|' -o uuid,name,attr,size,tags,vg_mda_size,vg_mda_free f48e93bb-e3bb-4291-bd2f-8d622ef8019d 2>&1 | awk '{ print strftime("%Y-%m-%d %H:%M:%S"), $0; fflush() }' > /tmp/lvs_f48e93bb_3.log ~~~ START: 2013-07-02 15:34:45 #libdm-config.c:863 Setting activation/monitoring to 1 END: 2013-07-02 15:34:46 #libdm-config.c:799 Setting log/verbose to 0 1 Second. ~~~ [#2 with the general filter to take all multipath devices] # lvs -vvvv --config " devices { preferred_names = [\"^/dev/mapper/\"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter = [ \"a|/dev/mapper/.*|\", \"r|.*|\" ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 } backup { retain_min = 50 retain_days = 0 } " --noheadings --units b --nosuffix --separator '|' -o uuid,name,attr,size,tags,vg_mda_size,vg_mda_free f48e93bb-e3bb-4291-bd2f-8d622ef8019d 2>&1 | awk '{ print strftime("%Y-%m-%d %H:%M:%S"), $0; fflush() }' > /tmp/lvs_f48e93bb.log ~~~ START: 2013-07-02 14:54:51 #libdm-config.c:863 Setting activation/monitoring to 1 END: 2013-07-02 14:55:02 #libdm-config.c:799 Setting log/verbose to 0 11 Seconds ~~~ [#3 The original filter was taking all the devices, without the full paths] # vgs --config " devices { preferred_names = [\"^/dev/mapper/\"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter = [ 'a\%3600601605ab02c0028c71db4c076e211|3600601605ab02c005a0df4e048a0e211|3600601605ab02c005c0df4e048a0e211|3600601605ab02c005e0df4e048a0e211|3600601605ab02c00600df4e048a0e211|3600601605ab02c00620df4e048a0e211|3600601605ab02c00640df4e048a0e211|36848f690eba0850018d3cd8a15d6a65c%\', 'r\%.*%\' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 } backup { retain_min = 50 retain_days = 0 } " --noheadings --units b --nosuffix --separator '|' -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free f48e93bb-e3bb-4291-bd2f-8d622ef8019d
I tested the following on my setup and going to build a test package with this change now: # diff lvm.py /var/tmp/lvm.py 137c137 < devs.append('/dev/mapper/' + strippedDev.replace(r'\x', r'\\x')) --- > devs.append(strippedDev.replace(r'\x', r'\\x')) 140c140 < filt = "'a|" + filt + "|', " --- > filt = "'a%" + filt + "%', " 142c142 < filt = "filter = [ " + filt + "'r|.*|' ]" --- > filt = "filter = [ " + filt + "'r%.*%' ]" ----------------------------------------------- This results in: filter = [ \'a|/dev/mapper/360014052e1060300051d006000000000|/dev/mapper/360014052e1060300051d016000000000|/dev/mapper/360014052e1060300051d017000000000|\', \'r|.*|\' ]
Created attachment 767987 [details] patch for lvm.py
Unfortunately, the fixed filter didn't help. It slightly improved the time, but it EVM requests are still failing on time out. I am waiting for sos reports from the patched SPM to decide on next steps.
I opened new bug for the filter only: https://bugzilla.redhat.com/show_bug.cgi?id=981055 Here we should continue investigating customer original problem.
After narrowing down the issue, we see that the problem happens when Ruby waits more then 30 seconds to get request from RHEV REST API on a single Create VM request and times out with ERROR. Looking into RHEV side, for those timed out requests, we can see that most of the time is spent on GetImageInfoVDSCommand on the template we are trying to create VM from. (about two minutes for some requests). Questions: 1. Is the API flow correct and we must wait till we create the task for vdsm/storage to CreateSnapshot from the template (probably it is correct, otherwise how else we can acknowledge that the request was accepted by RHEV?) 2. Why do we need to issue GetImageInfoVDSCommand for the template before creating the task for CreateSnapshot? 3. Why does it take SPM host so long to reply to this command? Additional information: 1. Logs excerpts for this flow are available on the attached file ajp-127.0.0.1:8702-10.flow 2. SPM is running the patch for narrowing the lvm filter as described earlier on this bug and moved to https://bugzilla.redhat.com/show_bug.cgi?id=981055. 3. This problem occurs only on 1 specific storage domain that has 8PVs in it and multiple LVs. Once issue the same request to create VM from template on another DC or another SD under this DC, on same RHEV setup, the problem does not occur --> this leads us to storage performance issue. 4. Earlier on the case, the storage performance was reviewed by our Storage Expert and by EMC representative, and no flaws were found. --> what should be next steps here? 5. Customer decided to switch to VMWare due to this case not being resolved for a long time and we have a limited time frame to fix the issue and have the customer staying with RH.
Created attachment 771381 [details] the flow from Ruby to RHEV
(In reply to Marina from comment #23) > Questions: > 1. Is the API flow correct and we must wait till we create the task for > vdsm/storage to CreateSnapshot from the template (probably it is correct, > otherwise how else we can acknowledge that the request was accepted by RHEV?) Yes, that's correct. > 2. Why do we need to issue GetImageInfoVDSCommand for the template before > creating the task for CreateSnapshot? We don't. It was removed in commit 2575a22, part of RHEV-M 3.3 (is1) > 3. Why does it take SPM host so long to reply to this command? Yeela?
(In reply to Allon Mureinik from comment #26) > (In reply to Marina from comment #23) > > Questions: > > 1. Is the API flow correct and we must wait till we create the task for > > vdsm/storage to CreateSnapshot from the template (probably it is correct, > > otherwise how else we can acknowledge that the request was accepted by RHEV?) > Yes, that's correct. > > > 2. Why do we need to issue GetImageInfoVDSCommand for the template before > > creating the task for CreateSnapshot? > We don't. > It was removed in commit 2575a22, part of RHEV-M 3.3 (is1) This can probably easily be cherry-picked to 3.2.z. Can't gaurentee it'll solve the issue, but it will definately increase performance. > > > 3. Why does it take SPM host so long to reply to this command? > Yeela?
> > > > > 2. Why do we need to issue GetImageInfoVDSCommand for the template before > > > creating the task for CreateSnapshot? > > We don't. > > It was removed in commit 2575a22, part of RHEV-M 3.3 (is1) > This can probably easily be cherry-picked to 3.2.z. > Can't gaurentee it'll solve the issue, but it will definately increase > performance. > Agreed. And after testing it, we would be able to reconsider the environment again and find other, new factors. /me is going to work on the test package for it.
(In reply to Allon Mureinik from comment #27) > (In reply to Allon Mureinik from comment #26) > > (In reply to Marina from comment #23) > > > 2. Why do we need to issue GetImageInfoVDSCommand for the template before > > > creating the task for CreateSnapshot? > > We don't. > > It was removed in commit 2575a22, part of RHEV-M 3.3 (is1) > This can probably easily be cherry-picked to 3.2.z. > Can't gaurentee it'll solve the issue, but it will definately increase > performance. OK, my mistake. This was ALREADY cherry-picked for 3.2.1 (build sf18.2) > > > > > > 3. Why does it take SPM host so long to reply to this command? > > Yeela?
Created attachment 772887 [details] ps aux output from spm host
Marina, Please attach the logs for the new package you supplied. We need logs that contain the flow to the original issue which is creation of vms from template (which is described in the first comment). Especially we need the engine, vdsm and lvm logs. Thanks!
Created attachment 775333 [details] commit b1a3d4fa4e33c55ba7613cec27ffd6aa98dc79d0
Created attachment 775334 [details] commit e72a9ce379640e7ec3b134ef7453ee3e25f714d9
Setting Target Milestone Beta1.
Created attachment 787116 [details] summary of storage commands after apply patch3 to vdsm
Created attachment 787117 [details] processed output of a file from patch3
Created attachment 787118 [details] vdsm.log after patch3
Marina, I'm not sure this has anything to do with the new patches. Can you maybe attach the full vdsm log for this issue? Thanks.
Created attachment 791549 [details] all the patches provided up to 28 Aug
verified using is17, following comment #5 on bug #1001031
This bug is currently attached to errata RHBA-2013:15291. If this change is not to be documented in the text for this errata please either remove it from the errata, set the requires_doc_text flag to minus (-), or leave a "Doc Text" value of "--no tech note required" if you do not have permission to alter the flag. Otherwise to aid in the development of relevant and accurate release documentation, please fill out the "Doc Text" field above with these four (4) pieces of information: * Cause: What actions or circumstances cause this bug to present. * Consequence: What happens when the bug presents. * Fix: What was done to fix the bug. * Result: What now happens when the actions or circumstances above occur. (NB: this is not the same as 'the bug doesn't present anymore') Once filled out, please set the "Doc Type" field to the appropriate value for the type of change made and submit your edits to the bug. For further details on the Cause, Consequence, Fix, Result format please refer to: https://bugzilla.redhat.com/page.cgi?id=fields.html#cf_release_notes Thanks in advance.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2014-0040.html