Bug 713823

Summary: few of concurrent createVolume calls fail when trying to hibernate multiple VM's
Product: Red Hat Enterprise Linux 6 Reporter: Dafna Ron <dron>
Component: vdsmAssignee: Igor Lvovsky <ilvovsky>
Status: CLOSED ERRATA QA Contact: Dafna Ron <dron>
Severity: low Docs Contact:
Priority: medium    
Version: 6.1CC: abaron, bazulay, danken, iheim, lpeer, ykaul
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: storage
Fixed In Version: vdsm-4.9-80.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-12-06 07:22:58 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 712829    
Bug Blocks: 686001    
Attachments:
Description Flags
logs none

Description Dafna Ron 2011-06-16 15:14:58 UTC
Created attachment 505073 [details]
logs

Description of problem:

when trying to hibernate multiple VM's 1-2 of these VM's fail to hibernate. 
After checking this issue with Igor we found the following in the log:

lvcreare is run for first VM
lvchange --add tag is run for first VM

same for VM's 2 and 3

lvchange -a w is ran for first VM and at the same time
lvcreate is ran for 4th VM

fist VM will fail to hibernate with error:

Cannot deactivate Logical Volume


when also tried adding lsof and sleep for 6 seconds and when both were added to vdsm code all 4 VM's were able to hibernate successfully. 


Version-Release number of selected component (if applicable):

vdsm-4.9-75.el6.x86_64
libvirt-0.8.7-18.el6.x86_64

How reproducible:

100%

Steps to Reproduce:
1. hibernate 4 VM's
2.
3.
  
Actual results:

1-2 of the VM's will fail to hibernate

Expected results:

all VM's should hibernate

Additional info: logs 



ae23525c-10eb-4726-b325-9ba053ba531d::DEBUG::2011-06-16 17:23:23,647::lvm::550::OperationMutex::(_invalidatelvs) Operation 'lvm invalidate operation' released the operation mute
x
ae23525c-10eb-4726-b325-9ba053ba531d::ERROR::2011-06-16 17:23:23,648::blockVolume::219::Storage.Volume::(create) Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/blockVolume.py", line 211, in create
    lvm.deactivateLVs(sdUUID, volUUID)
  File "/usr/share/vdsm/storage/lvm.py", line 1006, in deactivateLVs
    _setLVAvailability(vgName, toDeactivate, "n")
  File "/usr/share/vdsm/storage/lvm.py", line 751, in _setLVAvailability
    raise error(str(e))
CannotDeactivateLogicalVolume: Cannot deactivate Logical Volume: ('General Storage Exception: ("5 [] [\'  LV bfd358ed-0ce4-472e-aba3-927232699567/d9a181be-8910-4d06-a1e2-8576afc
e5fe4 in use: not deactivating\']\\nbfd358ed-0ce4-472e-aba3-927232699567/[\'d9a181be-8910-4d06-a1e2-8576afce5fe4\']",)',)
ae23525c-10eb-4726-b325-9ba053ba531d::DEBUG::2011-06-16 17:23:23,649::resourceManager::517::ResourceManager::(releaseResource) Trying to release resource 'bfd358ed-0ce4-472e-aba
3-927232699567_imageNS.2b11b4d5-c464-41e9-bab1-0d0b1c65d84b'
ae23525c-10eb-4726-b325-9ba053ba531d::DEBUG::2011-06-16 17:23:23,650::resourceManager::532::ResourceManager::(releaseResource) Released resource 'bfd358ed-0ce4-472e-aba3-9272326
99567_imageNS.2b11b4d5-c464-41e9-bab1-0d0b1c65d84b' (0 active users)
ae23525c-10eb-4726-b325-9ba053ba531d::DEBUG::2011-06-16 17:23:23,650::resourceManager::537::ResourceManager::(releaseResource) Resource 'bfd358ed-0ce4-472e-aba3-927232699567_ima
geNS.2b11b4d5-c464-41e9-bab1-0d0b1c65d84b' is free, finding out if anyone is waiting for it.
ae23525c-10eb-4726-b325-9ba053ba531d::DEBUG::2011-06-16 17:23:23,651::resourceManager::544::ResourceManager::(releaseResource) No one is waiting for resource 'bfd358ed-0ce4-472e
-aba3-927232699567_imageNS.2b11b4d5-c464-41e9-bab1-0d0b1c65d84b', Clearing records.
ae23525c-10eb-4726-b325-9ba053ba531d::ERROR::2011-06-16 17:23:23,651::task::865::TaskManager.Task::(_setError) Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 873, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/storage/task.py", line 300, in run
    return self.cmd(*self.argslist, **self.argsdict)
  File "/usr/share/vdsm/storage/spm.py", line 115, in run
    return self.func(*args, **kwargs)
  File "/usr/share/vdsm/storage/spm.py", line 1007, in createVolume
    srcImgUUID=srcImgUUID, srcVolUUID=srcVolUUID)
  File "/usr/share/vdsm/storage/blockSD.py", line 394, in createVolume
    volUUID, desc, srcImgUUID, srcVolUUID)
  File "/usr/share/vdsm/storage/blockVolume.py", line 211, in create
    lvm.deactivateLVs(sdUUID, volUUID)
  File "/usr/share/vdsm/storage/lvm.py", line 1006, in deactivateLVs
    _setLVAvailability(vgName, toDeactivate, "n")
  File "/usr/share/vdsm/storage/lvm.py", line 751, in _setLVAvailability
    raise error(str(e))
CannotDeactivateLogicalVolume: Cannot deactivate Logical Volume: ('General Storage Exception: ("5 [] [\'  LV bfd358ed-0ce4-472e-aba3-927232699567/d9a181be-8910-4d06-a1e2-8576afce5fe4 in use: not deactivating\']\\nbfd358ed-0ce4-472e-aba3-927232699567/[\'d9a181be-8910-4d06-a1e2-8576afce5fe4\']",)',)
ae23525c-10eb-4726-b325-9ba053ba531d::DEBUG::2011-06-16 17:23:23,652::task::492::TaskManager.Task::(_debug) Task ae23525c-10eb-4726-b325-9ba053ba531d: Task._run: ae23525c-10eb-4726-b325-9ba053ba531d () {} failed - stopping task
ae23525c-10eb-4726-b325-9ba053ba531d::DEBUG::2011-06-16 17:23:23,653::task::492::TaskManager.Task::(_debug) Task ae23525c-10eb-4726-b325-9ba053ba531d: stopping in state running (force False)
ae23525c-10eb-4726-b325-9ba053ba531d::DEBUG::2011-06-16 17:23:23,653::task::492::TaskManager.Task::(_debug) Task ae23525c-10eb-4726-b325-9ba053ba531d: ref 1 aborting True
ae23525c-10eb-4726-b325-9ba053ba531d::DEBUG::2011-06-16 17:23:23,654::task::915::TaskManager.Task::(_runJobs) aborting: Task is aborted: 'Cannot deactivate Logical Volume' - code 552

Comment 1 Dan Kenigsberg 2011-06-16 19:28:00 UTC
Before we start hibernating a VM we create lvs to store its data on. After creation, they are deactivated - but this deactivation fails with

The situation reminds me of lvm bug 712829, which might be in effect here.

> when also tried adding lsof and sleep for 6 seconds and when both were added to
> vdsm code all 4 VM's were able to hibernate successfully.

So let's ship with an `lsof` call, just before deactivating the created lv!

Comment 2 Dan Kenigsberg 2011-06-19 09:01:10 UTC
Igor, shouldn't createVolume succeed albeit deactivate fails?

Comment 3 Igor Lvovsky 2011-06-19 09:33:36 UTC
(In reply to comment #2)
> Igor, shouldn't createVolume succeed albeit deactivate fails?

Yes, probably you are right. Deactivation failure isn't reason for create volume failure.

Comment 4 Igor Lvovsky 2011-06-19 09:42:45 UTC
(In reply to comment #1)
> Before we start hibernating a VM we create lvs to store its data on. After
> creation, they are deactivated - but this deactivation fails with
> 
> The situation reminds me of lvm bug 712829, which might be in effect here.
> 
> > when also tried adding lsof and sleep for 6 seconds and when both were added to
> > vdsm code all 4 VM's were able to hibernate successfully.
> 
> So let's ship with an `lsof` call, just before deactivating the created lv!

Funny.
Actually, it's not 'lsof' and 'sleep'.
It's 'lsof' or 'sleep'.  
But, as I told you before it's smell bad and I think it's some LVM issue (but I can't prove it yet).

Anyway, I don't think that it's blocker or regression

Comment 6 Dan Kenigsberg 2011-06-19 12:03:46 UTC
As Igor said, this is no regression.

On Sun, Jun 19, 2011 at 02:03:37PM +0300, Dafna Ron wrote:
> Hi Dan,
> 
>  I installed the below packages and it worked - I was able to mass
> hibernate VM's (even checked 7 VM's instead of just 4 to make sure
> :)).

Keeping this issue open instead of close|dup since we would like not to fail upon deactivate error. However severity is much lower.

Comment 7 Igor Lvovsky 2011-06-28 13:26:31 UTC
http://gerrit.usersys.redhat.com/#change,627

Comment 10 Dafna Ron 2011-07-14 10:45:23 UTC
verified on vdsm-4.9-81.el6.x86_64, ic130

Comment 11 errata-xmlrpc 2011-12-06 07:22:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2011-1782.html