Bug 864902

Summary: _lvmActivationNS resource not unlocked for a volume when no space is left on domain
Product: Red Hat Enterprise Linux 6 Reporter: Lee Yarwood <lyarwood>
Component: vdsmAssignee: Ayal Baron <abaron>
Status: CLOSED DUPLICATE QA Contact: Haim <hateya>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.3CC: abaron, bazulay, iheim, lpeer, ybronhei, yeylon, ykaul
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard: infra, storage
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-10-21 09:07:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Lee Yarwood 2012-10-10 11:17:56 UTC
Description of problem:

Thanks to RHEV-M BZ#864892 VDSM can encounter a situation where it is unable to create a temp volume to preform a merge of a RAW base snapshot and COW active volume. 

When tearing down the volumes used during this process the _lvmActivationNS resource for the COW volume is not unlocked causing all subsequent attempts to merge or use the volume to hang waiting on the resource becoming free again even though the controlling task has died.

For example in the vdsm.log.1 file attached we can see the _lvmActivationNS resource for active COW volume being granted for the first merge task. 

# grep -rin '608fea3a-824c-497a-9619-9546e4aab241_lvmActivationNS.63d2f649-ab16-4e89-90c5-f13218359498' vdsm.log.1
59091:dede6b50-f759-4d94-8bab-4c48bc6d6719::DEBUG::2012-10-09 07:56:49,413::resourceManager::155::ResourceManager.Request::(__init__) ResName=`608fea3a-824c-497a-9619-9546e4aab241_lvmActivationNS.63d2f649-ab16-4e89-90c5-f13218359498`ReqID=`9658cad8-0c8b-4528-bf7d-421d93d4ddc3`::Request was made in '/usr/share/vdsm/storage/volume.py' line '533' at 'prepare'
59092:dede6b50-f759-4d94-8bab-4c48bc6d6719::DEBUG::2012-10-09 07:56:49,414::resourceManager::463::ResourceManager::(registerResource) Trying to register resource '608fea3a-824c-497a-9619-9546e4aab241_lvmActivationNS.63d2f649-ab16-4e89-90c5-f13218359498' for lock type 'exclusive'
59093:dede6b50-f759-4d94-8bab-4c48bc6d6719::DEBUG::2012-10-09 07:56:49,416::resourceManager::505::ResourceManager::(registerResource) Resource '608fea3a-824c-497a-9619-9546e4aab241_lvmActivationNS.63d2f649-ab16-4e89-90c5-f13218359498' is free. Now locking as 'exclusive' (1 active user)
59095:dede6b50-f759-4d94-8bab-4c48bc6d6719::DEBUG::2012-10-09 07:56:49,420::resourceManager::192::ResourceManager.Request::(grant) ResName=`608fea3a-824c-497a-9619-9546e4aab241_lvmActivationNS.63d2f649-ab16-4e89-90c5-f13218359498`ReqID=`9658cad8-0c8b-4528-bf7d-421d93d4ddc3`::Granted request

The customer then attempted to merge again, in this instance we see the task hang waiting for the resource to become free..

70786:dede6b50-f759-4d94-8bab-4c48bc6d6719::DEBUG::2012-10-09 08:18:28,757::resourceManager::515::ResourceManager::(releaseResource) Trying to release resource '608fea3a-824c-497a-9619-9546e4aab241_lvmActivationNS.63d2f649-ab16-4e89-90c5-f13218359498'
70803:dede6b50-f759-4d94-8bab-4c48bc6d6719::DEBUG::2012-10-09 08:18:29,346::resourceManager::530::ResourceManager::(releaseResource) Released resource '608fea3a-824c-497a-9619-9546e4aab241_lvmActivationNS.63d2f649-ab16-4e89-90c5-f13218359498' (0 active users)
70805:dede6b50-f759-4d94-8bab-4c48bc6d6719::DEBUG::2012-10-09 08:18:29,347::resourceManager::535::ResourceManager::(releaseResource) Resource '608fea3a-824c-497a-9619-9546e4aab241_lvmActivationNS.63d2f649-ab16-4e89-90c5-f13218359498' is free, finding out if anyone is waiting for it.
70816:dede6b50-f759-4d94-8bab-4c48bc6d6719::DEBUG::2012-10-09 08:18:29,898::resourceManager::542::ResourceManager::(releaseResource) No one is waiting for resource '608fea3a-824c-497a-9619-9546e4aab241_lvmActivationNS.63d2f649-ab16-4e89-90c5-f13218359498', Clearing records.

Version-Release number of selected component (if applicable):
113.3.el6_3

How reproducible:
Always.

Steps to Reproduce:
1. Follow the steps in BZ#864892.
2. Attempt to use the COW volume again.
  
Actual results:
VDSM does not unlock the COW volumes _lvmActivationNS resource during the teardown after the original issue.

Expected results:
VDSM should unlock the COW volumes _lvmActivationNS resource during the teardown after the original issue.

Additional info:

Comment 3 Lee Yarwood 2012-10-10 12:42:42 UTC
I just noticed that I supplied the wrong example in comment #1. This is clearly a working example where the same task locks and then unlocks the resource after completing the merge....

The correct example showing a failed merge task lock but not unlock the resource before a separate merge task then attempts to lock the resource is below.


# grep -rin '608fea3a-824c-497a-9619-9546e4aab241_lvmActivationNS.fdd80dcf-5b90-455a-8cba-75f74130a7bf' vdsm.log.1
60073:6bc892de-2b19-404a-9b31-fabdcc394a25::DEBUG::2012-10-09 07:57:07,522::resourceManager::155::ResourceManager.Request::(__init__) ResName=`608fea3a-824c-497a-9619-9546e4aab241_lvmActivationNS.fdd80dcf-5b90-455a-8cba-75f74130a7bf`ReqID=`474302db-f1ab-4b1e-86a5-51d8aa0cc48e`::Request was made in '/usr/share/vdsm/storage/volume.py' line '533' at 'prepare'
60075:6bc892de-2b19-404a-9b31-fabdcc394a25::DEBUG::2012-10-09 07:57:07,524::resourceManager::463::ResourceManager::(registerResource) Trying to register resource '608fea3a-824c-497a-9619-9546e4aab241_lvmActivationNS.fdd80dcf-5b90-455a-8cba-75f74130a7bf' for lock type 'exclusive'
60089:6bc892de-2b19-404a-9b31-fabdcc394a25::DEBUG::2012-10-09 07:57:07,535::resourceManager::505::ResourceManager::(registerResource) Resource '608fea3a-824c-497a-9619-9546e4aab241_lvmActivationNS.fdd80dcf-5b90-455a-8cba-75f74130a7bf' is free. Now locking as 'exclusive' (1 active user)
60093:6bc892de-2b19-404a-9b31-fabdcc394a25::DEBUG::2012-10-09 07:57:07,536::resourceManager::192::ResourceManager.Request::(grant) ResName=`608fea3a-824c-497a-9619-9546e4aab241_lvmActivationNS.fdd80dcf-5b90-455a-8cba-75f74130a7bf`ReqID=`474302db-f1ab-4b1e-86a5-51d8aa0cc48e`::Granted request
 
73114:39b07f28-02a1-4167-868a-3ff6b95ac2b2::DEBUG::2012-10-09 08:20:11,591::resourceManager::155::ResourceManager.Request::(__init__) ResName=`608fea3a-824c-497a-9619-9546e4aab241_lvmActivationNS.fdd80dcf-5b90-455a-8cba-75f74130a7bf`ReqID=`f16c755e-c590-4596-aab6-b6b1d642a18f`::Request was made in '/usr/share/vdsm/storage/volume.py' line '533' at 'prepare'
73115:39b07f28-02a1-4167-868a-3ff6b95ac2b2::DEBUG::2012-10-09 08:20:11,592::resourceManager::463::ResourceManager::(registerResource) Trying to register resource '608fea3a-824c-497a-9619-9546e4aab241_lvmActivationNS.fdd80dcf-5b90-455a-8cba-75f74130a7bf' for lock type 'exclusive'
73116:39b07f28-02a1-4167-868a-3ff6b95ac2b2::DEBUG::2012-10-09 08:20:11,593::resourceManager::487::ResourceManager::(registerResource) Resource '608fea3a-824c-497a-9619-9546e4aab241_lvmActivationNS.fdd80dcf-5b90-455a-8cba-75f74130a7bf' is currently locked, Entering queue (1 in queue)

Comment 9 Ayal Baron 2012-10-21 09:07:52 UTC

*** This bug has been marked as a duplicate of bug 815359 ***