Bug 781990

Summary: vdsm: we fail reconstruct when master domain consists of targets which are smaller then 10G
Product: [Retired] oVirt Reporter: Dafna Ron <dron>
Component: vdsmAssignee: Dan Kenigsberg <danken>
Status: CLOSED WONTFIX QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: unspecifiedCC: abaron, acathrow, amureini, bazulay, iheim, ykaul
Target Milestone: ---   
Target Release: 3.3.4   
Hardware: x86_64   
OS: Linux   
Whiteboard: storage
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-03-12 09:36:41 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
logs none

Description Dafna Ron 2012-01-16 09:26:41 UTC
Created attachment 555460 [details]
logs

Description of problem:

if master domain consists of luns that are smaller then 10G and we have vm's on the domain we will fail reconstruct after host reboot with StorageDomainDoesNotExist error

Version-Release number of selected component (if applicable):

vdsm-4.9.2-0.65.gitf945dc2.fc16.x86_64
ovirt-engine-backend-3.0.0_0001-7.fc16.x86_64

How reproducible:

100%

Steps to Reproduce:
1. create 2 domains on 2 different storage servers: 1 with 50G, second consists of several lun sizes (anywhere from 2-25G). 
2. attach both domains to DC - the second domain should be master
3. create a few vm's (with disks)
4. reboot host
  
Actual results:

when the host goes up it fails reconstruct with vdsm error

Expected results:

we should not fail reconstruct

Additional info:full logs attached

please note that this is not a high priority bug because we only fail reconstruct when we have small luns and if we have vm's on the domain. 

Thread-14::ERROR::2012-01-16 10:51:44,161::task::855::TaskManager.Task::(_setError) Task=`c8fdbed8-d59a-4d0e-8b7b-77304ea6a4d6`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 863, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 38, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 433, in getSpmStatus
    pool = self.getPool(spUUID)
  File "/usr/share/vdsm/storage/hsm.py", line 181, in getPool
    raise se.StoragePoolUnknown(spUUID)
StoragePoolUnknown: Unknown pool id, pool not connected: ('9aefefdb-a54f-4645-9ee4-a0863ed1a3b5',)
Thread-14::DEBUG::2012-01-16 10:51:44,166::task::874::TaskManager.Task::(_run) Task=`c8fdbed8-d59a-4d0e-8b7b-77304ea6a4d6`::Task._run: c8fdbed8-d59a-4d0e-8b7b-77304ea6a4d6 ('9aefefdb-a54f-4645-9ee4-a0863ed1a3b5',) {} failed - stopping task
Thread-14::DEBUG::2012-01-16 10:51:44,167::task::1201::TaskManager.Task::(stop) Task=`c8fdbed8-d59a-4d0e-8b7b-77304ea6a4d6`::stopping in state preparing (force False)
Thread-14::DEBUG::2012-01-16 10:51:44,168::task::980::TaskManager.Task::(_decref) Task=`c8fdbed8-d59a-4d0e-8b7b-77304ea6a4d6`::ref 1 aborting True
Thread-14::INFO::2012-01-16 10:51:44,169::task::1159::TaskManager.Task::(prepare) Task=`c8fdbed8-d59a-4d0e-8b7b-77304ea6a4d6`::aborting: Task is aborted: 'Unknown pool id, pool not connected' - code 309

Comment 1 Itamar Heim 2013-03-12 09:36:41 UTC
Closing old bugs. If this issue is still relevant/important in current version, please re-open the bug.