Bug 1101009 - On createStorageDomain vdsm warns about an unfetched domain
Summary: On createStorageDomain vdsm warns about an unfetched domain
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.4.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 3.5.0
Assignee: Federico Simoncelli
QA Contact: Aharon Canan
URL:
Whiteboard: storage
: 1203296 1256036 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-05-25 13:33 UTC by Ori Gofen
Modified: 2016-05-26 01:48 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-07-20 06:27:02 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:
scohen: needinfo+


Attachments (Terms of Use)
vdsm+engine logs (613.69 KB, application/x-gzip)
2014-05-25 13:33 UTC, Ori Gofen
no flags Details
vdsm+engine logs (952.42 KB, application/x-tar)
2014-07-07 12:32 UTC, Ori Gofen
no flags Details

Description Ori Gofen 2014-05-25 13:33:41 UTC
Created attachment 899074 [details]
vdsm+engine logs

Description of problem:

Creating a template always prompts a failure concerning cache memory,
it seems that one of the threads clears the cache right before vdsm tries to find it in order to complete the operation.

Thread-76::ERROR::2014-05-25 15:07:35,030::sdc::137::Storage.StorageDomainCache::(_findDomain) looking for unfetched domain ceade227-c2df-4f82-b77d-30ce8075dacc
Thread-76::ERROR::2014-05-25 15:07:35,030::sdc::154::Storage.StorageDomainCache::(_findUnfetchedDomain) looking for domain ceade227-c2df-4f82-b77d-30ce8075dacc
Thread-76::DEBUG::2014-05-25 15:07:35,030::lvm::373::OperationMutex::(_reloadvgs) Operation 'lvm reload operation' got the operation mutex
Thread-76::DEBUG::2014-05-25 15:07:35,031::lvm::296::Storage.Misc.excCmd::(cmd) '/usr/bin/sudo -n /sbin/lvm vgs --config " devices { preferred_names = [\\"^/dev/mapper/\\"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list_from_udev=0 filter = [ \'a|/dev/mapper/360060160f4a030007aeed85291dbe311|/dev/mapper/360060160f4a030007beed85291dbe311|/dev/mapper/360060160f4a030007ceed85291dbe311|/dev/mapper/360060160f4a030007deed85291dbe311|/dev/mapper/360060160f4a030007eeed85291dbe311|/dev/mapper/360060160f4a03000fa65675991dbe311|/dev/mapper/360060160f4a03000fb65675991dbe311|/dev/mapper/360060160f4a03000fc65675991dbe311|/dev/mapper/360060160f4a03000fd65675991dbe311|/dev/mapper/360060160f4a03000fe65675991dbe311|\', \'r|.*|\' ] }  global {  locking_type=1  prioritise_write_locks=1  wait_for_locks=1  use_lvmetad=0 }  backup {  retain_min = 50  retain_days = 0 } " --noheadings --units b --nosuffix --separator | -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name ceade227-c2df-4f82-b77d-30ce8075dacc' (cwd None)
Thread-76::DEBUG::2014-05-25 15:07:35,454::lvm::296::Storage.Misc.excCmd::(cmd) FAILED: <err> = '  Volume group "ceade227-c2df-4f82-b77d-30ce8075dacc" not found\n  Skipping volume group ceade227-c2df-4f82-b77d-30ce8075dacc\n'; <rc> = 5
Thread-76::WARNING::2014-05-25 15:07:35,459::lvm::378::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 [] ['  Volume group "ceade227-c2df-4f82-b77d-30ce8075dacc" not found', '  Skipping volume group ceade227-c2df-4f82-b77d-30ce8075dacc']
Thread-76::DEBUG::2014-05-25 15:07:35,459::lvm::415::OperationMutex::(_reloadvgs) Operation 'lvm reload operation' released the operation mutex
Thread-76::ERROR::2014-05-25 15:07:35,466::sdc::143::Storage.StorageDomainCache::(_findDomain) domain ceade227-c2df-4f82-b77d-30ce8075dacc not found
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain
    dom = findMethod(sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 171, in _findUnfetchedDomain
    raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist: ('ceade227-c2df-4f82-b77d-30ce8075dacc',)
Thread-76::ERROR::2014-05-25 15:07:35,474::task::866::TaskManager.Task::(_setError) Task=`efae12d0-1fcb-4d73-8e1e-c073a13fcd89`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 873, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 45, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 3092, in getVolumeInfo
    sdUUID=sdUUID).produceVolume(imgUUID=imgUUID,
  File "/usr/share/vdsm/storage/sdc.py", line 98, in produce
    domain.getRealDomain()
  File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain
    return self._cache._realProduce(self._sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 122, in _realProduce
    domain = self._findDomain(sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain
    dom = findMethod(sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 171, in _findUnfetchedDomain
    raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist: ('ceade227-c2df-4f82-b77d-30ce8075dacc',)
Thread-76::DEBUG::2014-05-25 15:07:35,476::task::885::TaskManager.Task::(_run) Task=`efae12d0-1fcb-4d73-8e1e-c073a13fcd89`::Task._run: efae12d0-1fcb-4d73-8e1e-c073a13fcd89 ('ceade227-c2df-4f82-b77d-30ce8075dacc', 'Error', 'using', 'command:') {} failed - stopping task


** in the end operation succeeds.**

Version-Release number of selected component (if applicable):

vdsm-4.14.7-3.el6ev.x86_64
rhevm-3.4.0-0.21.el6ev.noarch

How reproducible:

25% approximately 1 out of 4

Steps to Reproduce:
1.create a template 

Actual results:

An Error appears on logs, operation is a success 

Expected results:

The Error shouldn't appear

Additional info:

Comment 1 Allon Mureinik 2014-05-25 14:22:11 UTC
This seems like the SD cache oddity again.
Fede - can we finally get rid of this?

Comment 2 Federico Simoncelli 2014-07-02 20:57:00 UTC
Engine address was 10.35.163.66.

The request mentioned in the bug description came from the same ip of the vdsm host: 10.35.116.2

Thread-76::DEBUG::2014-05-25 15:07:35,026::BindingXMLRPC::251::vds::(wrapper) client [10.35.116.2]
Thread-76::DEBUG::2014-05-25 15:07:35,027::task::595::TaskManager.Task::(_updateState) Task=`efae12d0-1fcb-4d73-8e1e-c073a13fcd89`::moving from state init -> state preparing
Thread-76::INFO::2014-05-25 15:07:35,027::logUtils::44::dispatcher::(wrapper) Run and protect: getVolumeInfo(sdUUID='ceade227-c2df-4f82-b77d-30ce8075dacc', spUUID='Error', imgUUID='using', volUUID='command:', options=None)

the four parameters:

sdUUID='ceade227-c2df-4f82-b77d-30ce8075dacc'
spUUID='Error'
imgUUID='using'
volUUID='command:'

look really suspicious. Beside the obviously wrong spUUID, imgUUID, volUUID, also the sdUUID was wrong (in fact ceade227-c2df-4f82-b77d-30ce8075dacc is an image, not a storage domain), and that's what triggered the StorageDomainDoesNotExist error.

Were you running manual getVolumeInfo commands? Any automated script?

This seems completely unrelated to VDSM and engine.

Comment 3 Ori Gofen 2014-07-07 11:01:49 UTC
Federico Simoncelli,This is a whole different issue which was resolved at
BZ #1112722, sorry about this mix-up,this has nothing to do with the current bug

Comment 4 Allon Mureinik 2014-07-07 11:31:24 UTC
Ori, can we get CLEAN logs for this scenario please?

Comment 5 Ori Gofen 2014-07-07 12:32:07 UTC
Created attachment 916059 [details]
vdsm+engine logs

sure reproduced on av3.6

Thread-300::ERROR::2014-07-07 15:28:45,514::sdc::143::Storage.StorageDomainCache::(_findDomain) domain ce6b5c90-72f4-4a7e-b079-715b268bf706 not found
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain
    dom = findMethod(sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 171, in _findUnfetchedDomain
    raise se.StorageDomainDoesNotExist(sdUUID)

Comment 6 Federico Simoncelli 2014-07-14 09:37:31 UTC
(In reply to Ori from comment #5)
> Created attachment 916059 [details]
> vdsm+engine logs
> 
> sure reproduced on av3.6
> 
> Thread-300::ERROR::2014-07-07
> 15:28:45,514::sdc::143::Storage.StorageDomainCache::(_findDomain) domain
> ce6b5c90-72f4-4a7e-b079-715b268bf706 not found
> Traceback (most recent call last):
>   File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain
>     dom = findMethod(sdUUID)
>   File "/usr/share/vdsm/storage/sdc.py", line 171, in _findUnfetchedDomain
>     raise se.StorageDomainDoesNotExist(sdUUID)

BZ title needed an update.

The warning is part of the createStorageDomain flow:

Thread-300::DEBUG::2014-07-07 15:28:44,481::BindingXMLRPC::251::vds::(wrapper) client [10.35.161.37] flowID [761f2c8d]
Thread-300::DEBUG::2014-07-07 15:28:44,482::task::595::TaskManager.Task::(_updateState) Task=`84a8cda5-de74-4502-b77a-00199b48f8d5`::moving from state init -> state preparing
Thread-300::INFO::2014-07-07 15:28:44,482::logUtils::44::dispatcher::(wrapper) Run and protect: createStorageDomain(storageType=1, sdUUID='ce6b5c90-72f4-4a7e-b079-715b268bf706', domainName='nfs_3', typeSpecificArg='10.35.160.108:/RHEV/ogofen/8', domClass=1, domVersion='3', options=None)
...
Thread-300::ERROR::2014-07-07 15:28:45,514::sdc::143::Storage.StorageDomainCache::(_findDomain) domain ce6b5c90-72f4-4a7e-b079-715b268bf706 not found
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain
    dom = findMethod(sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 171, in _findUnfetchedDomain
    raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist: ('ce6b5c90-72f4-4a7e-b079-715b268bf706',)


and it's making sure that the new domain we're creating doesn't exists already.

In general logging the lookup for a domain that doesn't exists is interesting.

Comment 7 Allon Mureinik 2014-07-14 19:14:26 UTC
(In reply to Federico Simoncelli from comment #6)
> In general logging the lookup for a domain that doesn't exists is
> interesting.
Sure, but why is this an ERROR?
Seems to me it will needlessly alarm customers.

Comment 8 Federico Simoncelli 2014-07-14 22:42:23 UTC
(In reply to Allon Mureinik from comment #7)
> (In reply to Federico Simoncelli from comment #6)
> > In general logging the lookup for a domain that doesn't exists is
> > interesting.
> 
> Sure, but why is this an ERROR?
> Seems to me it will needlessly alarm customers.

Looking for a domain that cannot be found (StorageDomainDoesNotExist) generates an exception, and that is an error.

The question is if it should be always logged. At the moment we are logging it in all cases because there's only one situation where we expect it to fail (storage domain creation).

Comment 9 Allon Mureinik 2014-07-17 12:11:25 UTC
(In reply to Federico Simoncelli from comment #8)
> (In reply to Allon Mureinik from comment #7)
> > (In reply to Federico Simoncelli from comment #6)
> > > In general logging the lookup for a domain that doesn't exists is
> > > interesting.
> > 
> > Sure, but why is this an ERROR?
> > Seems to me it will needlessly alarm customers.
> 
> Looking for a domain that cannot be found (StorageDomainDoesNotExist)
> generates an exception, and that is an error.
> 
> The question is if it should be always logged. At the moment we are logging
> it in all cases because there's only one situation where we expect it to
> fail (storage domain creation).

Sean - I suggest closing this as WONTFIX.
Any objection?

Comment 11 Ori Gofen 2014-07-30 11:25:09 UTC
(In reply to Federico Simoncelli from comment #8)
> (In reply to Allon Mureinik from comment #7)
> > (In reply to Federico Simoncelli from comment #6)
> > > In general logging the lookup for a domain that doesn't exists is
> > > interesting.
> > 
> > Sure, but why is this an ERROR?
> > Seems to me it will needlessly alarm customers.
> 
> Looking for a domain that cannot be found (StorageDomainDoesNotExist)
> generates an exception, and that is an error.
> 
> The question is if it should be always logged. At the moment we are logging
> it in all cases because there's only one situation where we expect it to
> fail (storage domain creation).

for some reason I have filtered this discussion,it is important to mark that this bug happens also when creating a TEMPLATE as described in the description,not only on domain creation.

Comment 12 Federico Simoncelli 2014-07-30 13:45:36 UTC
(In reply to Ori from comment #11)
> (In reply to Federico Simoncelli from comment #8)
> > (In reply to Allon Mureinik from comment #7)
> > > (In reply to Federico Simoncelli from comment #6)
> > > > In general logging the lookup for a domain that doesn't exists is
> > > > interesting.
> > > 
> > > Sure, but why is this an ERROR?
> > > Seems to me it will needlessly alarm customers.
> > 
> > Looking for a domain that cannot be found (StorageDomainDoesNotExist)
> > generates an exception, and that is an error.
> > 
> > The question is if it should be always logged. At the moment we are logging
> > it in all cases because there's only one situation where we expect it to
> > fail (storage domain creation).
> 
> for some reason I have filtered this discussion,it is important to mark that
> this bug happens also when creating a TEMPLATE as described in the
> description,not only on domain creation.

No, as per comment 2 and comment 3 the error during template creation was because of bug 1112722.

If not please attach more logs.

Comment 13 Allon Mureinik 2015-03-19 14:56:42 UTC
*** Bug 1203296 has been marked as a duplicate of this bug. ***

Comment 14 Allon Mureinik 2015-10-14 07:37:14 UTC
*** Bug 1256036 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.