Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1155912

Summary: Create new domain fails on vdsm logs
Product: Red Hat Enterprise Virtualization Manager Reporter: Ori Gofen <ogofen>
Component: vdsmAssignee: Allon Mureinik <amureini>
Status: CLOSED WONTFIX QA Contact: Aharon Canan <acanan>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.5.0CC: acanan, amureini, bazulay, ecohen, gklein, iheim, lpeer, lsurette, yeylon
Target Milestone: ---Keywords: Regression, Reopened
Target Release: 3.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: storage
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-10-30 14:05:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
logs none

Description Ori Gofen 2014-10-23 07:23:31 UTC
Created attachment 949686 [details]
logs

Description of problem:

This is a known issue which have regressed, upon every domain creation vdsm.log prompts an Error:

Thread-34728::ERROR::2014-10-23 10:09:29,453::sdc::137::Storage.StorageDomainCache::(_findDomain) looking for unfetched domain 90b422c0-ba59-4caa-9a48-b984e4703c65
Thread-34728::ERROR::2014-10-23 10:09:29,453::sdc::154::Storage.StorageDomainCache::(_findUnfetchedDomain) looking for domain 90b422c0-ba59-4caa-9a48-b984e4703c65
Thread-34728::DEBUG::2014-10-23 10:09:29,453::lvm::365::Storage.OperationMutex::(_reloadvgs) Operation 'lvm reload operation' got the operation mutex
Thread-34728::DEBUG::2014-10-23 10:09:29,457::lvm::288::Storage.Misc.excCmd::(cmd) /usr/bin/sudo -n /sbin/lvm vgs --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list_from_udev=0 filter = [ '\''a|/dev/mapper/360060160f4a030007aeed85291dbe311|/dev/mapper/360060160f4a030007beed85291dbe311|/dev/mapper/360060160f4a030007ceed85291dbe311|/dev/mapper/360060160f4a030007deed85291dbe311|/dev/mapper/360060160f4a030007eeed85291dbe311|/dev/mapper/360060160f4a03000fa65675991dbe311|/dev/mapper/360060160f4a03000fb65675991dbe311|/dev/mapper/360060160f4a03000fc65675991dbe311|/dev/mapper/360060160f4a03000fd65675991dbe311|/dev/mapper/360060160f4a03000fe65675991dbe311|'\'', '\''r|.*|'\'' ] }  global {  locking_type=1  prioritise_write_locks=1  wait_for_locks=1  use_lvmetad=0 }  backup {  retain_min = 50  retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name 90b422c0-ba59-4caa-9a48-b984e4703c65 (cwd None)
Thread-34728::DEBUG::2014-10-23 10:09:29,899::lvm::288::Storage.Misc.excCmd::(cmd) FAILED: <err> = '  Volume group "90b422c0-ba59-4caa-9a48-b984e4703c65" not found\n  Skipping volume group 90b422c0-ba59-4caa-9a48-b984e4703c65\n'; <rc> = 5
Thread-34728::WARNING::2014-10-23 10:09:29,904::lvm::370::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 [] ['  Volume group "90b422c0-ba59-4caa-9a48-b984e4703c65" not found', '  Skipping volume group 90b422c0-ba59-4caa-9a48-b984e4703c65']
Thread-34728::DEBUG::2014-10-23 10:09:29,905::lvm::407::Storage.OperationMutex::(_reloadvgs) Operation 'lvm reload operation' released the operation mutex
Thread-231::DEBUG::2014-10-23 10:09:29,910::__init__::232::IOProcess::(_processLogs) Receiving request...
Thread-231::DEBUG::2014-10-23 10:09:29,911::__init__::232::IOProcess::(_processLogs) Queuing request in the thread pool...
Thread-231::DEBUG::2014-10-23 10:09:29,911::__init__::232::IOProcess::(_processLogs) Extracting request information...
Thread-231::DEBUG::2014-10-23 10:09:29,911::__init__::232::IOProcess::(_processLogs) (36143) Got request for method 'glob'
Thread-231::DEBUG::2014-10-23 10:09:29,911::__init__::232::IOProcess::(_processLogs) (36143) Queuing response
Thread-231::DEBUG::2014-10-23 10:09:29,912::__init__::232::IOProcess::(_processLogs) Receiving request...
Thread-231::DEBUG::2014-10-23 10:09:29,913::__init__::232::IOProcess::(_processLogs) Queuing request in the thread pool...
Thread-231::DEBUG::2014-10-23 10:09:29,913::__init__::232::IOProcess::(_processLogs) Extracting request information...
Thread-231::DEBUG::2014-10-23 10:09:29,913::__init__::232::IOProcess::(_processLogs) (36144) Got request for method 'glob'
Thread-231::DEBUG::2014-10-23 10:09:29,914::__init__::232::IOProcess::(_processLogs) (36144) Queuing response
Thread-34728::ERROR::2014-10-23 10:09:29,915::sdc::143::Storage.StorageDomainCache::(_findDomain) domain 90b422c0-ba59-4caa-9a48-b984e4703c65 not found
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain
    dom = findMethod(sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 171, in _findUnfetchedDomain
    raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist: (u'90b422c0-ba59-4caa-9a48-b984e4703c65',)
Thread-34728::INFO::2014-10-23 10:09:29,915::nfsSD::69::Storage.StorageDomain::(create) sdUUID=90b422c0-ba59-4caa-9a48-b984e4703c65 domainName=nfs01 remotePath=10.35.160.108:/RHEV/ogofen/1 domClass=1
?ERROR

This bug makes QA automation scripts fail on sanity every time, an Error shouldn't be thrown with a long Traceback without a reason, not to mention support guys which probably have to explain why this Error happens (If new storage engineers always ask me about this bug than most surely our costumers)

please don't disregard this as it is a regression

Version-Release number of selected component (if applicable):
vt7

How reproducible:
100%

Steps to Reproduce:
1.Create a Storage domain

Actual results:

The operation fails on logs, paints them in red and makes it hard for me to test the software

Expected results:
No Errors or Tracebacks should be thrown on a successful create domain operation

Additional info:

Comment 1 Allon Mureinik 2014-10-23 13:04:15 UTC
(In reply to Ori from comment #0)
> This is a known issue which have regressed, upon every domain creation
> vdsm.log prompts an Error:
The operative word here is KNOWN ISSUE.
This has been discussed several times - automation scripts should not grep for errors.

Comment 2 Ori Gofen 2014-10-27 13:22:51 UTC
Yes Alon, A known issue, but it had a regression and it's now coloring the vdsm logs with red upon every successful domain creation.

This is not only a qa problem but a support problem, our users can think the domain creation had failed, internal users do(so there is no reason external won't).

In addition there's no reason for a successful operation to fail on logs, the situation today is that it is the only operation that is successful yet fails on  vdsm logs.

Comment 3 Gil Klein 2014-10-27 17:03:11 UTC
Allon, I must say I disagree with this approach. 
There is no reason we should report errors on successful operations. 

From QE point of view, this is not just an issue for automation. 
It actually looks bad from a customer perspective, and will make a lot of confusion not only for QE but also to GSS. 

Can we please reconsider to fix this?

Comment 4 Allon Mureinik 2014-10-30 14:05:01 UTC
(In reply to Gil Klein from comment #3)
> Allon, I must say I disagree with this approach. 
> There is no reason we should report errors on successful operations. 
> 
> From QE point of view, this is not just an issue for automation. 
> It actually looks bad from a customer perspective, and will make a lot of
> confusion not only for QE but also to GSS. 
> 
> Can we please reconsider to fix this?

This is an error form a low level component that searches for the domain, and does not find it - which is the expected behavior.
It's well documented, and has went unnoticed for at least 3 versions - I don't intend to re-engineer working code to understand if it's being called from a domain creation sequence or not just to remove a log line.