Created attachment 926742 [details] vdsm+engine logs Description of problem: This bug related to the same problem described at BZ #1101009,though the following scenario cause vdsm to loop through find_domain tracebacks and errors. Thread-24::ERROR::2014-08-14 13:44:35,456::sdc::143::Storage.StorageDomainCache::(_findDomain) domain 52d8ebbf-6a2e-4968-9a0a-11f46ddbc612 not found Traceback (most recent call last): File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain dom = findMethod(sdUUID) File "/usr/share/vdsm/storage/sdc.py", line 171, in _findUnfetchedDomain raise se.StorageDomainDoesNotExist(sdUUID) StorageDomainDoesNotExist: Storage domain does not exist: ('52d8ebbf-6a2e-4968-9a0a-11f46ddbc612',) Thread-14::DEBUG::2014-08-14 13:44:35,456::__init__::225::IOProcess::(_processLogs) Queuing request... Thread-24::ERROR::2014-08-14 13:44:35,457::domainMonitor::239::Storage.DomainMonitorThread::(_monitorDomain) Error while collecting domain 52d8ebbf-6a2e-4968-9a0a-11f46ddbc612 monitoring information Traceback (most recent call last): File "/usr/share/vdsm/storage/domainMonitor.py", line 204, in _monitorDomain self.domain = sdCache.produce(self.sdUUID) File "/usr/share/vdsm/storage/sdc.py", line 98, in produce domain.getRealDomain() File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain return self._cache._realProduce(self._sdUUID) File "/usr/share/vdsm/storage/sdc.py", line 122, in _realProduce domain = self._findDomain(sdUUID) File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain dom = findMethod(sdUUID) File "/usr/share/vdsm/storage/sdc.py", line 171, in _findUnfetchedDomain raise se.StorageDomainDoesNotExist(sdUUID) StorageDomainDoesNotExist: Storage domain does not exist: ('52d8ebbf-6a2e-4968-9a0a-11f46ddbc612',) rather than throw the error one time (as described at BZ #1101009) Setup: Have two initialized setups(with host_1 on setup_1,host_2 on setup_2) Steps to Reproduce 1: 1.add host_1 to setup_2 (do not remove it from setup_1) 2.remove host_1 from setup_2 3.reinstall host_1 on setup_1 Actual results: vdsm log get's flooded with errors and tracebacks Steps to Reproduce 2: 1.add storage domain Actual results: vdsm log throws several errors and a traceback Version-Release number of selected component (if applicable): rc1 How reproducible: 100% Expected results: vdsm should know to handle this situation Additional info: ***** note ****** host_1 had contained an nfs server,the engine throws several errors through Steps to reproduce 1 operation: 2014-08-14 14:00:38,877 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (org.ovirt.thread.pool-8-thread-25) [2ea93c87] Command ConnectStorageServerVDSCommand(HostName = vdsb, HostId = a437569e-70a7-444e-99b9-f5e7b4b43bce, storagePoolId = 00000000-0000-0000-0000-000000000000, storageType = NFS, connectionList = [{ id: 6684d658-23f5-49a0-81f7-94eaeebcbc5a, connection: 10.35.102.78:/nfsshare, iqn: null, vfsType: null, mountOptions: null, nfsVersion: null, nfsRetrans: null, nfsTimeo: null };]) execution failed. Exception: VDSNetworkException: java.net.SocketException: Socket closed i didn't considered it to be a bug but just for general knowledge when going over engine's log.
(In reply to Ori from comment #0) > Created attachment 926742 [details] > vdsm+engine logs > > Description of problem: > I don't see any description of a problem. What is the problem? > > Setup: > Have two initialized setups(with host_1 on setup_1,host_2 on setup_2) What are setup_1 and setup_2? we cannot use this info to reproduce anything. > > Steps to Reproduce 1: > 1.add host_1 to setup_2 (do not remove it from setup_1) What do you mean by adding a host to another setup without removing it from the other? > 2.remove host_1 from setup_2 > 3.reinstall host_1 on setup_1 > Actual results: > vdsm log get's flooded with errors and tracebacks This flow is very unclear - I cannot reproduce this according to this description. You must give much more detailed description. Since we cannot do anything with this description, we will not handle this in this bug - please open another for this flow. > > Steps to Reproduce 2: > 1.add storage domain > > Actual results: > vdsm log throws several errors and a traceback We will continue to handle this in this bug. > > Version-Release number of selected component (if applicable): > rc1 > > How reproducible: > 100% > > Expected results: > vdsm should know to handle this situation This does not mean anything.
Ori, please create clean vdsm log showing the errors when creating a storage domain.
(In reply to Nir Soffer from comment #1) > (In reply to Ori from comment #0) > > Created attachment 926742 [details] > > vdsm+engine logs > > > > Description of problem: > > > > I don't see any description of a problem. > > What is the problem? As I mentioned at the description, BZ #1101009 provides a lot of info about this bug. The main issue here that on several occasions vdsm searches and monitors domains which either not have been created yet or not "belongs" to it. > > > > Setup: > > Have two initialized setups(with host_1 on setup_1,host_2 on setup_2) > > What are setup_1 and setup_2? we cannot use this info to reproduce anything. setup_1 and setup_2 are two different setups! > > > > Steps to Reproduce 1: > > 1.add host_1 to setup_2 (do not remove it from setup_1) > > What do you mean by adding a host to another setup without removing it from > the other? when you have one setup,lets say "setup_1" it has a host connected to it lets call that host,"host_1",then you add host_1 to a different setup called in our case,"setup_2",do not remove or maintain host_1 from setup_1 while doing so. > > 2.remove host_1 from setup_2 > > 3.reinstall host_1 on setup_1 > > Actual results: > > vdsm log get's flooded with errors and tracebacks > > This flow is very unclear - I cannot reproduce this according to this > description. > You must give much more detailed description. > Since we cannot do anything with this description, we will not handle this > in > this bug - please open another for this flow. Detailed Steps to reproduce: Before reproducing this bug,make sure you have: 2X machines(or VM's) with oVirt engine 3.5 rc1 version installed on both 2X setups of oVirt (one on each),to setup oVirt please run engine-setup 2X initialized dc's (one on each dc) to initialize a dc you need to add a host and create a storage domain. ** make sure both hosts are "up" ** Ok now the steps: 1.add one of the hosts (both of them are up,just to remind you),to the other setup,after the host was added and is in the state up move to step 2 2.now you have 2 Setups,one with 1 host on it(it is now none-responsive) and one with 2 hosts(both are up),we want to remove now, the same host we just added, from the setup which has two hosts on it. 3. now reinstall the none-responsive host on the first dc(dc - data center) hopefully this is clear enough. > > > > Steps to Reproduce 2: > > 1.add storage domain > > > > Actual results: > > vdsm log throws several errors and a traceback > > We will continue to handle this in this bug. > > > > > Version-Release number of selected component (if applicable): > > rc1 > > > > How reproducible: > > 100% > > > > Expected results: > > vdsm should know to handle this situation > > This does not mean anything. haven't you read BZ #1101009 ? >Ori, please create clean vdsm log showing the errors when creating a storage >domain. haven't you read BZ #1101009 ? flow one is the bug! This bug is about those errors during vdsm monitoring. we don't want to clean them,we want to solve them. However,the attachments you are asking for are found at... BZ #1101009.
Allon, according to comment 3, it looks like Ori is using the system in a way which is not supported. I don't see any business value in this, and suggest to close this as WONTFIX.
(In reply to Nir Soffer from comment #4) > Allon, according to comment 3, it looks like Ori is using the system in a > way which is not supported. I don't see any business value in this, and > suggest to close this as WONTFIX. Agreed.