Created attachment 1166248 [details] screenshot of events and logfiles Description of problem: Following error appears in UI when trying to configure storage; everything seems to work fine though VDSM HOST2 command failed: Cannot find master domain: u'spUUID=83958f66-8809-4260-a13f-9ba2a4619f2a, msdUUID=6e1cf774-8caf-4f5d-9491-f01e618cf656' Version-Release number of selected component (if applicable): 3.6.7-4 How reproducible: Tried twice, same result Steps to Reproduce: 1. Create DC 2. Create 2 Clusters 3. Create Host on each 4. Attach NFS storage 5. See error Actual results: Storage attached successfully with error message displayed Expected results: Same, without showing error message Additional info:
This seems to happen during disk registration. Maor, can you take a look please?
Hi Dusan, It looks like the logs are not synced. In the engine, the error is at 2016-06-08 19:13:09,230 and the VDSM log only starts at 2016-06-09 08:01:01,906 Can you please add all the relevant vdsm logs. Also which VDSM version are you using? Thanks
Created attachment 1166350 [details] correct version of vdsm log file Hi, Sorry, i didn't notice it was rewriten. Correct version (according to timestamp) attached. The VDSM version is 4.17.31 Thanks
The issue here is that the Data Center contained two Hosts in an uninitialized Data Center. It looks like createStoragePool was being done on Host1, while connectStoragePool was being done on Host2, and that caused connectStoragePool to fail since only Host1 knew the master domain: 2016-06-08 19:12:41,796 .... ConnectStorageServerVDSCommand] START, ConnectStorageServerVDSCommand(HostName = HOST1 2016-06-08 19:12:43,001 INFO ...CreateStoragePoolVDSCommand] START, CreateStoragePoolVDSCommand(HostName = HOST1 2016-06-08 19:13:07,791 INFO ....ConnectStoragePoolVDSCommand] START, ConnectStoragePoolVDSCommand(HostName = HOST2 When calling ConnectStoragePoolVDSCommand we get the Host by using the method IrsProxyData#getPrioritizedVdsInPool, this method gives a random Host, and that Host might not be the one that called CreateStoragePoolVDSCommand and update its storage domains cache with the master SD.
*** Bug 1330827 has been marked as a duplicate of this bug. ***
No error is shown in the events, but I checked the vdsm logs and I see the error, versions: vdsm-4.18.5.1-1.el7ev.x86_64 rhevm-4.0.2-0.2.rc1.el7ev.noarch 1. A DC, a cluster and two hosts connect to that cluster, add an nfs domain, I see an error in one of the hosts: ioprocess communication (23127)::INFO::2016-07-11 13:07:26,080::__init__::447::IOProcess::(_pr ocessLogs) Starting ioprocess ioprocess communication (23127)::INFO::2016-07-11 13:07:26,080::__init__::447::IOProcess::(_pr ocessLogs) Starting ioprocess jsonrpc.Executor/6::ERROR::2016-07-11 13:07:26,081::sdc::146::Storage.StorageDomainCache::(_fi ndDomain) domain 058eedf5-e92b-47a8-a3b7-a5c5db6baac1 not found Traceback (most recent call last): File "/usr/share/vdsm/storage/sdc.py", line 144, in _findDomain dom = findMethod(sdUUID) File "/usr/share/vdsm/storage/sdc.py", line 174, in _findUnfetchedDomain raise se.StorageDomainDoesNotExist(sdUUID) StorageDomainDoesNotExist: Storage domain does not exist: ('058eedf5-e92b-47a8-a3b7-a5c5db6baa c1',) jsonrpc.Executor/6::INFO::2016-07-11 13:07:26,082::nfsSD::70::Storage.StorageDomain::(create) sdUUID=058eedf5-e92b-47a8-a3b7-a5c5db6baac1 domainName=test_nfs remotePath=10.35.64.11:/vol/RH EV/Storage/storage_jenkins_ge19_nfs_3 domClass=1 Maor, is this important? Operation seems fine
I'm not sure if it's related to that issue, does it happen consistently?
Created attachment 1178845 [details] vdsm.log for both hosts and engine. Yes, tested it twice and saw it both times. DC with Cluster and two hosts, add one NFS domain (choose host_1) and the error shows in host_1: jsonrpc.Executor/3::ERROR::2016-07-12 13:22:40,918::sdc::146::Storage.StorageDomainCache::(_findDomain) domain d388f7ee-ac92-4ce8-9a7a-81d4e69adb16 not found Traceback (most recent call last): File "/usr/share/vdsm/storage/sdc.py", line 144, in _findDomain dom = findMethod(sdUUID) File "/usr/share/vdsm/storage/sdc.py", line 174, in _findUnfetchedDomain raise se.StorageDomainDoesNotExist(sdUUID) StorageDomainDoesNotExist: Storage domain does not exist: ('d388f7ee-ac92-4ce8-9a7a-81d4e69adb16',) jsonrpc.Executor/3::INFO::2016-07-12 13:22:40,920::nfsSD::70::Storage.StorageDomain::(create) sdUUID=d388f7ee-ac92-4ce8-9a7a-81d4e69adb16 domainName=nfs_test_dc remotePath=10.35.64.11:/vol/RHEV/Storage/storage_jenkins_ge19_nfs_4 domClass=1 jsonrpc.Executor/3::DEBUG::2016-07-12 13:22:40,928::outOfProcess::69::Storage.oop::(getProcessPool) Creating ioprocess d388f7ee-ac92-4ce8-9a7a-81d4e69adb16 jsonrpc.Executor/3::INFO::2016-07-12 13:22:40,928::__init__::325::IOProcessClient::(__init__) Starting client ioprocess-3 jsonrpc.Executor/3::DEBUG::2016-07-12 13:22:40,928::__init__::334::IOProcessClient::(_run) Starting ioprocess for client ioprocess-3
This is an old logging issue we've had since oVrit 3.2 IIRC. I agree it's ugly, but there's no real effect there. You can move the BZ to VERIFIED, thanks!
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-1743.html