Created attachment 935464 [details]
vdsm log during failure
Description of problem:
Trying to force select as SPM a newly installed host fails with InquireNotSupportedError.
Version-Release number of selected component (if applicable):
How reproducible: Seems intermittent. Unable to nail down the exact scenario yet
Steps to Reproduce:
1. Configure host with ovirt-3.5-snapshot repository
2. Install host from ovirt-engine
3. Click "Select as SPM" when newly installed host is selected.
Operation fails with InquireNotSupported in the logs
Host should be selected as SPM
Adam, I need the full log, from the time vdsm was started. Can you upload a new log from the line that say "I am the actual vdsm..."
From the partial log, I learn that you are using clusterlock.SafeLease.
This is a legacy implementation used only by storage domain version 0, the one used by ovirt 3.0.
There is no reason to create a storage domain using this version with current version.
I guess that there is some wrong code leading to the selection of the wrong cluster lock class.
Are you using jsonrpc or xmlrpc? Can you reproduce this with both protocols?
Most likely a duplicate of bug 1118349.
Please create storage domains V3 if you are using a data-center > 3.0.
(In reply to Federico Simoncelli from comment #3)
> Most likely a duplicate of bug 1118349.
I disagree. I've taken a look at that bug but my domains are all V3. The UI will not allow me to create anything older.
Created attachment 935730 [details]
Full vdsm log
Here is the full vdsm log.
I tried re-adding the host using JsonRPC and it was able to acquire the SPM role. So this may only impact the case where the current SPM is using JsonRPC and the candidate host is using XMLRPC.
Adam, as I suspected, both of your storage domains use format 0 (See VERSION=0):
Thread-13::DEBUG::2014-09-09 09:54:20,580::fileSD::152::Storage.StorageDomain::(__init__) Reading domain in path /rhev/data-center/mnt/192.168.2.1:_home_storage_data/caf02237-e8a4-49fe-ab6d-4d23843f0edf
Thread-13::DEBUG::2014-09-09 09:54:20,786::persistentDict::192::Storage.PersistentDict::(__init__) Created a persistent dict with FileMetadataRW backend
Thread-13::DEBUG::2014-09-09 09:54:20,792::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=data-nfs', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'MASTER_VERSION=1', 'POOL_DESCRIPTION=nfs', 'POOL_DOMAINS=e453999e-ec9f-420b-941c-a9f9cc4432bb:Active,caf02237-e8a4-49fe-ab6d-4d23843f0edf:Active', 'POOL_SPM_ID=1', 'POOL_SPM_LVER=0', 'POOL_UUID=5745e11d-81b4-4d82-8fed-c8a22ec23383', 'REMOTE_PATH=192.168.2.1:/home/storage/data', 'ROLE=Master', 'SDUUID=caf02237-e8a4-49fe-ab6d-4d23843f0edf', 'TYPE=NFS', 'VERSION=0', '_SHA_CKSUM=0b5682af506c2f8194f0e48e2f45adc239d2e2eb']
Thread-17::DEBUG::2014-09-09 09:54:20,845::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Iso', 'DESCRIPTION=iso', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'MASTER_VERSION=0', 'POOL_UUID=00000002-0002-0002-0002-000000000047,6d2495f3-efe4-4aa1-ab93-bdb0f6860cee,ff41f65c-09ac-40aa-8917-4b3ca8719aec,a69c4d84-3114-4937-aace-2079ce66b10b,7b90a9ec-ecb3-4105-b817-306ffc0b8245,be744655-2a19-4919-92f3-e7c20ac03193,0650ae6a-530a-4208-a13a-668025352382,528420e2-28d3-42c0-a243-a72c2dc0bd04,5fa7190c-833d-4b62-8524-a114f6ce5a79,4ed9221c-909e-4b58-998d-6140b9c241d0,00000002-0002-0002-0002-00000000034b,ca74573b-dc89-4785-92a8-5fdcc079f4df,00000002-0002-0002-0002-0000000001e1,aab45993-c5cf-436e-b55a-873bd83681d2,1ac20b5a-4433-4c04-bbc7-33d7eb3fd41f,d0089906-9093-4e45-91f4-2c73256f3dd3,9d8913f8-c9ee-4216-b161-a9ead1047fb9,f201c1d3-166d-42b0-ab69-0920a6dc279f,a1d01b58-6574-432d-b2df-2054ffca7eab,25a09b8a-177c-40af-865d-22436da5cc66,aed40d3c-15fc-45d2-bcba-da1398ae3d14,5745e11d-81b4-4d82-8fed-c8a22ec23383', 'REMOTE_PATH=192.168.2.1:/home/storage/iso', 'ROLE=Regular', 'SDUUID=e453999e-ec9f-420b-941c-a9f9cc4432bb', 'TYPE=NFS', 'VERSION=0', '_SHA_CKSUM=3ab7cad61661fbee975ab14bc6a1ca5540004e85']
When attaching storage domain with format 0, we use SafeLease clusterlock, which does not support the inquire method, raising an exception.
So the failure is caused by the same issue of bug 1118349.
Now lets understand why your domains are using version 0. I suspect that while the UI show version 3, vdsm gets version 0, maybe because a parameter is missing or the order of the parameters is wrong (we have seen such errors in jsonrpc).
Please send engine.log (in debug level) and vdsm.log, describing this flow:
1. Add new data center version 3.5
2. Add new cluster version 3.5
3. Add host to the new cluster
4. Add new NFS storage domain (make sure V3 is selected)
Repeat this test with host using xmlrpc and jsonrpc.
Please also attach the dom_md/metadata file from the domain.
This is caused by engine improperly calling createStorageDomain which results in the domain being version 0 instead of 3 (which was requested).
Created attachment 935878 [details]
engine.log when using jsonrpc
Created attachment 935879 [details]
vdsm.log when using jsonrpc
Created attachment 935880 [details]
storage domain metadata after using jsonrpc
Created attachment 935881 [details]
engine.log when using xmlrpc
Created attachment 935882 [details]
vdsm.log when using xmlrpc
Created attachment 935883 [details]
storage domain metadata after using xmlrpc
Added files requested in comment #7
Once bug 1139817 is solved, what else should be done here?
(In reply to Allon Mureinik from comment #16)
> Once bug 1139817 is solved, what else should be done here?
I guess check that now selecting spm works and we can close it.
Adding back need info for Adam.
As Nir says, this is really just another manifestation of bug 1139817 so once that is fixed this should be automatically fixed as well.
The InquireNotSupportedError is tracked in bug 1118349, and the root cause is solved in bug 1139817.
Since I don't see anything else to do with this bug, we will close it. Please reopen if you think that there are additional issue to resolve.
*** This bug has been marked as a duplicate of bug 1118349 ***