Description of problem: When creating a new RHEV environment (rhevm-cleanup, rhevm-setup, etc), after adding two RHEL hosts and trying to add a NFS Data Domain, it gets added to the UI (and some files created under the mount point), but it fails... Trying to attach the SD to the DC fails too, and rhevm logs (engine.log) shows messages from the vdsm trying to do the operation. Version-Release number of selected component (if applicable): vdsm-4.10.2-1.13.el6ev.x86_64 How reproducible: Steps to Reproduce: 1. Add a Data Domain (NFS) 2. It will create some files on the FS 3. It will fail afterwards with that message in logs Actual results: No data domain, so no working RHEV environment Expected results: Data domain attached Additional info: ### ENGINE.LOG #### 2013-05-02 12:55:52,872 INFO [org.ovirt.engine.core.bll.storage.AddStoragePoolWithStoragesCommand] (ajp-/127.0.0.1:8702-16) Lock Acquired to object EngineLock [exclusiveLocks= key: b4ce5912-b1d0-11e2-9c85-080027c6913a value: POOL , sharedLocks= ] 2013-05-02 12:55:53,050 INFO [org.ovirt.engine.core.bll.storage.AddStoragePoolWithStoragesCommand] (pool-3-thread-49) [ddcd2ff] Running command: AddStoragePoolWithStoragesCommand internal: false. Entities affected : ID: b4ce5912-b1d0-11e2-9c85-080027c6913a Type: StoragePool 2013-05-02 12:55:53,216 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.ValidateStorageServerConnectionVDSCommand] (pool-3-thread-49) [7248988c] START, ValidateStorageServerConnectionVDSCommand(HostName = Virtualizador, HostId = 57062226-b1ec-11e2-b016-080027c6913a, storagePoolId = 00000000-0000-0000-0000-000000000000, storageType = NFS, connectionList = [{ id: ce655af9-4de6-4b76-a992-c6564b57e562, connection: 192.168.2.122:/mnt/data, iqn: null, vfsType: null, mountOptions: null, nfsVersion: AUTO, nfsRetrans: null, nfsTimeo: null };]), log id: 303ee650 2013-05-02 12:55:53,986 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.ValidateStorageServerConnectionVDSCommand] (pool-3-thread-49) [7248988c] FINISH, ValidateStorageServerConnectionVDSCommand, return: {ce655af9-4de6-4b76-a992-c6564b57e562=0}, log id: 303ee650 2013-05-02 12:55:53,987 INFO [org.ovirt.engine.core.bll.storage.ConnectStorageToVdsCommand] (pool-3-thread-49) [7248988c] Running command: ConnectStorageToVdsCommand internal: true. Entities affected : ID: aaa00000-0000-0000-0000-123456789aaa Type: System 2013-05-02 12:55:53,993 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (pool-3-thread-49) [7248988c] START, ConnectStorageServerVDSCommand(HostName = Virtualizador, HostId = 57062226-b1ec-11e2-b016-080027c6913a, storagePoolId = 00000000-0000-0000-0000-000000000000, storageType = NFS, connectionList = [{ id: ce655af9-4de6-4b76-a992-c6564b57e562, connection: 192.168.2.122:/mnt/data, iqn: null, vfsType: null, mountOptions: null, nfsVersion: AUTO, nfsRetrans: null, nfsTimeo: null };]), log id: 19da0b66 2013-05-02 12:55:54,822 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (pool-3-thread-49) [7248988c] FINISH, ConnectStorageServerVDSCommand, return: {ce655af9-4de6-4b76-a992-c6564b57e562=0}, log id: 19da0b66 2013-05-02 12:55:54,986 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStoragePoolVDSCommand] (pool-3-thread-49) [7248988c] START, CreateStoragePoolVDSCommand(HostName = Virtualizador, HostId = 57062226-b1ec-11e2-b016-080027c6913a, storagePoolId=b4ce5912-b1d0-11e2-9c85-080027c6913a, storageType=NFS, storagePoolName=Default, masterDomainId=a459c008-9cee-4b00-9ff2-3189f1b595a4, domainsIdList=[a459c008-9cee-4b00-9ff2-3189f1b595a4], masterVersion=21), log id: 4db21a0a 2013-05-02 12:55:56,900 WARN [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-3-thread-49) [7248988c] Weird return value: Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusForXmlRpc mCode 661 mMessage Cannot acquire host id: ('a459c008-9cee-4b00-9ff2-3189f1b595a4', SanlockException(19, 'Sanlock lockspace add failure', 'No such device')) 2013-05-02 12:55:56,902 WARN [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-3-thread-49) [7248988c] Weird return value: Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusForXmlRpc mCode 661 mMessage Cannot acquire host id: ('a459c008-9cee-4b00-9ff2-3189f1b595a4', SanlockException(19, 'Sanlock lockspace add failure', 'No such device')) 2013-05-02 12:55:56,902 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-3-thread-49) [7248988c] Failed in CreateStoragePoolVDS method 2013-05-02 12:55:56,904 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-3-thread-49) [7248988c] Error code unexpected and error message VDSGenericException: VDSErrorException: Failed to CreateStoragePoolVDS, error = Cannot acquire host id: ('a459c008-9cee-4b00-9ff2-3189f1b595a4', SanlockException(19, 'Sanlock lockspace add failure', 'No such device')) 2013-05-02 12:55:56,906 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-3-thread-49) [7248988c] Command org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStoragePoolVDSCommand return value Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusOnlyReturnForXmlRpc mStatus Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusForXmlRpc mCode 661 mMessage Cannot acquire host id: ('a459c008-9cee-4b00-9ff2-3189f1b595a4', SanlockException(19, 'Sanlock lockspace add failure', 'No such device')) 2013-05-02 12:55:56,910 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-3-thread-49) [7248988c] HostName = Virtualizador 2013-05-02 12:55:56,910 ERROR [org.ovirt.engine.core.vdsbroker.VDSCommandBase] (pool-3-thread-49) [7248988c] Command CreateStoragePoolVDS execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to CreateStoragePoolVDS, error = Cannot acquire host id: ('a459c008-9cee-4b00-9ff2-3189f1b595a4', SanlockException(19, 'Sanlock lockspace add failure', 'No such device')) 2013-05-02 12:55:56,910 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStoragePoolVDSCommand] (pool-3-thread-49) [7248988c] FINISH, CreateStoragePoolVDSCommand, log id: 4db21a0a 2013-05-02 12:55:56,910 ERROR [org.ovirt.engine.core.bll.storage.AddStoragePoolWithStoragesCommand] (pool-3-thread-49) [7248988c] Command org.ovirt.engine.core.bll.storage.AddStoragePoolWithStoragesCommand throw Vdc Bll exception. With error message VdcBLLException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to CreateStoragePoolVDS, error = Cannot acquire host id: ('a459c008-9cee-4b00-9ff2-3189f1b595a4', SanlockException(19, 'Sanlock lockspace add failure', 'No such device')) 2013-05-02 12:55:56,951 INFO [org.ovirt.engine.core.bll.storage.AddStoragePoolWithStoragesCommand] (pool-3-thread-49) [7248988c] Command [id=9d602bf8-b72f-4bd1-8c03-10588655c885]: Compensating CHANGED_ENTITY of org.ovirt.engine.core.common.businessentities.storage_pool; snapshot: id=b4ce5912-b1d0-11e2-9c85-080027c6913a. 2013-05-02 12:55:56,964 INFO [org.ovirt.engine.core.bll.storage.AddStoragePoolWithStoragesCommand] (pool-3-thread-49) [7248988c] Command [id=9d602bf8-b72f-4bd1-8c03-10588655c885]: Compensating NEW_ENTITY_ID of org.ovirt.engine.core.common.businessentities.storage_pool_iso_map; snapshot: storagePoolId = b4ce5912-b1d0-11e2-9c85-080027c6913a, storageId = a459c008-9cee-4b00-9ff2-3189f1b595a4. 2013-05-02 12:55:56,988 INFO [org.ovirt.engine.core.bll.storage.AddStoragePoolWithStoragesCommand] (pool-3-thread-49) [7248988c] Command [id=9d602bf8-b72f-4bd1-8c03-10588655c885]: Compensating CHANGED_ENTITY of org.ovirt.engine.core.common.businessentities.storage_domain_static; snapshot: id=a459c008-9cee-4b00-9ff2-3189f1b595a4. 2013-05-02 12:55:57,095 INFO [org.ovirt.engine.core.bll.storage.AddStoragePoolWithStoragesCommand] (pool-3-thread-49) [7248988c] Lock freed to object EngineLock [exclusiveLocks= key: b4ce5912-b1d0-11e2-9c85-080027c6913a value: POOL , sharedLocks= ] ############ VDSM.LOG ############## Thread-1312::DEBUG::2013-05-03 10:45:12,946::resourceManager::809::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-1312::DEBUG::2013-05-03 10:45:12,947::resourceManager::844::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-1312::DEBUG::2013-05-03 10:45:12,947::task::957::TaskManager.Task::(_decref) Task=`e4a7722b-5734-4bbf-956c-599f9222dfa4`::ref 0 aborting False Thread-1318::DEBUG::2013-05-03 10:45:28,238::task::568::TaskManager.Task::(_updateState) Task=`b9927342-8b91-4188-ba3a-1e2b697e08f9`::moving from state init -> state preparing Thread-1318::INFO::2013-05-03 10:45:28,238::logUtils::37::dispatcher::(wrapper) Run and protect: repoStats(options=None) Thread-1318::INFO::2013-05-03 10:45:28,240::logUtils::39::dispatcher::(wrapper) Run and protect: repoStats, Return response: {} Thread-1318::DEBUG::2013-05-03 10:45:28,240::task::1151::TaskManager.Task::(prepare) Task=`b9927342-8b91-4188-ba3a-1e2b697e08f9`::finished: {} Thread-1318::DEBUG::2013-05-03 10:45:28,240::task::568::TaskManager.Task::(_updateState) Task=`b9927342-8b91-4188-ba3a-1e2b697e08f9`::moving from state preparing -> state finished Thread-1318::DEBUG::2013-05-03 10:45:28,240::resourceManager::809::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-1318::DEBUG::2013-05-03 10:45:28,240::resourceManager::844::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-1318::DEBUG::2013-05-03 10:45:28,241::task::957::TaskManager.Task::(_decref) Task=`b9927342-8b91-4188-ba3a-1e2b697e08f9`::ref 0 aborting False Thread-1324::DEBUG::2013-05-03 10:45:43,409::task::568::TaskManager.Task::(_updateState) Task=`3e5e2e45-7b1a-4124-bc29-591fcda6d3a2`::moving from state init -> state preparing Thread-1324::INFO::2013-05-03 10:45:43,409::logUtils::37::dispatcher::(wrapper) Run and protect: repoStats(options=None) Thread-1324::INFO::2013-05-03 10:45:43,409::logUtils::39::dispatcher::(wrapper) Run and protect: repoStats, Return response: {} Thread-1324::DEBUG::2013-05-03 10:45:43,409::task::1151::TaskManager.Task::(prepare) Task=`3e5e2e45-7b1a-4124-bc29-591fcda6d3a2`::finished: {} Thread-1324::DEBUG::2013-05-03 10:45:43,409::task::568::TaskManager.Task::(_updateState) Task=`3e5e2e45-7b1a-4124-bc29-591fcda6d3a2`::moving from state preparing -> state finished Thread-1324::DEBUG::2013-05-03 10:45:43,410::resourceManager::809::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-1324::DEBUG::2013-05-03 10:45:43,410::resourceManager::844::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-1324::DEBUG::2013-05-03 10:45:43,410::task::957::TaskManager.Task::(_decref) Task=`3e5e2e45-7b1a-4124-bc29-591fcda6d3a2`::ref 0 aborting False Thread-1330::DEBUG::2013-05-03 10:45:58,873::task::568::TaskManager.Task::(_updateState) Task=`16893207-e770-42dc-8bce-bc4cf3aedc90`::moving from state init -> state preparing Thread-1330::INFO::2013-05-03 10:45:58,873::logUtils::37::dispatcher::(wrapper) Run and protect: repoStats(options=None) Thread-1330::INFO::2013-05-03 10:45:58,873::logUtils::39::dispatcher::(wrapper) Run and protect: repoStats, Return response: {} Thread-1330::DEBUG::2013-05-03 10:45:58,874::task::1151::TaskManager.Task::(prepare) Task=`16893207-e770-42dc-8bce-bc4cf3aedc90`::finished: {} Thread-1330::DEBUG::2013-05-03 10:45:58,874::task::568::TaskManager.Task::(_updateState) Task=`16893207-e770-42dc-8bce-bc4cf3aedc90`::moving from state preparing -> state finished Thread-1330::DEBUG::2013-05-03 10:45:58,874::resourceManager::809::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-1330::DEBUG::2013-05-03 10:45:58,874::resourceManager::844::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-1330::DEBUG::2013-05-03 10:45:58,874::task::957::TaskManager.Task::(_decref) Task=`16893207-e770-42dc-8bce-bc4cf3aedc90`::ref 0 aborting False Thread-1336::DEBUG::2013-05-03 10:46:14,051::task::568::TaskManager.Task::(_updateState) Task=`26c57057-914e-48f7-8440-d184512b0760`::moving from state init -> state preparing Thread-1336::INFO::2013-05-03 10:46:14,051::logUtils::37::dispatcher::(wrapper) Run and protect: repoStats(options=None) Thread-1336::INFO::2013-05-03 10:46:14,053::logUtils::39::dispatcher::(wrapper) Run and protect: repoStats, Return response: {} Thread-1336::DEBUG::2013-05-03 10:46:14,053::task::1151::TaskManager.Task::(prepare) Task=`26c57057-914e-48f7-8440-d184512b0760`::finished: {} Thread-1336::DEBUG::2013-05-03 10:46:14,053::task::568::TaskManager.Task::(_updateState) Task=`26c57057-914e-48f7-8440-d184512b0760`::moving from state preparing -> state finished Thread-1336::DEBUG::2013-05-03 10:46:14,053::resourceManager::809::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-1336::DEBUG::2013-05-03 10:46:14,054::resourceManager::844::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-1336::DEBUG::2013-05-03 10:46:14,054::task::957::TaskManager.Task::(_decref) Task=`26c57057-914e-48f7-8440-d184512b0760`::ref 0 aborting False Thread-1342::DEBUG::2013-05-03 10:46:29,334::task::568::TaskManager.Task::(_updateState) Task=`bc22ed03-4f02-4fe9-9886-35a5bf34eff0`::moving from state init -> state preparing Thread-1342::INFO::2013-05-03 10:46:29,335::logUtils::37::dispatcher::(wrapper) Run and protect: repoStats(options=None) Thread-1342::INFO::2013-05-03 10:46:29,335::logUtils::39::dispatcher::(wrapper) Run and protect: repoStats, Return response: {} Thread-1342::DEBUG::2013-05-03 10:46:29,335::task::1151::TaskManager.Task::(prepare) Task=`bc22ed03-4f02-4fe9-9886-35a5bf34eff0`::finished: {} Thread-1342::DEBUG::2013-05-03 10:46:29,335::task::568::TaskManager.Task::(_updateState) Task=`bc22ed03-4f02-4fe9-9886-35a5bf34eff0`::moving from state preparing -> state finished Thread-1342::DEBUG::2013-05-03 10:46:29,335::resourceManager::809::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-1342::DEBUG::2013-05-03 10:46:29,335::resourceManager::844::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-1342::DEBUG::2013-05-03 10:46:29,335::task::957::TaskManager.Task::(_decref) Task=`bc22ed03-4f02-4fe9-9886-35a5bf34eff0`::ref 0 aborting False Thread-1348::DEBUG::2013-05-03 10:46:44,719::task::568::TaskManager.Task::(_updateState) Task=`e7d58dee-f349-49eb-b2c0-4d42c556ce60`::moving from state init -> state preparing Thread-1348::INFO::2013-05-03 10:46:44,719::logUtils::37::dispatcher::(wrapper) Run and protect: repoStats(options=None) Thread-1348::INFO::2013-05-03 10:46:44,719::logUtils::39::dispatcher::(wrapper) Run and protect: repoStats, Return response: {} Thread-1348::DEBUG::2013-05-03 10:46:44,719::task::1151::TaskManager.Task::(prepare) Task=`e7d58dee-f349-49eb-b2c0-4d42c556ce60`::finished: {} Thread-1348::DEBUG::2013-05-03 10:46:44,720::task::568::TaskManager.Task::(_updateState) Task=`e7d58dee-f349-49eb-b2c0-4d42c556ce60`::moving from state preparing -> state finished Thread-1348::DEBUG::2013-05-03 10:46:44,720::resourceManager::809::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-1348::DEBUG::2013-05-03 10:46:44,720::resourceManager::844::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-1348::DEBUG::2013-05-03 10:46:44,720::task::957::TaskManager.Task::(_decref) Task=`e7d58dee-f349-49eb-b2c0-4d42c556ce60`::ref 0 aborting False Thread-1354::DEBUG::2013-05-03 10:46:59,911::task::568::TaskManager.Task::(_updateState) Task=`4128abb0-bd4c-43e2-ae87-c992ab025224`::moving from state init -> state preparing Thread-1354::INFO::2013-05-03 10:46:59,911::logUtils::37::dispatcher::(wrapper) Run and protect: repoStats(options=None) Thread-1354::INFO::2013-05-03 10:46:59,912::logUtils::39::dispatcher::(wrapper) Run and protect: repoStats, Return response: {} Thread-1354::DEBUG::2013-05-03 10:46:59,912::task::1151::TaskManager.Task::(prepare) Task=`4128abb0-bd4c-43e2-ae87-c992ab025224`::finished: {} Thread-1354::DEBUG::2013-05-03 10:46:59,912::task::568::TaskManager.Task::(_updateState) Task=`4128abb0-bd4c-43e2-ae87-c992ab025224`::moving from state preparing -> state finished Thread-1354::DEBUG::2013-05-03 10:46:59,912::resourceManager::809::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-1354::DEBUG::2013-05-03 10:46:59,912::resourceManager::844::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-1354::DEBUG::2013-05-03 10:46:59,912::task::957::TaskManager.Task::(_decref) Task=`4128abb0-bd4c-43e2-ae87-c992ab025224`::ref 0 aborting False Thread-1360::DEBUG::2013-05-03 10:47:15,276::task::568::TaskManager.Task::(_updateState) Task=`9a1e1ace-a461-43f7-a579-cb14bbf21903`::moving from state init -> state preparing Thread-1360::INFO::2013-05-03 10:47:15,276::logUtils::37::dispatcher::(wrapper) Run and protect: repoStats(options=None) Thread-1360::INFO::2013-05-03 10:47:15,278::logUtils::39::dispatcher::(wrapper) Run and protect: repoStats, Return response: {} Thread-1360::DEBUG::2013-05-03 10:47:15,278::task::1151::TaskManager.Task::(prepare) Task=`9a1e1ace-a461-43f7-a579-cb14bbf21903`::finished: {} Thread-1360::DEBUG::2013-05-03 10:47:15,278::task::568::TaskManager.Task::(_updateState) Task=`9a1e1ace-a461-43f7-a579-cb14bbf21903`::moving from state preparing -> state finished Thread-1360::DEBUG::2013-05-03 10:47:15,278::resourceManager::809::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-1360::DEBUG::2013-05-03 10:47:15,278::resourceManager::844::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-1360::DEBUG::2013-05-03 10:47:15,278::task::957::TaskManager.Task::(_decref) Task=`9a1e1ace-a461-43f7-a579-cb14bbf21903`::ref 0 aborting False Thread-1364::DEBUG::2013-05-03 10:47:23,196::BindingXMLRPC::161::vds::(wrapper) [25.97.225.61] Thread-1364::DEBUG::2013-05-03 10:47:23,196::task::568::TaskManager.Task::(_updateState) Task=`62156094-319e-4223-9cb0-a24ac391fb70`::moving from state init -> state preparing Thread-1364::INFO::2013-05-03 10:47:23,197::logUtils::37::dispatcher::(wrapper) Run and protect: validateStorageServerConnection(domType=1, spUUID='00000000-0000-0000-0000-000000000000', conList=[{'port': '', 'connection': '192.168.2.122:/mnt/data', 'iqn': '', 'portal': '', 'user': '', 'protocol_version': 'auto', 'password': '******', 'id': 'ce655af9-4de6-4b76-a992-c6564b57e562'}], options=None) Thread-1364::INFO::2013-05-03 10:47:23,197::logUtils::39::dispatcher::(wrapper) Run and protect: validateStorageServerConnection, Return response: {'statuslist': [{'status': 0, 'id': 'ce655af9-4de6-4b76-a992-c6564b57e562'}]} Thread-1364::DEBUG::2013-05-03 10:47:23,197::task::1151::TaskManager.Task::(prepare) Task=`62156094-319e-4223-9cb0-a24ac391fb70`::finished: {'statuslist': [{'status': 0, 'id': 'ce655af9-4de6-4b76-a992-c6564b57e562'}]} Thread-1364::DEBUG::2013-05-03 10:47:23,197::task::568::TaskManager.Task::(_updateState) Task=`62156094-319e-4223-9cb0-a24ac391fb70`::moving from state preparing -> state finished Thread-1364::DEBUG::2013-05-03 10:47:23,198::resourceManager::809::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-1364::DEBUG::2013-05-03 10:47:23,198::resourceManager::844::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-1364::DEBUG::2013-05-03 10:47:23,198::task::957::TaskManager.Task::(_decref) Task=`62156094-319e-4223-9cb0-a24ac391fb70`::ref 0 aborting False Thread-1365::DEBUG::2013-05-03 10:47:23,997::BindingXMLRPC::161::vds::(wrapper) [25.97.225.61] Thread-1365::DEBUG::2013-05-03 10:47:23,998::task::568::TaskManager.Task::(_updateState) Task=`61e6d67d-54d9-4f99-abfd-c015e38f10ad`::moving from state init -> state preparing Thread-1365::INFO::2013-05-03 10:47:23,998::logUtils::37::dispatcher::(wrapper) Run and protect: connectStorageServer(domType=1, spUUID='00000000-0000-0000-0000-000000000000', conList=[{'port': '', 'connection': '192.168.2.122:/mnt/data', 'iqn': '', 'portal': '', 'user': '', 'protocol_version': 'auto', 'password': '******', 'id': 'ce655af9-4de6-4b76-a992-c6564b57e562'}], options=None) Thread-1365::INFO::2013-05-03 10:47:24,002::logUtils::39::dispatcher::(wrapper) Run and protect: connectStorageServer, Return response: {'statuslist': [{'status': 0, 'id': 'ce655af9-4de6-4b76-a992-c6564b57e562'}]} Thread-1365::DEBUG::2013-05-03 10:47:24,002::task::1151::TaskManager.Task::(prepare) Task=`61e6d67d-54d9-4f99-abfd-c015e38f10ad`::finished: {'statuslist': [{'status': 0, 'id': 'ce655af9-4de6-4b76-a992-c6564b57e562'}]} Thread-1365::DEBUG::2013-05-03 10:47:24,002::task::568::TaskManager.Task::(_updateState) Task=`61e6d67d-54d9-4f99-abfd-c015e38f10ad`::moving from state preparing -> state finished Thread-1365::DEBUG::2013-05-03 10:47:24,002::resourceManager::809::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-1365::DEBUG::2013-05-03 10:47:24,002::resourceManager::844::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-1365::DEBUG::2013-05-03 10:47:24,002::task::957::TaskManager.Task::(_decref) Task=`61e6d67d-54d9-4f99-abfd-c015e38f10ad`::ref 0 aborting False Thread-1366::DEBUG::2013-05-03 10:47:24,736::BindingXMLRPC::161::vds::(wrapper) [25.97.225.61] Thread-1366::DEBUG::2013-05-03 10:47:24,736::task::568::TaskManager.Task::(_updateState) Task=`af1a2711-317a-4b71-aea9-22d430e91b66`::moving from state init -> state preparing Thread-1366::INFO::2013-05-03 10:47:24,736::logUtils::37::dispatcher::(wrapper) Run and protect: createStoragePool(poolType=None, spUUID='b4ce5912-b1d0-11e2-9c85-080027c6913a', poolName='Default', masterDom='a459c008-9cee-4b00-9ff2-3189f1b595a4', domList=['a459c008-9cee-4b00-9ff2-3189f1b595a4'], masterVersion=22, lockPolicy=None, lockRenewalIntervalSec=5, leaseTimeSec=60, ioOpTimeoutSec=10, leaseRetries=3, options=None) Thread-1366::INFO::2013-05-03 10:47:24,737::fileSD::310::Storage.StorageDomain::(validate) sdUUID=a459c008-9cee-4b00-9ff2-3189f1b595a4 Thread-1366::DEBUG::2013-05-03 10:47:24,751::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=rpi-data', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=192.168.2.122:/mnt/data', 'ROLE=Regular', 'SDUUID=a459c008-9cee-4b00-9ff2-3189f1b595a4', 'TYPE=NFS', 'VERSION=3', '_SHA_CKSUM=96a09f31a467960cab663ef912b87f3fd6e0ac12'] Thread-1366::DEBUG::2013-05-03 10:47:24,752::resourceManager::175::ResourceManager.Request::(__init__) ResName=`Storage.b4ce5912-b1d0-11e2-9c85-080027c6913a`ReqID=`46dbba87-bb13-461a-9d96-7d8fb2557f96`::Request was made in '/usr/share/vdsm/storage/resourceManager.py' line '174' at '__init__' Thread-1366::DEBUG::2013-05-03 10:47:24,752::resourceManager::486::ResourceManager::(registerResource) Trying to register resource 'Storage.b4ce5912-b1d0-11e2-9c85-080027c6913a' for lock type 'exclusive' Thread-1366::DEBUG::2013-05-03 10:47:24,752::resourceManager::528::ResourceManager::(registerResource) Resource 'Storage.b4ce5912-b1d0-11e2-9c85-080027c6913a' is free. Now locking as 'exclusive' (1 active user) Thread-1366::DEBUG::2013-05-03 10:47:24,752::resourceManager::212::ResourceManager.Request::(grant) ResName=`Storage.b4ce5912-b1d0-11e2-9c85-080027c6913a`ReqID=`46dbba87-bb13-461a-9d96-7d8fb2557f96`::Granted request Thread-1366::DEBUG::2013-05-03 10:47:24,753::task::794::TaskManager.Task::(resourceAcquired) Task=`af1a2711-317a-4b71-aea9-22d430e91b66`::_resourcesAcquired: Storage.b4ce5912-b1d0-11e2-9c85-080027c6913a (exclusive) Thread-1366::DEBUG::2013-05-03 10:47:24,753::task::957::TaskManager.Task::(_decref) Task=`af1a2711-317a-4b71-aea9-22d430e91b66`::ref 1 aborting False Thread-1366::DEBUG::2013-05-03 10:47:24,753::resourceManager::175::ResourceManager.Request::(__init__) ResName=`Storage.a459c008-9cee-4b00-9ff2-3189f1b595a4`ReqID=`73ad75b9-8193-41ec-9a0a-8a2ebffcaca3`::Request was made in '/usr/share/vdsm/storage/resourceManager.py' line '174' at '__init__' Thread-1366::DEBUG::2013-05-03 10:47:24,753::resourceManager::486::ResourceManager::(registerResource) Trying to register resource 'Storage.a459c008-9cee-4b00-9ff2-3189f1b595a4' for lock type 'exclusive' Thread-1366::DEBUG::2013-05-03 10:47:24,754::resourceManager::528::ResourceManager::(registerResource) Resource 'Storage.a459c008-9cee-4b00-9ff2-3189f1b595a4' is free. Now locking as 'exclusive' (1 active user) Thread-1366::DEBUG::2013-05-03 10:47:24,754::resourceManager::212::ResourceManager.Request::(grant) ResName=`Storage.a459c008-9cee-4b00-9ff2-3189f1b595a4`ReqID=`73ad75b9-8193-41ec-9a0a-8a2ebffcaca3`::Granted request Thread-1366::DEBUG::2013-05-03 10:47:24,754::task::794::TaskManager.Task::(resourceAcquired) Task=`af1a2711-317a-4b71-aea9-22d430e91b66`::_resourcesAcquired: Storage.a459c008-9cee-4b00-9ff2-3189f1b595a4 (exclusive) Thread-1366::DEBUG::2013-05-03 10:47:24,754::task::957::TaskManager.Task::(_decref) Task=`af1a2711-317a-4b71-aea9-22d430e91b66`::ref 1 aborting False Thread-1366::INFO::2013-05-03 10:47:24,754::sp::543::Storage.StoragePool::(create) spUUID=b4ce5912-b1d0-11e2-9c85-080027c6913a poolName=Default master_sd=a459c008-9cee-4b00-9ff2-3189f1b595a4 domList=['a459c008-9cee-4b00-9ff2-3189f1b595a4'] masterVersion=22 {'LEASETIMESEC': 60, 'IOOPTIMEOUTSEC': 10, 'LEASERETRIES': 3, 'LOCKRENEWALINTERVALSEC': 5} Thread-1366::INFO::2013-05-03 10:47:24,755::fileSD::310::Storage.StorageDomain::(validate) sdUUID=a459c008-9cee-4b00-9ff2-3189f1b595a4 Thread-1366::DEBUG::2013-05-03 10:47:24,761::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=rpi-data', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=192.168.2.122:/mnt/data', 'ROLE=Regular', 'SDUUID=a459c008-9cee-4b00-9ff2-3189f1b595a4', 'TYPE=NFS', 'VERSION=3', '_SHA_CKSUM=96a09f31a467960cab663ef912b87f3fd6e0ac12'] Thread-1366::DEBUG::2013-05-03 10:47:24,767::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=rpi-data', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=192.168.2.122:/mnt/data', 'ROLE=Regular', 'SDUUID=a459c008-9cee-4b00-9ff2-3189f1b595a4', 'TYPE=NFS', 'VERSION=3', '_SHA_CKSUM=96a09f31a467960cab663ef912b87f3fd6e0ac12'] Thread-1366::WARNING::2013-05-03 10:47:24,767::fileUtils::184::Storage.fileUtils::(createdir) Dir /rhev/data-center/b4ce5912-b1d0-11e2-9c85-080027c6913a already exists Thread-1366::DEBUG::2013-05-03 10:47:24,768::persistentDict::167::Storage.PersistentDict::(transaction) Starting transaction Thread-1366::DEBUG::2013-05-03 10:47:24,768::persistentDict::175::Storage.PersistentDict::(transaction) Finished transaction Thread-1366::INFO::2013-05-03 10:47:24,768::clusterlock::172::SANLock::(acquireHostId) Acquiring host id for domain a459c008-9cee-4b00-9ff2-3189f1b595a4 (id: 250) Thread-1366::ERROR::2013-05-03 10:47:25,775::task::833::TaskManager.Task::(_setError) Task=`af1a2711-317a-4b71-aea9-22d430e91b66`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 840, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 38, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 893, in createStoragePool masterVersion, leaseParams) File "/usr/share/vdsm/storage/sp.py", line 568, in create self._acquireTemporaryClusterLock(msdUUID, leaseParams) File "/usr/share/vdsm/storage/sp.py", line 510, in _acquireTemporaryClusterLock msd.acquireHostId(self.id) File "/usr/share/vdsm/storage/sd.py", line 445, in acquireHostId self._clusterLock.acquireHostId(hostId, async) File "/usr/share/vdsm/storage/clusterlock.py", line 187, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: ('a459c008-9cee-4b00-9ff2-3189f1b595a4', SanlockException(19, 'Sanlock lockspace add failure', 'No such device')) Thread-1366::DEBUG::2013-05-03 10:47:25,781::task::852::TaskManager.Task::(_run) Task=`af1a2711-317a-4b71-aea9-22d430e91b66`::Task._run: af1a2711-317a-4b71-aea9-22d430e91b66 (None, 'b4ce5912-b1d0-11e2-9c85-080027c6913a', 'Default', 'a459c008-9cee-4b00-9ff2-3189f1b595a4', ['a459c008-9cee-4b00-9ff2-3189f1b595a4'], 22, None, 5, 60, 10, 3) {} failed - stopping task Thread-1366::DEBUG::2013-05-03 10:47:25,781::task::1177::TaskManager.Task::(stop) Task=`af1a2711-317a-4b71-aea9-22d430e91b66`::stopping in state preparing (force False) Thread-1366::DEBUG::2013-05-03 10:47:25,781::task::957::TaskManager.Task::(_decref) Task=`af1a2711-317a-4b71-aea9-22d430e91b66`::ref 1 aborting True Thread-1366::INFO::2013-05-03 10:47:25,781::task::1134::TaskManager.Task::(prepare) Task=`af1a2711-317a-4b71-aea9-22d430e91b66`::aborting: Task is aborted: 'Cannot acquire host id' - code 661 Thread-1366::DEBUG::2013-05-03 10:47:25,782::task::1139::TaskManager.Task::(prepare) Task=`af1a2711-317a-4b71-aea9-22d430e91b66`::Prepare: aborted: Cannot acquire host id Thread-1366::DEBUG::2013-05-03 10:47:25,782::task::957::TaskManager.Task::(_decref) Task=`af1a2711-317a-4b71-aea9-22d430e91b66`::ref 0 aborting True Thread-1366::DEBUG::2013-05-03 10:47:25,782::task::892::TaskManager.Task::(_doAbort) Task=`af1a2711-317a-4b71-aea9-22d430e91b66`::Task._doAbort: force False Thread-1366::DEBUG::2013-05-03 10:47:25,782::resourceManager::844::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-1366::DEBUG::2013-05-03 10:47:25,782::task::568::TaskManager.Task::(_updateState) Task=`af1a2711-317a-4b71-aea9-22d430e91b66`::moving from state preparing -> state aborting Thread-1366::DEBUG::2013-05-03 10:47:25,782::task::523::TaskManager.Task::(__state_aborting) Task=`af1a2711-317a-4b71-aea9-22d430e91b66`::_aborting: recover policy none Thread-1366::DEBUG::2013-05-03 10:47:25,783::task::568::TaskManager.Task::(_updateState) Task=`af1a2711-317a-4b71-aea9-22d430e91b66`::moving from state aborting -> state failed Thread-1366::DEBUG::2013-05-03 10:47:25,783::resourceManager::809::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {'Storage.a459c008-9cee-4b00-9ff2-3189f1b595a4': < ResourceRef 'Storage.a459c008-9cee-4b00-9ff2-3189f1b595a4', isValid: 'True' obj: 'None'>, 'Storage.b4ce5912-b1d0-11e2-9c85-080027c6913a': < ResourceRef 'Storage.b4ce5912-b1d0-11e2-9c85-080027c6913a', isValid: 'True' obj: 'None'>} Thread-1366::DEBUG::2013-05-03 10:47:25,783::resourceManager::844::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-1366::DEBUG::2013-05-03 10:47:25,783::resourceManager::538::ResourceManager::(releaseResource) Trying to release resource 'Storage.a459c008-9cee-4b00-9ff2-3189f1b595a4' Thread-1366::DEBUG::2013-05-03 10:47:25,783::resourceManager::553::ResourceManager::(releaseResource) Released resource 'Storage.a459c008-9cee-4b00-9ff2-3189f1b595a4' (0 active users) Thread-1366::DEBUG::2013-05-03 10:47:25,783::resourceManager::558::ResourceManager::(releaseResource) Resource 'Storage.a459c008-9cee-4b00-9ff2-3189f1b595a4' is free, finding out if anyone is waiting for it. Thread-1366::DEBUG::2013-05-03 10:47:25,784::resourceManager::565::ResourceManager::(releaseResource) No one is waiting for resource 'Storage.a459c008-9cee-4b00-9ff2-3189f1b595a4', Clearing records. Thread-1366::DEBUG::2013-05-03 10:47:25,784::resourceManager::538::ResourceManager::(releaseResource) Trying to release resource 'Storage.b4ce5912-b1d0-11e2-9c85-080027c6913a' Thread-1366::DEBUG::2013-05-03 10:47:25,784::resourceManager::553::ResourceManager::(releaseResource) Released resource 'Storage.b4ce5912-b1d0-11e2-9c85-080027c6913a' (0 active users) Thread-1366::DEBUG::2013-05-03 10:47:25,784::resourceManager::558::ResourceManager::(releaseResource) Resource 'Storage.b4ce5912-b1d0-11e2-9c85-080027c6913a' is free, finding out if anyone is waiting for it. Thread-1366::DEBUG::2013-05-03 10:47:25,784::resourceManager::565::ResourceManager::(releaseResource) No one is waiting for resource 'Storage.b4ce5912-b1d0-11e2-9c85-080027c6913a', Clearing records. Thread-1366::ERROR::2013-05-03 10:47:25,785::dispatcher::66::Storage.Dispatcher.Protect::(run) {'status': {'message': "Cannot acquire host id: ('a459c008-9cee-4b00-9ff2-3189f1b595a4', SanlockException(19, 'Sanlock lockspace add failure', 'No such device'))", 'code': 661}} Thread-1369::DEBUG::2013-05-03 10:47:30,589::task::568::TaskManager.Task::(_updateState) Task=`5baf4299-619a-4cbc-80dd-f72a9a68aaed`::moving from state init -> state preparing Thread-1369::INFO::2013-05-03 10:47:30,589::logUtils::37::dispatcher::(wrapper) Run and protect: repoStats(options=None) Thread-1369::INFO::2013-05-03 10:47:30,589::logUtils::39::dispatcher::(wrapper) Run and protect: repoStats, Return response: {} Thread-1369::DEBUG::2013-05-03 10:47:30,589::task::1151::TaskManager.Task::(prepare) Task=`5baf4299-619a-4cbc-80dd-f72a9a68aaed`::finished: {} Thread-1369::DEBUG::2013-05-03 10:47:30,590::task::568::TaskManager.Task::(_updateState) Task=`5baf4299-619a-4cbc-80dd-f72a9a68aaed`::moving from state preparing -> state finished
please attach vdsm logs (preferred SOS report). can you list the permissions on the nfs storage server (dir)? what type of NFS are you using (i.e; is it NFSv3 or NFSv4).
Haim, NFS in default mode (NFS3) (I can see the nfs being mounted when issuing a 'mount' from the hypervisor) Permisions are user and group 36:36 for the full folder and it creates some content: (from the NFS server) /mnt/data/a459c008-9cee-4b00-9ff2-3189f1b595a4 /mnt/data/a459c008-9cee-4b00-9ff2-3189f1b595a4/images /mnt/data/a459c008-9cee-4b00-9ff2-3189f1b595a4/dom_md /mnt/data/a459c008-9cee-4b00-9ff2-3189f1b595a4/dom_md/outbox /mnt/data/a459c008-9cee-4b00-9ff2-3189f1b595a4/dom_md/inbox /mnt/data/a459c008-9cee-4b00-9ff2-3189f1b595a4/dom_md/metadata /mnt/data/a459c008-9cee-4b00-9ff2-3189f1b595a4/dom_md/ids /mnt/data/a459c008-9cee-4b00-9ff2-3189f1b595a4/dom_md/leases I'm attaching sos report in a few minutes for both hypervisors. Thanks, Pablo
Created attachment 743127 [details] Sosreport from host 2
Created attachment 743128 [details] Sosreport from host 1
Fede, any idea?
WRT: Thread-1366::ERROR::2013-05-03 10:47:25,775::task::833::TaskManager.Task::(_setError) Task=`af1a2711-317a-4b71-aea9-22d430e91b66`::Unexpected error Traceback (most recent call last): AcquireHostIdFailure: Cannot acquire host id: ('a459c008-9cee-4b00-9ff2-3189f1b595a4', SanlockException(19, 'Sanlock lockspace add failure', 'No such device')) The sanlock error is (found in /var/log/messages): May 3 10:01:08 Virtualizador sanlock[2041]: 2013-05-03 10:01:08+0200 1939 [3939]: open error -13 /rhev/data-center/mnt/192.168.2.122:_mnt_data/a459c008-9cee-4b00-9ff2-3189f1b595a4/dom_md/ids May 3 10:01:08 Virtualizador sanlock[2041]: 2013-05-03 10:01:08+0200 1939 [3939]: s4 open_disk /rhev/data-center/mnt/192.168.2.122:_mnt_data/a459c008-9cee-4b00-9ff2-3189f1b595a4/dom_md/ids error -13 The file cannot be opened because of errno 13: EACCESS This is most likely a problem in the file permissions. Please double check (from the vdsm host) the permissions (mode/owner/group) of the ids file and the sanlock process: # ls -l /rhev/data-center/mnt/192.168.2.122:_mnt_data/a459c008-9cee-4b00-9ff2-3189f1b595a4/dom_md/ids # ps aux | grep sanlock # id sanlock Also make sure that all the previous paths (/rhev, /rhev/data-center, etc...) are accessible by the sanlock user.
Tomorrow I'll try to mount it from the RHEV environment. Regards, Pablo [iranzo@x201 d3382225-a3fd-44e0-9973-67d369c9b017]$ ls -lR .: total 24 drwxr-xr-x. 3 36 kvm 4096 may 3 09:59 data drwxr-xr-x. 2 36 kvm 4096 abr 30 20:54 iso drwx------. 2 root root 16384 abr 30 18:50 lost+found ./data: total 4 drwxr-xr-x. 4 36 kvm 4096 may 3 09:59 a459c008-9cee-4b00-9ff2-3189f1b595a4 ./data/a459c008-9cee-4b00-9ff2-3189f1b595a4: total 8 drwxr-xr-x. 2 36 kvm 4096 may 3 09:59 dom_md drwxr-xr-x. 2 36 kvm 4096 may 3 09:59 images ./data/a459c008-9cee-4b00-9ff2-3189f1b595a4/dom_md: total 2052 -rw-rw----. 1 36 kvm 1048576 may 3 09:59 ids -rw-rw----. 1 36 kvm 0 may 3 09:59 inbox -rw-rw----. 1 36 kvm 2097152 may 3 09:59 leases -rw-r--r--. 1 36 kvm 293 may 3 09:59 metadata -rw-rw----. 1 36 kvm 0 may 3 09:59 outbox ./data/a459c008-9cee-4b00-9ff2-3189f1b595a4/images: total 0 ./iso: total 0 ls: no se puede abrir el directorio ./lost+found: Permiso denegado [iranzo@x201 d3382225-a3fd-44e0-9973-67d369c9b017]$ ls -lRZ .: drwxr-xr-x. 36 kvm system_u:object_r:file_t:s0 data drwxr-xr-x. 36 kvm system_u:object_r:file_t:s0 iso drwx------. root root system_u:object_r:file_t:s0 lost+found ./data: drwxr-xr-x. 36 kvm system_u:object_r:file_t:s0 a459c008-9cee-4b00-9ff2-3189f1b595a4 ./data/a459c008-9cee-4b00-9ff2-3189f1b595a4: drwxr-xr-x. 36 kvm system_u:object_r:file_t:s0 dom_md drwxr-xr-x. 36 kvm system_u:object_r:file_t:s0 images ./data/a459c008-9cee-4b00-9ff2-3189f1b595a4/dom_md: -rw-rw----. 36 kvm system_u:object_r:file_t:s0 ids -rw-rw----. 36 kvm system_u:object_r:file_t:s0 inbox -rw-rw----. 36 kvm system_u:object_r:file_t:s0 leases -rw-r--r--. 36 kvm system_u:object_r:file_t:s0 metadata -rw-rw----. 36 kvm system_u:object_r:file_t:s0 outbox ./data/a459c008-9cee-4b00-9ff2-3189f1b595a4/images: ./iso: ls: no se puede abrir el directorio ./lost+found: Permiso denegado
[root@Virtualizador2 ~]# id sanlock uid=179(sanlock) gid=179(sanlock) grupos=179(sanlock),6(disk),36(kvm),107(qemu) [root@Virtualizador2 ~]# ps aux|grep sanlock root 2087 0.0 0.0 13548 3316 ? SLs 22:11 0:00 wdmd -G sanlock sanlock 2142 0.0 0.4 263236 17516 ? SLsl 22:11 0:00 sanlock daemon -U sanlock -G sanlock root 2144 0.0 0.0 23076 288 ? S 22:11 0:00 sanlock daemon -U sanlock -G sanlock root 2738 0.0 0.0 105312 884 pts/0 S+ 22:17 0:00 grep sanlock
Try to access the ids file as the sanlock user: # su -s /bin/bash sanlock $ cd /rhev/data-center/mnt/192.168.2.122:_mnt_data/a459c008-9cee-4b00-9ff2-3189f1b595a4/dom_md $ hexdump -C ids
bash-4.1$ id uid=179(sanlock) gid=179(sanlock) groups=179(sanlock),6(disk),36(kvm),107(qemu) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 bash-4.1$ ls -l total 2060 -rw-rw----. 1 vdsm kvm 1048576 2013-05-16 14:56 ids -rw-rw----. 1 vdsm kvm 0 2013-05-16 14:56 inbox -rw-rw----. 1 vdsm kvm 2097152 2013-05-16 14:56 leases -rw-r--r--. 1 vdsm kvm 296 2013-05-16 14:56 metadata -rw-rw----. 1 vdsm kvm 0 2013-05-16 14:56 outbox bash-4.1$ hexdump -C ids hexdump: ids: Permission denied hexdump: ids: Bad file descriptor The same issue occurs with root: [root@revtest1 dom_md]# hexdump -C ids hexdump: ids: Permission denied hexdump: ids: Bad file descriptor
[sanlock@Virtualizador2 dom_md]$ hexdump -C ids hexdump: ids: Permission denied hexdump: ids: Bad file descriptor [sanlock@Virtualizador2 dom_md]$ ls -l total 2052 -rw-rw----. 1 vdsm kvm 1048576 May 3 09:59 ids -rw-rw----. 1 vdsm kvm 0 May 3 09:59 inbox -rw-rw----. 1 vdsm kvm 2097152 May 3 09:59 leases -rw-r--r--. 1 vdsm kvm 293 May 3 09:59 metadata -rw-rw----. 1 vdsm kvm 0 May 3 09:59 outbox [sanlock@Virtualizador2 dom_md]$ whoami sanlock [sanlock@Virtualizador2 dom_md]$ id uid=179(sanlock) gid=179(sanlock) groups=179(sanlock),6(disk),36(kvm),107(qemu) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 [sanlock@Virtualizador2 dom_md]$
I forgot to mention that I had to manually mount it under /mnt as RHEV couldn't mount it directly because of the sanlock failure.
(In reply to comment #11) > [sanlock@Virtualizador2 dom_md]$ hexdump -C ids > hexdump: ids: Permission denied > hexdump: ids: Bad file descriptor > [sanlock@Virtualizador2 dom_md]$ ls -l > total 2052 > -rw-rw----. 1 vdsm kvm 1048576 May 3 09:59 ids > -rw-rw----. 1 vdsm kvm 0 May 3 09:59 inbox > -rw-rw----. 1 vdsm kvm 2097152 May 3 09:59 leases > -rw-r--r--. 1 vdsm kvm 293 May 3 09:59 metadata > -rw-rw----. 1 vdsm kvm 0 May 3 09:59 outbox > [sanlock@Virtualizador2 dom_md]$ whoami > sanlock > [sanlock@Virtualizador2 dom_md]$ id > uid=179(sanlock) gid=179(sanlock) > groups=179(sanlock),6(disk),36(kvm),107(qemu) > context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 > [sanlock@Virtualizador2 dom_md]$ BTW, putting permissions as 777 for the whole export allowed me to mount it, which shouldn't be the thing to do on production environment After that, I was able to activate the data domain, create a new iso domain and an export domain (and putting them as 777 too). Even after that, the files created for a VM have 660 attributes and can't be used to boot the vm unles changing permissions again :/
NFS server is a raspberry running raspbian exporting an USB mounted HDD
(In reply to Pablo Iranzo Gómez from comment #14) > NFS server is a raspberry running raspbian exporting an USB mounted HDD This configuration is highly untested, I guess you have to discover why your server isn't allowing qemu and sanlock to access the files. Given: # id qemu uid=107(qemu) gid=107(qemu) groups=107(qemu),36(kvm) # id sanlock uid=179(sanlock) gid=179(sanlock) groups=179(sanlock),6(disk),36(kvm),107(qemu) The server should allow the users qemu/sanlock to access the files: -rw-rw----. 1 vdsm kvm 1048576 May 3 09:59 ids -rw-rw----. 1 vdsm kvm 2097152 May 3 09:59 leases ... (disk images...) if this doesn't happen it means that you have to discover what is wrong with the NFS server provided with raspbian. Quick tests to verify that the server is fixed: # su -s /bin/sh qemu # hexdump -C ids ... # su -s /bin/sh sanlock # hexdump -C ids ...
hello, i m getting a similar error message when trying to add a brand new storage FC domain on rhev 3.1 find attached engine.log and vdsm.log when creating storage domain GES-CP-000
Created attachment 758198 [details] engine.log when adding a FC SD
Created attachment 758199 [details] vdsm.log when adding a FC SD
Created attachment 758206 [details] additional vdsm.log when attaching a SD from rhvges2
Created attachment 758209 [details] correct vdsm.log
Created attachment 758220 [details] sanlock log from rhvges1
(In reply to Federico Simoncelli from comment #15) > (In reply to Pablo Iranzo Gómez from comment #14) > > NFS server is a raspberry running raspbian exporting an USB mounted HDD > > This configuration is highly untested, I guess you have to discover why your > server isn't allowing qemu and sanlock to access the files. > > Given: > > # id qemu > uid=107(qemu) gid=107(qemu) groups=107(qemu),36(kvm) > > # id sanlock > uid=179(sanlock) gid=179(sanlock) > groups=179(sanlock),6(disk),36(kvm),107(qemu) > > The server should allow the users qemu/sanlock to access the files: > > -rw-rw----. 1 vdsm kvm 1048576 May 3 09:59 ids > -rw-rw----. 1 vdsm kvm 2097152 May 3 09:59 leases > ... > (disk images...) > > if this doesn't happen it means that you have to discover what is wrong with > the NFS server provided with raspbian. > > Quick tests to verify that the server is fixed: > > # su -s /bin/sh qemu > # hexdump -C ids > ... > > # su -s /bin/sh sanlock > # hexdump -C ids > ... [sanlock@Virtualizador data-center]$ cd mnt/192.168.2.122\:_mnt_data/ bash: cd: mnt/192.168.2.122:_mnt_data/: Permiso denegado [sanlock@Virtualizador data-center]$ cd mnt [sanlock@Virtualizador mnt]$ ls -l total 8 drwxrwx---. 3 vdsm kvm 4096 may 3 09:59 192.168.2.122:_mnt_data drwxr-xr-x. 2 vdsm kvm 4096 may 13 22:57 rhevm.example.com:_var_lib_exports_iso [sanlock@Virtualizador mnt]$ id uid=179(sanlock) gid=179(sanlock) grupos=179(sanlock),6(disk),36(kvm),107(qemu) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 [sanlock@Virtualizador mnt]$ newgrp [sanlock@Virtualizador mnt]$ man newgrp No hay ninguna página sobre newgrp [sanlock@Virtualizador mnt]$ newgrp kvm [sanlock@Virtualizador mnt]$ id uid=179(sanlock) gid=36(kvm) grupos=179(sanlock),6(disk),36(kvm),107(qemu) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 [sanlock@Virtualizador mnt]$ cd 192.168.2.122\:_mnt_data/ [sanlock@Virtualizador 192.168.2.122:_mnt_data]$ ls a459c008-9cee-4b00-9ff2-3189f1b595a4 [sanlock@Virtualizador 192.168.2.122:_mnt_data]$ cd a459c008-9cee-4b00-9ff2-3189f1b595a4/ [sanlock@Virtualizador a459c008-9cee-4b00-9ff2-3189f1b595a4]$ ls dom_md images master [sanlock@Virtualizador a459c008-9cee-4b00-9ff2-3189f1b595a4]$ When I'm doing the newgrp kvm I can now access the files without further problems. Before this I did a chmod -R 770 /mnt/data/ to revert back to the previous status of the NFS server
(In reply to Karim Boumedhel from comment #16) > hello, i m getting a similar error message when trying to add a brand new > storage FC domain on rhev 3.1 > find attached engine.log and vdsm.log when creating storage domain GES-CP-000 In this case I've determined that sanlock was started before being added to the correct groups. Restarting the service fixed the issue. # ps ax | grep sanlock 12402 ? SLsl 0:00 sanlock daemon -U sanlock -G sanlock # cat /proc/12402/status Name: sanlock Uid: 179 179 179 179 Gid: 179 179 179 179 Groups: 0 6 179 # service sanlock restart # ps ax | grep sanlock 15840 ? SLsl 0:00 sanlock daemon -U sanlock -G sanlock # cat /proc/15840/status Name: sanlock Uid: 179 179 179 179 Gid: 179 179 179 179 Groups: 0 6 36 107 179 This happened because the packages were installed in an unusual order: Jun 07 08:24:23 Installed: sanlock-2.6-2.el6.x86_64 ... Jun 07 10:28:54 Installed: vdsm-4.9-113.1.el6.x86_64 And the host was never rebooted since. This is unrelated to the bug.
As described in comment 15 this was a server side issue, after logging in the raspberry pi I noticed: # ps aux | grep mountd root 2426 0.0 0.8 3320 1528 ? Ss 16:05 0:00 /usr/sbin/rpc.mountd --manage-gids I edited /etc/default/nfs-kernel-server commenting out RPCMOUNTDOPTS: # grep RPCMOUNTDOPTS /etc/default/nfs-kernel-server #RPCMOUNTDOPTS=--manage-gids and I restarted the server: # /etc/init.d/nfs-kernel-server restart # ps aux | grep mount root 2912 0.0 0.6 3188 1200 ? Ss 17:03 0:00 /usr/sbin/rpc.mountd After that it has been possible to use the NFS storage domain. Some background on -g/--manage-gids: """ Accept requests from the kernel to map user id numbers into lists of group id numbers for use in access control. An NFS request will normally (except when using Kerberos or other crypto‐graphic authentication) contains a user-id and a list of group-ids. Due to a limitation in the NFS protocol, at most 16 groups ids can be listed. If you use the -g flag, then the list of group ids received from the client will be replaced by a list of group ids determined by an appropriate lookup on the server. Note that the 'primary' group id is not affected so a newgroup command on the client will still be effective. This function requires a Linux Kernel with version at least 2.6.21. """
This issue was a configuration problem on the server side (raspberry pi)