Hide Forgot
Created attachment 1151424 [details] vdsm.log Description of problem: I can successfully create an ISO-Domain on my remote NFS Server. However I can not attach it to a local storage DC (See Log excerpt below). Version-Release number of selected component (if applicable): I'm already running a snapshot 3.6 release because of BZ 1329317 Here are all versions, engine: rpm -qa | grep -e "vdsm\|ovirt" ebay-cors-filter-1.0.1-0.1.ovirt.el7.noarch ovirt-log-collector-3.6.0-1.el7.centos.noarch vdsm-jsonrpc-java-1.1.9-1.el7.centos.noarch ovirt-engine-lib-3.6.6-0.0.master.20160421175934.git6e24235.el7.centos.noarch ovirt-engine-setup-plugin-ovirt-engine-3.6.6-0.0.master.20160421175934.git6e24235.el7.centos.noarch ovirt-engine-extension-aaa-jdbc-1.0.8-0.0.master.20160415165902.gitca9be7e.el7.noarch ovirt-engine-websocket-proxy-3.6.6-0.0.master.20160421175934.git6e24235.el7.centos.noarch ovirt-host-deploy-java-1.4.2-0.0.master.20151122153544.gitfc808fc.el7.noarch ovirt-engine-tools-3.6.6-0.0.master.20160421175934.git6e24235.el7.centos.noarch ovirt-vmconsole-proxy-1.0.0-1.el7.centos.noarch ovirt-release36-002-2.noarch ovirt-engine-wildfly-8.2.1-1.el7.x86_64 ovirt-engine-extension-aaa-ldap-setup-1.1.2-1.el7.centos.noarch ovirt-engine-wildfly-overlay-8.0.5-1.el7.noarch ovirt-release36-snapshot-007-1.noarch ovirt-engine-setup-base-3.6.6-0.0.master.20160421175934.git6e24235.el7.centos.noarch ovirt-engine-setup-plugin-vmconsole-proxy-helper-3.6.6-0.0.master.20160421175934.git6e24235.el7.centos.noarch ovirt-engine-setup-3.6.6-0.0.master.20160421175934.git6e24235.el7.centos.noarch ovirt-engine-extensions-api-impl-3.6.6-0.0.master.20160421175934.git6e24235.el7.centos.noarch ovirt-iso-uploader-3.6.1-0.0.master.20160414111648.gitd2aea1a.el7.noarch ovirt-engine-vmconsole-proxy-helper-3.6.6-0.0.master.20160421175934.git6e24235.el7.centos.noarch ovirt-engine-cli-3.6.2.1-0.1.20160111.git696d8ea.el7.centos.noarch ovirt-host-deploy-1.4.2-0.0.master.20151122153544.gitfc808fc.el7.noarch ovirt-engine-restapi-3.6.6-0.0.master.20160421175934.git6e24235.el7.centos.noarch ovirt-engine-webadmin-portal-3.6.6-0.0.master.20160421175934.git6e24235.el7.centos.noarch ovirt-engine-userportal-3.6.6-0.0.master.20160421175934.git6e24235.el7.centos.noarch ovirt-engine-backend-3.6.6-0.0.master.20160421175934.git6e24235.el7.centos.noarch ovirt-setup-lib-1.0.1-1.el7.centos.noarch ovirt-vmconsole-1.0.0-1.el7.centos.noarch ovirt-engine-extension-aaa-ldap-1.1.2-1.el7.centos.noarch ovirt-engine-sdk-python-3.6.2.1-1.el7.centos.noarch ovirt-engine-setup-plugin-ovirt-engine-common-3.6.6-0.0.master.20160421175934.git6e24235.el7.centos.noarch ovirt-engine-setup-plugin-websocket-proxy-3.6.6-0.0.master.20160421175934.git6e24235.el7.centos.noarch ovirt-engine-tools-backup-3.6.6-0.0.master.20160421175934.git6e24235.el7.centos.noarch ovirt-image-uploader-3.6.1-0.0.master.20151006154111.git95ce637.el7.centos.noarch ovirt-engine-dbscripts-3.6.6-0.0.master.20160421175934.git6e24235.el7.centos.noarch ovirt-engine-3.6.6-0.0.master.20160421175934.git6e24235.el7.centos.noarch vdsm host: rpm -qa | grep -e "vdsm\|ovirt" vdsm-4.17.23-0.el7.centos.noarch vdsm-infra-4.17.23-0.el7.centos.noarch vdsm-xmlrpc-4.17.23-0.el7.centos.noarch vdsm-yajsonrpc-4.17.23-0.el7.centos.noarch vdsm-hook-vmfex-dev-4.17.23-0.el7.centos.noarch vdsm-cli-4.17.23-0.el7.centos.noarch vdsm-python-4.17.23-0.el7.centos.noarch ovirt-vmconsole-1.0.0-1.el7.centos.noarch vdsm-jsonrpc-4.17.23-0.el7.centos.noarch ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch How reproducible: always Steps to Reproduce: 1. 2. 3. Actual results: can't attach iso domain Expected results: can mount iso domain Additional info: this is no configuration error on my side (imho). I can mount the NFS-Share fine manually on the vdsm host: mount 10.210.1.6:/home/iso1 /tmp/testmount/ l /tmp/testmount/ total 16 drwxr-xr-x 4 nobody nobody 4096 2016-04-27 16:28 ./ drwxrwxrwt. 9 root root 4096 2016-04-27 16:30 ../ drwxr-xr-x 4 nobody nobody 4096 2016-04-27 16:28 9ba22bfe-d760-48c9-b4a5-10149cb4ecb9/ -rwxr-xr-x 1 nobody nobody 0 2016-04-27 16:31 __DIRECT_IO_TEST__* drwx------ 2 root root 4096 2015-01-20 14:58 lost+found/ Excerpt from engine.log: 2016-04-27 16:31:14,812 INFO [org.ovirt.engine.core.bll.storage.AttachStorageDomainToPoolCommand] (default task-10) [5aa023a7] Lock Acquired to object 'EngineLock:{exclusiveLocks='[9ba22bfe-d760-48c9-b4a5-10149 cb4ecb9=<STORAGE, ACTION_TYPE_FAILED_OBJECT_LOCKED>]', sharedLocks='null'}' 2016-04-27 16:31:14,943 INFO [org.ovirt.engine.core.bll.storage.AttachStorageDomainToPoolCommand] (org.ovirt.thread.pool-8-thread-31) [5aa023a7] Running command: AttachStorageDomainToPoolCommand internal: false . Entities affected : ID: 9ba22bfe-d760-48c9-b4a5-10149cb4ecb9 Type: StorageAction group MANIPULATE_STORAGE_DOMAIN with role type ADMIN, ID: 76cd7003-cf29-461a-b4d0-ff278b680cb5 Type: StoragePoolAction group M ANIPULATE_STORAGE_DOMAIN with role type ADMIN 2016-04-27 16:31:14,996 INFO [org.ovirt.engine.core.bll.storage.ConnectStorageToVdsCommand] (org.ovirt.thread.pool-8-thread-44) [110400ae] Running command: ConnectStorageToVdsCommand internal: true. Entities af fected : ID: aaa00000-0000-0000-0000-123456789aaa Type: SystemAction group CREATE_STORAGE_DOMAIN with role type ADMIN 2016-04-27 16:31:14,999 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (org.ovirt.thread.pool-8-thread-44) [110400ae] START, ConnectStorageServerVDSCommand(HostName = vrnode0015 , StorageServerConnectionManagementVDSParameters:{runAsync='true', hostId='0dab6a20-e772-4cb4-8b7b-53d2d4757c1b', storagePoolId='00000000-0000-0000-0000-000000000000', storageType='NFS', connectionList='[Storage ServerConnections:{id='661332a4-a897-4e04-b6bc-91c88d71b3cf', connection='10.210.1.6:/home/iso1', iqn='null', vfsType='null', mountOptions='null', nfsVersion='V4', nfsRetrans='null', nfsTimeo='null', iface='null ', netIfaceName='null'}]'}), log id: 183c6dbd 2016-04-27 16:31:15,016 INFO [org.ovirt.engine.core.bll.aaa.LoginUserCommand] (default task-12) [] Running command: LoginUserCommand internal: false. 2016-04-27 16:31:15,020 INFO [org.ovirt.engine.core.bll.aaa.LogoutSessionCommand] (default task-12) [5e3d9501] Running command: LogoutSessionCommand internal: false. 2016-04-27 16:31:15,031 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (org.ovirt.thread.pool-8-thread-44) [110400ae] FINISH, ConnectStorageServerVDSCommand, return: {661332a4-a 897-4e04-b6bc-91c88d71b3cf=0}, log id: 183c6dbd 2016-04-27 16:31:15,032 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.AttachStorageDomainVDSCommand] (org.ovirt.thread.pool-8-thread-31) [5aa023a7] START, AttachStorageDomainVDSCommand( AttachStorageDomainVDS CommandParameters:{runAsync='true', storagePoolId='76cd7003-cf29-461a-b4d0-ff278b680cb5', ignoreFailoverLimit='false', storageDomainId='9ba22bfe-d760-48c9-b4a5-10149cb4ecb9'}), log id: 48a0aa2e 2016-04-27 16:31:15,065 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-12) [5e3d9501] Correlation ID: 5e3d9501, Call Stack: null, Custom Event ID: -1, Message: User ad min@internal logged out. 2016-04-27 16:31:15,251 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.AttachStorageDomainVDSCommand] (org.ovirt.thread.pool-8-thread-31) [5aa023a7] Failed in 'AttachStorageDomainVDS' method 2016-04-27 16:31:15,278 INFO [org.ovirt.engine.core.bll.aaa.LoginUserCommand] (default task-4) [] Running command: LoginUserCommand internal: false. 2016-04-27 16:31:15,288 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-8-thread-31) [5aa023a7] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Me ssage: VDSM command failed: Cannot obtain lock: u"id=9ba22bfe-d760-48c9-b4a5-10149cb4ecb9, rc=1, out=[], err=['sudo: a password is required']" 2016-04-27 16:31:15,288 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.AttachStorageDomainVDSCommand] (org.ovirt.thread.pool-8-thread-31) [5aa023a7] Command 'AttachStorageDomainVDSCommand( AttachStorageDomainV DSCommandParameters:{runAsync='true', storagePoolId='76cd7003-cf29-461a-b4d0-ff278b680cb5', ignoreFailoverLimit='false', storageDomainId='9ba22bfe-d760-48c9-b4a5-10149cb4ecb9'})' execution failed: IRSGenericExce ption: IRSErrorException: Failed to AttachStorageDomainVDS, error = Cannot obtain lock: u"id=9ba22bfe-d760-48c9-b4a5-10149cb4ecb9, rc=1, out=[], err=['sudo: a password is required']", code = 651 2016-04-27 16:31:15,288 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.AttachStorageDomainVDSCommand] (org.ovirt.thread.pool-8-thread-31) [5aa023a7] FINISH, AttachStorageDomainVDSCommand, log id: 48a0aa2e 2016-04-27 16:31:15,288 ERROR [org.ovirt.engine.core.bll.storage.AttachStorageDomainToPoolCommand] (org.ovirt.thread.pool-8-thread-31) [5aa023a7] Command 'org.ovirt.engine.core.bll.storage.AttachStorageDomainToP oolCommand' failed: EngineException: org.ovirt.engine.core.vdsbroker.irsbroker.IrsOperationFailedNoFailoverException: IRSGenericException: IRSErrorException: Failed to AttachStorageDomainVDS, error = Cannot obtain lock: u"id=9ba22bfe-d760-48c9-b4a5-10149cb4ecb9, rc=1, out=[], err=['sudo: a password is required']", code = 651 (Failed with error AcquireLockFailure and code 651) 2016-04-27 16:31:15,289 INFO [org.ovirt.engine.core.bll.storage.AttachStorageDomainToPoolCommand] (org.ovirt.thread.pool-8-thread-31) [5aa023a7] Command [id=da717d90-6f46-4f7c-8711-c65d0abea948]: Compensating NEW_ENTITY_ID of org.ovirt.engine.core.common.businessentities.StoragePoolIsoMap; snapshot: StoragePoolIsoMapId:{storagePoolId='76cd7003-cf29-461a-b4d0-ff278b680cb5', storageId='9ba22bfe-d760-48c9-b4a5-10149cb4ecb9'}. 2016-04-27 16:31:15,318 INFO [org.ovirt.engine.core.bll.aaa.LogoutSessionCommand] (default task-4) [14a048ad] Running command: LogoutSessionCommand internal: false. 2016-04-27 16:31:15,329 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-8-thread-31) [5aa023a7] Correlation ID: 5aa023a7, Job ID: f2c8a361-23ef-4750-b919-acac205d59e7, Call Stack: null, Custom Event ID: -1, Message: Failed to attach Storage Domain iso1 to Data Center vrd0015. (User: admin@internal) 2016-04-27 16:31:15,354 INFO [org.ovirt.engine.core.bll.storage.AttachStorageDomainToPoolCommand] (org.ovirt.thread.pool-8-thread-31) [5aa023a7] Lock freed to object 'EngineLock:{exclusiveLocks='[9ba22bfe-d760-48c9-b4a5-10149cb4ecb9=<STORAGE, ACTION_TYPE_FAILED_OBJECT_LOCKED>]', sharedLocks='null'}' I have attached both engine and vdsm log.
Created attachment 1151425 [details] engine.log
I also did run the nfs-check.py from vdsm/contrib repo described here: https://www.ovirt.org/documentation/how-to/troubleshooting/troubleshooting-nfs-storage-issues/ result: python ./nfs-check.py 10.210.1.6:/home/iso1 Current hostname: vrnode0015.vrootdev - IP addr 10.210.2.20 Trying to /bin/mount -t nfs 10.210.1.6:/home/iso1... Executing NFS tests.. Removing vdsmTest file.. Status of tests [OK] Disconnecting from NFS Server.. Done! So I really don't know what's the problem. Maybe this is related to BZ 1320128 ? There seem to be some network issues in between Connections from engine to host (in engine and/or vdsm code, nothing related to the infrastructure). kind regards Sven
Relevant error from VDSM: jsonrpc.Executor/2::DEBUG::2016-04-27 16:28:55,378::clusterlock::140::Storage.Misc.excCmd::(acquire) /usr/bin/taskset --cpu-list 0-47 /usr/bin/sudo -n /usr/bin/setsid /usr/bin/ionice -c 1 -n 0 /usr/bin/su vdsm -s /bin/sh -c '/usr/libexec/vdsm/spmprotect.sh start 9ba22bfe-d760-48c9-b4a5-10149cb4ecb9 1 5 /rhev/data-center/mnt/10.210.1.6:_home_iso1/9ba22bfe-d760-48c9-b4a5-10149cb4ecb9/dom_md/leases 60000 10000 3 17433' (cwd /usr/libexec/vdsm) jsonrpc.Executor/2::DEBUG::2016-04-27 16:28:55,387::clusterlock::140::Storage.Misc.excCmd::(acquire) FAILED: <err> = 'sudo: a password is required\n'; <rc> = 1 jsonrpc.Executor/2::ERROR::2016-04-27 16:28:55,387::task::866::Storage.TaskManager.Task::(_setError) Task=`9cf18f8d-c6a8-4e4a-ac8e-d9c4d786d3ae`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 873, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 49, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 1210, in attachStorageDomain pool.attachSD(sdUUID) File "/usr/share/vdsm/storage/securable.py", line 77, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 943, in attachSD dom.acquireClusterLock(self.id) File "/usr/share/vdsm/storage/sd.py", line 565, in acquireClusterLock self._clusterLock.acquire(hostID) File "/usr/share/vdsm/storage/clusterlock.py", line 142, in acquire raise se.AcquireLockFailure(self._sdUUID, rc, out, err) AcquireLockFailure: Cannot obtain lock: u"id=9ba22bfe-d760-48c9-b4a5-10149cb4ecb9, rc=1, out=[], err=['sudo: a password is required']" Can you please share your /etc/sudoers.d/50_vdsm file?
And the /etc/sudoers file too please, actually.
Created attachment 1151722 [details] /etc/sudoers from affected host running vdsm This is the stock centos 7.2 sudoers file, I did not alter it
Created attachment 1151735 [details] 50_vdsm sudoer file, actually altered by me I added the section Cmnd_Alias VDSM_NETWORK because I run a hook which uses "tc" for advanced traffic control of vms. I hope this isn't problematic, this works with older vdsm and ovirt-engine versions.
Seems to be in order [unfortunately :-(]. Nir - can you take a look?
Bug tickets must have version flags set prior to targeting them to a release. Please ask maintainer to set the correct version flags and only then set the target milestone.
Sven, In your vdsm log, we see that vdsm ask to run this command with sudo: /usr/bin/setsid /usr/bin/ionice -c 1 -n 0 /usr/bin/su vdsm -s /bin/sh -c '/usr/libexec/vdsm/spmprotect.sh ... But your /etc/sudoers.d/50_vdsm will allow only: /usr/bin/setsid /usr/bin/ionice -c ? -n ? /bin/su vdsm -s /bin/sh -c /usr/libexec/vdsm/spmprotect.sh Note "/bin/su" vs "/usr/bin/su" Can you share the output of: grep 'EXT_SU\b' /usr/lib/python2.7/site-packages/vdsm/constants.py Did you modify the 50_vdsm file installed by vdsm, or copied from another machine? I don't think modifying files installed by vdsm is a good idea. Your file will be overwritten once you upgrade vdsm. I think you want to put your private sudoers configuration in another file like: /etc/sudoers.d/51_vdsm_hook_name The contents would some somthing like this: vdsm ALL=(ALL) NOPASSWD: /sbin/tc *
(In reply to Nir Soffer from comment #9) > Sven, > > In your vdsm log, we see that vdsm ask to run this command with sudo: > > /usr/bin/setsid /usr/bin/ionice -c 1 -n 0 /usr/bin/su vdsm -s /bin/sh -c > '/usr/libexec/vdsm/spmprotect.sh ... > > But your /etc/sudoers.d/50_vdsm will allow only: > > /usr/bin/setsid /usr/bin/ionice -c ? -n ? /bin/su vdsm -s /bin/sh -c > /usr/libexec/vdsm/spmprotect.sh > > Note "/bin/su" vs "/usr/bin/su" > > Can you share the output of: > > grep 'EXT_SU\b' /usr/lib/python2.7/site-packages/vdsm/constants.py > > Did you modify the 50_vdsm file installed by vdsm, or copied from another > machine? In fact I just checked the config management (I do basic setup with cobbler and later configuration via salt): It seems my colleagues found it a good idea to overwrite the whole vdsm config with an old one from vdsm-4.13.4-0, from EL6 machines hence the error. > I don't think modifying files installed by vdsm is a good idea. Your file > will be overwritten once you upgrade vdsm. > > I think you want to put your private sudoers configuration in another file > like: > > /etc/sudoers.d/51_vdsm_hook_name > > The contents would some somthing like this: > > vdsm ALL=(ALL) NOPASSWD: /sbin/tc * I will adapt your suggestion and put my own config in my own file and not fiddle with vdsm's sudoers file anymore. In fact on EL7 su is located under /usr/bin/su and tc is located under /usr/sbin/tc and the stock 50_vdsm sudoers file has the correct path to "su" in it, I just checked. I just reinstalled the host with your suggested changes to my sudoers files (taking stock 50_vdsm sudoers and my changes in a separate file) and I can confirm that iso domain attaching works again. Thanks for your great help and support, it's really appreciated! :) And sorry for all the noise, as this was apparently a configuration mistake on my side from the very beginning. kind regards Sven