Bug 834486 - vdsm-gluster Error well attempting to mount Posix Complent OS of gluster.
Summary: vdsm-gluster Error well attempting to mount Posix Complent OS of gluster.
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: oVirt
Classification: Retired
Component: vdsm
Version: 3.1 RC
Hardware: Unspecified
OS: Linux
unspecified
high
Target Milestone: ---
: 3.3.4
Assignee: Bala.FA
QA Contact:
URL:
Whiteboard: storage
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-06-22 02:32 UTC by Robert Middleswarth
Modified: 2016-02-10 16:47 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-02-03 12:25:04 UTC
oVirt Team: Storage
Embargoed:


Attachments (Terms of Use)

Description Robert Middleswarth 2012-06-22 02:32:01 UTC
Description of problem:
Attempting to use the new feature of mounting vdsm storage I get an error.

Version-Release number of selected component (if applicable):
oVirt Engine Version: 3.1.0-3.8.el6 I have seen the same problem on F17 beta builds as well.

How reproducible:
Very.  Looking at what gets mounted.  The share is mounted as root but then attempts to be accessed as using the KVM user.

Steps to Reproduce:
1. After you have a volume create and attempt to mount it it errors out event though the volume is mouted read-only
  
Actual results:
Nohing shows as mounted in the engine.

Expected results:
It mounts the volume on all nodes and and shows in the interface

Additional info: Below is the VDSM log.

Thread-22012::DEBUG::2012-06-21 21:56:26,401::task::588::TaskManager.Task::(_updateState) Task=`6311f6d6-6c0f-48c7-81c4-e51747d7c7f6`::moving from state init -> state preparing
Thread-22012::INFO::2012-06-21 21:56:26,401::logUtils::37::dispatcher::(wrapper) Run and protect: repoStats(options=None)
Thread-22012::INFO::2012-06-21 21:56:26,402::logUtils::39::dispatcher::(wrapper) Run and protect: repoStats, Return response: {}
Thread-22012::DEBUG::2012-06-21 21:56:26,402::task::1172::TaskManager.Task::(prepare) Task=`6311f6d6-6c0f-48c7-81c4-e51747d7c7f6`::finished: {}
Thread-22012::DEBUG::2012-06-21 21:56:26,402::task::588::TaskManager.Task::(_updateState) Task=`6311f6d6-6c0f-48c7-81c4-e51747d7c7f6`::moving from state preparing -> state finished
Thread-22012::DEBUG::2012-06-21 21:56:26,402::resourceManager::809::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {}
Thread-22012::DEBUG::2012-06-21 21:56:26,402::resourceManager::844::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-22012::DEBUG::2012-06-21 21:56:26,403::task::978::TaskManager.Task::(_decref) Task=`6311f6d6-6c0f-48c7-81c4-e51747d7c7f6`::ref 0 aborting False
Thread-22016::DEBUG::2012-06-21 21:56:31,347::BindingXMLRPC::160::vds::(wrapper) [192.168.101.170]
Thread-22016::DEBUG::2012-06-21 21:56:31,348::task::588::TaskManager.Task::(_updateState) Task=`1baa92ad-b453-44f9-99b1-547becf1674d`::moving from state init -> state preparing
Thread-22016::INFO::2012-06-21 21:56:31,348::logUtils::37::dispatcher::(wrapper) Run and protect: validateStorageServerConnection(domType=6, spUUID='00000000-0000-0000-0000-000000000000', conList=[{'port': '', 'connection': '192.168.101.172:/ov_02_03', 'iqn': '', 'portal': '', 'user': '', 'vfs_type': 'glusterfs', 'password': '******', 'id': '43b36f2f-86c2-4fd4-b2a4-5c7d46189675'}], options=None)
Thread-22016::INFO::2012-06-21 21:56:31,348::logUtils::39::dispatcher::(wrapper) Run and protect: validateStorageServerConnection, Return response: {'statuslist': [{'status': 0, 'id': '43b36f2f-86c2-4fd4-b2a4-5c7d46189675'}]}
Thread-22016::DEBUG::2012-06-21 21:56:31,349::task::1172::TaskManager.Task::(prepare) Task=`1baa92ad-b453-44f9-99b1-547becf1674d`::finished: {'statuslist': [{'status': 0, 'id': '43b36f2f-86c2-4fd4-b2a4-5c7d46189675'}]}
Thread-22016::DEBUG::2012-06-21 21:56:31,349::task::588::TaskManager.Task::(_updateState) Task=`1baa92ad-b453-44f9-99b1-547becf1674d`::moving from state preparing -> state finished
Thread-22016::DEBUG::2012-06-21 21:56:31,349::resourceManager::809::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {}
Thread-22016::DEBUG::2012-06-21 21:56:31,349::resourceManager::844::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-22016::DEBUG::2012-06-21 21:56:31,349::task::978::TaskManager.Task::(_decref) Task=`1baa92ad-b453-44f9-99b1-547becf1674d`::ref 0 aborting False
Thread-22017::DEBUG::2012-06-21 21:56:31,377::BindingXMLRPC::160::vds::(wrapper) [192.168.101.170]
Thread-22017::DEBUG::2012-06-21 21:56:31,378::task::588::TaskManager.Task::(_updateState) Task=`67d2a3f7-f928-4e8a-88b1-ccfe909f887a`::moving from state init -> state preparing
Thread-22017::INFO::2012-06-21 21:56:31,378::logUtils::37::dispatcher::(wrapper) Run and protect: connectStorageServer(domType=6, spUUID='00000000-0000-0000-0000-000000000000', conList=[{'port': '', 'connection': '192.168.101.172:/ov_02_03', 'iqn': '', 'portal': '', 'user': '', 'vfs_type': 'glusterfs', 'password': '******', 'id': '43b36f2f-86c2-4fd4-b2a4-5c7d46189675'}], options=None)
Thread-22017::DEBUG::2012-06-21 21:56:31,402::__init__::1164::Storage.Misc.excCmd::(_log) '/usr/bin/sudo -n /bin/mount -t glusterfs 192.168.101.172:/ov_02_03 /rhev/data-center/mnt/192.168.101.172:_ov__02__03' (cwd None)
Thread-22017::DEBUG::2012-06-21 21:56:35,598::lvm::459::OperationMutex::(_invalidateAllPvs) Operation 'lvm invalidate operation' got the operation mutex
Thread-22017::DEBUG::2012-06-21 21:56:35,598::lvm::461::OperationMutex::(_invalidateAllPvs) Operation 'lvm invalidate operation' released the operation mutex
Thread-22017::DEBUG::2012-06-21 21:56:35,598::lvm::471::OperationMutex::(_invalidateAllVgs) Operation 'lvm invalidate operation' got the operation mutex
Thread-22017::DEBUG::2012-06-21 21:56:35,598::lvm::473::OperationMutex::(_invalidateAllVgs) Operation 'lvm invalidate operation' released the operation mutex
Thread-22017::DEBUG::2012-06-21 21:56:35,599::lvm::492::OperationMutex::(_invalidateAllLvs) Operation 'lvm invalidate operation' got the operation mutex
Thread-22017::DEBUG::2012-06-21 21:56:35,599::lvm::494::OperationMutex::(_invalidateAllLvs) Operation 'lvm invalidate operation' released the operation mutex
Thread-22017::INFO::2012-06-21 21:56:35,599::logUtils::39::dispatcher::(wrapper) Run and protect: connectStorageServer, Return response: {'statuslist': [{'status': 0, 'id': '43b36f2f-86c2-4fd4-b2a4-5c7d46189675'}]}
Thread-22017::DEBUG::2012-06-21 21:56:35,600::task::1172::TaskManager.Task::(prepare) Task=`67d2a3f7-f928-4e8a-88b1-ccfe909f887a`::finished: {'statuslist': [{'status': 0, 'id': '43b36f2f-86c2-4fd4-b2a4-5c7d46189675'}]}
Thread-22017::DEBUG::2012-06-21 21:56:35,600::task::588::TaskManager.Task::(_updateState) Task=`67d2a3f7-f928-4e8a-88b1-ccfe909f887a`::moving from state preparing -> state finished
Thread-22017::DEBUG::2012-06-21 21:56:35,600::resourceManager::809::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {}
Thread-22017::DEBUG::2012-06-21 21:56:35,600::resourceManager::844::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-22017::DEBUG::2012-06-21 21:56:35,600::task::978::TaskManager.Task::(_decref) Task=`67d2a3f7-f928-4e8a-88b1-ccfe909f887a`::ref 0 aborting False
Thread-22020::DEBUG::2012-06-21 21:56:35,773::BindingXMLRPC::160::vds::(wrapper) [192.168.101.170]
Thread-22020::DEBUG::2012-06-21 21:56:35,774::task::588::TaskManager.Task::(_updateState) Task=`d956cea1-6320-406b-ac48-4edf5089b4d4`::moving from state init -> state preparing
Thread-22020::INFO::2012-06-21 21:56:35,774::logUtils::37::dispatcher::(wrapper) Run and protect: createStorageDomain(storageType=6, sdUUID='edf700fd-5089-4bdb-89cf-61e4fc0da241', domainName='ov_02_03', typeSpecificArg='192.168.101.172:/ov_02_03', domClass=1, domVersion='0', options=None)
Thread-22020::DEBUG::2012-06-21 21:56:35,775::misc::1053::SamplingMethod::(__call__) Trying to enter sampling method (storage.sdc.refreshStorage)
Thread-22020::DEBUG::2012-06-21 21:56:35,775::misc::1055::SamplingMethod::(__call__) Got in to sampling method
Thread-22020::DEBUG::2012-06-21 21:56:35,775::misc::1053::SamplingMethod::(__call__) Trying to enter sampling method (storage.iscsi.rescan)
Thread-22020::DEBUG::2012-06-21 21:56:35,775::misc::1055::SamplingMethod::(__call__) Got in to sampling method
Thread-22020::DEBUG::2012-06-21 21:56:35,775::__init__::1164::Storage.Misc.excCmd::(_log) '/usr/bin/sudo -n /sbin/iscsiadm -m session -R' (cwd None)
Thread-22020::DEBUG::2012-06-21 21:56:35,792::__init__::1164::Storage.Misc.excCmd::(_log) FAILED: <err> = 'iscsiadm: No session found.\n'; <rc> = 21
Thread-22020::DEBUG::2012-06-21 21:56:35,793::misc::1063::SamplingMethod::(__call__) Returning last result
Thread-22020::DEBUG::2012-06-21 21:56:36,022::__init__::1164::Storage.Misc.excCmd::(_log) '/usr/bin/sudo -n /sbin/multipath' (cwd None)
Thread-22020::DEBUG::2012-06-21 21:56:36,092::__init__::1164::Storage.Misc.excCmd::(_log) SUCCESS: <err> = ''; <rc> = 0
Thread-22020::DEBUG::2012-06-21 21:56:36,093::lvm::459::OperationMutex::(_invalidateAllPvs) Operation 'lvm invalidate operation' got the operation mutex
Thread-22020::DEBUG::2012-06-21 21:56:36,093::lvm::461::OperationMutex::(_invalidateAllPvs) Operation 'lvm invalidate operation' released the operation mutex
Thread-22020::DEBUG::2012-06-21 21:56:36,093::lvm::471::OperationMutex::(_invalidateAllVgs) Operation 'lvm invalidate operation' got the operation mutex
Thread-22020::DEBUG::2012-06-21 21:56:36,094::lvm::473::OperationMutex::(_invalidateAllVgs) Operation 'lvm invalidate operation' released the operation mutex
Thread-22020::DEBUG::2012-06-21 21:56:36,094::lvm::492::OperationMutex::(_invalidateAllLvs) Operation 'lvm invalidate operation' got the operation mutex
Thread-22020::DEBUG::2012-06-21 21:56:36,094::lvm::494::OperationMutex::(_invalidateAllLvs) Operation 'lvm invalidate operation' released the operation mutex
Thread-22020::DEBUG::2012-06-21 21:56:36,094::misc::1063::SamplingMethod::(__call__) Returning last result
Thread-22020::DEBUG::2012-06-21 21:56:36,095::lvm::351::OperationMutex::(_reloadvgs) Operation 'lvm reload operation' got the operation mutex
Thread-22020::DEBUG::2012-06-21 21:56:36,096::__init__::1164::Storage.Misc.excCmd::(_log) '/usr/bin/sudo -n /sbin/lvm vgs --config " devices { preferred_names = [\\"^/dev/mapper/\\"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter = [ \\"r%.*%\\" ] }  global {  locking_type=1  prioritise_write_locks=1  wait_for_locks=1 }  backup {  retain_min = 50  retain_days = 0 } " --noheadings --units b --nosuffix --separator | -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free edf700fd-5089-4bdb-89cf-61e4fc0da241' (cwd None)
Thread-22020::DEBUG::2012-06-21 21:56:36,275::__init__::1164::Storage.Misc.excCmd::(_log) FAILED: <err> = '  Volume group "edf700fd-5089-4bdb-89cf-61e4fc0da241" not found\n'; <rc> = 5
Thread-22020::WARNING::2012-06-21 21:56:36,276::lvm::355::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 [] ['  Volume group "edf700fd-5089-4bdb-89cf-61e4fc0da241" not found']
Thread-22020::DEBUG::2012-06-21 21:56:36,278::lvm::378::OperationMutex::(_reloadvgs) Operation 'lvm reload operation' released the operation mutex
Thread-22023::DEBUG::2012-06-21 21:56:36,550::task::588::TaskManager.Task::(_updateState) Task=`eb92eae0-2bb6-4c53-8954-ddb2b6e0170a`::moving from state init -> state preparing
Thread-22023::INFO::2012-06-21 21:56:36,556::logUtils::37::dispatcher::(wrapper) Run and protect: repoStats(options=None)
Thread-22023::INFO::2012-06-21 21:56:36,571::logUtils::39::dispatcher::(wrapper) Run and protect: repoStats, Return response: {}
Thread-22023::DEBUG::2012-06-21 21:56:36,597::task::1172::TaskManager.Task::(prepare) Task=`eb92eae0-2bb6-4c53-8954-ddb2b6e0170a`::finished: {}
Thread-22023::DEBUG::2012-06-21 21:56:36,608::task::588::TaskManager.Task::(_updateState) Task=`eb92eae0-2bb6-4c53-8954-ddb2b6e0170a`::moving from state preparing -> state finished
Thread-22023::DEBUG::2012-06-21 21:56:36,619::resourceManager::809::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {}
Thread-22023::DEBUG::2012-06-21 21:56:36,641::resourceManager::844::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-22023::DEBUG::2012-06-21 21:56:36,675::task::978::TaskManager.Task::(_decref) Task=`eb92eae0-2bb6-4c53-8954-ddb2b6e0170a`::ref 0 aborting False
Thread-22020::INFO::2012-06-21 21:56:36,785::nfsSD::64::Storage.StorageDomain::(create) sdUUID=edf700fd-5089-4bdb-89cf-61e4fc0da241 domainName=ov_02_03 remotePath=192.168.101.172:/ov_02_03 domClass=1
Thread-22020::ERROR::2012-06-21 21:56:36,790::task::853::TaskManager.Task::(_setError) Task=`d956cea1-6320-406b-ac48-4edf5089b4d4`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 861, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 38, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 2136, in createStorageDomain
    typeSpecificArg, storageType, domVersion)
  File "/usr/share/vdsm/storage/nfsSD.py", line 75, in create
    cls._preCreateValidation(sdUUID, mntPoint, remotePath, version)
  File "/usr/share/vdsm/storage/nfsSD.py", line 46, in _preCreateValidation
    fileSD.validateDirAccess(domPath)
  File "/usr/share/vdsm/storage/fileSD.py", line 51, in validateDirAccess
    getProcPool().fileUtils.validateAccess(dirPath)
  File "/usr/share/vdsm/storage/processPool.py", line 63, in wrapper
    return self.runExternally(func, *args, **kwds)
  File "/usr/share/vdsm/storage/processPool.py", line 74, in runExternally
    return self._procPool.runExternally(*args, **kwargs)
  File "/usr/share/vdsm/storage/processPool.py", line 170, in runExternally
    raise err
StorageServerAccessPermissionError: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/192.168.101.172:_ov__02__03'
Thread-22020::DEBUG::2012-06-21 21:56:36,823::task::872::TaskManager.Task::(_run) Task=`d956cea1-6320-406b-ac48-4edf5089b4d4`::Task._run: d956cea1-6320-406b-ac48-4edf5089b4d4 (6, 'edf700fd-5089-4bdb-89cf-61e4fc0da241', 'ov_02_03', '192.168.101.172:/ov_02_03', 1, '0') {} failed - stopping task
Thread-22020::DEBUG::2012-06-21 21:56:36,823::task::1199::TaskManager.Task::(stop) Task=`d956cea1-6320-406b-ac48-4edf5089b4d4`::stopping in state preparing (force False)
Thread-22020::DEBUG::2012-06-21 21:56:36,823::task::978::TaskManager.Task::(_decref) Task=`d956cea1-6320-406b-ac48-4edf5089b4d4`::ref 1 aborting True
Thread-22020::INFO::2012-06-21 21:56:36,823::task::1157::TaskManager.Task::(prepare) Task=`d956cea1-6320-406b-ac48-4edf5089b4d4`::aborting: Task is aborted: 'Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.' - code 469
Thread-22020::DEBUG::2012-06-21 21:56:36,824::task::1162::TaskManager.Task::(prepare) Task=`d956cea1-6320-406b-ac48-4edf5089b4d4`::Prepare: aborted: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.
Thread-22020::DEBUG::2012-06-21 21:56:36,824::task::978::TaskManager.Task::(_decref) Task=`d956cea1-6320-406b-ac48-4edf5089b4d4`::ref 0 aborting True
Thread-22020::DEBUG::2012-06-21 21:56:36,824::task::913::TaskManager.Task::(_doAbort) Task=`d956cea1-6320-406b-ac48-4edf5089b4d4`::Task._doAbort: force False
Thread-22020::DEBUG::2012-06-21 21:56:36,824::resourceManager::844::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-22020::DEBUG::2012-06-21 21:56:36,824::task::588::TaskManager.Task::(_updateState) Task=`d956cea1-6320-406b-ac48-4edf5089b4d4`::moving from state preparing -> state aborting
Thread-22020::DEBUG::2012-06-21 21:56:36,825::task::537::TaskManager.Task::(__state_aborting) Task=`d956cea1-6320-406b-ac48-4edf5089b4d4`::_aborting: recover policy none
Thread-22020::DEBUG::2012-06-21 21:56:36,825::task::588::TaskManager.Task::(_updateState) Task=`d956cea1-6320-406b-ac48-4edf5089b4d4`::moving from state aborting -> state failed
Thread-22020::DEBUG::2012-06-21 21:56:36,825::resourceManager::809::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {}
Thread-22020::DEBUG::2012-06-21 21:56:36,825::resourceManager::844::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-22020::ERROR::2012-06-21 21:56:36,826::dispatcher::66::Storage.Dispatcher.Protect::(run) {'status': {'message': "Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/192.168.101.172:_ov__02__03'", 'code': 469}}
Thread-22030::DEBUG::2012-06-21 21:56:46,864::task::588::TaskManager.Task::(_updateState) Task=`e7c1e5e8-91e2-4cee-a9ae-4eb397bccd12`::moving from state init -> state preparing
Thread-22030::INFO::2012-06-21 21:56:46,865::logUtils::37::dispatcher::(wrapper) Run and protect: repoStats(options=None)
Thread-22030::INFO::2012-06-21 21:56:46,865::logUtils::39::dispatcher::(wrapper) Run and protect: repoStats, Return response: {}
Thread-22030::DEBUG::2012-06-21 21:56:46,865::task::1172::TaskManager.Task::(prepare) Task=`e7c1e5e8-91e2-4cee-a9ae-4eb397bccd12`::finished: {}
Thread-22030::DEBUG::2012-06-21 21:56:46,865::task::588::TaskManager.Task::(_updateState) Task=`e7c1e5e8-91e2-4cee-a9ae-4eb397bccd12`::moving from state preparing -> state finished
Thread-22030::DEBUG::2012-06-21 21:56:46,865::resourceManager::809::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {}
Thread-22030::DEBUG::2012-06-21 21:56:46,865::resourceManager::844::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-22030::DEBUG::2012-06-21 21:56:46,866::task::978::TaskManager.Task::(_decref) Task=`e7c1e5e8-91e2-4cee-a9ae-4eb397bccd12`::ref 0 aborting False
^C

Comment 1 Robert Middleswarth 2012-06-22 04:23:00 UTC
Workaround

1. Attempt to create a new sharefs domain.
2. After it errors out.
3. SSH into the box that failed.
4. chown -R 36.36 /rhev/data-center/mnt/
5. Reattempt to create the same sharefs domain.
6. Mounts and mounts on all boxes.
7. Enjoy.

Thanks
Robert

Comment 2 Robert Middleswarth 2012-06-22 13:01:58 UTC
Well it kinda fixed it in my case.  Posix FS mounts but I keep looping though my hosts having SPM elections.  It will finish the election go active for a min then another host will start contending repeat forever.

Thanks
Robert

Comment 3 Ayal Baron 2012-06-22 20:03:06 UTC
(In reply to comment #1)
> Workaround
> 
> 1. Attempt to create a new sharefs domain.
> 2. After it errors out.
> 3. SSH into the box that failed.
> 4. chown -R 36.36 /rhev/data-center/mnt/

This is only possible if you don't use root squashing.
What you should really do is change the ownership of the export on the gluster side to be 36:36

In any event this is not a bug.


(In reply to comment #2)
> Well it kinda fixed it in my case.  Posix FS mounts but I keep looping
> though my hosts having SPM elections.  It will finish the election go active
> for a min then another host will start contending repeat forever.

Please attach logs (easiest is to use log collector and make sure you collect from host as well, otherwise - engine, vdsm and spm-lock.log)

Comment 4 Robert Middleswarth 2012-06-22 21:45:59 UTC
> This is only possible if you don't use root squashing.
> What you should really do is change the ownership of the export on the gluster > side to be 36:36

How does one use root squashing in gluster I haven't foundd any documentation on the process?  I am not using NFS I am using posix fs.  Also shouldn't the ovirt enginue created volume on vdsm setup nodes default to that?

As for the SPM looping though I just rebuild node one using F17 to make sure it wasn't a CentOS build vers F17 issue.  But it is still looping though the system.  I will get the logs together and attach them.

Thanks
Robert

Comment 5 Itamar Heim 2013-02-03 12:25:04 UTC
Closing old bugs. If this issue is still relevant/important in current version, please re-open the bug.


Note You need to log in before you can comment on or make changes to this bug.