Bug 922744
Summary: | [RFE][HC] – allow gluster mount with additional nodes, currently only one gluster hosts is mounted. | ||
---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Rejy M Cyriac <rcyriac> |
Component: | vdsm | Assignee: | Ala Hino <ahino> |
Status: | CLOSED ERRATA | QA Contact: | Kevin Alon Goldblatt <kgoldbla> |
Severity: | medium | Docs Contact: | |
Priority: | high | ||
Version: | 3.1.0 | CC: | acanan, ahino, amureini, asriram, bazulay, bsettle, chrisw, csaba, danken, fsimonce, grajaiya, hchiramm, iheim, kgoldbla, lpeer, rbalakri, rcyriac, Rhev-m-bugs, rhs-bugs, rwheeler, sankarshan, sbonazzo, scohen, smohan, tnisan, yeylon, ylavi |
Target Milestone: | ovirt-3.6.0-rc | Keywords: | FutureFeature, Improvement |
Target Release: | 3.6.0 | Flags: | scohen:
Triaged+
|
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Release Note | |
Doc Text: |
With this change, the glusterfs-cli package must be installed. Please note that in the VDSM spec file, there is no dependency to glusterfs-cli. This means that VDSM installation will succeed even if glustefs-cli is not installed, and that the glusterfs-cli package must be installed manually.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2016-03-09 19:17:18 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1177776, 1177777 | ||
Bug Blocks: | 1175354 |
Description
Rejy M Cyriac
2013-03-18 12:17:11 UTC
can you upload the logs (glusterfs specific)? (we want to see the 'command-line' option in logs to understand whats wrong. Additional Info: [root@rhs-client45 ~]# gluster peer status Number of Peers: 3 Hostname: rhs-client37.lab.eng.blr.redhat.com Port: 24007 Uuid: 1f13b836-1bf9-4df2-ba42-c5bdf12a3c54 State: Peer in Cluster (Connected) Hostname: rhs-client15.lab.eng.blr.redhat.com Port: 24007 Uuid: 6d82cb77-39cf-48c6-8c9d-dfee5cebf30a State: Peer in Cluster (Disconnected) Hostname: rhs-client10.lab.eng.blr.redhat.com Port: 24007 Uuid: c1eb3946-771a-4347-b771-4722b2e4321d State: Peer in Cluster (Connected) [root@rhs-client45 ~]# gluster volume info Volume Name: RHS_vmstore Type: Distribute Volume ID: a39f5dd4-104d-4812-bbd3-3f7d7dd40b92 Status: Started Number of Bricks: 12 Transport-type: tcp Bricks: Brick1: rhs-client45.lab.eng.blr.redhat.com:/brick1 Brick2: rhs-client37.lab.eng.blr.redhat.com:/brick1 Brick3: rhs-client15.lab.eng.blr.redhat.com:/brick1 Brick4: rhs-client10.lab.eng.blr.redhat.com:/brick1 Brick5: rhs-client45.lab.eng.blr.redhat.com:/brick2 Brick6: rhs-client37.lab.eng.blr.redhat.com:/brick2 Brick7: rhs-client15.lab.eng.blr.redhat.com:/brick2 Brick8: rhs-client10.lab.eng.blr.redhat.com:/brick2 Brick9: rhs-client45.lab.eng.blr.redhat.com:/brick3 Brick10: rhs-client37.lab.eng.blr.redhat.com:/brick3 Brick11: rhs-client15.lab.eng.blr.redhat.com:/brick3 Brick12: rhs-client10.lab.eng.blr.redhat.com:/brick3 Options Reconfigured: storage.owner-gid: 36 storage.owner-uid: 36 network.remote-dio: on cluster.eager-lock: enable performance.stat-prefetch: off performance.io-cache: off performance.read-ahead: off performance.quick-read: off Please note: The volume was 6x2 Distribute-Replicate when the issue was first noticed and reported. When I reproduced the issue again, for which the information is given above, I used a 12 Brick Distribute Volume. So the issue occurs regardless of volume type. The logs for the failures in both the volume types must be available in the Hypervisor sosreports attached, since the Hypervisors were not changed for the tests. (In reply to comment #0) > > However, the mount option works as expected, when used manually on the hosts. > Rejy, I'm a little confused by this. So you mean when you use the 'backupvolfile-server' option with a mount command (from cli) on the same hosts it works, but when used from the RHEV ui it doesn't? Can you clarify this? (In reply to comment #8) > (In reply to comment #0) > > > > > However, the mount option works as expected, when used manually on the hosts. > > > > Rejy, > > I'm a little confused by this. So you mean when you use the > 'backupvolfile-server' option with a mount command (from cli) on the same > hosts it works, but when used from the RHEV ui it doesn't? > > Can you clarify this? Your statement is correct. And from our discussion offline, there is a possibility that this could be caused by error in processing by RHEV-M. So I am going to raise this BZ issue in rhev-gluster mailing list, and see if we can get some RHEV guy to have a look at it. I will keep you in 'cc'. Cheers! rejy (rmc) '/usr/bin/sudo -n /bin/mount -t glusterfs -o backupvolfile-server=rhs-client37.lab.eng.blr.redhat.com rhs-client45.lab.eng.blr.redhat.com:/RHS_vmstore /rhev/data-center/mnt/rhs-client45.lab.eng.blr.redhat.com:_RHS__vmstore' (cwd None) If above is the mount, vdsm/rhev has respected it . Thread-227::INFO::2013-03-18 20:32:57,933::logUtils::37::dispatcher::(wrapper) Run and protect: connectStorageServer(domType=6, spUUID='00000000-0000-0000-0000-000000000000', conList=[{'port': '', 'connection': 'rhs-client37.lab.eng.blr.redhat.com:/RHS_vmstore', 'mnt_options': 'backupvolfile-server=rhs-client45.lab.eng.blr.redhat.com', 'portal': '', 'user': '', 'iqn': '', 'vfs_type': 'glusterfs', 'password': '******', 'id': '00000000-0000-0000-0000-000000000000'}], options=None) Thread-227::DEBUG::2013-03-18 20:32:57,953::misc::83::Storage.Misc.excCmd::(<lambda>) '/usr/bin/sudo -n /bin/mount -t glusterfs -o backupvolfile-server=rhs-client45.lab.eng.blr.redhat.com rhs-client37.lab.eng.blr.redhat.com:/RHS_vmstore /rhev/data-center/mnt/rhs-client37.lab.eng.blr.redhat.com:_RHS__vmstore' (cwd None) Thread-227::ERROR::2013-03-18 20:33:02,307::hsm::2241::Storage.HSM::(connectStorageServer) Could not connect to storageServer Traceback (most recent call last): File "/usr/share/vdsm/storage/hsm.py", line 2237, in connectStorageServer conObj.connect() File "/usr/share/vdsm/storage/storageServer.py", line 208, in connect fileSD.validateDirAccess(self.getMountObj().getRecord().fs_file) File "/usr/share/vdsm/storage/mount.py", line 244, in getRecord (self.fs_spec, self.fs_file)) OSError: [Errno 2] Mount of `rhs-client37.lab.eng.blr.redhat.com:/RHS_vmstore` at `/rhev/data-center/mnt/rhs-client37.lab.eng.blr.redhat.com:_RHS__vmstore` does not exist ^^^^ So, above is the story. but, unfortunately, the area of code path seems very vague to find out the exact cause of the issue. how-ever looking further on it. It needs further inspection, but this smells like a dup of bug 883877. (In reply to comment #12) > It needs further inspection, but this smells like a dup of bug 883877. Indeed.. The current strings are not useful. We should have some more detailed error strings as mentioned ( http://gerrit.ovirt.org/#/c/12042/2/vdsm/storage/storageServer.py) in 883877. Is this env still exist ? if yes, we could include above mentioned patch and get more detailed error which can help us to find out the root cause. (In reply to comment #14) > Is this env still exist ? if yes, we could include above mentioned patch and > get more detailed error which can help us to find out the root cause. The exact set-up based on which the issue was reported has been dismantled. However the issue is easily reproducible in my current set-up as well. I do not know how to apply the patch to get the detailed error. I can provide access to my set-up if you think it will be helpful to debug the issue. FYI- It appears that the other mount options of glusterfs, are not being honoured, before they reach the 'mount' call. Have opened another one for it - BZ 927262 Referring to Comment 12 : It may be possible to get more information about the cause of the issue by using the patch, as given at https://bugzilla.redhat.com/show_bug.cgi?id=883877#c10 Moving the state to ON_QA as per comment 18 (In reply to comment #19) > Moving the state to ON_QA as per comment 18 The patch provided by the RHEV team is only to provide more debugging info, and DOES NOT resolve the issue. I believe that the Devel team needs to use the patch, and diagnose the issue from the extra debugging info available, and work to resolve the issue. Since the reported issue is still unresolved, moving back to ASSIGNED. Rejy, From the updates on the related bug(927262) you'd filed, it is clear that this is not a glusterfs/RHS issue. It would make sense to either close this bug as notabug (this is not an RHS bug) or move the bug to the correct product. What do you think would be the better option? - Kaushal (In reply to comment #21) > Rejy, > From the updates on the related bug(927262) you'd filed, it is clear that > this is not a glusterfs/RHS issue. It would make sense to either close this > bug as notabug (this is not an RHS bug) or move the bug to the correct > product. > > What do you think would be the better option? > > - Kaushal Kaushal, I think that moving this BZ to the correct product and component would be the best option. I will set the product and component similar to Bug 927262. I will also send out a mail about this BZ, to the related mailing list, so that someone from the related group can have a look. - rejy (rmc) Updates at related BZ - Bug 927262 - mount options of glusterfs not being honoured, while adding POSIX compliant FS Storage Domain at RHEV-M - suggest that the cause of the issue reported in this BZ may not be RHS-related, rather it may be RHEVM-related. So moving this BZ to the relevant product and component, and moving BZ state to 'NEW' I re-tested the issue on a set-up with the following components: RHEVM 3.2 - 3.2.0-10.21.master.el6ev Hypervisor -RHEL 6.4 with glusterfs-fuse-3.4.0.6rhs-1.el6.x86_64 and glusterfs-3.4.0.6rhs-1.el6.x86_64 installed Red Hat Storage with glusterfs-server-3.4.0.6rhs-1.el6rhs.x86_64 version The 'backupvolfile-server=<secondary RHS server>' mount option of glusterfs still works *only* on manual invocation of the mount command and option on the hypervisor. -------------------------------------------------------- [root@rhs-gp-srv12 ~]# /bin/mount -t glusterfs -o backupvolfile-server=rhs-client45.lab.eng.blr.redhat.com rhs-client15.lab.eng.blr.redhat.com:/RHEV-BigBend /mnt [root@rhs-gp-srv12 ~]# df -Th /mnt Filesystem Type Size Used Avail Use% Mounted on rhs-client45.lab.eng.blr.redhat.com:/RHEV-BigBend fuse.glusterfs 1.2T 206M 1.2T 1% /mnt -------------------------------------------------------- When the option is given over RHEVM, while adding a Storage Domain, the process still fails. The following is seen in the vdsm logs. -------------------------------------------------------- Thread-26253::DEBUG::2013-05-14 12:47:40,213::task::579::TaskManager.Task::(_updateState) Task=`e5444dea-aeb3-435e-9b3b-dd4a278ab26b`::moving from state init -> state preparing Thread-26253::INFO::2013-05-14 12:47:40,213::logUtils::40::dispatcher::(wrapper) Run and protect: validateStorageServerConnection(domType=6, spUUID='00000000-0000-0000-0000-000000000000', conList=[{'port': '', 'connection': 'rhs-client15.lab.eng.blr.redhat.com:/RHEV-BigBend', 'mnt_options': 'backupvolfile-server=rhs-client45.lab.eng.blr.redhat.com', 'portal': '', 'user': '', 'iqn': '', 'vfs_type': 'glusterfs', 'password': '******', 'id': '00000000-0000-0000-0000-000000000000'}], options=None) Thread-26253::INFO::2013-05-14 12:47:40,213::logUtils::42::dispatcher::(wrapper) Run and protect: validateStorageServerConnection, Return response: {'statuslist': [{'status': 0, 'id': '00000000-0000-0000-0000-000000000000'}]} Thread-26253::DEBUG::2013-05-14 12:47:40,213::task::1168::TaskManager.Task::(prepare) Task=`e5444dea-aeb3-435e-9b3b-dd4a278ab26b`::finished: {'statuslist': [{'status': 0, 'id': '00000000-0000-0000-0000-000000000000'}]} Thread-26253::DEBUG::2013-05-14 12:47:40,213::task::579::TaskManager.Task::(_updateState) Task=`e5444dea-aeb3-435e-9b3b-dd4a278ab26b`::moving from state preparing -> state finished Thread-26253::DEBUG::2013-05-14 12:47:40,213::resourceManager::809::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-26253::DEBUG::2013-05-14 12:47:40,213::resourceManager::844::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-26253::DEBUG::2013-05-14 12:47:40,214::task::974::TaskManager.Task::(_decref) Task=`e5444dea-aeb3-435e-9b3b-dd4a278ab26b`::ref 0 aborting False Thread-26254::DEBUG::2013-05-14 12:47:40,273::BindingXMLRPC::161::vds::(wrapper) [10.70.34.108] Thread-26254::DEBUG::2013-05-14 12:47:40,273::task::579::TaskManager.Task::(_updateState) Task=`206c2ea5-dd67-4825-b544-c44e7566c974`::moving from state init -> state preparing Thread-26254::INFO::2013-05-14 12:47:40,274::logUtils::40::dispatcher::(wrapper) Run and protect: connectStorageServer(domType=6, spUUID='00000000-0000-0000-0000-000000000000', conList=[{'port': '', 'connection': 'rhs-client15.lab.eng.blr.redhat.com:/RHEV-BigBend', 'mnt_options': 'backupvolfile-server=rhs-client45.lab.eng.blr.redhat.com', 'portal': '', 'user': '', 'iqn': '', 'vfs_type': 'glusterfs', 'password': '******', 'id': '00000000-0000-0000-0000-000000000000'}], options=None) Thread-26254::DEBUG::2013-05-14 12:47:40,278::misc::83::Storage.Misc.excCmd::(<lambda>) '/usr/bin/sudo -n /bin/mount -t glusterfs -o backupvolfile-server=rhs-client45.lab.eng.blr.redhat.com rhs-client15.lab.eng.blr.redhat.com:/RHEV-BigBend /rhev/data-center/mnt/rhs-client15.lab.eng.blr.redhat.com:_RHEV-BigBend' (cwd None) Thread-26254::ERROR::2013-05-14 12:47:40,462::hsm::2300::Storage.HSM::(connectStorageServer) Could not connect to storageServer Traceback (most recent call last): File "/usr/share/vdsm/storage/hsm.py", line 2297, in connectStorageServer conObj.connect() File "/usr/share/vdsm/storage/storageServer.py", line 208, in connect fileSD.validateDirAccess(self.getMountObj().getRecord().fs_file) File "/usr/share/vdsm/storage/mount.py", line 244, in getRecord (self.fs_spec, self.fs_file)) OSError: [Errno 2] Mount of `rhs-client15.lab.eng.blr.redhat.com:/RHEV-BigBend` at `/rhev/data-center/mnt/rhs-client15.lab.eng.blr.redhat.com:_RHEV-BigBend` does not exist -------------------------------------------------------- Screen-shots from RHEVM, and vdsm logs from hypervisor being attached to this BZ I believe the mount itself actually succeeded, however, vdsm has an assumption that the mount will actually point to where it asked and validates directory access after mounting which would fail since the mount points to the backup server. Hi Rejy, I don't have a glusterfs environment. Do you think you can help with verifying the patch for this bug (http://gerrit.ovirt.org/#/c/16534/)? Thanks, Yeela (In reply to Yeela Kaplan from comment #29) > Hi Rejy, > > I don't have a glusterfs environment. > Do you think you can help with verifying the patch for this bug > (http://gerrit.ovirt.org/#/c/16534/)? > > Thanks, > > Yeela Yeela, I would be glad to help you with verifying the patch. But I would need the patch incorporated into an rpm, with which I can update my RHEV environment, and test it. And would I need to just update the Hypervisors for the test ? - rejy (rmc) we do not have glusterFS, we need to configure or find one then I will ACK. (In reply to Aharon Canan from comment #31) > we do not have glusterFS, we need to configure or find one > then I will ACK. Please see comment 30 I am ready to help you guys :-) - rejy (rmc) (In reply to Rejy M Cyriac from comment #32) > (In reply to Aharon Canan from comment #31) > > we do not have glusterFS, we need to configure or find one > > then I will ACK. > > Please see comment 30 > > I am ready to help you guys :-) > > - rejy (rmc) thanks, we need to test it using official build and not only the patch. Let's wait for official and check then. Note that in RHS 2.1 this has been replaced by the option "backup-volfile-servers=<server-name1>:<server-name2>:..." Usage: mount -t glusterfs -obackup-volfile-servers=<server2>: \ <server3>:...:<serverN> <server1>:/<volname> <mount_point> Itamar, Please be advised that the option 'backupvolfile-server' has now been changed to 'backup-volfile-servers' . The following documents this. https://access.redhat.com/site/documentation/en-US/Red_Hat_Storage/2.1/html/Administration_Guide/sect-Administration_Guide-GlusterFS_Client-GlusterFS_Client-Mounting_Volumes.html You might want to use the new option in the doc-text for this BZ. - rejy (rmc) (In reply to Rejy M Cyriac from comment #39) > Itamar, > > Please be advised that the option 'backupvolfile-server' has now been > changed to 'backup-volfile-servers' . The following documents this. > > https://access.redhat.com/site/documentation/en-US/Red_Hat_Storage/2.1/html/ > Administration_Guide/sect-Administration_Guide-GlusterFS_Client- > GlusterFS_Client-Mounting_Volumes.html > > You might want to use the new option in the doc-text for this BZ. > > - rejy (rmc) It looks like backward compatibility is coming up, to prevent regression due to the change - BZ 1023950 Should only be implemented in a glustfs domain (as the abandoned patch suggests) This option should be add to the gluster connection details. *** Bug 1177777 has been marked as a duplicate of this bug. *** Required change: Engine currently allow gluster mounts on only one node to provide images to datacenter. This feature is to remove this limitation. Ala, please provide some doctext for this feature. Thanks! Verified with 3.6.0.20 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-0362.html |