Description of problem: Fail to add a gluster domain Version-Release number of selected component (if applicable): engine: ovirt-engine-3.6.0-0.0.master.20150726172446.git65db93d.el6.noarch host RHEL 7.1 7Server: glusterfs-3.7.3-1.el7.x86_64 How reproducible: 100% Steps to Reproduce: 1. Trying to add a glusterfs domain (replica 3) Actual results: Thread-7568::DEBUG::2015-07-29 16:22:23,450::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'StoragePool.connectStorageServer' in bridge with {u'connectionParams': [{u'id': u'00000000-0000-0000-0000-000000000000', u'connection': u'10.35.160.6:/StorageGE2_volume04', u'iqn': u'', u'user': u'', u'tpgt': u'1', u'vfs_type': u'glusterfs', u'password': '********', u'port': u''}], u'storagepoolID': u'00000000-0000-0000-0000-000000000000', u'domainType': 7} Thread-7568::DEBUG::2015-07-29 16:22:23,451::task::595::Storage.TaskManager.Task::(_updateState) Task=`0ee3da93-591c-4cf3-aa5c-70d3eade0810`::moving from state init -> state preparing Thread-7568::INFO::2015-07-29 16:22:23,451::logUtils::48::dispatcher::(wrapper) Run and protect: connectStorageServer(domType=7, spUUID=u'00000000-0000-0000-0000-000000000000', conList=[{u'id': u'00000000-0000-0000-0000-000000000000', u'connection': u'10.35.160.6:/StorageGE2_volume04', u'iqn': u'', u'user': u'', u'tpgt': u'1', u'vfs_type': u'glusterfs', u'password': '********', u'port': u''}], options=None) Thread-7568::ERROR::2015-07-29 16:22:23,482::hsm::2455::Storage.HSM::(connectStorageServer) Could not connect to storageServer Traceback (most recent call last): File "/usr/share/vdsm/storage/hsm.py", line 2452, in connectStorageServer conObj.connect() File "/usr/share/vdsm/storage/storageServer.py", line 213, in connect self.validate() File "/usr/share/vdsm/storage/storageServer.py", line 315, in validate replicaCount = self.volinfo['replicaCount'] File "/usr/share/vdsm/storage/storageServer.py", line 311, in volinfo self._volinfo = self._get_gluster_volinfo() File "/usr/share/vdsm/storage/storageServer.py", line 329, in _get_gluster_volinfo self._volfileserver) File "/usr/share/vdsm/supervdsm.py", line 50, in __call__ return callMethod() File "/usr/share/vdsm/supervdsm.py", line 48, in <lambda> **kwargs) File "<string>", line 2, in glusterVolumeInfo File "/usr/lib64/python2.7/multiprocessing/managers.py", line 773, in _callmethod raise convert_to_error(kind, result) GlusterCmdExecFailedException: Command execution failed error: Connection failed. Please check if gluster daemon is operational. return code: 1 Thread-7568::DEBUG::2015-07-29 16:22:23,483::hsm::2474::Storage.HSM::(connectStorageServ: Additional info: I cannot even mount a gluster domain in the console: [2015-07-29 14:03:39.460787] I [MSGID: 100030] [glusterfsd.c:2301:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.7.3 (args: /usr/sbin/glusterfs --volfile-server=10.35.160.6 --volfile-id=/StorageGE2_volume02 /mnt) [2015-07-29 14:03:39.487540] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2015-07-29 14:03:39.496198] W [socket.c:642:__socket_rwv] 0-glusterfs: readv on 10.35.160.6:24007 failed (No data available) [2015-07-29 14:03:39.497450] E [rpc-clnt.c:362:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x196)[0x7f4764ba49e6] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f476496f9be] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f476496face] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x9c)[0x7f476497147c] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x48)[0x7f4764971c38] ))))) 0-glusterfs: forced unwinding frame type(GlusterFS Handshake) op(GETSPEC(2)) called at 2015-07-29 14:03:39.487940 (xid=0x1) [2015-07-29 14:03:39.497494] E [glusterfsd-mgmt.c:1604:mgmt_getspec_cbk] 0-mgmt: failed to fetch volume file (key:/StorageGE2_volume02) [2015-07-29 14:03:39.497546] W [glusterfsd.c:1219:cleanup_and_exit] (-->/lib64/libgfrpc.so.0(saved_frames_unwind+0x205) [0x7f476496f9e5] -->/usr/sbin/glusterfs(mgmt_getspec_cbk+0x430) [0x7f476507c230] -->/usr/sbin/glusterfs(cleanup_and_exit+0x69) [0x7f4765076609] ) 0-: received signum (0), shutting down So I'm guessing a problem with the gluster version I have... package is from EPEL: baseurl=http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/epel-$releasever/$basearch/ In the previous version with glusterfs-3.7.2-3.el7.x86_64 it worked. Mark as regression.
As stated above this doesn't seem to happen with 3.7.2-3, tentative to just close this and open the proper bug in glusterfs project, what version is suppose to be tested with last build?
oVirt just attempts to mount the directory, and as Carlos stated, this issue stems from an error inside gluster. Moving there in case the relevant stakeholders want/need to investigate.
Is glusterd running on 10.35.160.6? From the error, it seems to indicate that it is not - ": Command execution failed error: Connection failed. Please check if gluster daemon is operational." Could you also attach the glusterfs logs, as well as the gluster mount log from the client?
Created attachment 1057990 [details] try to mount a volume and fails, also the mnt.log
Created attachment 1057991 [details] mount volume pass in 3.7.2-3
Hi Sahina, I attached the logs from two different clients , with 3.7.2-3 and 3.7.3-1, the first one works but not the latests. I also added the packages versions. The only log I can see anything in the server is etc-glusterfs-glusterd.vol.log. When I try to mount it from the client I get: [2015-07-31 08:55:06.846132] E [rpcsvc.c:638:rpcsvc_handle_rpc_call] 0-rpc-service: Request received from non-privileged port. Failing request and a lot of these: [2015-07-31 08:55:07.881943] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5274336107d113fcd6650a75a4695d2b.socket failed (Invalid argument) [2015-07-31 08:55:10.893913] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5274336107d113fcd6650a75a4695d2b.socket failed (Invalid argument) [2015-07-31 08:55:13.895145] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5274336107d113fcd6650a75a4695d2b.socket failed (Invalid argument)
Hi Sahina, any update on this?
Carlos, 1. Check whether glusterd is running on the node, in your case its 10.35.160.6 ? 2. Could you provide volume information ?- output of 'gluster volume info' command 3. Could you also attach glusterd log file from the node - 10.35.160.6 ? We will try to help you at the earliest
(In reply to SATHEESARAN from comment #11) > Carlos, > > 1. Check whether glusterd is running on the node, in your case its > 10.35.160.6 ? > > 2. Could you provide volume information ?- output of 'gluster volume info' > command > > 3. Could you also attach glusterd log file from the node - 10.35.160.6 ? missed to mention the location - /var/log/glusterfs/etc-glusterfs-glusterd.vol.log > > We will try to help you at the earliest
Client with 3.7.2-3 and 3.7.3-1: root@storage-ge5-vdsm1 ~]# rpm -qa | grep glusterfs glusterfs-3.7.2-3.el7.x86_64 glusterfs-fuse-3.7.2-3.el7.x86_64 glusterfs-libs-3.7.2-3.el7.x86_64 glusterfs-debuginfo-3.7.2-3.el7.x86_64 glusterfs-client-xlators-3.7.2-3.el7.x86_64 glusterfs-cli-3.7.2-3.el7.x86_64 glusterfs-server-3.7.2-3.el7.x86_64 glusterfs-api-devel-3.7.2-3.el7.x86_64 glusterfs-rdma-3.7.2-3.el7.x86_64 glusterfs-api-3.7.2-3.el7.x86_64 glusterfs-extra-xlators-3.7.2-3.el7.x86_64 glusterfs-devel-3.7.2-3.el7.x86_64 (reverse-i-search)`mount': ^Cunt [root@storage-ge5-vdsm1 ~]# date; mount -t glusterfs 10.35.160.6:/StorageGE5_volume01 /mnt/ Tue Aug 4 09:45:37 IDT 2015 [root@storage-ge5-vdsm1 ~]# umount /mnt/ ENABLE: [ovirt-master-glusterfs-epel] name=GlusterFS is a clustered file-system capable of scaling to several petabytes. baseurl=http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/epel-$releasever/$basearch/ enabled=1 skip_if_unavailable=1 gpgcheck=1 gpgkey=http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/pub.key [ovirt-master-glusterfs-noarch-epel] name=GlusterFS is a clustered file-system capable of scaling to several petabytes. baseurl=http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/epel-$releasever/noarch enabled=1 skip_if_unavailable=1 gpgcheck=1 gpgkey=http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/pub.key [root@storage-ge5-vdsm1 ~]# rpm -qa | grep glusterfs glusterfs-libs-3.7.3-1.el7.x86_64 glusterfs-client-xlators-3.7.3-1.el7.x86_64 glusterfs-cli-3.7.3-1.el7.x86_64 glusterfs-devel-3.7.3-1.el7.x86_64 glusterfs-rdma-3.7.3-1.el7.x86_64 glusterfs-api-3.7.3-1.el7.x86_64 glusterfs-fuse-3.7.3-1.el7.x86_64 glusterfs-server-3.7.3-1.el7.x86_64 glusterfs-api-devel-3.7.3-1.el7.x86_64 glusterfs-debuginfo-3.7.3-1.el7.x86_64 glusterfs-3.7.3-1.el7.x86_64 glusterfs-extra-xlators-3.7.3-1.el7.x86_64 [root@storage-ge5-vdsm1 ~]# date; mount -t glusterfs 10.35.160.6:/StorageGE5_volume01 /mnt/ Tue Aug 4 09:55:03 IDT 2015 Mount failed. Please check the log file for more details.
server: [root@gluster-storage-03 ~]# ps xau | grep StorageGE2_volume01 root 2180 0.0 0.5 904304 21060 ? Ssl 12:27 0:00 /usr/sbin/glusterfsd -s 10.35.160.6 --volfile-id StorageGE2_volume01.10.35.160.6.gluster_volumes-StorageGE2_volume01 -p /var/lib/glusterd/vols/StorageGE2_volume01/run/10.35.160.6-gluster_volumes-StorageGE2_volume01.pid -S /var/run/gluster/fbd825d57281527e79d845def2ecc5af.socket --brick-name /gluster_volumes/StorageGE2_volume01 -l /var/log/glusterfs/bricks/gluster_volumes-StorageGE2_volume01.log --xlator-option *-posix.glusterd-uuid=50891c26-f41d-4f71-9a2d-32998ed1e3a0 --brick-port 49353 --xlator-option StorageGE2_volume01-server.listen-port=49353 [root@gluster-storage-03 ~]# gluster volume info StorageGE5_volume01 Volume Name: StorageGE5_volume01 Type: Replicate Volume ID: ac5c201d-9eb0-4ecd-bce2-16d41dba3c30 Status: Started Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: 10.35.160.202:/gluster_volumes/StorageGE5_volume01 Brick2: 10.35.160.203:/gluster_volumes/StorageGE5_volume01 Brick3: 10.35.160.6:/gluster_volumes/StorageGE5_volume01 Options Reconfigured: performance.readdir-ahead: on storage.owner-gid: 36 storage.owner-uid: 36 [root@gluster-storage-03 ~]# rpm -qa | grep gluster glusterfs-fuse-3.7.1-11.el7rhgs.x86_64 glusterfs-libs-3.7.1-11.el7rhgs.x86_64 glusterfs-3.7.1-11.el7rhgs.x86_64 glusterfs-api-3.7.1-11.el7rhgs.x86_64 glusterfs-server-3.7.1-11.el7rhgs.x86_64 glusterfs-client-xlators-3.7.1-11.el7rhgs.x86_64 glusterfs-cli-3.7.1-11.el7rhgs.x86_64
Created attachment 1059000 [details] glusterd.log
This is happening because bind-insecure is enabled by default in the rpc library in 3.7.3. The 3.7.3 clients attempt to connect to the 3.7.1 servers using insecure ports. The older servers still expect incoming connections to use secure ports (unless bind-insecure is turned on), and reject the connections from 3.7.3 clients. Some solutions I can think of for this include, - Turn bind-insecure off for clients by setting `client.bind-insecure` to `off` for the volumes. This will force 3.7.3 clients to use secure ports to establish the connection. - Turn bind-insecure on for the bricks by setting `server.allow-insecure` to `on` for the volumes. This will enable the older servers to accept connections coming from insecure ports. This had to be documented in the release-notes for 3.7.3, but I missed this commit when preparing the notes.
Hi Kaushal, I tried the solution you suggested - configuring both the server and the client to work in the same mode. I tried to configure it both as secure: gluster volume set elad2 server.allow-insecure off gluster volume set elad2 client.bind-insecure off And also, tried to configure them both as insecure: gluster volume set elad2 server.allow-insecure on gluster volume set elad2 client.bind-insecure on Before configuring it, I stopped the volume and started it after. This configuration did not solve the mount failure. Can you please verify I did it the right way? Thanks
The root cause and workaround for this issue is mentioned in this mail thread [1] - Elad, Carlos, Hope the workaround mentioned in the mail thread[1] helps Is this still a problem for you ? [1] - http://www.gluster.org/pipermail/gluster-devel/2015-August/046361.html
Hi, I've tried both approach with one of the volumes, even stop the volume and start it again just in case, and same result: # gluster volume info StorageGE2_volume01 Volume Name: StorageGE2_volume01 Type: Replicate Volume ID: 8347806f-5075-4431-b143-984692370d8d Status: Started Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: 10.35.160.202:/gluster_volumes/StorageGE2_volume01 Brick2: 10.35.160.203:/gluster_volumes/StorageGE2_volume01 Brick3: 10.35.160.6:/gluster_volumes/StorageGE2_volume01 Options Reconfigured: storage.owner-uid: 36 storage.owner-gid: 36 performance.readdir-ahead: on server.allow-insecure: on client.bind-insecure: off from log: [2015-08-10 10:26:30.619489] E [rpcsvc.c:638:rpcsvc_handle_rpc_call] 0-rpc-service: Request received from non-privileged port. Failing request The mail states "on all volumes", I have only tried for one beacuse we have quite a lot of those and I just want to test it quickly. am I doing something wrong?
Hi Carlos, Could also try setting `option rpc-auth-allow-insecure on` in /etc/glusterfs/glusterd.vol ? You will need to restart GlusterD after doing this change. This one additional step is needed for both of the workarounds mentioned in the mail thread. The clients need to connect to glusterd to fetch the volfile. There is currently no way to specify the connection parameters for this initial connection to fetch the volfile. The new 3.7 clients will use the default insecure parameters to setup the connection. The older glusterds will reject these incoming connections.
after adding "option rpc-auth-allow-insecure on" to /etc/glusterfs/glusterd.vol and service restart it works fine now. [root@camel-vdsb ~]# mount -t glusterfs 10.35.160.6:/acanan02 /mnt/ [root@camel-vdsb ~]# rpm -qa |grep gluster glusterfs-libs-3.7.3-1.el7.x86_64 glusterfs-fuse-3.7.3-1.el7.x86_64 glusterfs-client-xlators-3.7.3-1.el7.x86_64 glusterfs-api-3.7.3-1.el7.x86_64 glusterfs-3.7.3-1.el7.x86_64 Volume Name: acanan02 Type: Distribute Volume ID: 8bb98061-eb50-44f7-bd03-cc011c31ac2c Status: Started Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: 10.35.160.6:/gluster_volumes/acanan02 Options Reconfigured: performance.readdir-ahead: on server.allow-insecure: on client.bind-insecure: off
(In reply to Aharon Canan from comment #21) > after adding "option rpc-auth-allow-insecure on" to > /etc/glusterfs/glusterd.vol and service restart it works fine now. Note that adding "option rpc-auth-allow-insecure on" to /etc/glusterfs/glusterd.vol is mandatory for Hosted Engine external glusterfs support http://www.ovirt.org/Features/Self_Hosted_Engine_Gluster_Support So dropping bug #1083025 from blocked bugs.
This bug is getting closed because GlusteFS-3.7 has reached its end-of-life. Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS. If this bug still exists in newer GlusterFS releases, please reopen this bug against the newer release.