Created attachment 974100 [details] logs Description of problem: Creating a glusterfs domain fails if the target host is running Rhel7 OS, tried most glusterfs packages versions 3.4.0 - 3.6, Rhel7.1 and Rhel6.5 work good. the error on the UI: Error while executing action Add Storage Connection: Problem while trying to mount target attempted to mount target manually from host operation succeed: root@dhcp-2-53 ~ # mount gluster-storage-01.scl.lab.tlv.redhat.com:ogofen5 /mnt root@dhcp-2-53 ~ # cd /mnt root@dhcp-2-53 /mnt # ls 9d4ba905-858b-465a-a1e1-59441d62ecec __DIRECT_IO_TEST__ Version-Release number of selected component (if applicable): vdsm-xmlrpc-4.16.8.1-4.el7ev.noarch vdsm-jsonrpc-4.16.8.1-4.el7ev.noarch vdsm-python-zombiereaper-4.16.8.1-4.el7ev.noarch vdsm-cli-4.16.8.1-4.el7ev.noarch vdsm-4.16.8.1-4.el7ev.x86_64 vdsm-python-4.16.8.1-4.el7ev.noarch vdsm-yajsonrpc-4.16.8.1-4.el7ev.noarch glusterfs-api-devel-3.4.0.65rhs-1.el7_0.x86_64 glusterfs-fuse-3.4.0.65rhs-1.el7_0.x86_64 glusterfs-3.4.0.65rhs-1.el7_0.x86_64 glusterfs-rdma-3.4.0.65rhs-1.el7_0.x86_64 glusterfs-libs-3.4.0.65rhs-1.el7_0.x86_64 glusterfs-debuginfo-3.4.0.65rhs-1.el7_0.x86_64 glusterfs-devel-3.4.0.65rhs-1.el7_0.x86_64 glusterfs-api-3.4.0.65rhs-1.el7_0.x86_64 *note* have tried all versions How reproducible: 100% Steps to Reproduce: 1.create glusterfs domain on a DC with one host running Rhel7 Actual results: oVirt prompt an error window explaining there was a problem mounting the target while there is no mounting problem at all Expected results: operation successful Additional info:
As seen in VDSM log, VDSM issues the correct mount command, however, this command fails with the error: Thread-640::DEBUG::2014-12-29 18:20:27,473::mount::227::Storage.Misc.excCmd::(_runcmd) /usr/bin/sudo -n /usr/bin/mount -t glusterfs 10.35.160.6:ogofen4 /rhev/data-center/mnt/glusterSD/10.35.160.6:ogofen4 (cwd No ne) Thread-640::ERROR::2014-12-29 18:20:27,784::storageServer::211::Storage.StorageServer.MountConnection::(connect) Mount failed: (1, 'Mount failed. Please check the log file for more details.\n;') Traceback (most recent call last): File "/usr/share/vdsm/storage/storageServer.py", line 209, in connect self._mount.mount(self.options, self._vfsType) File "/usr/share/vdsm/storage/mount.py", line 223, in mount return self._runcmd(cmd, timeout) File "/usr/share/vdsm/storage/mount.py", line 239, in _runcmd raise MountError(rc, ";".join((out, err))) MountError: (1, 'Mount failed. Please check the log file for more details.\n;') Thread-640::ERROR::2014-12-29 18:20:27,785::hsm::2424::Storage.HSM::(connectStorageServer) Could not connect to storageServer Traceback (most recent call last): File "/usr/share/vdsm/storage/hsm.py", line 2421, in connectStorageServer conObj.connect() File "/usr/share/vdsm/storage/storageServer.py", line 217, in connect raise e MountError: (1, 'Mount failed. Please check the log file for more details.\n;') In the Gluster logs (rhev-data-center-mnt-glusterSD-10.35.160.6:ogofen4.log) there is a permission denied error: [2014-12-29 16:20:27.756326] I [rpc-clnt.c:1690:rpc_clnt_reconfig] 0-ogofen4-client-0: changing port to 49259 (from 0) [2014-12-29 16:20:27.759098] E [socket.c:2793:socket_connect] 0-ogofen4-client-0: connection attempt on 10.35.160.203:24007 failed, (Permission denied) [2014-12-29 16:20:27.759530] I [rpc-clnt.c:1690:rpc_clnt_reconfig] 0-ogofen4-client-1: changing port to 49315 (from 0) [2014-12-29 16:20:27.763127] E [socket.c:2793:socket_connect] 0-ogofen4-client-1: connection attempt on 10.35.160.6:24007 failed, (Permission denied) Not sure if this is an environment problem or an actual Gluster issue, moving to the Gluster team for further inspection. Sahina, can someone from your team please have a look?
Looks like the volume is not started as per the log. Can you confirm whether its running by # gluster volume status <volume-name> ?
Also, does the volume have the "server.allow-insecure on" option set and the option "rpc-auth-allow-insecure on" in glusterd.vol
Created attachment 975918 [details] logs (In reply to Bala.FA from comment #3) > Looks like the volume is not started as per the log. Can you confirm > whether its running by > > # gluster volume status <volume-name> > > ? [root@gluster-storage-01 ~]# gluster volume info ogofen Volume Name: ogofen Type: Distribute Volume ID: 8e65ff13-552f-46d7-853b-d46e43d25b37 Status: Started Number of Bricks: 3 Transport-type: tcp Bricks: Brick1: 10.35.160.6:/export/ogofen Brick2: 10.35.160.203:/export/ogofen Brick3: 10.35.160.202:/export/ogofen Options Reconfigured: storage.owner-gid: 36 storage.owner-uid: 36 The volume is started, also I have created successfully a domain with a host running 7.1 OS (10.35.102.86), and failed with the host running 7.0(10.35.2.51).
(In reply to Sahina Bose from comment #4) > Also, does the volume have the "server.allow-insecure on" option set and the > option "rpc-auth-allow-insecure on" in glusterd.vol I have attempted to create a glusterfs domain with the flags you suggested(even though we have never used them and it worked fine): gluster volume set ogofen1 server.allow-insecure on gluster volume set ogofen1 rpc-auth-allow on with the same results as above.
KP, could you help? thanks!
Does /etc/glusterfs/glusterd.vol on all nodes contain "option rpc-auth-allow-insecure on"? If not, please add this line and restart glusterd.
Created attachment 977219 [details] logs Have reproduced again after rechecked all flow is according to documentation and comments here. verified vol is started: [root@gluster-storage-03 ~]# gluster volume info ogofen4 Volume Name: ogofen4 Type: Distribute Volume ID: 8e65ff13-552f-46d7-853b-d46e43d25b37 Status: Started Number of Bricks: 3 Transport-type: tcp Bricks: Brick1: 10.35.160.6:/export/ogofen4 Brick2: 10.35.160.203:/export/ogofen4 Brick3: 10.35.160.202:/export/ogofen4 Options Reconfigured: server.allow-insecure: on storage.owner-gid: 36 storage.owner-uid: 36 verified glusterd configuration on all servers: [root@gluster-storage-01 ogofen4]# cat /etc/glusterfs/glusterd.vol volume management type mgmt/glusterd option working-directory /var/lib/glusterd option transport-type socket,rdma option transport.socket.keepalive-time 10 option transport.socket.keepalive-interval 2 option transport.socket.read-fail-log off option rpc-auth-allow-insecure on # option base-port 49152 end-volume [root@gluster-storage-02 ogofen4]# cat /etc/glusterfs/glusterd.vol volume management type mgmt/glusterd option working-directory /var/lib/glusterd option transport-type socket,rdma option transport.socket.keepalive-time 10 option transport.socket.keepalive-interval 2 option transport.socket.read-fail-log off option rpc-auth-allow-insecure on # option base-port 49152 end-volume [root@gluster-storage-03 ogofen4]# cat /etc/glusterfs/glusterd.vol volume management type mgmt/glusterd option working-directory /var/lib/glusterd option transport-type socket,rdma option transport.socket.keepalive-time 10 option transport.socket.keepalive-interval 2 option transport.socket.read-fail-log off option rpc-auth-allow-insecure on # option base-port 49152 end-volume Have attempted to create glusterfs Domain twice, first time using a Rhel7.0 host (echoed "Add glusterfs domain with option rpc-auth-allow-insecure on [RHEL7]" to engine's log), second time using a Rhel6.6 host (echoed "Add glusterfs domain with option rpc-auth-allow-insecure on [RHEL7]" to engine's log). the results are the same, while Rhel6.6 running same vdsm version succeed to create domain, Rhel7 host fails.
(In reply to Ori Gofen from comment #9) > > Have attempted to create glusterfs Domain twice, first time using a Rhel7.0 > host (echoed "Add glusterfs domain with option rpc-auth-allow-insecure on > [RHEL7]" to engine's log), second time using a Rhel6.6 host (echoed "Add > glusterfs domain with option rpc-auth-allow-insecure on [RHEL7]" to engine's > log). > > the results are the same, while Rhel6.6 running same vdsm version succeed to > create domain, Rhel7 host fails. I still see this in the log file: [2015-01-07 11:01:47.440336] E [socket.c:2903:socket_connect] 0-ogofen4-client-0: connection attempt on 10.35.160.203:24007 failed, (Permission denied) [2015-01-07 11:01:47.440398] I [rpc-clnt.c:1729:rpc_clnt_reconfig] 0-ogofen4-client-1: changing port to 49315 (from 0) [2015-01-07 11:01:47.444724] E [socket.c:2903:socket_connect] 0-ogofen4-client-1: connection attempt on 10.35.160.6:24007 failed, (Permission denied) Can we check if telnet 10.35.160.203:24007 from the RHEL 7 host to see if it goes through? If not, it might be good to check the firewall rules on the RHEL 7 host.
(In reply to Vijay Bellur from comment #10) > (In reply to Ori Gofen from comment #9) > > > > > Have attempted to create glusterfs Domain twice, first time using a Rhel7.0 > > host (echoed "Add glusterfs domain with option rpc-auth-allow-insecure on > > [RHEL7]" to engine's log), second time using a Rhel6.6 host (echoed "Add > > glusterfs domain with option rpc-auth-allow-insecure on [RHEL7]" to engine's > > log). > > > > the results are the same, while Rhel6.6 running same vdsm version succeed to > > create domain, Rhel7 host fails. > > I still see this in the log file: > > [2015-01-07 11:01:47.440336] E [socket.c:2903:socket_connect] > 0-ogofen4-client-0: connection attempt on 10.35.160.203:24007 failed, > (Permission denied) > [2015-01-07 11:01:47.440398] I [rpc-clnt.c:1729:rpc_clnt_reconfig] > 0-ogofen4-client-1: changing port to 49315 (from 0) > [2015-01-07 11:01:47.444724] E [socket.c:2903:socket_connect] > 0-ogofen4-client-1: connection attempt on 10.35.160.6:24007 failed, > (Permission denied) > > Can we check if telnet 10.35.160.203:24007 from the RHEL 7 host to see > if it goes through? If not, it might be good to check the firewall rules on > the RHEL 7 host. the telnet result is similar on both RHEL7, RHEL6.6 hosts root@purple-vds1 ~ # telnet 10.35.160.203:24007 <-- RHEL7 telnet: 10.35.160.203:24007: Name or service not known 10.35.160.203:24007: Unknown host root@adder ~ # telnet 10.35.160.203:24007 <-- RHEL6 telnet: 10.35.160.203:24007: Name or service not known 10.35.160.203:24007: Unknown host in addition, I have removed all iptable exceptions(from hypervisor and glusterfs servers) in order to avoid any firewall issues.
(In reply to Ori Gofen from comment #11) > (In reply to Vijay Bellur from comment #10) > > (In reply to Ori Gofen from comment #9) > > > > > > > > Have attempted to create glusterfs Domain twice, first time using a Rhel7.0 > > > host (echoed "Add glusterfs domain with option rpc-auth-allow-insecure on > > > [RHEL7]" to engine's log), second time using a Rhel6.6 host (echoed "Add > > > glusterfs domain with option rpc-auth-allow-insecure on [RHEL7]" to engine's > > > log). > > > > > > the results are the same, while Rhel6.6 running same vdsm version succeed to > > > create domain, Rhel7 host fails. > > > > I still see this in the log file: > > > > [2015-01-07 11:01:47.440336] E [socket.c:2903:socket_connect] > > 0-ogofen4-client-0: connection attempt on 10.35.160.203:24007 failed, > > (Permission denied) > > [2015-01-07 11:01:47.440398] I [rpc-clnt.c:1729:rpc_clnt_reconfig] > > 0-ogofen4-client-1: changing port to 49315 (from 0) > > [2015-01-07 11:01:47.444724] E [socket.c:2903:socket_connect] > > 0-ogofen4-client-1: connection attempt on 10.35.160.6:24007 failed, > > (Permission denied) > > > > Can we check if telnet 10.35.160.203:24007 from the RHEL 7 host to see > > if it goes through? If not, it might be good to check the firewall rules on > > the RHEL 7 host. > > the telnet result is similar on both RHEL7, RHEL6.6 hosts > > root@purple-vds1 ~ # telnet 10.35.160.203:24007 <-- RHEL7 > telnet: 10.35.160.203:24007: Name or service not known > 10.35.160.203:24007: Unknown host > > root@adder ~ # telnet 10.35.160.203:24007 <-- RHEL6 > telnet: 10.35.160.203:24007: Name or service not known > 10.35.160.203:24007: Unknown host > Please use a space between hostname and port instead of ':'. In addition, are there any selinux denials observed in RHEL 7?
(In reply to Vijay Bellur from comment #12) > (In reply to Ori Gofen from comment #11) > > (In reply to Vijay Bellur from comment #10) > > > (In reply to Ori Gofen from comment #9) > > > > > > > > > > > Have attempted to create glusterfs Domain twice, first time using a Rhel7.0 > > > > host (echoed "Add glusterfs domain with option rpc-auth-allow-insecure on > > > > [RHEL7]" to engine's log), second time using a Rhel6.6 host (echoed "Add > > > > glusterfs domain with option rpc-auth-allow-insecure on [RHEL7]" to engine's > > > > log). > > > > > > > > the results are the same, while Rhel6.6 running same vdsm version succeed to > > > > create domain, Rhel7 host fails. > > > > > > I still see this in the log file: > > > > > > [2015-01-07 11:01:47.440336] E [socket.c:2903:socket_connect] > > > 0-ogofen4-client-0: connection attempt on 10.35.160.203:24007 failed, > > > (Permission denied) > > > [2015-01-07 11:01:47.440398] I [rpc-clnt.c:1729:rpc_clnt_reconfig] > > > 0-ogofen4-client-1: changing port to 49315 (from 0) > > > [2015-01-07 11:01:47.444724] E [socket.c:2903:socket_connect] > > > 0-ogofen4-client-1: connection attempt on 10.35.160.6:24007 failed, > > > (Permission denied) > > > > > > Can we check if telnet 10.35.160.203:24007 from the RHEL 7 host to see > > > if it goes through? If not, it might be good to check the firewall rules on > > > the RHEL 7 host. > > > > the telnet result is similar on both RHEL7, RHEL6.6 hosts > > > > root@purple-vds1 ~ # telnet 10.35.160.203:24007 <-- RHEL7 > > telnet: 10.35.160.203:24007: Name or service not known > > 10.35.160.203:24007: Unknown host > > > > root@adder ~ # telnet 10.35.160.203:24007 <-- RHEL6 > > telnet: 10.35.160.203:24007: Name or service not known > > 10.35.160.203:24007: Unknown host > > > > Please use a space between hostname and port instead of ':'. In addition, > are there any selinux denials observed in RHEL 7? You are right about the selinux denials, this issue is moved to rhel7 bz #1181111. setting this bug to be Blocked, by bz #1181111 or it can also be closed as far as I know.
Bug 1181111 will (probably) be solved by building a new selinux-policy rpm. This bug will be used to track the need for a patch to vdsm's spec file to require it.
*** Bug 1165215 has been marked as a duplicate of this bug. ***
allon, can we move it to 3.5.0-1 for now, since we're not going to be respining for GA even if the selinux-policy rpm will be ready.
(In reply to Eyal Edri from comment #16) > allon, can we move it to 3.5.0-1 for now, since we're not going to be > respining for GA even if the selinux-policy rpm will be ready. Agreed. Tentatively targeting to 3.5.0-1 with hope of getting the selinux-policy RPM. IMHO, BTW, this should not be a blocker for 3.5.0-1 either, as there's an easy workaround (see doctext).
still blocked on selinux, doesn't seem to be converging for 3.5.0-1, moving to z-stream.
3.5.1 is already full with bugs (over 80), and since none of these bugs were added as urgent for 3.5.1 release in the tracker bug, moving to 3.5.2
Andrew - the enclosed doctext is perfect for RHEV 3.5.0, but should be amended looking forward. In RHEV 3.6.0 (and in some 3.5.z zstream, not sure which yet - probably 3.5.3), "yum install vdsm" or "yum upgrade vdsm" will pull the appropriate selinux dependency for the RHEL channel and turn this bug into a mute point. Not quiet sure how this should be handled process-wise - please advice.
Hi Allon, Thank you for letting me know about this bug. Perhaps we can ensure it is added as a known issue to the release notes for now, and track its resolution in future releases? When it is resolved, we can remove the release note. Let me know what you think. Kind regards, Andrew
(In reply to Andrew Dahms from comment #22) > Hi Allon, > > Thank you for letting me know about this bug. Perhaps we can ensure it is > added as a known issue to the release notes for now, and track its > resolution in future releases? When it is resolved, we can remove the > release note. Sounds good to me, thanks. How do you want to track this? Currently, RHEV engineering has two bugs for the issue: 1. bug 1177651 (this issue) for 3.6.0 - by then issue will already be fixed (note the bug's status is MODIFIED) 2. bug 1205583 for 3.5.3. When 3.5.3 will be released, the issue will be fixed. Unless it is moved back to 3.5.1 (unlikely), we'll need the release note you mentioned for 3.5.1. Are these two enough, or do we need a separate docs bug for 3.5.1 in case 1205583 is only solved in 3.5.3?
Hi Allon, The two bugs should be enough. I will keep track of the two bugs and their status, and will also make sure to speak with you about whether we can remove the note in further z-stream releases as well. Does that sound ok? For the time being, I will add the doc text to the release notes so that the known issue is covered there. Kind regards, Andrew
verified
Hi Allon, I am tracking this bug for the 3.6 beta release notes. As your conversation with Andrew in comment 22 and comment 23 suggests, should I remove the doc text for this bug now that it is verified? Is there anything here that users need to know that we should include in the release notes? Kind Regards, Lucy
Hi Lucy, Sorry for the late reply - I was on PTO for the local holidays. In RHEV 3.6.0 there's nothing to document for this bug. Simply installing VDSM will pull the relevant gluster and selinux libraries so creating a gluster storage domain is possible. In short - this release note should be removed in 3.6.0 [beta]. Thanks!
Thanks, Allon. I'll set the 'requires_release_note' flag back to blank (I can't seem to set it to '-'), and remove the text now.