1165215 – [vdsm] connectStorageServer fails on a RHEL7 host while trying to connect to a gluster domain which was created using a RHEL6 host

Bug 1165215 - [vdsm] connectStorageServer fails on a RHEL7 host while trying to connect to a gluster domain which was created using a RHEL6 host

Summary: [vdsm] connectStorageServer fails on a RHEL7 host while trying to connect to ...

Keywords:
Status:	CLOSED DUPLICATE of bug 1177651
Alias:	None
Product:	Red Hat Enterprise Virtualization Manager
Classification:	Red Hat
Component:	vdsm
Sub Component:
Version:	3.5.0
Hardware:	x86_64
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	3.5.0
Assignee:	Adam Litke
QA Contact:	Aharon Canan
Docs Contact:
URL:
Whiteboard:	storage
Duplicates (1):	1165243 (view as bug list)
Depends On:
Blocks:	1165243
TreeView+	depends on / blocked

Reported:	2014-11-18 15:17 UTC by Elad
Modified:	2016-02-10 19:12 UTC (History)
CC List:	14 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	1165243 (view as bug list)
Environment:
Last Closed:	2015-01-15 10:12:46 UTC
oVirt Team:	Storage
Target Upstream Version:
Embargoed:
Flags:	amureini: needinfo-

Attachments	(Terms of Use)
/var/log from both hosts (9.00 MB, application/x-gzip) 2014-11-18 15:17 UTC, Elad	no flags	Details
sosreport (11.70 MB, application/x-gzip) 2014-12-02 11:37 UTC, Elad	no flags	Details
debug level logs (13.02 MB, application/x-gzip) 2014-12-04 14:38 UTC, Elad	no flags	Details
gluster logs (439.75 KB, application/x-gzip) 2014-12-07 09:21 UTC, Elad	no flags	Details
View All

Description Elad 2014-11-18 15:17:21 UTC

Created attachment 958622 [details]
/var/log from both hosts

Description of problem:
I tried to add a new RHEL7 host to a DC where I already had a host with RHEL6.6 OS installed. In the DC I had a gluster domain which was created using the RHEL6.6 host. The host wasn't able to connect to the storage pool due to a mount failure to the gluster domain.

Version-Release number of selected component (if applicable):

rhev 3.5 vt10

On the RHEL7 host:
Red Hat Enterprise Linux Server release 7.0 (Maipo)
vdsm-4.16.7.4-1.el7ev.x86_64
scsi-target-utils-gluster-1.0.46-3.el7.x86_64
glusterfs-api-devel-3.4.0.65rhs-1.el7_0.x86_64
glusterfs-api-3.4.0.65rhs-1.el7_0.x86_64
glusterfs-3.4.0.65rhs-1.el7_0.x86_64
samba-vfs-glusterfs-4.1.1-37.el7_0.x86_64
glusterfs-rdma-3.4.0.65rhs-1.el7_0.x86_64
glusterfs-debuginfo-3.4.0.65rhs-1.el7_0.x86_64
glusterfs-libs-3.4.0.65rhs-1.el7_0.x86_64
glusterfs-fuse-3.4.0.65rhs-1.el7_0.x86_64
glusterfs-devel-3.4.0.65rhs-1.el7_0.x86_64
libvirt-daemon-1.1.1-29.el7_0.3.x86_64
qemu-kvm-rhev-1.5.3-60.el7_0.10.x86_64
sanlock-3.1.0-2.el7.x86_64

On the RHEL6.6 host:
Red Hat Enterprise Linux Server release 6.6 (Santiago)
vdsm-4.16.7.4-1.el6ev.x86_64
glusterfs-api-devel-3.6.0.28-2.el6.x86_64
glusterfs-cli-3.6.0.28-2.el6.x86_64
glusterfs-devel-3.6.0.28-2.el6.x86_64
samba-glusterfs-3.6.23-12.el6.x86_64
glusterfs-rdma-3.6.0.28-2.el6.x86_64
glusterfs-debuginfo-3.4.0.57rhs-1.el6_5.x86_64
glusterfs-libs-3.6.0.28-2.el6.x86_64
glusterfs-api-3.6.0.28-2.el6.x86_64
glusterfs-3.6.0.28-2.el6.x86_64
glusterfs-fuse-3.6.0.28-2.el6.x86_64
libvirt-0.10.2-46.el6_6.1.x86_64
qemu-kvm-rhev-0.12.1.2-2.448.el6.x86_64
sanlock-2.8-1.el6.x86_64

rhevm-3.5.0-0.20.el6ev.noarch


How reproducible:
Always

Steps to Reproduce:
Have 2 hosts: one with RHEL6.6 and one with RHEL7. Install all gluster packages mentioned above
1. Create a gluster domain using a RHEL6.6 host 

Thread-52::DEBUG::2014-11-18 15:40:15,924::fileUtils::142::Storage.fileUtils::(createdir) Creating directory: /rhev/data-center/mnt/glusterSD/10.35.160.202:_elad1
Thread-52::DEBUG::2014-11-18 15:40:15,925::mount::227::Storage.Misc.excCmd::(_runcmd) /usr/bin/sudo -n /bin/mount -t glusterfs 10.35.160.202:/elad1 /rhev/data-center/mnt/glusterSD/10.35.160.202:_elad1 (cwd None)

2. Add a new cluster to the DC and add a RHEL7 host to it
3.

Actual results:
Host is unable to connect to the storage pool:

vdsm.log:

Thread-342::DEBUG::2014-11-18 17:00:04,106::mount::227::Storage.Misc.excCmd::(_runcmd) /usr/bin/sudo -n /usr/bin/mount -t glusterfs 10.35.160.202:/elad1 /rhev/data-center/mnt/glusterSD/10.35.160.202:_elad1 (cwd No
ne)
Thread-342::ERROR::2014-11-18 17:00:04,189::storageServer::211::Storage.StorageServer.MountConnection::(connect) Mount failed: (1, 'Mount failed. Please check the log file for more details.\n;')
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/storageServer.py", line 209, in connect
    self._mount.mount(self.options, self._vfsType)
  File "/usr/share/vdsm/storage/mount.py", line 223, in mount
    return self._runcmd(cmd, timeout)
  File "/usr/share/vdsm/storage/mount.py", line 239, in _runcmd
    raise MountError(rc, ";".join((out, err)))
MountError: (1, 'Mount failed. Please check the log file for more details.\n;')
Thread-342::ERROR::2014-11-18 17:00:04,189::hsm::2433::Storage.HSM::(connectStorageServer) Could not connect to storageServer
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/hsm.py", line 2430, in connectStorageServer
    conObj.connect()
  File "/usr/share/vdsm/storage/storageServer.py", line 217, in connect
    raise e
MountError: (1, 'Mount failed. Please check the log file for more details.\n;')


engine.log:

2014-11-18 16:50:03,570 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-7-thread-37) [3a4fb9aa] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Mess
age: Failed to connect Host green-vdsc to the Storage Domains gluster1.
2014-11-18 16:50:03,571 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (org.ovirt.thread.pool-7-thread-37) [3a4fb9aa] FINISH, ConnectStorageServerVDSCommand, return: {a768d625-9c7
0-4440-b81b-18a1d9494221=477}, log id: b513854
2014-11-18 16:50:03,588 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-7-thread-37) [3a4fb9aa] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Mess
age: The error message for connection 10.35.160.202:/elad1 returned by VDSM was: Problem while trying to mount target
2014-11-18 16:50:03,612 ERROR [org.ovirt.engine.core.bll.storage.GLUSTERFSStorageHelper] (org.ovirt.thread.pool-7-thread-37) [3a4fb9aa] The connection with details 10.35.160.202:/elad1 failed because of error code
 477 and error message is: problem while trying to mount target



Tried to add a new host with RHEL6 installed to the DC. It succeeded.

The problem occurs only on RHEL7 hosts.
Mounting the gluster voume manually from the host succeeds:


[root@green-vdsc ~]# mount -t glusterfs 10.35.160.202:/elad1 /mnt
[root@green-vdsc ~]# mount
10.35.160.202:/elad1 on /mnt type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)

Expected results:
Connecting a RHEL7 host to a gluster domain should succeed if it was created by a RHEL6 host

Additional info: 
/var/log from both hosts
engine.log

Comment 1 Allon Mureinik 2014-11-24 17:23:25 UTC

*** Bug 1165243 has been marked as a duplicate of this bug. ***

Comment 2 Allon Mureinik 2014-11-30 13:50:46 UTC

Seems like an inconsistency in glusrefs - can be handled async if needed,

Comment 3 Timothy Asir 2014-12-02 11:04:53 UTC

Can you also provide all the glusterd and brick logs.
You can run sosreport to get the logs.

Comment 4 Elad 2014-12-02 11:37:55 UTC

Created attachment 963654 [details]
sosreport

sosreport from the Gluster storage server attached.
Attched also vdsm.log

The relevant volume named 'elad'. It has bricks in /export/elad in the peers.

Comment 5 Elad 2014-12-02 11:39:28 UTC

A reference from vdsm.log:

Thread-5539::ERROR::2014-12-02 13:28:57,096::hsm::2531::Storage.HSM::(disconnectStorageServer) Could not disconnect from storageServer
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/hsm.py", line 2527, in disconnectStorageServer
    conObj.disconnect()
  File "/usr/share/vdsm/storage/storageServer.py", line 235, in disconnect
    self._mount.umount(True, True)
  File "/usr/share/vdsm/storage/mount.py", line 254, in umount
    return self._runcmd(cmd, timeout)
  File "/usr/share/vdsm/storage/mount.py", line 239, in _runcmd
    raise MountError(rc, ";".join((out, err)))
MountError: (32, ';umount: /rhev/data-center/mnt/glusterSD/10.35.160.6:_elad: not mounted\n')

Comment 6 Timothy Asir 2014-12-04 07:24:58 UTC

Could you please set client log level to DEBUG before recreates the issue
and attach the log

To set client log level to DEBUG for a volume,
#gluster volume set VOLNAME diagnostics.client-log-level DEBUG

Comment 7 Elad 2014-12-04 14:38:50 UTC

Created attachment 964694 [details]
debug level logs

(In reply to Timothy Asir from comment #6)
> Could you please set client log level to DEBUG before recreates the issue
> and attach the log
> 
> To set client log level to DEBUG for a volume,
> #gluster volume set VOLNAME diagnostics.client-log-level DEBUG

Reproduced using DEBUG level on the volume. 'elad2' is the volume

Comment 8 Timothy Asir 2014-12-05 07:45:04 UTC

Please attach the latest rhev mount log of el7-host
(rhev-data-center-mnt-glusterSD-10.35.160.202:_elad2.log)

Comment 9 Elad 2014-12-07 09:21:05 UTC

Created attachment 965528 [details]
gluster logs

Comment 10 Elad 2014-12-08 10:36:33 UTC

I think the issue here occurs due to SElinux installed on the RHEL7 host.
I checked it again while SElinux in permissive and it doesn't reproduced, the domain is created successfully.

There are the SElinux rpms I'm using in my RHEL7 host:

[root@puma20 ~]# rpm -qa |grep selinux
libselinux-2.2.2-6.el7.x86_64
libselinux-ruby-2.2.2-6.el7.x86_64
libselinux-utils-2.2.2-6.el7.x86_64
libselinux-python-2.2.2-6.el7.x86_64
selinux-policy-3.12.1-153.el7_0.12.noarch
selinux-policy-targeted-3.12.1-153.el7_0.12.noarch

[root@puma20 ~]# rpm -qa |grep gluster
glusterfs-api-3.5.3-1.el7.x86_64
glusterfs-rdma-3.5.3-1.el7.x86_64
glusterfs-debuginfo-3.5.3-1.el7.x86_64
glusterfs-3.5.3-1.el7.x86_64
glusterfs-extra-xlators-3.5.3-1.el7.x86_64
glusterfs-api-devel-3.5.3-1.el7.x86_64
glusterfs-libs-3.5.3-1.el7.x86_64
glusterfs-devel-3.5.3-1.el7.x86_64
glusterfs-fuse-3.5.3-1.el7.x86_64


vdsm-4.16.8.1-2.el7ev.x86_64


Adam, can you take a look?

Comment 11 Yaniv Lavi 2015-01-14 15:33:00 UTC

Is this a dup of BZ #1177651?
That was also a SELINUX issue.

Comment 12 Elad 2015-01-15 07:54:58 UTC

Seems like a SELinux issue in both BZs.

Comment 13 Yaniv Lavi 2015-01-15 09:04:42 UTC

(In reply to Elad from comment #12)
> Seems like a SELinux issue in both BZs.

Can you try to recreate using 7.1 hosts?

Comment 14 Elad 2015-01-15 09:58:11 UTC

This was already tested by Ori as stated here https://bugzilla.redhat.com/show_bug.cgi?id=1181111#c10.

Comment 15 Yaniv Lavi 2015-01-15 10:12:46 UTC


*** This bug has been marked as a duplicate of bug 1177651 ***

Note You need to log in before you can comment on or make changes to this bug.