Bug 1169691 - [rhev-upgrade] > hosts loosing connectivity with nfs and iscsi domains in a mixed environment, and can't reconnect
Summary: [rhev-upgrade] > hosts loosing connectivity with nfs and iscsi domains in a m...
Keywords:
Status: CLOSED DUPLICATE of bug 1160653
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.5.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: 3.5.0
Assignee: Maor
QA Contact: Aharon Canan
URL:
Whiteboard: storage
Depends On:
Blocks: rhev35rcblocker rhev35gablocker
TreeView+ depends on / blocked
 
Reported: 2014-12-02 08:36 UTC by Michael Burman
Modified: 2016-02-10 20:45 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-12-07 09:06:47 UTC
oVirt Team: Storage


Attachments (Terms of Use)
host and engine logs (1.39 MB, application/x-gzip)
2014-12-02 08:36 UTC, Michael Burman
no flags Details
host and engine logs2 (1.96 MB, application/x-gzip)
2014-12-02 08:38 UTC, Michael Burman
no flags Details

Description Michael Burman 2014-12-02 08:36:59 UTC
Created attachment 963590 [details]
host and engine logs

Description of problem:
[rhev-upgrade] > hosts loosing connectivity with nfs and iscsi domains in a mixed environment, and can't reconnect.
As part of the rhev-upgrade setup, we have a big mixed environment, mixed DC's, clusters, hosts.
There are several hosts that loosing connectivity and not able to reconnect back to the storage domain's (mixed nfs and iscsi)
From the network side everything looks good, can't understand the reason for loosing connectivity every time. 
Attaching all relevant logs for both hosts and the engine.
Pls ssh to relevant hosts and engine, this is a really bad issues, the hosts are in non-operational state and can't reconnect back to storage domain's.

hosts:
leopard02.qa.lab.tlv.redhat.com
alma03.qa.lab.tlv.redhat.com

engine:
10.35.161.37

Version-Release number of selected component (if applicable):
3.5.0-0.22.el6ev

Comment 1 Michael Burman 2014-12-02 08:38:02 UTC
Created attachment 963592 [details]
host and engine logs2

Comment 3 Allon Mureinik 2014-12-04 13:42:02 UTC
(In reply to Michael Burman from comment #0)
> Created attachment 963590 [details]
> host and engine logs
> 
> Description of problem:
> [rhev-upgrade] > hosts loosing connectivity with nfs and iscsi domains in a
> mixed environment, and can't reconnect.
What do you mean "mixed"?
Different host versions?
Different architectures?
Different storage types?

Comment 4 Michael Burman 2014-12-07 07:19:14 UTC
Different storage types(nfs, iscsi)
Mixed DC(intel+amd clusters)
Several DC's, clusters and Hosts in same environment. (10.37.161.37)

Comment 5 Allon Mureinik 2014-12-07 09:06:47 UTC
The host fails to mount a gluster domain:

2014-12-02 09:26:52,721 ERROR [org.ovirt.engine.core.bll.storage.GLUSTERFSStorageHelper] (DefaultQuartzScheduler_Worker-17) [65ce5ff5] The connection with details 10.35.160.202:/ogofen1 failed because of error code 477 and error message is: problem while trying to mount target

Thread-13::DEBUG::2014-12-02 09:26:54,753::task::595::Storage.TaskManager.Task::(_updateState) Task=`af1433bc-5b9a-47f1-8265-f275912a3228`::moving from state init -> state preparing
Thread-13::INFO::2014-12-02 09:26:54,754::logUtils::44::dispatcher::(wrapper) Run and protect: connectStorageServer(domType=7, spUUID='ba5d5f70-b014-4b33-bc81-de7df2f88574', conList=[{'port': '', 'connection': '10.35.160.202:/ogofen1', 'iqn': '', 'user': '', 'tpgt': '1', 'vfs_type': 'glusterfs', 'password': '******', 'id': 'ef9e98e6-fe20-4599-955e-2d288ba14de2'}], options=None)
Thread-13::DEBUG::2014-12-02 09:26:54,756::fileUtils::142::Storage.fileUtils::(createdir) Creating directory: /rhev/data-center/mnt/glusterSD/10.35.160.202:_ogofen1
Thread-13::DEBUG::2014-12-02 09:26:54,777::mount::227::Storage.Misc.excCmd::(_runcmd) /usr/bin/sudo -n /bin/mount -t glusterfs 10.35.160.202:/ogofen1 /rhev/data-center/mnt/glusterSD/10.35.160.202:_ogofen1 (cwd None)
Thread-13::ERROR::2014-12-02 09:26:54,813::storageServer::211::Storage.StorageServer.MountConnection::(connect) Mount failed: (32, ";mount: unknown filesystem type 'glusterfs'\n")
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/storageServer.py", line 209, in connect
    self._mount.mount(self.options, self._vfsType)
  File "/usr/share/vdsm/storage/mount.py", line 223, in mount
    return self._runcmd(cmd, timeout)
  File "/usr/share/vdsm/storage/mount.py", line 239, in _runcmd
    raise MountError(rc, ";".join((out, err)))
MountError: (32, ";mount: unknown filesystem type 'glusterfs'\n")
Thread-13::ERROR::2014-12-02 09:26:54,819::hsm::2433::Storage.HSM::(connectStorageServer) Could not connect to storageServer
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/hsm.py", line 2430, in connectStorageServer
    conObj.connect()
  File "/usr/share/vdsm/storage/storageServer.py", line 217, in connect
    raise e
MountError: (32, ";mount: unknown filesystem type 'glusterfs'\n")

Because GlusterFS is not installed.
Please install it properly and retry.

*** This bug has been marked as a duplicate of bug 1160653 ***


Note You need to log in before you can comment on or make changes to this bug.