Created attachment 963590 [details] host and engine logs Description of problem: [rhev-upgrade] > hosts loosing connectivity with nfs and iscsi domains in a mixed environment, and can't reconnect. As part of the rhev-upgrade setup, we have a big mixed environment, mixed DC's, clusters, hosts. There are several hosts that loosing connectivity and not able to reconnect back to the storage domain's (mixed nfs and iscsi) From the network side everything looks good, can't understand the reason for loosing connectivity every time. Attaching all relevant logs for both hosts and the engine. Pls ssh to relevant hosts and engine, this is a really bad issues, the hosts are in non-operational state and can't reconnect back to storage domain's. hosts: leopard02.qa.lab.tlv.redhat.com alma03.qa.lab.tlv.redhat.com engine: 10.35.161.37 Version-Release number of selected component (if applicable): 3.5.0-0.22.el6ev
Created attachment 963592 [details] host and engine logs2
(In reply to Michael Burman from comment #0) > Created attachment 963590 [details] > host and engine logs > > Description of problem: > [rhev-upgrade] > hosts loosing connectivity with nfs and iscsi domains in a > mixed environment, and can't reconnect. What do you mean "mixed"? Different host versions? Different architectures? Different storage types?
Different storage types(nfs, iscsi) Mixed DC(intel+amd clusters) Several DC's, clusters and Hosts in same environment. (10.37.161.37)
The host fails to mount a gluster domain: 2014-12-02 09:26:52,721 ERROR [org.ovirt.engine.core.bll.storage.GLUSTERFSStorageHelper] (DefaultQuartzScheduler_Worker-17) [65ce5ff5] The connection with details 10.35.160.202:/ogofen1 failed because of error code 477 and error message is: problem while trying to mount target Thread-13::DEBUG::2014-12-02 09:26:54,753::task::595::Storage.TaskManager.Task::(_updateState) Task=`af1433bc-5b9a-47f1-8265-f275912a3228`::moving from state init -> state preparing Thread-13::INFO::2014-12-02 09:26:54,754::logUtils::44::dispatcher::(wrapper) Run and protect: connectStorageServer(domType=7, spUUID='ba5d5f70-b014-4b33-bc81-de7df2f88574', conList=[{'port': '', 'connection': '10.35.160.202:/ogofen1', 'iqn': '', 'user': '', 'tpgt': '1', 'vfs_type': 'glusterfs', 'password': '******', 'id': 'ef9e98e6-fe20-4599-955e-2d288ba14de2'}], options=None) Thread-13::DEBUG::2014-12-02 09:26:54,756::fileUtils::142::Storage.fileUtils::(createdir) Creating directory: /rhev/data-center/mnt/glusterSD/10.35.160.202:_ogofen1 Thread-13::DEBUG::2014-12-02 09:26:54,777::mount::227::Storage.Misc.excCmd::(_runcmd) /usr/bin/sudo -n /bin/mount -t glusterfs 10.35.160.202:/ogofen1 /rhev/data-center/mnt/glusterSD/10.35.160.202:_ogofen1 (cwd None) Thread-13::ERROR::2014-12-02 09:26:54,813::storageServer::211::Storage.StorageServer.MountConnection::(connect) Mount failed: (32, ";mount: unknown filesystem type 'glusterfs'\n") Traceback (most recent call last): File "/usr/share/vdsm/storage/storageServer.py", line 209, in connect self._mount.mount(self.options, self._vfsType) File "/usr/share/vdsm/storage/mount.py", line 223, in mount return self._runcmd(cmd, timeout) File "/usr/share/vdsm/storage/mount.py", line 239, in _runcmd raise MountError(rc, ";".join((out, err))) MountError: (32, ";mount: unknown filesystem type 'glusterfs'\n") Thread-13::ERROR::2014-12-02 09:26:54,819::hsm::2433::Storage.HSM::(connectStorageServer) Could not connect to storageServer Traceback (most recent call last): File "/usr/share/vdsm/storage/hsm.py", line 2430, in connectStorageServer conObj.connect() File "/usr/share/vdsm/storage/storageServer.py", line 217, in connect raise e MountError: (32, ";mount: unknown filesystem type 'glusterfs'\n") Because GlusterFS is not installed. Please install it properly and retry. *** This bug has been marked as a duplicate of bug 1160653 ***