Bug 857632

Summary: When removing gluster storage, it always fails on timeout (POSIXFS)
Product: Red Hat Enterprise Virtualization Manager Reporter: Petr Dufek <pdufek>
Component: ovirt-engineAssignee: Tal Nisan <tnisan>
Status: CLOSED CURRENTRELEASE QA Contact: RedHat Israel QE <bugzilla-qe-tlv>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 3.1.0CC: abaron, adarazs, amureini, bazulay, bugzilla-qe-tlv, danken, dyasny, hateya, iheim, lpeer, Rhev-m-bugs, sgrinber, yeylon, ykaul
Target Milestone: ---Keywords: TestBlocker
Target Release: 3.1.0   
Hardware: x86_64   
OS: Linux   
Whiteboard: storage
Fixed In Version: SI21 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-12-04 20:04:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
vdsm log
none
logs none

Description Petr Dufek 2012-09-15 10:29:50 UTC
Created attachment 613229 [details]
vdsm log

Description of problem:
When removing gluster storage, it always fails on timeout.


How reproducible:
- attach gluster storage (storage type: posix compliant FS, VFStype: glusterfs)
- creation of VM with adding disk from this storage is possible (can be seen in log)
- detach storage
- when removing storage, timeout always happens


Additional info:

Host:
-----
vdsm-4.9.6-34.0.el6_3.x86_64 (or tested also with: vdsm-4.9.6-31.0.el6_3.x86_64)
glusterfs-debuginfo-3.3.0qa45-1.el6.x86_64
glusterfs-3.3.0qa45-1.el6.x86_64
glusterfs-devel-3.3.0qa45-1.el6.x86_64
glusterfs-fuse-3.3.0qa45-1.el6.x86_64

- vdsm log is attached.

Comment 3 Petr Dufek 2012-09-17 10:46:52 UTC
command which fails: connectStorageServer (Could not connect to storageServer, MountError: (32, ';mount.nfs: Connection timed out\n'))

workflow of commands is as follows:
- validateStorageServerConnection
- connectStorageServer
- createStorageDomain
- getStorageDomainStats
- validateStorageServerConnection
- connectStorageServer
- createStoragePool
- validateStorageServerConnection
- connectStorageServer
- connectStoragePool
- activateStorageDomain
- connectStoragePool
- createVolume
- updateVM
- removeVM
- deleteImage
- deactivateStorageDomain
- disconnectStoragePool
- disconnectStorageServer
- validateStorageServerConnection
- connectStorageServer
- connectStoragePool
- destroyStoragePool
- disconnectStorageServer
- disconnectStoragePool
- validateStorageServerConnection
- connectStorageServer

the last 'connectStorageServer' always returns:
Thread-281661::ERROR::2012-09-17 12:09:26,194::hsm::1971::Storage.HSM::(connectStorageServer) Could not connect to storageServer
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/hsm.py", line 1968, in connectStorageServer
    conObj.connect()
  File "/usr/share/vdsm/storage/storageServer.py", line 179, in connect
    self._mount.mount(self.options, self._vfsType)
  File "/usr/share/vdsm/storage/mount.py", line 198, in mount
    return self._runcmd(cmd, timeout)
  File "/usr/share/vdsm/storage/mount.py", line 214, in _runcmd
    raise MountError(rc, ";".join((out, err)))
MountError: (32, ';mount.nfs: Connection timed out\n')

Comment 4 Ayal Baron 2012-09-23 07:20:35 UTC
(In reply to comment #0)
> Created attachment 613229 [details]
> vdsm log
> 
> Description of problem:
> When removing gluster storage, it always fails on timeout.

What do you mean removing?
What do you actually do? remove how?

Please attach engine log as well (always)

Comment 5 Petr Dufek 2012-09-25 11:50:48 UTC
Testing scenario:
- adding gluster storage in webadmin (storage type: posix compliant FS, VFStype: glusterfs) & attaching to data center
- creating of VM with adding disk from this storage in webadmin (can be seen in log)
- moving storage to maintenance in webadmin
- removing data center in webadmin
- removing storage in webadmin - timeout always happens

vdsm & engine logs are attached

Comment 6 Petr Dufek 2012-09-25 11:51:22 UTC
Created attachment 616994 [details]
logs

Comment 7 Attila Darazs 2012-09-27 12:15:01 UTC
This issue blocks us from adding "Posix FS over Gluster" run mode to our basic Storage Sanity test.

Comment 9 Ayal Baron 2012-09-29 14:22:03 UTC
After detach engine did not pass the vfs_type to vdsm so connect failed:

Thread-1304::INFO::2012-09-25 13:30:05,137::logUtils::37::dispatcher::(wrapper) Run and protect: validateStorageServerConnection(domType=6, spUUID='00000000-
0000-0000-0000-000000000000', conList=[{'connection': 'filer01.qa.lab.tlv.redhat.com:/pdufekdd', 'iqn': '', 'portal': '', 'user': '', 'password': '******', '
id': '00928e25-79ca-4ca2-bc39-7e398c01b47f', 'port': ''}], options=None)

Thread-1305::ERROR::2012-09-25 13:32:17,212::hsm::1971::Storage.HSM::(connectStorageServer) Could not connect to storageServer
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/hsm.py", line 1968, in connectStorageServer
    conObj.connect()
  File "/usr/share/vdsm/storage/storageServer.py", line 179, in connect
    self._mount.mount(self.options, self._vfsType)
  File "/usr/share/vdsm/storage/mount.py", line 198, in mount
    return self._runcmd(cmd, timeout)
  File "/usr/share/vdsm/storage/mount.py", line 214, in _runcmd
    raise MountError(rc, ";".join((out, err)))
MountError: (32, ';mount.nfs: Connection timed out\n')

Comment 11 Tal Nisan 2012-10-15 17:36:26 UTC
http://gerrit.ovirt.org/#/c/8576/

Comment 13 Petr Dufek 2012-10-19 09:50:30 UTC
verified in si21