Bug 799918

Summary: [error handling] Temporarily unavailable NFS storage is not restored and can not be activated
Product: [Retired] oVirt Reporter: Rami Vaknin <rvaknin>
Component: vdsmAssignee: Dan Kenigsberg <danken>
Status: CLOSED WONTFIX QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: abaron, acathrow, amureini, bazulay, iheim, mgoldboi, yeylon, ykaul
Target Milestone: ---   
Target Release: 3.3.4   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: storage
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-03-12 09:36:30 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
VDSM Logs none

Description Rami Vaknin 2012-03-05 12:15:40 UTC
Created attachment 567581 [details]
VDSM Logs

Version:
vdsm-4.9.4-0.59.gitf01998b.fc16.x86_64

My scenario:
1. Attach NFS export storage domain to iSCSI data center
2. Import template from that export domain
3. Disconnect networking between the hosts to the export domain's NFS server

Result:
The export domain is down and can not be activated

Reason:
disconnectStorageServer fails to umount the export storage domain's mount point (as expected) but connectStorageServer tries to mount again the already-mounted export domain and get error and fails.


The disconnectStorageServer:
----------------------------

Thread-64103::INFO::2012-03-05 12:37:42,957::logUtils::37::dispatcher::(wrapper) Run and protect: disconnectStorageServer(domType=1, spUUID='00000000-0000-0000-0000-000000000000', conList=[{'connection': 'qanashead.qa.lab.tlv.redhat.com:
/export/rami/export_backup', 'iqn': '', 'portal': '', 'user': '', 'password': '******', 'id': 'c188db30-2434-4d17-9296-f12ffa3bb591', 'port': ''}], options=None)
Thread-64103::INFO::2012-03-05 12:37:42,958::storage_connection::167::Storage.ServerConnection::(disconnect) Request to disconnect NFS storage server
Thread-64103::WARNING::2012-03-05 12:37:42,959::storage_connection::382::Storage.ServerConnection::(__disconnectNFSServer) Cannot remove mountpoint after umount()
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/storage_connection.py", line 378, in __disconnectNFSServer
    os.rmdir(mnt.fs_file)
OSError: [Errno 16] Device or resource busy: '/rhev/data-center/mnt/qanashead.qa.lab.tlv.redhat.com:_export_rami_export__backup'

The connectStorageServer:
-------------------------

Thread-66813::INFO::2012-03-05 13:48:22,016::logUtils::37::dispatcher::(wrapper) Run and protect: connectStorageServer(domType=1, spUUID='00000000-0000-0000-0000-000000000000', conList=[{'connection': 'qanashead.qa.lab.tlv.redhat.com:/export/rami/export_backup', 'iqn': '', 'portal': '', 'user': '', 'password': '******', 'id': 'c188db30-2434-4d17-9296-f12ffa3bb591', 'port': ''}], options=None)
Thread-66813::INFO::2012-03-05 13:48:22,016::storage_connection::146::Storage.ServerConnection::(connect) Request to connect NFS storage server
Thread-66813::DEBUG::2012-03-05 13:48:22,020::mount::117::Storage.Misc.excCmd::(_runcmd) '/usr/bin/sudo -n /bin/mount -t nfs -o soft,nosharecache,timeo=600,retrans=6 qanashead.qa.lab.tlv.redhat.com:/export/rami/export_backup /rhev/data-center/mnt/qanashead.qa.lab.tlv.redhat.com:_export_rami_export__backup' (cwd None)
Thread-66813::ERROR::2012-03-05 13:48:22,132::storage_connection::255::Storage.ServerConnection::(__connectNFSServer) Error during storage connection
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/storage_connection.py", line 253, in __connectNFSServer
    mnt.mount(getNfsOptions(con), mount.VFS_NFS, timeout=CON_TIMEOUT)
  File "/usr/share/vdsm/storage/mount.py", line 113, in mount
    return self._runcmd(cmd, timeout)
  File "/usr/share/vdsm/storage/mount.py", line 128, in _runcmd
    raise MountError(rc, ";".join((out, err)))
MountError: (32, ';mount.nfs: mounting qanashead.qa.lab.tlv.redhat.com:/export/rami/export_backup failed, reason given by server:\n  No such file or directory\n')
Thread-66813::DEBUG::2012-03-05 13:48:22,137::lvm::457::OperationMutex::(_invalidateAllPvs) Operation 'lvm invalidate operation' got the operation mutex
Thread-66813::DEBUG::2012-03-05 13:48:22,137::lvm::459::OperationMutex::(_invalidateAllPvs) Operation 'lvm invalidate operation' released the operation mutex
Thread-66813::DEBUG::2012-03-05 13:48:22,137::lvm::469::OperationMutex::(_invalidateAllVgs) Operation 'lvm invalidate operation' got the operation mutex
Thread-66813::DEBUG::2012-03-05 13:48:22,138::lvm::471::OperationMutex::(_invalidateAllVgs) Operation 'lvm invalidate operation' released the operation mutex
Thread-66813::DEBUG::2012-03-05 13:48:22,138::lvm::490::OperationMutex::(_invalidateAllLvs) Operation 'lvm invalidate operation' got the operation mutex
Thread-66813::DEBUG::2012-03-05 13:48:22,138::lvm::492::OperationMutex::(_invalidateAllLvs) Operation 'lvm invalidate operation' released the operation mutex
Thread-66813::INFO::2012-03-05 13:48:22,139::logUtils::39::dispatcher::(wrapper) Run and protect: connectStorageServer, Return response: {'statuslist': [{'status': 451, 'id': 'c188db30-2434-4d17-9296-f12ffa3bb591'}]}

Comment 1 Rami Vaknin 2012-03-05 12:21:10 UTC
Forgot to mention - the workaround for that situation is to manually umount the export domain and activate it again via webadmin.

Comment 2 Itamar Heim 2013-03-12 09:36:30 UTC
Closing old bugs. If this issue is still relevant/important in current version, please re-open the bug.