Bug 1018364 - can't contact vdsm after storage issues
Summary: can't contact vdsm after storage issues
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.2.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
: 3.5.0
Assignee: Sergey Gotliv
QA Contact: Aharon Canan
URL:
Whiteboard: storage
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-10-11 18:55 UTC by Amador Pahim
Modified: 2018-12-03 20:16 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-04-07 07:20:05 UTC
oVirt Team: Storage
Target Upstream Version:


Attachments (Terms of Use)

Description Amador Pahim 2013-10-11 18:55:39 UTC
Description of problem:
After some errors contacting storage, RHEV Manager could not contact Hypervisor until a vdsm service restart. Hypervisor become Unresponsive.

Version-Release number of selected component (if applicable):
Red Hat Enterprise Virtualization Hypervisor release 6.4 (20130709.0.el6_4)
vdsm-4.10.2-23.0.el6ev.x86_64

How reproducible:
Not reproduced. Seems like one time issue. 
Maybe related with https://bugzilla.redhat.com/show_bug.cgi?id=871355? (but no defunct processes created)

Additional info:

Relevant logs at the issue time:

Thread-641739::ERROR::2013-09-18 21:51:32,875::domainMonitor::225::Storage.DomainMonitorThread::(_monitorDomain) Error while collecting domain e310a1af-5aa2-4371-b3c6-dbf36d6cbc50 monitoring information
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/domainMonitor.py", line 201, in _monitorDomain
  File "/usr/share/vdsm/storage/sdc.py", line 49, in __getattr__
  File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain
  File "/usr/share/vdsm/storage/sdc.py", line 122, in _realProduce
  File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain
  File "/usr/share/vdsm/storage/nfsSD.py", line 127, in findDomain
  File "/usr/share/vdsm/storage/nfsSD.py", line 117, in findDomainPath
StorageDomainDoesNotExist: Storage domain does not exist: (u'e310a1af-5aa2-4371-b3c6-dbf36d6cbc50',)

...

BindingXMLRPC::ERROR::2013-09-18 21:54:35,915::BindingXMLRPC::72::vds::(threaded_start) xml-rpc handler exception
Traceback (most recent call last):
  File "/usr/share/vdsm/BindingXMLRPC.py", line 68, in threaded_start
  File "/usr/lib64/python2.6/SocketServer.py", line 268, in handle_request
  File "/usr/lib64/python2.6/SocketServer.py", line 278, in _handle_request_noblock
  File "/usr/lib64/python2.6/SocketServer.py", line 446, in get_request
  File "/usr/lib64/python2.6/site-packages/vdsm/SecureXMLRPCServer.py", line 116, in accept
  File "/usr/lib64/python2.6/site-packages/M2Crypto/SSL/Connection.py", line 167, in accept
  File "/usr/lib64/python2.6/site-packages/M2Crypto/SSL/Connection.py", line 156, in accept_ssl
SSLError: (110, 'Connection timed out')

Comment 3 Ayal Baron 2013-12-18 09:22:19 UTC
Sergey, any update on this one?


Note You need to log in before you can comment on or make changes to this bug.