Hide Forgot
Description of problem: After some errors contacting storage, RHEV Manager could not contact Hypervisor until a vdsm service restart. Hypervisor become Unresponsive. Version-Release number of selected component (if applicable): Red Hat Enterprise Virtualization Hypervisor release 6.4 (20130709.0.el6_4) vdsm-4.10.2-23.0.el6ev.x86_64 How reproducible: Not reproduced. Seems like one time issue. Maybe related with https://bugzilla.redhat.com/show_bug.cgi?id=871355? (but no defunct processes created) Additional info: Relevant logs at the issue time: Thread-641739::ERROR::2013-09-18 21:51:32,875::domainMonitor::225::Storage.DomainMonitorThread::(_monitorDomain) Error while collecting domain e310a1af-5aa2-4371-b3c6-dbf36d6cbc50 monitoring information Traceback (most recent call last): File "/usr/share/vdsm/storage/domainMonitor.py", line 201, in _monitorDomain File "/usr/share/vdsm/storage/sdc.py", line 49, in __getattr__ File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain File "/usr/share/vdsm/storage/sdc.py", line 122, in _realProduce File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain File "/usr/share/vdsm/storage/nfsSD.py", line 127, in findDomain File "/usr/share/vdsm/storage/nfsSD.py", line 117, in findDomainPath StorageDomainDoesNotExist: Storage domain does not exist: (u'e310a1af-5aa2-4371-b3c6-dbf36d6cbc50',) ... BindingXMLRPC::ERROR::2013-09-18 21:54:35,915::BindingXMLRPC::72::vds::(threaded_start) xml-rpc handler exception Traceback (most recent call last): File "/usr/share/vdsm/BindingXMLRPC.py", line 68, in threaded_start File "/usr/lib64/python2.6/SocketServer.py", line 268, in handle_request File "/usr/lib64/python2.6/SocketServer.py", line 278, in _handle_request_noblock File "/usr/lib64/python2.6/SocketServer.py", line 446, in get_request File "/usr/lib64/python2.6/site-packages/vdsm/SecureXMLRPCServer.py", line 116, in accept File "/usr/lib64/python2.6/site-packages/M2Crypto/SSL/Connection.py", line 167, in accept File "/usr/lib64/python2.6/site-packages/M2Crypto/SSL/Connection.py", line 156, in accept_ssl SSLError: (110, 'Connection timed out')
Sergey, any update on this one?