Hide Forgot
Created attachment 489630 [details] vdsm log Description of problem: when no vms are running, vdsm is not aware to libvirtd status. in a situation that libvirtd is down and trying to run vm: vdsm fails to communicate with libvirtd, prepareForShutdown starts and return 'OK' but the process dose not restarts and hangs forever. Version-Release number of selected component (if applicable): vdsm-4.9-57.el6.x86_64 How reproducible: 100% Steps to Reproduce: 1.stop libvirtd 2.try to run vm Actual results: vdsm hangs forever Expected results: vdsm should restart Full vdsm log attached.
*** Bug 693165 has been marked as a duplicate of this bug. ***
Created attachment 497751 [details] vdsm log Still reproducible in vdsm-4.9-64.el6.x86_64, vdsm hangs. Thread-170::INFO::2011-05-09 05:34:16,499::dispatcher::100::Storage.Dispatcher.Protect::(run) Run and protect: prepareForShutdown, Return response: {'status': {'message': 'OK', 'code': 0}} Thread-170::ERROR::2011-05-09 05:34:16,502::vm::618::vm.Vm::(_startUnderlyingVm) vmId=`e7436a97-d9ab-41d1-bbcd-048fb103edde`::Traceback (most recent call last): File "/usr/share/vdsm/vm.py", line 588, in _startUnderlyingVm self._run() File "/usr/share/vdsm/libvirtvm.py", line 906, in _run if self._dom.UUIDString() != self.id: File "/usr/share/vdsm/libvirtvm.py", line 262, in __getattr__ attr = getattr(self._dom, name) AttributeError: 'NoneType' object has no attribute 'UUIDString' Thread-170::DEBUG::2011-05-09 05:34:16,505::vm::1760::vm.Vm::(setDownStatus) vmId=`e7436a97-d9ab-41d1-bbcd-048fb103edde`::Changed state to Down: 'NoneType' object has no attribute 'UUIDString'
Actually, AttributeError: 'NoneType' object has no attribute 'UUIDString' is an artifact of bug 695244. Please recheck with vdsm-4.9-65 that fixed bug 702275.
tested on vdsm-4.9-65 and bug still occurs, scenario is a bit different: 1) host (SPM) with running VMs on it 2) perform storage actions, such as deatch\attach 3) restart libvirtd; Thread-954::ERROR::2011-05-12 15:51:28,145::utils::526::vm.Vm::(collect) vmId=`0c5efaeb-ba63-421e-8a59-1fe0f4ee7662`::Stats function failed: <AdvancedStatsFunc tion _highWrite at 0x1acaa08> Traceback (most recent call last): File "/usr/share/vdsm/utils.py", line 523, in collect statsFunction() File "/usr/share/vdsm/utils.py", line 407, in __call__ retValue = self._function(*args, **kwargs) File "/usr/share/vdsm/libvirtvm.py", line 62, in _highWrite dCap, dAlloc, dPhys = self._vm._dom.blockInfo(vmDrive.path, 0) File "/usr/share/vdsm/libvirtvm.py", line 306, in f ret = attr(*args, **kwargs) File "/usr/share/vdsm/libvirtconnection.py", line 59, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/libvirt.py", line 806, in blockInfo if ret is None: raise libvirtError ('virDomainGetBlockInfo() failed', dom=self) libvirtError: cannot send data: Broken pipe Thread-946::ERROR::2011-05-12 15:51:28,148::libvirtconnection::69::vds::(wrapper) connection to libvirt broken. taking vdsm down. Thread-946::DEBUG::2011-05-12 15:51:28,149::clientIF::141::vds::(prepareForShutdown) cannot run prepareForShutdown twice Thread-946::ERROR::2011-05-12 15:51:28,149::utils::526::vm.Vm::(collect) vmId=`074d712e-6f3d-4e1e-bffd-ab40a56c9067`::Stats function failed: <AdvancedStatsFunc tion _highWrite at 0x1acaa08> Traceback (most recent call last): File "/usr/share/vdsm/utils.py", line 523, in collect statsFunction() File "/usr/share/vdsm/utils.py", line 407, in __call__ retValue = self._function(*args, **kwargs) File "/usr/share/vdsm/libvirtvm.py", line 62, in _highWrite dCap, dAlloc, dPhys = self._vm._dom.blockInfo(vmDrive.path, 0) File "/usr/share/vdsm/libvirtvm.py", line 306, in f ret = attr(*args, **kwargs) File "/usr/share/vdsm/libvirtconnection.py", line 59, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/libvirt.py", line 806, in blockInfo if ret is None: raise libvirtError ('virDomainGetBlockInfo() failed', dom=self) libvirtError: cannot send data: Broken pipe Thread-1007::ERROR::2011-05-12 15:51:28,183::libvirtconnection::69::vds::(wrapper) connection to libvirt broken. taking vdsm down. Thread-1007::DEBUG::2011-05-12 15:51:28,183::clientIF::141::vds::(prepareForShutdown) cannot run prepareForShutdown twice Thread-1007::ERROR::2011-05-12 15:51:28,184::utils::526::vm.Vm::(collect) vmId=`7ccf8808-d464-42ea-8a09-5ec4f46de527`::Stats function failed: <AdvancedStatsFun :
The bug didn't reproduce using a ruth test implementing the steps described in comment #5: http://gerrit.usersys.redhat.com/419 Maybe we can get rid of the AdvancedStatsFunction exceptions stopping the AdvancedStatsThread when a libvirt broken pipe is raised. Can you check that this is still reproducible on vdsm-4.9-66.1?
(In reply to comment #6) > The bug didn't reproduce using a ruth test implementing the steps described in > comment #5: > > http://gerrit.usersys.redhat.com/419 > > Maybe we can get rid of the AdvancedStatsFunction exceptions stopping the > AdvancedStatsThread when a libvirt broken pipe is raised. > > Can you check that this is still reproducible on vdsm-4.9-66.1? Please retest with vdsm-4.9-67
Still not good. Using steps described in description (comment 0) vdsm hangs on Thread-1648::INFO::2011-05-18 14:03:34,128::vm::625::vm.Vm::(_startUnderlyingVm) vmId=`a4c6fec2-f16e-4092-8aab-bed9d202d2b2`::The vm start process failed Traceback (most recent call last): File "/usr/share/vdsm/vm.py", line 592, in _startUnderlyingVm self._run() File "/usr/share/vdsm/libvirtvm.py", line 949, in _run self._connection.createXML(domxml, flags), File "/usr/share/vdsm/libvirtconnection.py", line 59, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1353, in createXML if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self) libvirtError: cannot send data: Broken pipe Thread-1648::DEBUG::2011-05-18 14:03:34,139::vm::1764::vm.Vm::(setDownStatus) vmId=`a4c6fec2-f16e-4092-8aab-bed9d202d2b2`::Changed state to Down: cannot send data: Broken pipe This is last log message appeared in log.
Created attachment 499586 [details] vdsm + backend logs Vdsm and backend logs attached
Created attachment 499598 [details] Backend vdsm logs My apologize, I tarred symlinks. These logs should contain info.
After some more tests and discussion with Federico, moving this bug to verified using vdsm-4.9-70.el6.x86_64. There is still trouble if libvirt is stopped and not running then vdsm is not operating, doesn't respond to backend - but according bug 678084, libvirt should be able to revive itself in condition it is not started by init script but by upstart. Please correct or accurate my statements if I'm wrong. Still - after host installation to rhevm libvirt is started by System V initscript (bug 694026) in ic119.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2011-1782.html