Description of problem: - hsm host running some vms - block connection from host to all storage domains - at some point, vdsm prints that connection to libvirt is broken, and calls fro prepareForShutdown; - it calls is several times: Thread-153::ERROR::2012-01-27 16:36:28,334::libvirtconnection::89::vds::(wrapper) connection to libvirt broken. taking vdsm down. Thread-137::ERROR::2012-01-27 16:36:28,336::libvirtconnection::89::vds::(wrapper) connection to libvirt broken. taking vdsm down. Thread-48::ERROR::2012-01-27 16:36:28,335::libvirtconnection::89::vds::(wrapper) connection to libvirt broken. taking vdsm down. Thread-23::ERROR::2012-01-27 16:36:28,337::libvirtconnection::89::vds::(wrapper) connection to libvirt broken. taking vdsm down. Thread-169::ERROR::2012-01-27 16:36:28,338::libvirtconnection::89::vds::(wrapper) connection to libvirt broken. taking vdsm down. Thread-84::ERROR::2012-01-27 16:36:28,339::libvirtconnection::89::vds::(wrapper) connection to libvirt broken. taking vdsm down. Thread-137::DEBUG::2012-01-27 16:36:28,340::clientIF::178::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently Thread-153::DEBUG::2012-01-27 16:36:28,340::task::588::TaskManager.Task::(_updateState) Task=`938d8853-798e-4d46-9a50-12155adc7181`::moving from state init -> state preparing Thread-48::DEBUG::2012-01-27 16:36:28,341::clientIF::178::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently Thread-72::ERROR::2012-01-27 16:36:28,342::libvirtconnection::89::vds::(wrapper) connection to libvirt broken. taking vdsm down. Thread-100::ERROR::2012-01-27 16:36:28,343::libvirtconnection::89::vds::(wrapper) connection to libvirt broken. taking vdsm down. Thread-169::DEBUG::2012-01-27 16:36:28,343::clientIF::178::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently Thread-84::DEBUG::2012-01-27 16:36:28,344::clientIF::178::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently Thread-137::ERROR::2012-01-27 16:36:28,345::utils::393::vm.Vm::(collect) vmId=`84066f51-2b3d-4afb-b16f-7d576f2d6235`::Stats function failed: <AdvancedStatsFunction _highWrite at 0x17fbf60> Traceback (most recent call last): File "/usr/share/vdsm/utils.py", line 389, in collect statsFunction() File "/usr/share/vdsm/utils.py", line 266, in __call__ retValue = self._function(*args, **kwargs) File "/usr/share/vdsm/libvirtvm.py", line 79, in _highWrite self._vm._dom.blockInfo(vmDrive.path, 0) File "/usr/share/vdsm/libvirtvm.py", line 483, in f ret = attr(*args, **kwargs) File "/usr/share/vdsm/libvirtconnection.py", line 79, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1465, in blockInfo if ret is None: raise libvirtError ('virDomainGetBlockInfo() failed', dom=self) libvirtError: End of file while reading data: Input/output error - Then, we see the following, request to stop HSM_MailMonitor thread: Thread-210::INFO::2012-01-27 16:36:29,108::storage_mailbox::444::Storage.MailBox.HsmMailMonitor::(run) HSM_MailboxMonitor - Incoming mail monitoring thread stopped, clearing outgoing mail Thread-210::INFO::2012-01-27 16:36:29,109::storage_mailbox::340::Storage.MailBox.HsmMailMonitor::(_sendMail) HSM_MailMonitor sending mail to SPM - ['dd', 'of=/rhev/data-center/61751252-99fe-45bc-bf24-dbf248172475/mastersd/dom_md/inbox', 'iflag=fullblock', 'oflag=direct', 'conv=notrunc', 'bs=512', 'seek=8'] Thread-210::DEBUG::2012-01-27 16:36:29,109::storage_mailbox::344::Storage.Misc.excCmd::(_sendMail) 'dd of=/rhev/data-center/61751252-99fe-45bc-bf24-dbf248172475/mastersd/dom_md/inbox iflag=fullblock oflag=direct conv=notrunc bs=512 seek=8' (cwd None) Thread-196::ERROR::2012-01-27 16:37:29,796::domainMonitor::120::Storage.DomainMonitor::(_monitorDomain) Error while collecting domain `b466b318-6a31-4670-bcaa-cd9ccf5cf1de` monitoring information Traceback (most recent call last): File "/usr/share/vdsm/storage/domainMonitor.py", line 105, in _monitorDomain nextStatus.readDelay = domain.getReadDelay() File "/usr/share/vdsm/storage/blockSD.py", line 423, in getReadDelay f.read(4096) File "/usr/share/vdsm/storage/fileUtils.py", line 287, in read raise OSError(err, msg) OSError: [Errno 5] Input/output error Thread-196::DEBUG::2012-01-27 16:37:29,817::domainMonitor::130::Storage.DomainMonitor::(_monitorDomain) Domain `b466b318-6a31-4670-bcaa-cd9ccf5cf1de` changed its status to Invalid Thread-201::ERROR::2012-01-27 16:37:30,810::domainMonitor::120::Storage.DomainMonitor::(_monitorDomain) Error while collecting domain `3f5edaa4-699b-4b62-bd34-3f6b04a29070` monitoring information Traceback (most recent call last): File "/usr/share/vdsm/storage/domainMonitor.py", line 105, in _monitorDomain nextStatus.readDelay = domain.getReadDelay() File "/usr/share/vdsm/storage/blockSD.py", line 423, in getReadDelay f.read(4096) File "/usr/share/vdsm/storage/fileUtils.py", line 287, in read raise OSError(err, msg) OSError: [Errno 5] Input/output error Thread-201::DEBUG::2012-01-27 16:37:30,811::domainMonitor::130::Storage.DomainMonitor::(_monitorDomain) Domain `3f5edaa4-699b-4b62-bd34-3f6b04a29070` changed its status to Invalid Thread-210::DEBUG::2012-01-27 16:37:30,911::storage_mailbox::344::Storage.Misc.excCmd::(_sendMail) FAILED: <err> = "dd: writing `/rhev/data-center/61751252-99fe-45bc-bf24-dbf248172475/mastersd/dom_md/inbox': Input/output error\n1+0 records in\n0+0 records out\n0 bytes (0 B) copied, 61.756 s, 0.0 kB/s\n"; <rc> = 1
Created attachment 558342 [details] vdsm logs
does it cause any issue in fact? The "cannot run prepareForShutdown concurrently" is a simple protection measure, as long as the first thread does shutdown vdsm successfully those other threads are nonsignificant
(In reply to comment #2) > does it cause any issue in fact? The "cannot run prepareForShutdown > concurrently" is a simple protection measure, as long as the first thread > does shutdown vdsm successfully those other threads are nonsignificant deadlock ? vdsm doesn't get restarted and move to non-responsive (getVdsCaps is not answered).
Closing old bugs. If this issue is still relevant/important in current version, please re-open the bug.