Bug 783920 - [ovirt] [vdsm] service restart is not eminent in case there are running vms
Summary: [ovirt] [vdsm] service restart is not eminent in case there are running vms
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: oVirt
Classification: Retired
Component: vdsm
Version: unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Dan Kenigsberg
QA Contact:
URL:
Whiteboard: infra
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-01-23 09:05 UTC by Haim
Modified: 2014-01-13 00:50 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-03-12 15:55:28 UTC
oVirt Team: ---


Attachments (Terms of Use)

Description Haim 2012-01-23 09:05:25 UTC
Description of problem:

case: 

- host runs vms 
- restart vdsm service 

takes time (depends on the number of vms) till service is restarted (till i see the "I am" entry inside the log), and during that time (pre-restart), nothing is written to vdsm.log. 

This is what I see in logs when 40 vms are running: 

Thread-1630::DEBUG::2012-01-23 03:47:47,921::resourceManager::844::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-1630::DEBUG::2012-01-23 03:47:47,923::resourceManager::538::ResourceManager::(releaseResource) Trying to release resource 'Storage.b1fe6ae5-d7e7-4847-96a5-7119f0cde67a'
Thread-1630::DEBUG::2012-01-23 03:47:47,923::resourceManager::553::ResourceManager::(releaseResource) Released resource 'Storage.b1fe6ae5-d7e7-4847-96a5-7119f0cde67a' (6 active users)
Thread-1630::DEBUG::2012-01-23 03:47:47,923::task::980::TaskManager.Task::(_decref) Task=`2c9ee487-e161-4196-9fcb-a90a0e5a2a33`::ref 0 aborting False
Thread-1630::ERROR::2012-01-23 03:47:47,965::utils::399::vm.Vm::(collect) vmId=`80303818-82a5-4b46-9a0d-33134235c7ad`::Stats function failed: <AdvancedStatsFunction _sampleDiskLatency at 0x12563e8>
Traceback (most recent call last):
  File "/usr/share/vdsm/utils.py", line 395, in collect
    statsFunction()
  File "/usr/share/vdsm/utils.py", line 272, in __call__
    retValue = self._function(*args, **kwargs)
  File "/usr/share/vdsm/libvirtvm.py", line 155, in _sampleDiskLatency
    stats = _blockstatsParses(out)
  File "/usr/share/vdsm/libvirtvm.py", line 142, in _blockstatsParses
    'flush_op':devStats['flush_operations'],
KeyError: 'flush_operations'
MainThread::INFO::2012-01-23 03:49:20,795::vdsm::71::vds::(run) I am the actual vdsm 4.9-0

You can see a delay of approx 2 minutes where nothing gets written during that time.

note that vdsm restart with 0 vms is very atomic in that sense (happens immediately)  

git commit hash: 5a0b2c912fb0ea5a305f191e9b558385ef249caa

Comment 1 Haim 2012-01-23 09:16:21 UTC
please ignore the KeyError when evaluating this bug, its a known\different issue with our sampling method.

Comment 2 Itamar Heim 2013-03-12 15:55:28 UTC
Closing old bugs. If this issue is still relevant/important in current version, please re-open the bug.


Note You need to log in before you can comment on or make changes to this bug.