Bug 1119051

Summary: Vdsm reports,"Disk sda stats not available",in a case vm has more than one disk interface and one of the interfaces is virtio
Product: [Retired] oVirt Reporter: Ori Gofen <ogofen>
Component: vdsmAssignee: Francesco Romani <fromani>
Status: CLOSED CURRENTRELEASE QA Contact: Gil Klein <gklein>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.5CC: acanan, amureini, bazulay, gklein, iheim, mgoldboi, ofrenkel, rbalakri, yeylon
Target Milestone: ---   
Target Release: 3.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: virt
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-10-17 12:26:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Attachments:
Description Flags
vdsm+engine logs none

Description Ori Gofen 2014-07-13 15:51:02 UTC
Created attachment 917633 [details]
vdsm+engine logs

Description of problem:

when having a running vm with one virtio disk and one or more disks that have a different interface,vdsm throws _getDiskStats and _getDiskLatency errors every several seconds.

GuestMonitor-vm_1::ERROR::2014-07-13 18:37:32,025::vm::533::vm.Vm::(_getDiskLatency) vmId=`8be16854-2bc9-49dd-a5c5-8ea38536be99`::Disk sda latency not available
  Traceback (most recent call last):
    File "/usr/share/vdsm/virt/vm.py", line 531, in _getDiskLatency
      dLatency = _avgLatencyCalc(sInfo[dName], eInfo[dName])
  KeyError: u'sda'
  GuestMonitor-vm_1::DEBUG::2014-07-13 18:37:32,025::vm::423::vm.Vm::(_getUserCpuTuneInfo) vmId=`8be16854-2bc9-49dd-a5c5-8ea38536be99`::Domain Metadata is not set
  GuestMonitor-vm_1::ERROR::2014-07-13 18:37:32,026::vm::491::vm.Vm::(_getDiskStats) vmId=`8be16854-2bc9-49dd-a5c5-8ea38536be99`::Disk sda stats not available
  Traceback (most recent call last):
    File "/usr/share/vdsm/virt/vm.py", line 487, in _getDiskStats
      (eInfo[dName][1] - sInfo[dName][1]) / sampleInterval)
  KeyError: u'sda'


Version-Release number of selected component (if applicable):
beta

How reproducible:
100%

Steps to Reproduce:
1.add vm+virtio disk +virtio-iscsi disk
2.run vm

Actual results:
vdsm fails on vdsm's logs and floods them with errors

Expected results:
no errors should be reported

Additional info:

Comment 1 Francesco Romani 2014-07-14 11:48:20 UTC
the issue seems to be triggered by the addition of one hotplug disk.

Comment 2 Francesco Romani 2014-07-14 13:45:53 UTC
reproduced on today's VDSM master.

Steps to reproduce:
1. boota a VM
2. attach a virtIO disk (VDSM verb hotplugDisk triggered)

what happens here is the new disk is added to the list of VM drives.
When stats are asked, VDSM iterares on that list and look up for disk samples
in order to build the stats.

But the stats are collected (by default) every 60s, and VDSM considers the oldest and the newst samples; so, until the oldest samples collected has the values for the new disk, we'll see this behaviour.

We have a vulerabilility window up to (sampling_window * sampling_interval)
in the worst case. With default values is 2 * 60s = 120s.
After that, everything should go back to normal: it worked here, stats for the new disk appears and the error go away.

I believe the best way to fix this is just to ignore missing samples while building disk stats.

Comment 3 Francesco Romani 2014-08-27 11:24:33 UTC
patch available, and the issue is self-resolving when VDSM gathers enough stats. So decreasing severity

Comment 4 Sandro Bonazzola 2014-10-17 12:26:14 UTC
oVirt 3.5 has been released and should include the fix for this issue.