Bug 883858 - vdsm: failure to move disk with 'truesize' error in vdsm will show the same exit message in event log
Summary: vdsm: failure to move disk with 'truesize' error in vdsm will show the same e...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.1.0
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: 3.3.0
Assignee: Federico Simoncelli
QA Contact: Katarzyna Jachim
URL:
Whiteboard: storage
: 892280 (view as bug list)
Depends On:
Blocks: 985556
TreeView+ depends on / blocked
 
Reported: 2012-12-05 12:59 UTC by Dafna Ron
Modified: 2018-12-02 17:32 UTC (History)
22 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
A misleading python traceback related to corrupted or missing images has been converted into a more meaningful error message ("Volume is corrupted or missing").
Clone Of:
: 985556 (view as bug list)
Environment:
Last Closed: 2014-01-21 16:03:47 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:
scohen: needinfo+
scohen: Triaged+


Attachments (Terms of Use)
logs (860.29 KB, application/x-gzip)
2012-12-05 12:59 UTC, Dafna Ron
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2014:0040 0 normal SHIPPED_LIVE vdsm bug fix and enhancement update 2014-01-21 20:26:21 UTC
oVirt gerrit 13529 0 None None None Never

Description Dafna Ron 2012-12-05 12:59:13 UTC
Created attachment 658169 [details]
logs

Description of problem:

if we have a failure on move disk with truesize as exit error the following is shown in event log: 

VM RHEL3 is down. Exit message: 'truesize'.


Version-Release number of selected component (if applicable):

si24.5

How reproducible:

100%

Steps to Reproduce:
1. shut down a vm during move image
2. try to start it again
3.
  
Actual results:

VM RHEL3 is down. Exit message: 'truesize'.

Expected results:

exit message should be fixed to human readable. 

Additional info: logs

Comment 1 Ayal Baron 2012-12-26 23:52:10 UTC
This is not a text bug.

Thread-54906::ERROR::2012-12-05 14:50:18,843::dispatcher::69::Storage.Dispatcher.Protect::(run) [Errno 2] No such file or directory: '/rhev/data-center/2d5f297c-185a-470b-8600-208fc3c9b235/0348d4c6-85e5-4ab1-8226-4002f2441037/images/bc37bc9e-e431-4b1a-8a7b-c6fd9866ad27/b894b7fa-e438-4c40-b0ea-e04afea1bc8b'
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/dispatcher.py", line 61, in run
    result = ctask.prepare(self.func, *args, **kwargs)
  File "/usr/share/vdsm/storage/task.py", line 1164, in prepare
    raise self.error
OSError: [Errno 2] No such file or directory: '/rhev/data-center/2d5f297c-185a-470b-8600-208fc3c9b235/0348d4c6-85e5-4ab1-8226-4002f2441037/images/bc37bc9e-e431-4b1a-8a7b-c6fd9866ad27/b894b7fa-e438-4c40-b0ea-e04afea1bc8b'
Thread-54908::DEBUG::2012-12-05 14:50:18,844::task::588::TaskManager.Task::(_updateState) Task=`db84e33f-e984-4800-a4b8-ca59dc4d427d`::moving from state init -> state preparing
Thread-54906::DEBUG::2012-12-05 14:50:18,844::vm::607::vm.Vm::(_startUnderlyingVm) vmId=`8491798b-38f5-4e4b-a918-7f3f4533fdf2`::_ongoingCreations released
Thread-54908::INFO::2012-12-05 14:50:18,845::logUtils::37::dispatcher::(wrapper) Run and protect: clearTask(taskID='53200579-8156-4ac2-8484-01174249520d', spUUID=None, options=None)
Thread-54906::ERROR::2012-12-05 14:50:18,845::vm::631::vm.Vm::(_startUnderlyingVm) vmId=`8491798b-38f5-4e4b-a918-7f3f4533fdf2`::The vm start process failed
Traceback (most recent call last):
  File "/usr/share/vdsm/vm.py", line 597, in _startUnderlyingVm
    self._run()
  File "/usr/share/vdsm/libvirtvm.py", line 1344, in _run
    devices = self.buildConfDevices()
  File "/usr/share/vdsm/vm.py", line 454, in buildConfDevices
    self._normalizeVdsmImg(drv)
  File "/usr/share/vdsm/vm.py", line 381, in _normalizeVdsmImg
    drv['truesize'] = res['truesize']
KeyError: 'truesize'

Comment 2 Ayal Baron 2013-01-03 14:18:19 UTC
This is all that vdsm is passing, engine has no more information to display.

Comment 4 Ayal Baron 2013-01-07 13:58:04 UTC
*** Bug 892280 has been marked as a duplicate of this bug. ***

Comment 6 Federico Simoncelli 2013-04-05 15:00:57 UTC
A deleteImage:

Thread-54782::INFO::2012-12-05 14:49:46,933::logUtils::37::dispatcher::(wrapper) Run and protect: deleteImage(sdUUID='0348d4c6-85e5-4ab1-8226-4002f2441037', spUUID='2d5f297c-185a-470b-8600-208fc3c9b235', imgUUID='bc37bc9e-e431-4b1a-8a7b-c6fd9866ad27', postZero='false', force='false')

preceded the createVm command:

Thread-54905::DEBUG::2012-12-05 14:50:18,772::BindingXMLRPC::894::vds::(wrapper) client [10.35.97.65]::call vmCreate with ({...
{'iface': 'virtio', 'format': 'cow', 'type': 'disk', 'poolID': '2d5f297c-185a-470b-8600-208fc3c9b235', 'volumeID': 'b894b7fa-e438-4c40-b0ea-e04afea1bc8b', 'imageID': 'bc37bc9e-e431-4b1a-8a7b-c6fd9866ad27', 'specParams': {}, 'readonly': 'false', 'domainID': '0348d4c6-85e5-4ab1-8226-4002f2441037', 'deviceId': 'bc37bc9e-e431-4b1a-8a7b-c6fd9866ad27', 'address': {'bus': '0x00', ' slot': '0x09', ' domain': '0x0000', ' type': 'pci', ' function': '0x0'}, 'device': 'disk', 'shared': 'false', 'propagateErrors': 'off', 'optional': 'false'}, ... },) {} flowID [41346f55]

This resulted in:

OSError: [Errno 2] No such file or directory: '/rhev/data-center/2d5f297c-185a-470b-8600-208fc3c9b235/0348d4c6-85e5-4ab1-8226-4002f2441037/images/bc37bc9e-e431-4b1a-8a7b-c6fd9866ad27/b894b7fa-e438-4c40-b0ea-e04afea1bc8b'

Traceback (most recent call last):
  File "/usr/share/vdsm/vm.py", line 597, in _startUnderlyingVm
    self._run()
  File "/usr/share/vdsm/libvirtvm.py", line 1344, in _run
    devices = self.buildConfDevices()
  File "/usr/share/vdsm/vm.py", line 454, in buildConfDevices
    self._normalizeVdsmImg(drv)
  File "/usr/share/vdsm/vm.py", line 381, in _normalizeVdsmImg
    drv['truesize'] = res['truesize']
KeyError: 'truesize'


If we want to improve the VDSM error reporting we can merge:

http://gerrit.ovirt.org/#/c/13529

even though this obviously won't solve the problem in the engine (the bug should be moved probably).

Comment 7 Ayal Baron 2013-04-07 07:54:05 UTC
(In reply to comment #6)
> A deleteImage:
> 
> Thread-54782::INFO::2012-12-05
> 14:49:46,933::logUtils::37::dispatcher::(wrapper) Run and protect:
> deleteImage(sdUUID='0348d4c6-85e5-4ab1-8226-4002f2441037',
> spUUID='2d5f297c-185a-470b-8600-208fc3c9b235',
> imgUUID='bc37bc9e-e431-4b1a-8a7b-c6fd9866ad27', postZero='false',
> force='false')
> 
> preceded the createVm command:
> 
> Thread-54905::DEBUG::2012-12-05
> 14:50:18,772::BindingXMLRPC::894::vds::(wrapper) client [10.35.97.65]::call
> vmCreate with ({...
> {'iface': 'virtio', 'format': 'cow', 'type': 'disk', 'poolID':
> '2d5f297c-185a-470b-8600-208fc3c9b235', 'volumeID':
> 'b894b7fa-e438-4c40-b0ea-e04afea1bc8b', 'imageID':
> 'bc37bc9e-e431-4b1a-8a7b-c6fd9866ad27', 'specParams': {}, 'readonly':
> 'false', 'domainID': '0348d4c6-85e5-4ab1-8226-4002f2441037', 'deviceId':
> 'bc37bc9e-e431-4b1a-8a7b-c6fd9866ad27', 'address': {'bus': '0x00', ' slot':
> '0x09', ' domain': '0x0000', ' type': 'pci', ' function': '0x0'}, 'device':
> 'disk', 'shared': 'false', 'propagateErrors': 'off', 'optional': 'false'},
> ... },) {} flowID [41346f55]
> 
> This resulted in:
> 
> OSError: [Errno 2] No such file or directory:
> '/rhev/data-center/2d5f297c-185a-470b-8600-208fc3c9b235/0348d4c6-85e5-4ab1-
> 8226-4002f2441037/images/bc37bc9e-e431-4b1a-8a7b-c6fd9866ad27/b894b7fa-e438-
> 4c40-b0ea-e04afea1bc8b'
> 
> Traceback (most recent call last):
>   File "/usr/share/vdsm/vm.py", line 597, in _startUnderlyingVm
>     self._run()
>   File "/usr/share/vdsm/libvirtvm.py", line 1344, in _run
>     devices = self.buildConfDevices()
>   File "/usr/share/vdsm/vm.py", line 454, in buildConfDevices
>     self._normalizeVdsmImg(drv)
>   File "/usr/share/vdsm/vm.py", line 381, in _normalizeVdsmImg
>     drv['truesize'] = res['truesize']
> KeyError: 'truesize'
> 
> 
> If we want to improve the VDSM error reporting we can merge:
> 
> http://gerrit.ovirt.org/#/c/13529
> 
> even though this obviously won't solve the problem in the engine (the bug
> should be moved probably).

I'm pretty sure the engine side issue has been dealt with (would be good to verify though).
And if drive doesn't exist when running the vm it would indeed be a lot nicer to report that instead of KeyError

Comment 14 Marina Kalinin 2013-05-28 01:02:09 UTC
Anyway, I opened this bug: https://bugzilla.redhat.com/show_bug.cgi?id=967676
Since it seems to be a separate issue.

Comment 15 Dafna Ron 2013-05-28 13:03:10 UTC
actually it sounds to me more like a copy of this bug: https://bugzilla.redhat.com/show_bug.cgi?id=960952

I encountered the same issue today after LSM failed during the createVolume step and when I tried re-running the vm it failed to run or migrate on one of the hosts because of wrong linking. 

Thread-262::ERROR::2013-05-28 15:55:38,012::vm::704::vm.Vm::(_startUnderlyingVm) vmId=`0780cbb2-fc29-48b1-88d9-0838f6920ec3`::The vm start process failed
Traceback (most recent call last):
  File "/usr/share/vdsm/vm.py", line 664, in _startUnderlyingVm
    self._run()
  File "/usr/share/vdsm/libvirtvm.py", line 1458, in _run
    self.preparePaths(devices[vm.DISK_DEVICES])
  File "/usr/share/vdsm/vm.py", line 725, in preparePaths
    drive['path'] = self.cif.prepareVolumePath(drive, self.id)
  File "/usr/share/vdsm/clientIF.py", line 275, in prepareVolumePath
    raise vm.VolumeError(drive)

this is the link that's sent to one host: 

[root@cougar02 ~]# ls -l /rhev/data-center/7fd33b43-a9f4-4eb7-a885-e9583a929ceb/81ef11d0-4c0c-47b4-8953-d61a6af442d8/images/a6d6df52-a2d9-4598-a28d-0d22fe7e02a5/b5038832-5c63-4f9d-a11b-d7aeb93edbb1
ls: cannot access /rhev/data-center/7fd33b43-a9f4-4eb7-a885-e9583a929ceb/81ef11d0-4c0c-47b4-8953-d61a6af442d8/images/a6d6df52-a2d9-4598-a28d-0d22fe7e02a5/b5038832-5c63-4f9d-a11b-d7aeb93edbb1: No such file or directory


this is the correct link: 

[root@cougar02 ~]# ls -l /rhev/data-center/7fd33b43-a9f4-4eb7-a885-e9583a929ceb/81ef11d0-4c0c-47b4-8953-d61a6af442d8/images/a6d6df52-a2d9-4598-a28d-0d22fe7e02a5/7a3202f1-a98e-4f6e-b1b8-d8cf02aa61c5 
lrwxrwxrwx. 1 vdsm kvm 78 May 28 15:51 /rhev/data-center/7fd33b43-a9f4-4eb7-a885-e9583a929ceb/81ef11d0-4c0c-47b4-8953-d61a6af442d8/images/a6d6df52-a2d9-4598-a28d-0d22fe7e02a5/7a3202f1-a98e-4f6e-b1b8-d8cf02aa61c5 -> /dev/81ef11d0-4c0c-47b4-8953-d61a6af442d8/7a3202f1-a98e-4f6e-b1b8-d8cf02aa61c5

Comment 20 Charlie 2013-11-28 00:29:39 UTC
This bug is currently attached to errata RHBA-2013:15291. If this change is not to be documented in the text for this errata please either remove it from the errata, set the requires_doc_text flag to 
minus (-), or leave a "Doc Text" value of "--no tech note required" if you do not have permission to alter the flag.

Otherwise to aid in the development of relevant and accurate release documentation, please fill out the "Doc Text" field above with these four (4) pieces of information:

* Cause: What actions or circumstances cause this bug to present.
* Consequence: What happens when the bug presents.
* Fix: What was done to fix the bug.
* Result: What now happens when the actions or circumstances above occur. (NB: this is not the same as 'the bug doesn't present anymore')

Once filled out, please set the "Doc Type" field to the appropriate value for the type of change made and submit your edits to the bug.

For further details on the Cause, Consequence, Fix, Result format please refer to:

https://bugzilla.redhat.com/page.cgi?id=fields.html#cf_release_notes 

Thanks in advance.

Comment 21 errata-xmlrpc 2014-01-21 16:03:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-0040.html


Note You need to log in before you can comment on or make changes to this bug.