Bug 1082941 - [vdsm] vmSnapshot fails on 'IOError: [Errno 22] Invalid argument'
Summary: [vdsm] vmSnapshot fails on 'IOError: [Errno 22] Invalid argument'
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.4.0
Hardware: x86_64
OS: Unspecified
urgent
high
Target Milestone: ---
: 3.4.0
Assignee: Arik
QA Contact: Gadi Ickowicz
URL:
Whiteboard: virt
Depends On:
Blocks: 1052318 1083560
TreeView+ depends on / blocked
 
Reported: 2014-04-01 07:06 UTC by Elad
Modified: 2014-08-22 01:42 UTC (History)
13 users (show)

Fixed In Version: vdsm-4.14.7-0.1.beta3.el6ev
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-06-09 13:30:00 UTC
oVirt Team: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
engine, vdsm, libvirt, qemu and sanlock logs (1.28 MB, application/x-gzip)
2014-04-01 07:06 UTC, Elad
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2014:0504 0 normal SHIPPED_LIVE vdsm 3.4.0 bug fix and enhancement update 2014-06-09 17:21:35 UTC
oVirt gerrit 26407 0 None None None Never
oVirt gerrit 26594 0 None None None Never

Description Elad 2014-04-01 07:06:32 UTC
Created attachment 881229 [details]
engine, vdsm, libvirt, qemu and sanlock logs

Description of problem:
Live snapshot fails with the following error in vdsm.log:

Thread-32890::ERROR::2014-04-01 09:44:39,005::BindingXMLRPC::1081::vds::(wrapper) unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/BindingXMLRPC.py", line 1065, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/BindingXMLRPC.py", line 372, in vmSnapshot
    return vm.snapshot(snapDrives, snapMemVolHandle)
  File "/usr/share/vdsm/API.py", line 679, in snapshot
    return v.snapshot(snapDrives, memoryParams)
  File "/usr/share/vdsm/vm.py", line 4014, in snapshot
    _padMemoryVolume(memoryVolPath, spType, sdUUID)
  File "/usr/share/vdsm/vm.py", line 3881, in _padMemoryVolume
    padToBlockSize(memoryVolPath)
  File "/usr/share/vdsm/storage/remoteFileHandler.py", line 297, in callCrabRPCFunction
    *args, **kwargs)
  File "/usr/share/vdsm/storage/remoteFileHandler.py", line 199, in callCrabRPCFunction
    raise err
IOError: [Errno 22] Invalid argument


Version-Release number of selected component (if applicable):
RHEV-3.4-AV5
vdsm-4.14.2-0.2.el6ev.x86_64
libvirt-0.10.2-29.el6_5.5.x86_64
qemu-kvm-rhev-0.12.1.2-2.415.el6_5.7.x86_64
sanlock-2.8-1.el6.x86_64
rhevm-3.4.0-0.12.beta2.el6ev.noarch

How reproducible:
Always

Steps to Reproduce:
1. Create live snapshot
2.
3.



Additional info: engine, vdsm, libvirt, qemu and sanlock logs

Comment 2 Elad 2014-04-01 08:04:31 UTC
Checked also with vdsm-4.14.5-0.2.beta2.el6ev.x86_64 (av5)

Comment 3 Allon Mureinik 2014-04-01 10:02:43 UTC
Daniel, please take a look - is it possible that engine is sending the wrong parameters due to single disk snapshots refactoring?

Comment 4 Daniel Erez 2014-04-02 09:34:02 UTC
(In reply to Allon Mureinik from comment #3)
> Daniel, please take a look - is it possible that engine is sending the wrong
> parameters due to single disk snapshots refactoring?

Doesn't seem so. The issue has been reproduced a few times only on the specific environment. It looks like an IOError during the creation of memory volume, so I'm not sure there's anything else we can do in vdsm layer.

Comment 5 Tal Nisan 2014-04-02 13:04:23 UTC
(In reply to Daniel Erez from comment #4)
> (In reply to Allon Mureinik from comment #3)
> > Daniel, please take a look - is it possible that engine is sending the wrong
> > parameters due to single disk snapshots refactoring?
> 
> Doesn't seem so. The issue has been reproduced a few times only on the
> specific environment. It looks like an IOError during the creation of memory
> volume, so I'm not sure there's anything else we can do in vdsm layer.

Seems like a virt issue with memory padding, Michal, can one of your guys take a look please?
Note that Daniel could not consistently reproduce it.

Comment 6 Arik 2014-04-02 14:33:09 UTC
We are not supposed to pad volumes on block devices.
There's a check in vm.py that the storage domain type is file-based and only in that case we do the padding.
The memory volumes resided on block device so we should figure out why we think it is file in vdsm.

Comment 7 Arik 2014-04-02 14:34:07 UTC
(sorry, I removed the 'blocks' field by mistake)

Comment 8 Michal Skrivanek 2014-04-02 14:38:56 UTC
the type of the storage pool as returned by getStoragePoolInfo is NFS. Tal, are there changes done as part of local/shared SD feature I'm not aware of?
How is it supposed to be checked now?

Comment 9 Michal Skrivanek 2014-04-02 14:53:30 UTC
if so we need a better/new way how to check if the domain used for memory volume is file based or not

Comment 10 Allon Mureinik 2014-04-02 15:56:34 UTC
Some insight from Gadi: When examining the memory padding function, the decision on how to pad is done by the STORAGE POOL type (e.g.):

        def _padMemoryVolume(memoryVolPath, spType, sdUUId):
            if spType == sd.NFS_DOMAIN:
                oop.getProcessPool(sdUUID).fileUtils. \
                    padToBlockSize(memoryVolPath)
            else:
                fileUtils.padToBlockSize(memoryVolPath)

Now that we introduced storage mixed types, this assumption is wrong. Instead, we should inspect the type of the domain itself, presumably by producing it.

Comment 13 Gadi Ickowicz 2014-04-28 12:00:32 UTC
Verified on av7.

Able to create live snapshot with memory for vm when pool type was NFS (master domain was NFS), and disk was on ISCSI domain.

Comment 14 Allon Mureinik 2014-05-09 10:55:42 UTC
lbopf - this is a regression that was introduced in the development cycle of 3.4.0, and fixed within it. There is no publicly available version with this problem. Why should we supply doctext for it?

Comment 15 Lucy Bopf 2014-05-12 00:07:47 UTC
Hi Allon,

At this stage, I am supplying doc text for all errata in my queue, since no flags have been set the contrary. I am also quite new to this process. Am I to understand that all bugs marked as 'regression' usually do not require doc text? If you can confirm that this is the case, I will clear the text.

Thanks.

Comment 16 Lucy Bopf 2014-05-12 04:50:58 UTC
Further to my last message: If a bug is attached to an erratum, then the assumption is that it will require doc text. If it doesn't require doc text, then please set the "requires_doc_text" flag to -. I will set the flag in this instance.

Thanks.

Comment 17 Allon Mureinik 2014-05-14 12:28:11 UTC
(In reply to lbopf from comment #15)
> Hi Allon,
> 
> At this stage, I am supplying doc text for all errata in my queue, since no
> flags have been set the contrary. I am also quite new to this process. Am I
> to understand that all bugs marked as 'regression' usually do not require
> doc text? If you can confirm that this is the case, I will clear the text.
Yes, this is usually true. Marking a bug as a regression usually means that "Version X had some functionality, an early build of X+1 broke it, and a later build un-broke it".
From a customer's perspective, who only only consumes official releases, this is a mute point - he has version X with a working functionality, and he upgrades to X+1 and observes the functionality is still available.
Obviously, this is not true for 100% of the cases, and sometimes caveats exists - but if they do, it's up to the developer that solved the bug to raise them with a "requires_doc_text?" flag. 

(In reply to lbopf from comment #16)
> Further to my last message: If a bug is attached to an erratum, then the
> assumption is that it will require doc text. If it doesn't require doc text,
> then please set the "requires_doc_text" flag to -. I will set the flag in
> this instance.
Duly noted. 
I'll instruct my team to pay more attention to this.

Comment 18 errata-xmlrpc 2014-06-09 13:30:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-0504.html


Note You need to log in before you can comment on or make changes to this bug.