Bug 1082941
Summary: | [vdsm] vmSnapshot fails on 'IOError: [Errno 22] Invalid argument' | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Elad <ebenahar> | ||||
Component: | vdsm | Assignee: | Arik <ahadas> | ||||
Status: | CLOSED ERRATA | QA Contact: | Gadi Ickowicz <gickowic> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | urgent | ||||||
Version: | 3.4.0 | CC: | ahadas, amureini, bazulay, gklein, iheim, knesenko, lbopf, lpeer, michal.skrivanek, nlevinki, scohen, tnisan, yeylon | ||||
Target Milestone: | --- | Keywords: | Regression | ||||
Target Release: | 3.4.0 | ||||||
Hardware: | x86_64 | ||||||
OS: | Unspecified | ||||||
Whiteboard: | virt | ||||||
Fixed In Version: | vdsm-4.14.7-0.1.beta3.el6ev | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2014-06-09 13:30:00 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1052318, 1083560 | ||||||
Attachments: |
|
Checked also with vdsm-4.14.5-0.2.beta2.el6ev.x86_64 (av5) Daniel, please take a look - is it possible that engine is sending the wrong parameters due to single disk snapshots refactoring? (In reply to Allon Mureinik from comment #3) > Daniel, please take a look - is it possible that engine is sending the wrong > parameters due to single disk snapshots refactoring? Doesn't seem so. The issue has been reproduced a few times only on the specific environment. It looks like an IOError during the creation of memory volume, so I'm not sure there's anything else we can do in vdsm layer. (In reply to Daniel Erez from comment #4) > (In reply to Allon Mureinik from comment #3) > > Daniel, please take a look - is it possible that engine is sending the wrong > > parameters due to single disk snapshots refactoring? > > Doesn't seem so. The issue has been reproduced a few times only on the > specific environment. It looks like an IOError during the creation of memory > volume, so I'm not sure there's anything else we can do in vdsm layer. Seems like a virt issue with memory padding, Michal, can one of your guys take a look please? Note that Daniel could not consistently reproduce it. We are not supposed to pad volumes on block devices. There's a check in vm.py that the storage domain type is file-based and only in that case we do the padding. The memory volumes resided on block device so we should figure out why we think it is file in vdsm. (sorry, I removed the 'blocks' field by mistake) the type of the storage pool as returned by getStoragePoolInfo is NFS. Tal, are there changes done as part of local/shared SD feature I'm not aware of? How is it supposed to be checked now? if so we need a better/new way how to check if the domain used for memory volume is file based or not Some insight from Gadi: When examining the memory padding function, the decision on how to pad is done by the STORAGE POOL type (e.g.): def _padMemoryVolume(memoryVolPath, spType, sdUUId): if spType == sd.NFS_DOMAIN: oop.getProcessPool(sdUUID).fileUtils. \ padToBlockSize(memoryVolPath) else: fileUtils.padToBlockSize(memoryVolPath) Now that we introduced storage mixed types, this assumption is wrong. Instead, we should inspect the type of the domain itself, presumably by producing it. Verified on av7. Able to create live snapshot with memory for vm when pool type was NFS (master domain was NFS), and disk was on ISCSI domain. lbopf - this is a regression that was introduced in the development cycle of 3.4.0, and fixed within it. There is no publicly available version with this problem. Why should we supply doctext for it? Hi Allon, At this stage, I am supplying doc text for all errata in my queue, since no flags have been set the contrary. I am also quite new to this process. Am I to understand that all bugs marked as 'regression' usually do not require doc text? If you can confirm that this is the case, I will clear the text. Thanks. Further to my last message: If a bug is attached to an erratum, then the assumption is that it will require doc text. If it doesn't require doc text, then please set the "requires_doc_text" flag to -. I will set the flag in this instance. Thanks. (In reply to lbopf from comment #15) > Hi Allon, > > At this stage, I am supplying doc text for all errata in my queue, since no > flags have been set the contrary. I am also quite new to this process. Am I > to understand that all bugs marked as 'regression' usually do not require > doc text? If you can confirm that this is the case, I will clear the text. Yes, this is usually true. Marking a bug as a regression usually means that "Version X had some functionality, an early build of X+1 broke it, and a later build un-broke it". From a customer's perspective, who only only consumes official releases, this is a mute point - he has version X with a working functionality, and he upgrades to X+1 and observes the functionality is still available. Obviously, this is not true for 100% of the cases, and sometimes caveats exists - but if they do, it's up to the developer that solved the bug to raise them with a "requires_doc_text?" flag. (In reply to lbopf from comment #16) > Further to my last message: If a bug is attached to an erratum, then the > assumption is that it will require doc text. If it doesn't require doc text, > then please set the "requires_doc_text" flag to -. I will set the flag in > this instance. Duly noted. I'll instruct my team to pay more attention to this. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2014-0504.html |
Created attachment 881229 [details] engine, vdsm, libvirt, qemu and sanlock logs Description of problem: Live snapshot fails with the following error in vdsm.log: Thread-32890::ERROR::2014-04-01 09:44:39,005::BindingXMLRPC::1081::vds::(wrapper) unexpected error Traceback (most recent call last): File "/usr/share/vdsm/BindingXMLRPC.py", line 1065, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/BindingXMLRPC.py", line 372, in vmSnapshot return vm.snapshot(snapDrives, snapMemVolHandle) File "/usr/share/vdsm/API.py", line 679, in snapshot return v.snapshot(snapDrives, memoryParams) File "/usr/share/vdsm/vm.py", line 4014, in snapshot _padMemoryVolume(memoryVolPath, spType, sdUUID) File "/usr/share/vdsm/vm.py", line 3881, in _padMemoryVolume padToBlockSize(memoryVolPath) File "/usr/share/vdsm/storage/remoteFileHandler.py", line 297, in callCrabRPCFunction *args, **kwargs) File "/usr/share/vdsm/storage/remoteFileHandler.py", line 199, in callCrabRPCFunction raise err IOError: [Errno 22] Invalid argument Version-Release number of selected component (if applicable): RHEV-3.4-AV5 vdsm-4.14.2-0.2.el6ev.x86_64 libvirt-0.10.2-29.el6_5.5.x86_64 qemu-kvm-rhev-0.12.1.2-2.415.el6_5.7.x86_64 sanlock-2.8-1.el6.x86_64 rhevm-3.4.0-0.12.beta2.el6ev.noarch How reproducible: Always Steps to Reproduce: 1. Create live snapshot 2. 3. Additional info: engine, vdsm, libvirt, qemu and sanlock logs