851858 – qemu-img: cannot resume a vm that was paused due to EIO on NFS storage although storage is available

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 851858 - qemu-img: cannot resume a vm that was paused due to EIO on NFS storage although storage is available

Summary: qemu-img: cannot resume a vm that was paused due to EIO on NFS storage althou...

Keywords:
Status:	CLOSED DUPLICATE of bug 740509
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	qemu-kvm
Sub Component:
Version:	6.5
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Asias He
QA Contact:	Virtualization Bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2012-08-26 15:23 UTC by Dafna Ron
Modified:	2013-01-10 01:07 UTC (History)
CC List:	12 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2012-10-15 16:07:16 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
logs (649.40 KB, application/x-gzip) 2012-08-26 15:23 UTC, Dafna Ron	no flags	Details
View All

Description Dafna Ron 2012-08-26 15:23:57 UTC

Created attachment 607072 [details]
logs

Description of problem:

in a two host cluster with NFS storage, I blocked the storage from the host using iptables. 
after the vm's paused I removed the block -> activated the hosts and once the storage was active I selected all my 10 vms and ran them. 
one of the vm's refuses to start due to EIO even though the domain is available and all other vm's have started. 
Iv'e reproduced this several times with the same vm's and each time a different vm pauses and refuses to start.

Version-Release number of selected component (if applicable):

qemu-img-rhev-0.12.1.2-2.298.el6_3.x86_64
libvirt-0.9.10-21.el6.x86_64
vdsm-4.9.6-30.0.el6_3.x86_64

How reproducible:

100%

Steps to Reproduce:
1. in two hosts cluster with NFS storage -> run vm's with XP installed + writing 
2. block connectivity to the storage domain from both hosts
3. when vm's pause remove the iptables rule -> select all vm's and run
  
Actual results:

all but one of the vm's is resumed
when I try to run the vm again it keeps getting EIO errors and pauses. 

Expected results:

we should be able to resume all vm's

Additional info: libvirt, vdsm and 2 vm's logs (one is XP-6 that has the issue and one is XP-10 which ran). 

[root@gold-vdsd tmp]# vdsClient -s 0 continue 553cd58e-2295-4995-a4ee-71724f63ee49
	code = 0
	message = Done
[root@gold-vdsd tmp]# vdsClient -s 0 list table
63205116-5547-4e7c-b89f-c6cf8502f09d  23829  XP-10                Up                                       
2c78a0af-9e68-4e3b-a8f7-93346d19c3c9  24042  XP-8                 Up                                       
29ce48d2-966c-447b-809e-ae26303be112  25350  XP-5                 Up                                       
553cd58e-2295-4995-a4ee-71724f63ee49  23984  XP-6                 Paused                                   
50737895-2cee-42aa-8aaf-734e7891a99b  25423  XP-9                 Up                                       
985d5a5b-41ed-4b51-8f02-6886a4e3b223  24082  XP-7                 Up                                       
68640442-defe-4186-a67a-974fa33dfcf5  23488  XP-3                 Up                                       
7c4ee4f9-31bf-4dcd-8ca3-57d3988a1bbf  23787  XP-4                 Up                                       
1845bf08-b103-421a-aeb6-127d22486e30  23684  XP-2                 Up                                       
fc0643e6-dddc-4662-b3a3-a8b3b27924fd  23189  XP-1                 Up       

[root@gold-vdsd tmp]# virsh -r list
 Id    Name                           State
----------------------------------------------------
 71    XP-1                           running
 72    XP-3                           running
 73    XP-2                           running
 74    XP-4                           running
 75    XP-10                          running
 76    XP-6                           paused
 77    XP-8                           running
 78    XP-7                           running
 79    XP-5                           running
 80    XP-9                           running


-bash-4.1$ qemu-img info  /rhev/data-center/f2b5703d-6449-461d-a837-2bfd9dcf0201/2045e517-a65b-437d-8b2b-45018a5aaa23/images/7c68816e-51bf-4c98-bb18-2eb775f763c2/33ec3754-1617-454a-8ed6-6fdfdb5967a0
image: /rhev/data-center/f2b5703d-6449-461d-a837-2bfd9dcf0201/2045e517-a65b-437d-8b2b-45018a5aaa23/images/7c68816e-51bf-4c98-bb18-2eb775f763c2/33ec3754-1617-454a-8ed6-6fdfdb5967a0
file format: qcow2
virtual size: 15G (16106127360 bytes)
disk size: 334M
cluster_size: 65536
backing file: ../7c68816e-51bf-4c98-bb18-2eb775f763c2/3e03e69e-4e92-4ba3-ace5-2b02bae9e929 (actual path: /rhev/data-center/f2b5703d-6449-461d-a837-2bfd9dcf0201/2045e517-a65b-437d-8b2b-45018a5aaa23/images/7c68816e-51bf-4c98-bb18-2eb775f763c2/../7c68816e-51bf-4c98-bb18-2eb775f763c2/3e03e69e-4e92-4ba3-ace5-2b02bae9e929)

bash-4.1$ qemu-img check  /rhev/data-center/f2b5703d-6449-461d-a837-2bfd9dcf0201/2045e517-a65b-437d-8b2b-45018a5aaa23/images/7c68816e-51bf-4c98-bb18-2eb775f763c2/33ec3754-1617-454a-8ed6-6fdfdb5967a0
No errors were found on the image.

Comment 2 Chao Yang 2012-08-27 07:19:51 UTC

FYI:
Bug 740509 - cannot resume vm's that were paused due to disconnection to SD in NFS storage type

Comment 3 Dor Laor 2012-09-02 14:58:00 UTC

(In reply to comment #2)
> FYI:
> Bug 740509 - cannot resume vm's that were paused due to disconnection to SD
> in NFS storage type

Thanks for finding this exact source of the same bug!
Dafna, do you agree to clone this as a duplicate?
Since it's only about windowsXp + IDE + rare case for the storage I rather keep on posting (closing) this case too.
Dor

Comment 4 Dafna Ron 2012-09-02 15:05:07 UTC

sure. it's your call Dor :)

Comment 5 Ademar Reis 2012-10-15 16:07:16 UTC

(In reply to comment #4)
> sure. it's your call Dor :)

Done, thanks.

*** This bug has been marked as a duplicate of bug 740509 ***

Note You need to log in before you can comment on or make changes to this bug.