Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 720903

Summary:	The guest cannot be resumed without any error info when there is overcommit to storage
Product:	Red Hat Enterprise Linux 6	Reporter:	dyuan
Component:	libvirt	Assignee:	Libvirt Maintainers <libvirt-maint>
Status:	CLOSED NOTABUG	QA Contact:	Virtualization Bugs <virt-bugs>
Severity:	high	Docs Contact:
Priority:	high
Version:	6.2	CC:	dallan, mzhan, nzhang, rwu, weizhan, zpeng
Target Milestone:	rc
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2011-07-13 13:47:50 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description dyuan 2011-07-13 08:04:59 UTC

Description of problem:
The guest will be paused automatically and cannot be resumed without any error info when there is overcommit to the storage.

no error info from /var/log/messages and /var/log/libvirt/libvirtd.log.

and get the error report in qemu/$guest.log when the guest is paused
# cat /var/log/libvirt/qemu/guest.log
block I/O error in device 'drive-virtio-disk0': No space left on device (28)

when I try to resume the guest, the same error will appear in qemu/$guest.log again.

Version-Release number of selected component (if applicable):
libvirt-0.9.3-2.el6.x86_64
qemu-kvm-0.12.1.2-2.169.el6.x86_64
kernel-2.6.32-167.el6.x86_64

How reproducible:
always

Steps to Reproduce:
1. Prepare a small partition for this tesing.

# fdisk -l /dev/sda11

Disk /dev/sda11: 11 MB, 11517952 bytes
64 heads, 32 sectors/track, 10 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

# mkfs.ext3 /dev/sda11

2. Define, build and start the pool

<pool type='fs'>
  <name>mypool</name>
  <source>
    <device path='/dev/sda11'/>
    <format type='auto'/>
  </source>
  <target>
    <path>/var/lib/libvirt/images/mypool</path>
  </target>
</pool>

# virsh pool-define mypool.xml

# virsh pool-build mypool

# virsh pool-start mypool

3. Check the pool is working fine.

# df -h
/dev/sda11              11M  1.1M  9.0M  11% /var/lib/libvirt/images/mypool

4. Prepare the following xml to create volume in the pool.

# cat vol-disk-template.xml
<volume>
  <name>disk1.img</name>
  <capacity unit='M'>100</capacity>
  <allocation unit='M'>0</allocation>
  <target>
    <path>/var/lib/libvirt/images/mypool/disk1.img</path>
    <format type='raw'/>
  </target>
</volume>

# virsh vol-create mypool vol-disk-template.xml

5. Attach the volume to an existing guest as 2rd disk, then start the guest.

6. In guest, try to write some staf (which size is bigger than 50M ) in the 2rd disk.

# fdisk /dev/vdb

# mkfs.ext3 /dev/vdb1

# mount /dev/vdb1 /mnt

# dd if=/dev/zero of=/mnt/test.img bs=1M count=50

7. virsh list --all
 Id Name                 State
----------------------------------
  6 guest                paused


Expected result:
There should be some message that prompt user no availabe space, and the guest can be resumed successfully.

Actual result:
The guest will be paused automatically and cannot be resumed without any error info when there is overcommit to the storage.

Additional info:

Comment 2 Dave Allan 2011-07-13 13:47:50 UTC

(In reply to comment #0)

Thank you for the detailed bug report--it's very helpful.

> and get the error report in qemu/$guest.log when the guest is paused
> # cat /var/log/libvirt/qemu/guest.log
> block I/O error in device 'drive-virtio-disk0': No space left on device (28)

That's where the error is intended to be reported.  There is also an event emitted that provides the same information as the guest.log, that the VM was stopped because of an i/o error: event VIR_DOMAIN_EVENT_SUSPENDED, detail VIR_DOMAIN_EVENT_SUSPENDED_IOERROR

> Expected result:
> There should be some message that prompt user no availabe space, and the guest
> can be resumed successfully.

The guest has been configured to pause on i/o error, which is why it does not resume, or rather it resumes and then immediately suspends again.  It can't continue because it's out of disk space.  If you extend the underlying storage, the guest will stay running.

Comment 3 Dave Allan 2011-07-13 14:01:26 UTC

BTW, to control the behavior on i/o error, see:

http://libvirt.org/formatdomain.html#elementsDisks

In particular:

The optional error_policy attribute controls how the hypervisor will behave on an error, possible values are "stop", "ignore", and "enospace".