1281520 – [ppc64le] Qemu crashes after writing to a resized disk from the vm.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1281520 - [ppc64le] Qemu crashes after writing to a resized disk from the vm.

Summary: [ppc64le] Qemu crashes after writing to a resized disk from the vm.

Keywords:
Status:	CLOSED DUPLICATE of bug 1277922
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	qemu-kvm-rhev
Sub Component:
Version:	7.2
Hardware:	ppc64le
OS:	Unspecified
Priority:	unspecified
Severity:	urgent
Target Milestone:	rc
Target Release:	---
Assignee:	Thomas Huth
QA Contact:	Virtualization Bugs
Docs Contact:
URL:
Whiteboard:
Depends On:	1277922
Blocks:	1279052 RHEV4.0PPC RHV4.1PPC
TreeView+	depends on / blocked

Reported:	2015-11-12 16:26 UTC by Carlos Mestre González
Modified:	2016-07-25 14:18 UTC (History)
CC List:	16 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2015-11-23 04:59:06 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Carlos Mestre González 2015-11-12 16:26:15 UTC

Description of problem:
As the topic says, I'm opening a new bz here to investigate the issue regarding the crash from this bug https://bugzilla.redhat.com/show_bug.cgi?id=1279052 (there's another issue going on with vdsm, so opening a new one to investigate the issue with qemu)

Version-Release number of selected component (if applicable):
qemu-kvm-common-rhev-2.3.0-31.el7_2.1.ppc64le
qemu-img-rhev-2.3.0-31.el7_2.1.ppc64le
libvirt-daemon-driver-qemu-1.2.17-13.el7.ppc64le
ipxe-roms-qemu-20130517-7.gitc4bce43.el7.noarch
qemu-kvm-rhev-2.3.0-31.el7_2.1.ppc64le
qemu-kvm-tools-rhev-2.3.0-31.el7_2.1.ppc64le


How reproducible:
100%

Steps to Reproduce:
Please see https://bugzilla.redhat.com/show_bug.cgi?id=1279052#c17 (and rest of the bug)

Actual results:
ERROR:qom/object.c:716:object_unref: assertion failed: (obj->ref > 0)

Thus the QEMU process exited because it hit an assert() statement - something called object_unref() with an object that was not referenced anymore.

Qemu crashes, check https://bugzilla.redhat.com/show_bug.cgi?id=1279052#c19 and #c22 and for the core dump

Additional info:

Comment 2 Qunfang Zhang 2015-11-13 10:54:26 UTC

Hi, David

Is it possible that this bug is the same issue as bug 1277922? Since bug 1279052 comment 22 and bug 1277922 comment 18 looks similar.

Comment 3 Laurent Vivier 2015-11-13 13:22:08 UTC

(In reply to Qunfang Zhang from comment #2)
> Hi, David
> 
> Is it possible that this bug is the same issue as bug 1277922? Since bug
> 1279052 comment 22 and bug 1277922 comment 18 looks similar.

It looks like. It happens when the VM is stopped on I/O error (not enough space) and the VM is restarted after the problem has been fixed.

Comment 4 Shuang Yu 2015-11-13 16:17:38 UTC

Try to reproduce this issue with "qemu-kvm-rhev-2.3.0-31.el7_2.1.ppc64le",but follow the steps as below,only hit 
"(qemu) info status 
VM status: paused (io-error)" problem.

Host version:
qemu-kvm-rhev-2.3.0-31.el7_2.1.ppc64le
kernel-3.10.0-330.el7.ppc64le
SLOF-20150313-5.gitc89b0df.el7.noarch

Steps:

1.On iscsi server,create lun for iscsi client to use.

# qemu-img create -f qcow2 /home/test 1G

#targetcli
..
/backstores/fileio> create file0 /home/test
Created fileio file0 with size 1073741824
/> /iscsi/iqn.2015-10.com.test:server1/tpg1/luns/ create /backstores/fileio/file0


2.On iscsi client:

# iscsiadm --mode node --targetname iqn.2015-10.com.test:server1 --portal 10.16.67.19:3260 --login

# fdisk -l
Disk /dev/sdg: 1073 MB, 1073741824 bytes, 2097152 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 8388608 bytes
Disk label type: dos
Disk identifier: 0xc5672dcd

#fdisk /dev/sdg & #mkfs.ext4 /dev/sdg1 & mount /dev/sdg1 /mnt/tmp

3.Create lvm on /dev/sdg1

# losetup /dev/loop0 /dev/sdg1

# pvcreate /dev/loop0
  Physical volume "/dev/loop0" successfully created

# vgcreate test /dev/loop0
  Volume group "test" successfully created

# vgchange -ay test
  0 logical volume(s) in volume group "test" now active

# lvcreate -L 500M -n mylvm test
  Logical volume "mylvm" created.

# lvdisplay 
   
  --- Logical volume ---
  LV Path                /dev/test/mylvm
  LV Name                mylvm
  VG Name                test
  LV UUID                eN75cd-80I1-Zrdy-a59F-2ewE-iEEV-xeVENo
  LV Write Access        read/write
  LV Creation host, time ibm-p8-rhevm-16.lab4.eng.bos.redhat.com, 2015-11-13 09:29:37 -0500
  LV Status              available
  # open                 0
  LV Size                500.00 MiB
  Current LE             125
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     8192
  Block device           253:3

# qemu-img info /dev/test/mylvm
image: /dev/test/mylvm
file format: raw
virtual size: 500M (524288000 bytes)
disk size: 0

4.Create qcow2 image on /dev/test/mylvm

# qemu-img create -f qcow2 /dev/test/mylvm 2G
Formatting '/dev/test/mylvm', fmt=qcow2 size=2147483648 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16

# qemu-img info /dev/test/mylvm 
image: /dev/test/mylvm
file format: qcow2
virtual size: 2.0G (2147483648 bytes)
disk size: 0
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

5.Boot up the guest with /dev/test/mylvm as data disk

#  /usr/libexec/qemu-kvm -name Bug-reverify -machine pseries,accel=kvm,usb=off -m 4G -smp 8,sockets=2,cores=1,threads=4 -uuid 8aeab7e2-f341-4f8c-80e8-59e2968d85c2 -realtime mlock=off -nodefaults -monitor stdio -rtc base=utc -msg timestamp=on -usb -device usb-tablet,id=tablet1  -vga std -qmp tcp:0:4666,server,nowait -netdev tap,id=hostnet1,script=/etc/qemu-ifup,vhost=on -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:54:5a:52:5f:5c -vnc :10 -device virtio-scsi-pci,id=scsi0,addr=0x6 -drive file=RHEL-7.2-20151030.0-Server-ppc64le.qcow2,format=qcow2,if=none,id=drive-scsi0-0-0-0,cache=none -device scsi-hd,bus=scsi0.0,drive=drive-scsi0-0-0-0,bootindex=1,id=scsi0-0-0-0 -drive file=/dev/test/mylvm,format=qcow2,if=none,id=drive-scsi1,cache=none -device scsi-hd,bus=scsi0.0,drive=drive-scsi1,id=scsi1 

6.Extend the lvm size

# lvextend -L +500M /dev/test/mylvm
  Size of logical volume test/mylvm changed from 500.00 MiB (125 extents) to 1000.00 MiB (250 extents).
  Logical volume mylvm successfully resized.

7.In the guest:

# fdisk /dev/sdb
# mkfs.ext4 /dev/sdb1
# mount /dev/sdb1 /mnt/sdb1

8.In the guest:
# dd if=/dev/urandom of=/mnt/sdb1/file bs=1M count=2048

Actual result:
(qemu) info status
VM status: paused (io-error)
(qemu) info status
VM status: paused (io-error)
(qemu) cont
(qemu) info status
VM status: paused (io-error)
(qemu)

Comment 5 Thomas Huth 2015-11-13 16:50:24 UTC

(In reply to Qunfang Zhang from comment #2)
> Is it possible that this bug is the same issue as bug 1277922? Since bug
> 1279052 comment 22 and bug 1277922 comment 18 looks similar.

I agree with Laurent, looks similar! I'll do build with the fix that has been suggested in that bug, then we can (hopefully) check whether it fixes this issue, too.

Comment 9 Carlos Mestre González 2015-11-20 15:19:09 UTC

Hi,

I tested the scenario with the new build and seems to work, as in the vm *doesn't crash*. On the other hand the same scenario the vm pauses with an Storage space error, that it's a bug too (that could be related to some other components, so I'm updating the bug 1279052 and see if there's anything else to do with qemu-kvm-rhev.

Thanks for your build.

Comment 10 David Gibson 2015-11-23 01:41:15 UTC

The whole reproducer is deliberately set up to trigger a pause due to a storage space error, so seeing that initially is not a bug.  If expanding the LV then resuming qemu isn't enough to fix the storage space error and let the guest continue, then there is a problem.

Note that depending on exactly how much disk space you allocate at each stage, it is possible that you could get one storage space pause, lvextend, resume the guest and then it will run for a while before hitting another storage space pause which would require another lvextend.

Comment 11 David Gibson 2015-11-23 01:44:50 UTC

Sorry, comment 10 above was written with regards to the reproducer for bug 1277922.

I don't know vdsm well enough to be certain, but it looks like I'd expect a storage space error along the way for this bug as well - however, I'd expect vdsm to resolve that error (by expanding the space and resuming the guest) without manual intervention.

It does sound to me like this was a duplicate of bug 1277922 as expected.

Comment 12 Thomas Huth 2015-11-23 04:59:06 UTC

I agree with David, this bug here (the QEMU crash) was a duplicate of 1277922, so I'm closing this ticket accordingly. The remaining issue with the VM pause can be tracked in BZ 1279052 instead.

*** This bug has been marked as a duplicate of bug 1277922 ***

Note You need to log in before you can comment on or make changes to this bug.