Bug 537075 - qcow2: infinite recursion on grow_refcount_table() error handling
Summary: qcow2: infinite recursion on grow_refcount_table() error handling
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kvm
Version: 5.5
Hardware: All
OS: Linux
urgent
urgent
Target Milestone: rc
: ---
Assignee: Kevin Wolf
QA Contact: Virtualization Bugs
URL:
Whiteboard:
: 549747 551387 (view as bug list)
Depends On:
Blocks: 552159
TreeView+ depends on / blocked
 
Reported: 2009-11-12 11:57 UTC by Eduardo Habkost
Modified: 2013-01-09 22:00 UTC (History)
13 users (show)

Fixed In Version: kvm-83-136.el5
Doc Type: Bug Fix
Doc Text:
Clone Of: 520693
Environment:
Last Closed: 2010-03-30 07:52:34 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2010:0271 0 normal SHIPPED_LIVE Important: kvm security, bug fix and enhancement update 2010-03-29 13:19:48 UTC

Description Eduardo Habkost 2009-11-12 11:57:43 UTC
This bug is for one of the issues found while testing Bug #520693: the infinite recursion on grow_refcount_table() error handling.

+++ This bug was initially created as a clone of Bug #520693 +++
[...]

--- Additional comment from rsibley on 2009-10-12 13:00:48 EDT ---

VM:

Testing with IOmeter on a VW: w2k3 r2 sp2, and all of the latest updates, 1024MB memory, 1 CPU, VIRTIO drivers.

Phyical Disk 1: size:20GB  Actual size:12GB  Format:COW  Allocation:Sparse

IOmeter 2006.07.27, Cycle # Outstanding I/Os -- run step outstanding I/Os on all disks at a time,  Exponential Stepping start @1 end @64   Power of 2.
Sequential Writes transfer request size @ 32KB and 64KB.
Maximum Disk Size 0
Starting Disk Sector 0

Rhevm:

RHEV Manager 2.1.0.35992, perf26 2.1.0.35993 sp209

HOST:

CPU Name:       64-bit Intel with NX  
CPU Type:       Genuine Intel(R) CPU    @2.40GHz
Number of CPUs: 16
Memory:         32166 MB 

MSA 1000, 10 disk RAID 0 LUN.
QLA2xxx

Linux perf26.lab.bos.redhat.com 2.6.18-164.el5 #1 SMP Tue Aug 18 15:51:48 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux

etherboot-zroms-kvm-5.4.4-11.el5
kmod-kvm-83-125.el5
kvm-qemu-img-83-125.el5
kvm-debuginfo-83-125.el5
kvm-tools-83-125.el5
kvm-83-125.el5

Error:

VM will crash, or hang with the following info.

Using IOmeter Sequential Writes @ 32K, 64K transfer request size.

On RHEV-M I get the following errors:

Oct 12, 9:42 VM w2k3-io-125 is down. Exit message Lost connection with kvm process.
Oct 12, 9:39 VM w2k3-io-125 has paused due to no Storage space error.

vdsm.lg:
http://10.16.18.203/rhevm/iometer_sequential_wrts_error_vdsm.log

--- Additional comment from rsibley on 2009-10-12 15:08:10 EDT ---

Added information for IOmeter Sequential Writes @ 32K/64K transfer request size
error:


dmesg | grep kvm
kvm: virtualization flags detected on this hardware: vmx tpr_shadow vnmi
flexpriority
loaded kvm module (kvm-83-125.el5)
kvm: emulating exchange as write
qemu-kvm[8932]: segfault at 00000000414c4ff0 rip 00000000004195e0 rsp
00000000414c5008 error 6

--- Additional comment from ykaul on 2009-10-12 15:17:42 EDT ---

(In reply to comment #28)
> Added information for IOmeter Sequential Writes @ 32K/64K transfer request size
> error:
> 
> 
> dmesg | grep kvm
> kvm: virtualization flags detected on this hardware: vmx tpr_shadow vnmi
> flexpriority
> loaded kvm module (kvm-83-125.el5)
> kvm: emulating exchange as write
> qemu-kvm[8932]: segfault at 00000000414c4ff0 rip 00000000004195e0 rsp
> 00000000414c5008 error 6  

Isn't there a core dump available? /var/log/core/ ?

[...]

--- Additional comment from lihuang on 2009-10-14 21:24:01 EDT ---

I was unable to reproduce the segfault.

bt from the core file :
core.8870.1255374060.dump.1

Core was generated by `/usr/libexec/qemu-kvm -no-hpet -no-kvm-pit-reinjection -usbdevice tablet -rtc-t'.
Program terminated with signal 11, Segmentation fault.
[New process 8932]
[New process 10156]
[New process 10149]
[New process 10148]
[New process 9937]
[New process 9931]
[New process 9721]
[New process 9523]
[New process 9369]
[New process 8934]
[New process 8906]
[New process 8870]
#0  raw_pwrite_aligned (bs=0x162a27a0, offset=14002786304, buf=0x162a3200 "", 
    count=16384) at block-raw-posix.c:250
250     {
(gdb) bt
#0  raw_pwrite_aligned (bs=0x162a27a0, offset=14002786304, buf=0x162a3200 "", 
    count=16384) at block-raw-posix.c:250
#1  0x0000000000419711 in raw_pwrite (bs=0x162a27a0, offset=14002786304, 
    buf=0x2aaac3bd61a0 "", count=20480) at block-raw-posix.c:405
#2  0x0000000000462551 in bdrv_pwrite (bs=0x162a27a0, offset=14002786304, 
    buf1=0x2aaac3bd61a0, count1=20480) at block.c:825
#3  0x00000000004977a1 in update_refcount (bs=0x162a1c60, 
    offset=<value optimized out>, length=<value optimized out>, addend=-1)
    at block-qcow2.c:2541
#4  0x0000000000497811 in update_refcount (bs=0x162a1c60, 
    offset=<value optimized out>, length=<value optimized out>, addend=-1)
    at block-qcow2.c:2563
#5  0x0000000000497811 in update_refcount (bs=0x162a1c60, 
    offset=<value optimized out>, length=<value optimized out>, addend=-1)
    at block-qcow2.c:2563
#6  0x0000000000497811 in update_refcount (bs=0x162a1c60, 
    offset=<value optimized out>, length=<value optimized out>, addend=-1)
    at block-qcow2.c:2563
#7  0x0000000000497811 in update_refcount (bs=0x162a1c60, 
    offset=<value optimized out>, length=<value optimized out>, addend=-1)
    at block-qcow2.c:2563
#8  0x0000000000497811 in update_refcount (bs=0x162a1c60, 
    offset=<value optimized out>, length=<value optimized out>, addend=-1)



core.30972.1255354893.dump.1
Core was generated by `/usr/libexec/qemu-kvm -no-hpet -no-kvm-pit-reinjection -usbdevice tablet -rtc-t'.
Program terminated with signal 11, Segmentation fault.
[New process 31035]
[New process 32628]
[New process 32618]
[New process 32617]
[New process 32257]
[New process 32229]
[New process 32027]
[New process 31868]
[New process 31839]
[New process 31037]
[New process 30973]
[New process 30972]
#0  raw_pwrite_aligned (bs=0xaa9d7a0, offset=14002806784, buf=0xaa9e200 "", count=16384)
    at block-raw-posix.c:250
250     {
(gdb) bt
#0  raw_pwrite_aligned (bs=0xaa9d7a0, offset=14002806784, buf=0xaa9e200 "", count=16384)
    at block-raw-posix.c:250
#1  0x0000000000419711 in raw_pwrite (bs=0xaa9d7a0, offset=14002806784, buf=0x2aaab865e020 "", count=20480)
    at block-raw-posix.c:405
#2  0x0000000000462551 in bdrv_pwrite (bs=0xaa9d7a0, offset=14002806784, buf1=0x2aaab865e020, count1=20480)
    at block.c:825
#3  0x00000000004977a1 in update_refcount (bs=0xaa9cc60, offset=<value optimized out>, 
    length=<value optimized out>, addend=-1) at block-qcow2.c:2541
#4  0x0000000000497811 in update_refcount (bs=0xaa9cc60, offset=<value optimized out>, 
    length=<value optimized out>, addend=-1) at block-qcow2.c:2563
#5  0x0000000000497811 in update_refcount (bs=0xaa9cc60, offset=<value optimized out>, 
    length=<value optimized out>, addend=-1) at block-qcow2.c:2563
#6  0x0000000000497811 in update_refcount (bs=0xaa9cc60, offset=<value optimized out>, 
    length=<value optimized out>, addend=-1) at block-qcow2.c:2563
#7  0x0000000000497811 in update_refcount (bs=0xaa9cc60, offset=<value optimized out>, 
    length=<value optimized out>, addend=-1) at block-qcow2.c:2563
#8  0x0000000000497811 in update_refcount (bs=0xaa9cc60, offset=<value optimized out>, 
    length=<value optimized out>, addend=-1) at block-qcow2.c:2563
#9  0x0000000000497811 in update_refcount (bs=0xaa9cc60, offset=<value optimized out>, 
    length=<value optimized out>, addend=-1) at block-qcow2.c:2563
#10 0x0000000000497811 in update_refcount (bs=0xaa9cc60, offset=<value optimized out>, 
    length=<value optimized out>, addend=-1) at block-qcow2.c:2563
#11 0x0000000000497811 in update_refcount (bs=0xaa9cc60, offset=<value optimized out>,

Comment 6 Eduardo Habkost 2009-12-30 14:40:21 UTC
*** Bug 551387 has been marked as a duplicate of this bug. ***

Comment 8 Gleb Natapov 2010-01-05 09:55:03 UTC
*** Bug 549747 has been marked as a duplicate of this bug. ***

Comment 9 lihuang 2010-01-14 04:43:45 UTC
Verify bug in kvm-83-144.el5.

steps :
1. create lvm --> 
   [root@t199 Desktop]# lvs
  LV     VG     Attr   LSize Origin Snap%  Move Log Copy%  Convert
  lvtest vgtest -wi-ao 3.00G                                      

2. #qemu-img create -f qcow2 /dev/vgtest/lvtest 20G

3. start a vm with the host device .
   #/usr/libexec/qemu-kvm -m 2048 -smp 2 -drive file=vm.test,werror=stop -drive file=/dev/vgtest/lvtest,werror=stop,if=virtio,format=qcow2 -monitor stdio -vnc :1 -net nic,macaddr=00:21:04:58:92:D3 -net tap -usbdevice tablet

4. in guest. using dd to fill up the attached disk. # dd if=/dev/zero of=/dev/vda1 bs=1M

5. wait until vm pause on no space.

6. extend the lvm. #lvextend -L 4G /dev/vgtest/lvtest

7. in qemu monitor, input 'cont' to resume to paused vm.

8. repeat step 5~7  5times,enlarge the lvm's size 1G every time.

result: no segfault
        vm run normally before the 'no space' pause.

(can reproduce in kvm-83-105.el5_4.13,which doesn't have the patch. )

Comment 12 errata-xmlrpc 2010-03-30 07:52:34 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0271.html


Note You need to log in before you can comment on or make changes to this bug.