Bug 565471 - KVM: qemu crash on ide_write_dma_cb()
Summary: KVM: qemu crash on ide_write_dma_cb()
Keywords:
Status: CLOSED CANTFIX
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kvm
Version: 5.5
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Kevin Wolf
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: Rhel5KvmTier2
TreeView+ depends on / blocked
 
Reported: 2010-02-15 12:25 UTC by Yaniv Kaul
Modified: 2013-07-04 01:45 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-11-02 08:56:57 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Yaniv Kaul 2010-02-15 12:25:42 UTC
Description of problem:
I'm not sure exactly how it happened, it was after migration (not sure it's related), during light IO, with qcow2 over NFS (running a XP VM).
GDB output:

Program terminated with signal 11, Segmentation fault.
#0  ide_write_dma_cb (opaque=0x12368e20, ret=0) at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/hw/ide.c:1124
1124        n = s->io_buffer_size >> 9;
(gdb) bt
#0  ide_write_dma_cb (opaque=0x12368e20, ret=0) at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/hw/ide.c:1124
#1  0x000000000049a7da in qcow_aio_write_cb (opaque=0x12368f70, ret=0) at block-qcow2.c:1558
#2  0x0000000000419757 in posix_aio_read (opaque=<value optimized out>) at block-raw-posix.c:512
#3  0x00000000004639b1 in qemu_aio_wait () at aio.c:179
#4  0x0000000000461e5d in bdrv_write_em (bs=<value optimized out>, sector_num=<value optimized out>, buf=<value optimized out>, nb_sectors=<value optimized out>) at block.c:1626
#5  0x000000000042b241 in ide_sector_write (s=0x12368260) at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/hw/ide.c:1040
#6  0x00000000004feac1 in kvm_outw (opaque=<value optimized out>, addr=0, data=0) at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/qemu-kvm.c:690
#7  0x000000000052a16f in handle_io (kvm=0x11fbece0, run=0x2aaaab1ac000, vcpu=0) at libkvm.c:739
#8  0x000000000052aa22 in kvm_run (kvm=0x11fbece0, vcpu=0, env=0x120a2ba0) at libkvm.c:964
#9  0x00000000004ff229 in kvm_cpu_exec (env=0x0) at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/qemu-kvm.c:206
#10 0x00000000004ff4b3 in kvm_main_loop_cpu (env=<value optimized out>) at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/qemu-kvm.c:402
#11 ap_main_loop (env=<value optimized out>) at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/qemu-kvm.c:443
#12 0x00002af902e87617 in start_thread () from /lib64/libpthread.so.0
#13 0x00002af904117d1d in clone () from /lib64/libc.so.6
(gdb) l
1119        if (ret < 0) {
1120            if (ide_handle_rw_error(s, -ret,  BM_STATUS_DMA_RETRY))
1121                return;
1122        }
1123
1124        n = s->io_buffer_size >> 9;
1125        sector_num = ide_get_sector(s);
1126        if (n > 0) {
1127            sector_num += n;
1128            ide_set_sector(s, sector_num);
(gdb) p s
$1 = (IDEState *) 0x0


Version-Release number of selected component (if applicable):
83-156

How reproducible:
Not sure. I just saw a core dump.

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Kevin Wolf 2010-02-17 13:53:49 UTC
Adding some more information from the core dump. The interesting thing is that the BMDMAState has a NULL device, but the device correctly links the BMDMAState. Also, the status is 0 which excludes some bm->ide_if = NULL lines (they all set BM_STATUS_INT) unless the status is cleared later. I don't know yet how it can become NULL during a running request.


(gdb) p *(BMDMAState *)opaque
$1 = {cmd = 0 '\000', status = 0 '\000', addr = 39116536, pci_dev = 0x12368010, cur_addr = 39116544, cur_prd_last = 2147483648, 
  cur_prd_addr = 152395776, cur_prd_len = 0, ide_if = 0x0, dma_cb = 0, aiocb = 0x0, sector_num = 1315487, nsector = 6, bh = 0x0}
(gdb) frame 5
#5  0x000000000042b241 in ide_sector_write (s=0x12368260) at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/hw/ide.c:1040
1040        ret = bdrv_write(s->bs, sector_num, s->io_buffer, n);
(gdb) p *s
$2 = {is_cdrom = 0, is_cf = 0, cylinders = 16383, heads = 16, sectors = 63, nb_sectors = 41943040, mult_sectors = 16, identify_set = 1, 
  identify_data = {64, 16383, 0, 16, 32256, 512, 63, 0, 0, 0, 12642, 11618, 13881, 25645, 13107, 12596, 14689, 14641, 13113, 26210, 3, 512, 4, 
    12334, 14638, 12576, 8224, 20805, 19797, 8264, 16722, 17476, 18771, 19232, 8224 <repeats 13 times>, 32784, 1, 2816, 0, 512, 512, 7, 16383, 16, 
    63, 64528, 251, 272, 0, 640, 7, 1031, 0, 120, 120, 120, 120, 0 <repeats 11 times>, 240, 22, 16384, 29696, 16384, 16416, 29696, 16384, 63, 0, 0, 
    0, 0, 24577, 0, 0, 0, 0, 0, 0, 0, 640, 0 <repeats 154 times>}, irq = 0x123444c0, pci_dev = 0x0, bmdma = 0x12368e20, drive_serial = 1, 
  drive_serial_str = "1b-b69d-33149a9139fb", feature = 3 '\003', error = 0 '\000', nsector = 8, sector = 7 '\a', lcyl = 161 '\241', hcyl = 94 '^', 
  hob_feature = 0 '\000', hob_nsector = 0 '\000', hob_sector = 165 '\245', hob_lcyl = 18 '\022', hob_hcyl = 20 '\024', select = 96 '`', 
  status = 80 'P', cmd = 0 '\000', lba48 = 0 '\000', cur_drive = 0x12368260, bs = 0x11fd4010, sense_key = 0 '\000', asc = 0 '\000', 
  cdrom_changed = 0 '\000', packet_transfer_size = 0, elementary_transfer_size = 0, io_buffer_index = 3072, lba = 0, cd_sector_size = 0, 
  atapi_dma = 0, io_buffer_size = 3072, req_nb_sectors = 1, end_transfer_func = 0x42b200 <ide_sector_write>, data_ptr = 0x12397600 "", 
  data_end = 0x12397600 "", io_buffer = 0x12397400 "RCRD(", sector_write_timer = 0x12040ad0, irq_count = 0, ext_error = 0 '\000', mdata_size = 0, 
  mdata_storage = 0x0, media_changed = 1}
(gdb) p s->bmdma 
$3 = (struct BMDMAState *) 0x12368e20

Comment 2 Kevin Wolf 2010-02-17 14:30:25 UTC
Yaniv, do you know if this was immediately after migration or did it run a while on the new host until it crashed?

Comment 3 Yaniv Kaul 2010-02-17 15:02:41 UTC
(In reply to comment #2)
> Yaniv, do you know if this was immediately after migration or did it run a
> while on the new host until it crashed?    

I'm *almost* sure it ran after migration.

Comment 5 Kevin Wolf 2010-02-23 10:04:07 UTC
Christoph, do you have an idea if this stack trace is expected? (Running a synchronous bdrv_write while there is still an AIO request in flight) I always thought IDE would do only one request at the same time, but then I'm not really familiar with the IDE code.

Anyway, something must have destroyed bm->aiocb (and most other fields of bm) in the meantime.

Comment 7 chellwig@redhat.com 2010-03-28 12:46:06 UTC
The most likely case I can imagine would be a bug in the guest driver (or a bug in our ide emulation causing the guest driver to do something stupid) where the guest does another

WIN_READDMA*/WIN_WRITEDMA* command without waiting for completion on the previous one.  It's easy enough to check for that in the driver, but I'm not familar enough with the ATA and Intel PIIX specs to figure how to return that error to the guest.

Comment 8 Omri Hochman 2010-04-29 09:44:11 UTC
reproduced on my environment: 
according to what i see there was no migration but a storage issue on the host caused the guest to pause/resume few time before it crashed,.

"VM paused due to unknown storage error (Error Code: EIO)".
*******************************************************************************

Disconnect host from his storage can be done by running: 
-  iptables -A OUTPUT -p tcp -d 10.35.255.101 -j REJECT   

(the actino takes few minutes ..) 

to reverse the action and reconnect back the host with storage run:
-  iptables -D OUTPUT -p tcp -d 10.35.255.101 -j REJECT

Note: only in case the host is known as SPM the Vm's running on it will switch to paused after it been disconnected from storage , otherwise all VM's will be migrated to another host in the cluster.

Comment 9 Kevin Wolf 2010-11-02 08:56:57 UTC
This doesn't seem to be reproducible, there wasn't any activity since April and the data collected in this bug doesn't really make clear what's going on. Closing as suggested by Miroslav Matousek.


Note You need to log in before you can comment on or make changes to this bug.