Bug 1285928
Summary: | linux-aio aborts on io_submit() failure | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Han Han <hhan> | ||||||||
Component: | qemu-kvm-rhev | Assignee: | Kevin Wolf <kwolf> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | CongLi <coli> | ||||||||
Severity: | medium | Docs Contact: | |||||||||
Priority: | medium | ||||||||||
Version: | 7.2 | CC: | aliang, chayang, coli, dyuan, hachen, hhan, huding, jsuchane, jtomko, juzhang, knoel, kwolf, meyang, michen, mkolaja, mrezanin, ngu, pbonzini, pingl, rbalakri, virt-maint, xfu, xuwei, xuzhang, yanyang | ||||||||
Target Milestone: | rc | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | x86_64 | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | qemu-kvm-rhev-2.8.0-1.el7 | Doc Type: | If docs needed, set a value | ||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2017-08-01 23:29:42 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 1401400 | ||||||||||
Attachments: |
|
Moving to qemu-kvm-rhev for further analysis of the qemu-kvm crash. I'm sorry that the lock space in comment1 should be scsi_lockspace_dir = "/var/lib/libvirt/lockd/scsivolumes". But the bug can be reproduced,too. One more issue is related to the bug. But it is no always reproduced. 1. # vim /etc/libvirt/qemu-lockd.conf auto_disk_leases = 1 require_lease_for_disks = 1 file_lockspace_dir = "/var/lib/libvirt/lockd/files" 2. # systemctl restart virtlockd;systemctl restart libvirtd prepare 2 guest n1 n2 Cold plug a disk file to n1 and n2: <disk type='file' device='disk'> <driver name='qemu' type='raw'/> <source file='/var/lib/libvirt/images/test.img'/> <backingStore/> <target dev='hda' bus='ide'/> <alias name='ide0-0-0'/> </disk> start n1 n2, n2 will be start failure due to lockd # virsh start n1;virsh start n2 Domain n1 started error: Failed to start domain n2 error: resource busy: Lockspace resource '06bbb3c2da705d3532bcf1066e5058c5d29ff9d154697073b3818ec1da3b6fd3' is locked login n1, check dmesg info, if there is ABRT or error 3. # dmesg|grep ata -i get error like this: [ 1.572718] libata version 3.00 loaded. [ 1.657973] ata_piix 0000:00:01.1: version 2.13 [ 1.663004] scsi host0: ata_piix [ 1.666029] scsi host1: ata_piix [ 1.666393] ata1: PATA max MWDMA2 cmd 0x1f0 ctl 0x3f6 bmdma 0xc120 irq 14 [ 1.667082] ata2: PATA max MWDMA2 cmd 0x170 ctl 0x376 bmdma 0xc128 irq 15 [ 1.819109] ata1.01: NODEV after polling detection [ 1.819329] ata1.00: ATA-7: QEMU HARDDISK, 2.3.0, max UDMA/100 [ 1.819958] ata1.00: 1024000 sectors, multi 16: LBA48 [ 1.820938] ata1.00: configured for MWDMA2 [ 1.821442] scsi 0:0:0:0: Direct-Access ATA QEMU HARDDISK 0 PQ: 0 ANSI: 5 [ 1.848895] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [ 1.849614] ata1.00: BMDMA stat 0x4 [ 1.850010] ata1.00: failed command: READ DMA : [ 1.853789] ata1.00: configured for MWDMA2 [ 1.854272] ata1: EH complete [ 1.864919] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [ 1.865656] ata1.00: BMDMA stat 0x4 [ 1.866039] ata1.00: failed command: READ DMA n [ 1.868118] ata1.00: status: { DRDY ERR } [ 1.868536] ata1.00: error: { ABRT } [ 1.869656] ata1.00: configured for MWDMA2 [ 1.870257] ata1: EH complete [ 1.880942] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [ 1.881653] ata1.00: BMDMA stat 0x4 [ 1.882045] ata1.00: failed command: READ DMA And the disk newly added cannot read and write. Using lvm lockspace, not reproduced the problem. Created attachment 1100342 [details]
The dmesg info
Paolo, do you remember why we abort() on io_submit() failure? Of course, I also can't see why it would fail in the first place (other than -EAGAIN), but maybe abort() is a bit too much. Han, can you please provide the output of "bt full", so that the local variables are included in the stack trace? I'm afraid that the interesting part will be optimised out anyway, but it's worth a try. If this doesn't reveal the io_submit() error code because 'ret' is optimised out, attaching strace to qemu just before it fails could help us to understand why io_submit() fails. (gdb) bt full #0 0x00007f5b3c9af5f7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 resultvar = 0 pid = 5430 selftid = 5456 #1 0x00007f5b3c9b0ce8 in __GI_abort () at abort.c:90 save_stage = 2 act = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, sa_mask = {__val = {0 <repeats 16 times>}}, sa_flags = 0, sa_restorer = 0x0} sigs = {__val = {32, 0 <repeats 15 times>}} #2 0x00007f5b4478ffc6 in ioq_submit (s=s@entry=0x7f5b47302000) at block/linux-aio.c:208 ret = <optimized out> len = 1 aiocb = <optimized out> iocbs = {0x7f5b46a33030, 0x0 <repeats 95 times>, 0x7f0, 0x7f5b3fca6048 <(anonymous namespace)::do_memalign(size_t, size_t)+40>, 0x0, 0x7f0, 0x7f5b4c8f7a90, 0x800, 0xa0, 0x1, 0xa, 0x7f5b3fcc6772 <tc_posix_memalign(void**, size_t, size_t)+66>, 0xa0, 0x48, 0x800, 0x7f5b42965e88 <slab_allocator_alloc_chunk+136>, 0x7f5b46a33000, 0x844cd79ec949b600, 0x0, 0x7f5b46a32000, 0x98, 0x7f5b46a320a0, 0xa0, 0x7f5b429b096d <g_slice_alloc+1357>, 0x0, 0x7f5b46a33000, 0x90, 0x7f5b46a33500, 0x0, 0x7f5b44bc4e60 <laio_aiocb_info>, 0x7f5b472f8000, 0x7f5b44751ef0 <bdrv_co_io_em_complete>, 0x7f5b4c8f7c30, 0x0} completed = {sqh_first = 0x0, sqh_last = 0x0} #3 0x00007f5b447903ac in laio_submit (bs=bs@entry=0x7f5b472f8000, aio_ctx=0x7f5b47302000, fd=20, sector_num=sector_num@entry=0, qiov=qiov@entry=0x7f5b483dd7f8, nb_sectors=nb_sectors@entry=1, cb=cb@entry=0x7f5b44751ef0 <bdrv_co_io_em_complete>, opaque=opaque@entry=0x7f5b4c8f7c30, type=type@entry=1) at block/linux-aio.c:277 s = 0x7f5b47302000 laiocb = 0x7f5b46a33000 iocbs = 0x7f5b46a33030 offset = 0 __func__ = "laio_submit" #4 0x00007f5b4478e086 in raw_aio_submit (bs=0x7f5b472f8000, sector_num=0, qiov=0x7f5b483dd7f8, nb_sectors=1, cb=0x7f5b44751ef0 <bdrv_co_io_em_complete>, opaque=0x7f5b4c8f7c30, type=type@entry=1) at block/raw-posix.c:1317 s = 0x7f5b469a6580 #5 0x00007f5b4478e0d0 in raw_aio_readv (bs=<optimized out>, sector_num=<optimized out>, qiov=<optimized out>, nb_sectors=<optimized out>, cb=<optimized out>, opaque=<optimized out>) at block/raw-posix.c:1361 No locals. #6 0x00007f5b44751fa6 in bdrv_co_io_em (bs=0x7f5b472f8000, sector_num=0, nb_sectors=1, iov=0x7f5b483dd7f8, is_write=<optimized out>) at block.c:5153 co = {coroutine = 0x7f5b4730a880, ret = 0} acb = <optimized out> #7 0x00007f5b4475700d in bdrv_aligned_preadv (bs=bs@entry=0x7f5b472f8000, req=req@entry=0x7f5b4c8f7d70, offset=offset@entry=0, bytes=bytes@entry=512, align=align@entry=512, qiov=qiov@entry=0x7f5b483dd7f8, flags=flags@entry=0) at block.c:3090 total_sectors = 83886080 max_nb_sectors = 83886080 drv = 0x7f5b44c0f1e0 <bdrv_host_device> ret = <optimized out> sector_num = 0 nb_sectors = 1 __PRETTY_FUNCTION__ = "bdrv_aligned_preadv" #8 0x00007f5b44757313 in bdrv_co_do_preadv (bs=0x7f5b472f8000, offset=0, bytes=512, qiov=0x7f5b483dd7f8, flags=(unknown: 0)) at block.c:3193 drv = <optimized out> req = {bs = 0x7f5b472f8000, offset = 0, bytes = 512, is_write = false, serialising = false, overlap_offset = 0, overlap_bytes = 512, list = {le_next = 0x0, le_prev = 0x7f5b472fb290}, co = 0x7f5b4730a880, wait_queue = {entries = {tqh_first = 0x0, tqh_last = 0x7f5b4c8f7db0}}, waiting_for = 0x0} align = 512 head_buf = 0x0 tail_buf = 0x0 local_qiov = {iov = 0x0, niov = 0, nalloc = 0, size = 0} use_local_qiov = false ret = <optimized out> #9 0x00007f5b4475700d in bdrv_aligned_preadv (bs=bs@entry=0x7f5b46a8c800, req=req@entry=0x7f5b4c8f7f00, offset=offset@entry=0, bytes=bytes@entry=512, align=align@entry=512, qiov=qiov@entry=0x7f5b483dd7f8, flags=flags@entry=0) at block.c:3090 total_sectors = 83886080 max_nb_sectors = 83886080 drv = 0x7f5b44c0b540 <bdrv_raw> ret = <optimized out> sector_num = 0 nb_sectors = 1 __PRETTY_FUNCTION__ = "bdrv_aligned_preadv" #10 0x00007f5b44757313 in bdrv_co_do_preadv (bs=bs@entry=0x7f5b46a8c800, offset=0, bytes=512, qiov=0x7f5b483dd7f8, flags=(unknown: 0)) at block.c:3193 drv = <optimized out> req = {bs = 0x7f5b46a8c800, offset = 0, bytes = 512, is_write = false, serialising = false, overlap_offset = 0, overlap_bytes = 512, list = {le_next = 0x0, le_prev = 0x7f5b46a8fa90}, co = 0x7f5b4730a880, wait_queue = {entries = {tqh_first = 0x0, tqh_last = 0x7f5b4c8f7f40}}, waiting_for = 0x0} align = 512 head_buf = 0x0 tail_buf = 0x0 local_qiov = {iov = 0x0, niov = 0, nalloc = 0, size = 0} use_local_qiov = false ret = <optimized out> #11 0x00007f5b447583ff in bdrv_co_do_readv (flags=<optimized out>, qiov=<optimized out>, nb_sectors=<optimized out>, sector_num=<optimized out>, bs=0x7f5b46a8c800) at block.c:3215 No locals. #12 bdrv_co_do_rw (opaque=0x7f5b469dd2d0) at block.c:4994 acb = 0x7f5b469dd2d0 bs = 0x7f5b46a8c800 #13 0x00007f5b4476257a in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at coroutine-ucontext.c:80 self = 0x7f5b4730a880 co = 0x7f5b4730a880 #14 0x00007f5b3c9c1110 in ?? () from /usr/lib64/libc-2.17.so No symbol table info available. #15 0x00007f5b3200ffb0 in ?? () No symbol table info available. #16 0x0000000000000000 in ?? () No symbol table info available. More info: If there is a long time interval between the starting of two guests, the problem cannot be reproduced. For example: # virsh start n3; sleep 30;virsh start n4 It's just because we couldn't "see why it would fail in the first place", indeed. We should prepare a new RPM that logs the error (e.g. with g_assert_cmpint). Can you please try to reproduce with the following scratch build? Before it aborts, it should print some debug information to stderr (so it should end up in libvirt's qemu log). Please report back with this output. https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=11512442 I installed the scratch version and try to reproduce it by: # for i in n{1,2};do virsh start $i;sleep 1;done Domain n1 started error: Failed to start domain n2 error: resource busy: Lockspace resource '97177c7bd790b481f854131c62cd658a8adceb6d71532de0b609c064fc1d7c2a' is locked n1 keeps running for a while after n2 start failed and then get SIGABRT. The log of n1: 2016-08-08 02:00:24.082+0000: starting up libvirt version: 2.0.0, package: 4.el7 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2016-08-02-09:15:12, x86-034.build.eng.bos.redhat.com), qemu version: 2.6.0 (qemu-kvm-rhev-2.6.0-19.el7.bz1285928), hostname: lab.test.me LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=spice /usr/libexec/qemu-kvm -name guest=n1,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-3-n1/master-key.aes -machine pc-i440fx-rhel7.2.0,accel=kvm,usb=off,vmport=off -cpu Penryn -m 1024 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 660932f2-45a5-4756-880b-bd19345dbd46 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-3-n1/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x6.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x6 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x6.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x6.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/var/lib/libvirt/images/n1.qcow2,format=qcow2,if=none,id=drive-virtio-disk0 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/dev/sdb,format=raw,if=none,id=drive-ide0-0-0,cache=none,aio=native -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -netdev tap,fd=28,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:a7:e8:75,bus=pci.0,addr=0x9 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -spice port=5900,tls-port=5901,addr=0.0.0.0,disable-ticketing,x509-dir=/etc/pki/libvirt-spice,image-compression=off,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=1 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -msg timestamp=on char device redirected to /dev/pts/3 (label charserial0) io_submit(len = 1) aiocb[0] = read of 0x1 bytes at offset 0 aiocb[0].qiov[0] = 0x7f7f65451000 + 0x200 2016-08-08 02:00:32.670+0000: shutting down The log of n2: 2016-08-08 02:00:25.298+0000: starting up libvirt version: 2.0.0, package: 4.el7 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2016-08-02-09:15:12, x86-034.build.eng.bos.redhat.com), qemu version: 2.6.0 (qemu-kvm-rhev-2.6.0-19.el7.bz1285928), hostname: lab.test.me LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=spice /usr/libexec/qemu-kvm -name guest=n2,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-4-n2/master-key.aes -machine pc-i440fx-rhel7.2.0,accel=kvm,usb=off,vmport=off -cpu Penryn -m 1024 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 0df3dcb5-4fb2-4dec-b51d-926c9b07df6e -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-4-n2/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x6.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x6 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x6.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x6.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/var/lib/libvirt/images/n2.qcow2,format=qcow2,if=none,id=drive-virtio-disk0 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/dev/sdb,format=raw,if=none,id=drive-ide0-0-0,cache=none,aio=native -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -netdev tap,fd=29,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:36:f4:2c,bus=pci.0,addr=0x9 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -spice port=5902,tls-port=5903,addr=0.0.0.0,disable-ticketing,x509-dir=/etc/pki/libvirt-spice,image-compression=off,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=1 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -msg timestamp=on char device redirected to /dev/pts/4 (label charserial0) 2016-08-08 02:00:25.381+0000: shutting down 2016-08-08T02:00:25.382691Z qemu-kvm: terminating on signal 15 from pid 6654 (In reply to Han Han from comment #10) > io_submit(len = 1) > aiocb[0] = read of 0x1 bytes at offset 0 > aiocb[0].qiov[0] = 0x7f7f65451000 + 0x200 > 2016-08-08 02:00:32.670+0000: shutting down Thanks, this is the information we're looking for. Unfortunately, I messed it up and forgot to log the most important part (the error code). Sorry about that. Can you try again with a new scratch build and post those lines? https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=11517005 # cat /var/log/libvirt/qemu/n1.log 2016-08-09 01:29:47.114+0000: starting up libvirt version: 2.0.0, package: 4.el7 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2016-08-02-09:15:12, x86-034.build.eng.bos.redhat.com), qemu version: 2.6.0 (qemu-kvm-rhev-2.6.0-19.el7.bz1285928_v2), hostname: lab.test.me LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=spice /usr/libexec/qemu-kvm -name guest=n1,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-18-n1/master-key.aes -machine pc-i440fx-rhel7.2.0,accel=kvm,usb=off,vmport=off -cpu Penryn -m 1024 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 660932f2-45a5-4756-880b-bd19345dbd46 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-18-n1/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x6.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x6 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x6.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x6.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/var/lib/libvirt/images/n1.qcow2,format=qcow2,if=none,id=drive-virtio-disk0 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/dev/sdb,format=raw,if=none,id=drive-ide0-0-0,cache=none,aio=native -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -netdev tap,fd=34,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:a7:e8:75,bus=pci.0,addr=0x9 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -spice port=5904,tls-port=5905,addr=0.0.0.0,disable-ticketing,x509-dir=/etc/pki/libvirt-spice,image-compression=off,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=1 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -msg timestamp=on char device redirected to /dev/pts/13 (label charserial0) io_submit(len = 1) = -13 (Permission denied) aiocb[0] = read of 0x200 bytes at offset 0 aiocb[0] = fd 21, opcode 7, iov = 0x7f7978821490, qiov = 0x7f7978821490, niov = 1 aiocb[0].qiov[0] = 0x7f7977dfe000 + 0x200 2016-08-09 01:29:56.378+0000: shutting down # cat /var/log/libvirt/qemu/n2.log 2016-08-09 01:29:48.374+0000: starting up libvirt version: 2.0.0, package: 4.el7 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2016-08-02-09:15:12, x86-034.build.eng.bos.redhat.com), qemu version: 2.6.0 (qemu-kvm-rhev-2.6.0-19.el7.bz1285928_v2), hostname: lab.test.me LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=spice /usr/libexec/qemu-kvm -name guest=n2,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-19-n2/master-key.aes -machine pc-i440fx-rhel7.2.0,accel=kvm,usb=off,vmport=off -cpu Penryn -m 1024 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 0df3dcb5-4fb2-4dec-b51d-926c9b07df6e -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-19-n2/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x6.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x6 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x6.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x6.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/var/lib/libvirt/images/n2.qcow2,format=qcow2,if=none,id=drive-virtio-disk0 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/dev/sdb,format=raw,if=none,id=drive-ide0-0-0,cache=none,aio=native -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -netdev tap,fd=35,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:36:f4:2c,bus=pci.0,addr=0x9 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -spice port=5906,tls-port=5907,addr=0.0.0.0,disable-ticketing,x509-dir=/etc/pki/libvirt-spice,image-compression=off,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=1 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -msg timestamp=on char device redirected to /dev/pts/14 (label charserial0) 2016-08-09 01:29:48.461+0000: shutting down 2016-08-09T01:29:48.462249Z qemu-kvm: terminating on signal 15 from pid 6654 (In reply to Han Han from comment #12) > io_submit(len = 1) = -13 (Permission denied) I am not completely sure, but such permission problems could be related to SELinux; generally EACCES is not an error code that is expected or documented for io_submit(), which operates on an already opened file descriptor. On the other hand, I'd rather expect EPERM from SELinux, but who knows. Is it possible that libvirt revokes the first VM's access permissions to the device as soon as the second one accessing the same device is started? If so, shouldn't libvirt rather refuse to start the second VM? I'm moving this to libvirt for further investigation. Note that qemu should probably be fixed, too, to deal more gracefully with the problem instead of aborting, but a qemu fix alone would still result in I/O errors, so I think we need to fix the root cause primarily. We can create a new qemu BZ for the abort() if we confirm this to be really a libvirt bug. Yes, libvirt cuts off both the VMs from accessing the device. The goal of virtlockd is to prevent simultaneous writes to the same disk by two different domains: https://bugzilla.redhat.com/show_bug.cgi?id=1191802#c8 We have the long-standing bug 547546 against the security driver to remember the security labels. After that is fixed, starting a new domain should fail before changing other domains' disks' labels. Unless you want to keep this bug for tracking qemu's abort(), I'll close it as a duplicate of bug 547546. Thanks, Ján. I'll use it for my purposes then. Han, I proposed a patch upstream to handle io_submit() failure gracefully. I'm building a backport of this patch for you to test at: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=11527226 Hi Kevin, I cannot find the download link of you scratch build. Pls check it. Yes, looks like it was already old enough for it to expire. I made a new build with the same patch for you: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=11644607 (In reply to Kevin Wolf from comment #17) > Yes, looks like it was already old enough for it to expire. I made a new > build > with the same patch for you: > > https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=11644607 After installing your scratch build, did as comment0 does. The VM n4 start failed due to locking and n3 continue running without SIGABRT. logs: # cat /var/log/libvirt/qemu/n3.log 2016-08-26 06:36:14.966+0000: starting up libvirt version: 2.0.0, package: 5.el7 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2016-08-10-12:57:06, x86-019.build.eng.bos.redhat.com), qemu version: 2.6.0 (qemu-kvm-rhev-2.6.0-19.el7.bz1285928_v3), hostname: lab.work.me LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=spice /usr/libexec/qemu-kvm -name guest=n3,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-4-n3/master-key.aes -machine pc-i440fx-rhel7.3.0,accel=kvm,usb=off,vmport=off -m 1024 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 35047947-6221-4001-87ef-440f2f5fe1fe -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-4-n3/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device pci-bridge,chassis_nr=1,id=pci.1,bus=pci.0,addr=0xb -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x6.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x6 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x6.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x6.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0xa -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/var/lib/libvirt/images/n3.qcow2,format=qcow2,if=none,id=drive-virtio-disk0 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/dev/sdb,format=raw,if=none,id=drive-ide0-0-0,cache=none,aio=native -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=28 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:83:9d:15,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-4-n3/org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -device virtio-mouse-pci,id=input1,bus=pci.0,addr=0xc -device virtio-keyboard-pci,id=input2,bus=pci.0,addr=0xd -spice port=5900,addr=0.0.0.0,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -object rng-random,id=objrng0,filename=/dev/random -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x9 -msg timestamp=on # cat /var/log/libvirt/qemu/n4.log 2016-08-26 06:36:16.392+0000: shutting down Thanks, posting the backport now so that it can be merged. Fix included in qemu-kvm-rhev-2.6.0-23.el7 Did it as bug report on libvirt-2.0.0-8.el7.x86_64 qemu-kvm-rhev-2.6.0-24.el7.x86_64, bug was reproduced. id 5d2a593f91ccb7e768ed66f43c3685ab3b5f0468 reason: qemu-kvm killed by SIGABRT time: Mon 12 Sep 2016 18:50:59 CST cmdline: /usr/libexec/qemu-kvm -name guest=n1,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-10-n1/master-key.aes -machine pc-i440fx-rhel7.3.0,accel=kvm,usb=off,vmport=off -cpu Westmere -m 1024 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 152945f4-476d-4159-be56-e1a2e84ebc0f -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-10-n1/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x6.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x6 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x6.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x6.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/var/lib/libvirt/images/n1.qcow2,format=qcow2,if=none,id=drive-virtio-disk0 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/dev/sdb,format=raw,if=none,id=drive-ide0-0-0,cache=none,aio=native -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -netdev tap,fd=27,id=hostnet0,vhost=on,vhostfd=29 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:3e:b1:ea,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-10-n1/org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5900,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -msg timestamp=on package: qemu-kvm-rhev-2.6.0-24.el7 uid: 107 (qemu) Directory: /var/spool/abrt/ccpp-2016-09-12-18:50:59-25644 Run 'abrt-cli report /var/spool/abrt/ccpp-2016-09-12-18:50:59-25644' for creating a case in Red Hat Customer Portal The Autoreporting feature is disabled. Please consider enabling it by issuing 'abrt-auto-reporting enabled' as a user with root privileges Created attachment 1200188 [details]
Backtrace and logs on latest qemu-kvm
Hm, indeed, I think I see the problem. For a complete fix, we would have to use a bottom half in the error handling path to avoid entering the request coroutine recursively. I'm afraid this is too late for 7.3 then. Should be fixed in 2.8.0 (commit fe121b9, "linux-aio: fix re-entrant completion processing", 2016-09-28). (In reply to Paolo Bonzini from comment #28) > Should be fixed in 2.8.0 (commit fe121b9, "linux-aio: fix re-entrant > completion processing", 2016-09-28). I'm changing it to POST then. Kevin, any comments? (I'm leaving the needinfo(kwolf) here) (In reply to Paolo Bonzini from comment #28) > Should be fixed in 2.8.0 (commit fe121b9, "linux-aio: fix re-entrant > completion processing", 2016-09-28). Yes, had a look, this should be the one. Remember to set lock_manager = "lockd" in /etc/libvirt/qemu.conf and then restart virtlockd and libvirtd. Thanks Han Han's help. Verified this bug on: qemu-kvm-rhev-2.9.0-2.el7.x86_64 Steps: 1. Set lock_manager = "lockd" in /etc/libvirt/qemu.conf 2. Setup lockd environment # vim /etc/libvirt/qemu-lockd.conf auto_disk_leases = 1 require_lease_for_disks = 1 scsi_lockspace_dir = "/var/lib/libvirt/lockd/scsivolumes" # mkdir -p /var/lib/libvirt/lockd/scsivolumes # systemctl restart virtlockd; systemctl restart libvirtd 3. Two guests attach the same scsi device. # vim disk.xml <disk type='block' device='disk'> <driver name='qemu' type='raw' cache='none' io='native'/> <source dev='/dev/sdb'/> <target dev='hda' bus='ide'/> </disk> # for i in n{3,4};do virsh attach-device $i disk.xml --config;done Device attached successfully Device attached successfully 4. Start 2 guest one by one. # for i in n{3,4};do virsh start $i;sleep 1;done Result: n3 start successfully, and n4 locked, which is as expected. # for i in n{3,4};do virsh start $i;sleep 1;done Domain n3 started error: Failed to start domain n4 error: resource busy: Lockspace resource '35333333000002ee0' is locked Set bug to 'VERIFIED' based on the test result above. Thanks. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2392 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2392 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2392 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2392 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2392 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2392 |
Created attachment 1099511 [details] logs Description of problem: In libvirt-lockd environment, start 2 guests pointing the same scsi device. The start of the second guest will cause the first guest abort when then first guest is booting. Version: libvirt-1.2.17-13.el7.x86_64 qemu-kvm-rhev-2.3.0-31.el7.x86_64 kernel-3.10.0-330.el7.x86_64 Steps to Reproduce: 1. Setup lockd environment # vim /etc/libvirt/qemu-lockd.conf auto_disk_leases = 1 require_lease_for_disks = 1 file_lockspace_dir = "/var/lib/libvirt/lockd/scsivolumes" # mkdir -p /var/lib/libvirt/lockd/scsivolumes # systemctl restart virtlockd; systemctl restart libvirtd 2. Two guests attach the same scsi device. # vim disk.xml <disk type='block' device='disk'> <driver name='qemu' type='raw' cache='none' io='native'/> <source dev='/dev/sdb'/> <target dev='hda' bus='ide'/> </disk> # for i in n{3,4};do virsh attach-device $i disk.xml --config;done Device attached successfully Device attached successfully 3. Start 2 guest one by one. # for i in n{3,4};do virsh start $i;sleep 1;done Domain n3 started error: Failed to start domain n4 error: resource busy: Lockspace resource '97177c7bd790b481f854131c62cd658a8adceb6d71532de0b609c064fc1d7c2a' is locked The ABRT has detected 1 problem: # abrt-cli list --since 1448532246 id 5bb9ab8a439e2fd1e6a0bae5334e7ef76dca6940 reason: qemu-kvm killed by SIGABRT time: Thu 26 Nov 2015 06:06:16 PM CST cmdline: /usr/libexec/qemu-kvm -name n3 -S -machine pc-i440fx-rhel7.2.0,accel=kvm,usb=off -cpu Penryn -m 1024 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 0b895613-c1e3-4e10-ace1-d111e61be00b -nographic -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-n3/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x4.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x4 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x4.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x4.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x3 -drive file=/var/lib/libvirt/images/n3.qcow2,if=none,id=drive-virtio-disk0,format=qcow2 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/dev/sdb,if=none,id=drive-ide0-0-0,format=raw,cache=none,aio=native -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -netdev tap,fd=25,id=hostnet0,vhost=on,vhostfd=26 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:ba:ab:e5,bus=pci.0,addr=0x2 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-n3/org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -device usb-tablet,id=input0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -msg timestamp=on package: qemu-kvm-rhev-2.3.0-31.el7 uid: 107 (qemu) Directory: /var/spool/abrt/ccpp-2015-11-26-18:06:16-27014 Run 'abrt-cli report /var/spool/abrt/ccpp-2015-11-26-18:06:16-27014' for creating a case in Red Hat Customer Portal Additional info: The libvirtd.log n3.log n4.log files are in attachment. When n3 has booted, the start of n4 don't cause n3 abort immediately. But when you execute reboot in n3, it also aborts. The core file back trace: (gdb) bt #0 0x00007f8ff512f5f7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007f8ff5130ce8 in __GI_abort () at abort.c:90 #2 0x00007f900072ba36 in ioq_submit (s=s@entry=0x7f9001d50800) at block/linux-aio.c:208 #3 0x00007f900072be1c in laio_submit (bs=bs@entry=0x7f90025b4000, aio_ctx=0x7f9001d50800, fd=16, sector_num=sector_num@entry=0, qiov=qiov@entry=0x7f9003770bf8, nb_sectors=nb_sectors@entry=1, cb=cb@entry=0x7f90006ed960 <bdrv_co_io_em_complete>, opaque=opaque@entry=0x7f9007365c30, type=type@entry=1) at block/linux-aio.c:277 #4 0x00007f9000729af6 in raw_aio_submit (bs=0x7f90025b4000, sector_num=0, qiov=0x7f9003770bf8, nb_sectors=1, cb=0x7f90006ed960 <bdrv_co_io_em_complete>, opaque=0x7f9007365c30, type=type@entry=1) at block/raw-posix.c:1317 #5 0x00007f9000729b40 in raw_aio_readv (bs=<optimized out>, sector_num=<optimized out>, qiov=<optimized out>, nb_sectors=<optimized out>, cb=<optimized out>, opaque=<optimized out>) at block/raw-posix.c:1361 #6 0x00007f90006eda16 in bdrv_co_io_em (bs=0x7f90025b4000, sector_num=0, nb_sectors=1, iov=0x7f9003770bf8, is_write=<optimized out>) at block.c:5153 #7 0x00007f90006f2a7d in bdrv_aligned_preadv (bs=bs@entry=0x7f90025b4000, req=req@entry=0x7f9007365d70, offset=offset@entry=0, bytes=bytes@entry=512, align=align@entry=512, qiov=qiov@entry=0x7f9003770bf8, flags=flags@entry=0) at block.c:3090 #8 0x00007f90006f2d83 in bdrv_co_do_preadv (bs=0x7f90025b4000, offset=0, bytes=512, qiov=0x7f9003770bf8, flags=(unknown: 0)) at block.c:3193 #9 0x00007f90006f2a7d in bdrv_aligned_preadv (bs=bs@entry=0x7f9001d48800, req=req@entry=0x7f9007365f00, offset=offset@entry=0, bytes=bytes@entry=512, align=align@entry=512, qiov=qiov@entry=0x7f9003770bf8, flags=flags@entry=0) at block.c:3090 #10 0x00007f90006f2d83 in bdrv_co_do_preadv (bs=bs@entry=0x7f9001d48800, offset=0, bytes=512, qiov=0x7f9003770bf8, flags=(unknown: 0)) at block.c:3193 #11 0x00007f90006f3e6f in bdrv_co_do_readv (flags=<optimized out>, qiov=<optimized out>, nb_sectors=<optimized out>, sector_num=<optimized out>, bs=0x7f9001d48800) at block.c:3215 #12 bdrv_co_do_rw (opaque=0x7f9001c57ad0) at block.c:4994 #13 0x00007f90006fdfea in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at coroutine-ucontext.c:80 #14 0x00007f8ff5141110 in ?? () from /usr/lib64/libc-2.17.so #15 0x00007f8fea4f8f30 in ?? () #16 0x0000000000000000 in ?? () So I guess it is something wrong in qemu boot queue functions.