Bug 1160169

Summary: Segfault occurred at Dst VM while completed migration upon ENOSPC
Product: Red Hat Enterprise Linux 7 Reporter: Xu Han <xuhan>
Component: qemu-kvm-rhevAssignee: Stefan Hajnoczi <stefanha>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.1CC: coli, hhuang, juzhang, michen, qiguo, scui, shuang, virt-maint, xfu, xuhan
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.3.0-16.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1249718 1249740 (view as bug list) Environment:
Last Closed: 2015-12-04 16:20:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1249718, 1249740    
Attachments:
Description Flags
autotest debug log
none
QMP log (Src) none

Description Xu Han 2014-11-04 08:51:45 UTC
Description of problem:
Segfault occurred at Dst VM while completed migration upon ENOSPC.

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.1.2-5.el7.x86_64

How reproducible:
1/1

Steps to Reproduce:
1. Launch Src and Dst VM.
2. Migrate from Src to Dst.
3. Trigger ENOSPC during migration.

Actual results:
After migration, segfault occurred at dst VM
[qemu output] /tmp/aexpect/cdyRMNjm/aexpect-1PhA84.sh: line 1:  6333 Segmentation fault      (core dumped)

Expected results:
No crash

Additional info:
QEMU cmdline:
/bin/qemu-kvm \
    -S  \
    -name 'virt-tests-vm1'  \
    -sandbox off  \
    -M pc  \
    -nodefaults  \
    -vga qxl  \
    -global qxl-vga.vram_size=33554432 \
    -device intel-hda,bus=pci.0,addr=03 \
    -device hda-duplex  \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/tmp/monitor-qmpmonitor1-20141103-011230-5mgFdDOp,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -chardev socket,id=serial_id_serial0,path=/tmp/serial-serial0-20141103-011230-5mgFdDOp,server,nowait \
    -device isa-serial,chardev=serial_id_serial0 \
    -device virtio-serial-pci,id=virtio_serial_pci0,bus=pci.0,addr=04  \
    -chardev socket,id=devvs,path=/tmp/virtio_port-vs-20141103-011230-5mgFdDOp,server,nowait \
    -device virtserialport,chardev=devvs,name=vs,id=vs,bus=virtio_serial_pci0.0  \
    -chardev socket,id=seabioslog_id_20141103-011230-5mgFdDOp,path=/tmp/seabios-20141103-011230-5mgFdDOp,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20141103-011230-5mgFdDOp,iobase=0x402 \
    -device nec-usb-xhci,id=usb1,bus=pci.0,addr=05 \
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=06 \
    -drive id=drive_image1,if=none,cache=none,snapshot=off,aio=native,file=/root/autotest-devel/client/tests/virt/shared/data/images/RHEL-Server-7.1-64-virtio.qcow2 \
    -device scsi-hd,id=image1,drive=drive_image1 \
    -device virtio-net-pci,mac=9a:e3:e4:e5:e6:e7,id=id3NAkRW,vectors=4,netdev=idjqP6gy,bus=pci.0,addr=07  \
    -netdev tap,id=idjqP6gy,vhost=on,vhostfd=42,fd=41  \
    -m 16384  \
    -smp 16,cores=4,threads=2,sockets=2  \
    -cpu 'Westmere',+kvm_pv_unhalt \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -spice port=3001,password=123456,addr=0,tls-port=3201,x509-dir=/tmp/spice_x509d,tls-channel=main,tls-channel=inputs,image-compression=auto_glz,zlib-glz-wan-compression=auto,streaming-video=all,agent-mouse=on,playback-compression=on,ipv4  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot order=cdn,once=c,menu=off  \
    -no-kvm-pit-reinjection \
    -enable-kvm \
    -incoming tcp:0:5200

Comment 1 Xu Han 2014-11-04 08:54:43 UTC
Stack info:

#0  0x00007f55efa584b2 in __memcpy_ssse3_back () from /lib64/libc.so.6
No symbol table info available.
#1  0x00007f55f626d1e6 in memcpy (__len=51, __src=<optimized out>, __dest=<optimized out>) at /usr/include/bits/string3.h:51
No locals.
#2  iov_to_buf (iov=iov@entry=0x7f55f766bd50, iov_cnt=<optimized out>, offset=<optimized out>, offset@entry=0, buf=buf@entry=0x7f55f766fdb8, bytes=bytes@entry=51)
    at util/iov.c:49
        len = 51
        done = <optimized out>
        i = <optimized out>
#3  0x00007f55f60614ee in virtio_scsi_parse_req (req=req@entry=0x7f55f7663d30, req_size=51, resp_size=108) at /usr/src/debug/qemu-2.1.2/hw/scsi/virtio-scsi.c:140
        in_size = <optimized out>
        out_size = <optimized out>
#4  0x00007f55f6061670 in virtio_scsi_load_request (f=0x7f55f75a2d70, sreq=0x7f55f752c540) at /usr/src/debug/qemu-2.1.2/hw/scsi/virtio-scsi.c:215
        bus = <optimized out>
        s = <optimized out>
        vs = 0x7f55f75f3df8
        __func__ = "virtio_scsi_load_request"
        req = 0x7f55f7663d30
        n = <optimized out>
        __PRETTY_FUNCTION__ = "virtio_scsi_load_request"
#5  0x00007f55f61b8fea in get_scsi_requests (f=0x7f55f75a2d70, pv=0x7f55f75bc510, size=<optimized out>) at hw/scsi/scsi-bus.c:1903
        buf = "*\000\002\024$\200\000\000(\000\000\000\000\000\000"
        tag = <optimized out>
        lun = <optimized out>
        req = 0x7f55f752c540
        s = 0x7f55f75bc510
        bus = 0x7f55f75f3f00
        sbyte = <optimized out>
#6  0x00007f55f60fa8d8 in vmstate_load_state (f=f@entry=0x7f55f75a2d70, vmsd=0x7f55f663c860 <vmstate_scsi_device>, opaque=0x7f55f75bc510, version_id=1)
    at vmstate.c:105
        addr = <optimized out>
        base_addr = 0x7f55f75bc510
        i = 0
        n_elems = 1
        size = 0
        field = 0x7f55f6681520 <__compound_literal.1+576>
        ret = <optimized out>
#7  0x00007f55f60fa884 in vmstate_load_state (f=0x7f55f75a2d70, vmsd=0x7f55f663b760 <vmstate_scsi_disk_state>, opaque=0x7f55f75bc510, version_id=1) at vmstate.c:102
        addr = <optimized out>
        base_addr = 0x7f55f75bc510
        i = 0
        n_elems = 1
        size = 488
        field = 0x7f55f6680ac0 <__compound_literal.0>
        ret = <optimized out>
#8  0x00007f55f6042bda in qemu_loadvm_state (f=f@entry=0x7f55f75a2d70) at /usr/src/debug/qemu-2.1.2/savevm.c:1008
        instance_id = 0
        version_id = 1
        len = <optimized out>
        section_id = 81
        idstr = "0000:00:06.0/0:0:0/scsi-disk\000\177\000\000\000\000\000\000\000\000\000\000,\254\"\366U\177\000\000 \317\357\365U\177\000\000,\254\"\366U\177\000\000`\317\357\365U\177\000\000Y\256\"\366U\177\000\000\000\002\000\000\000\000\000\000\240\244\"\366U\177\000\000 \317\357\365U\177", '\000' <repeats 11 times>, "\002\000\000\000\000\000\000u\255!\366U\177\000\000\000\000\000\000U\177\000\000\060\232#\367U\177\000\000\000\340\071\367U\177\000\000\000\344\071\367U\177", '\000' <repeats 15 times>, "\002", '\000' <repeats 27 times>...
        loadvm_handlers = {
          lh_first = 0x7f55f7528200
        }
        le = <optimized out>
        new_le = <optimized out>
        section_type = <optimized out>
        v = <optimized out>
        ret = <optimized out>
#9  0x00007f55f60f8e36 in process_incoming_migration_co (opaque=0x7f55f75a2d70) at migration.c:97
        f = 0x7f55f75a2d70
        local_err = 0x0
        ret = <optimized out>
#10 0x00007f55f622ae2a in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at coroutine-ucontext.c:118
        self = 0x7f55f721cfa0
        co = 0x7f55f721cfa0
#11 0x00007f55ef9551d0 in ?? () from /lib64/libc.so.6
No symbol table info available.
#12 0x00007fffa54647d0 in ?? ()
No symbol table info available.
#13 0x0000000000000000 in ?? ()
No symbol table info available.

Comment 2 Xu Han 2014-11-04 08:56:58 UTC
Created attachment 953493 [details]
autotest debug log

Comment 3 Xu Han 2014-11-04 08:58:50 UTC
Created attachment 953494 [details]
QMP log (Src)

Comment 4 Xu Han 2014-11-04 09:03:06 UTC
QMP log (Src):

2014-11-03 01:12:46: {"execute": "query-status", "id": "JoO4RM9B"}
2014-11-03 01:12:46: {"return": {"status": "running", "singlestep": false, "running": true}, "id": "JoO4RM9B"}
...
2014-11-03 01:12:47: {"execute": "query-migrate", "id": "QZTfVHFK"}
2014-11-03 01:12:47: {"return": {"expected-downtime": 300, "status": "active", "setup-time": 22, "total-time": 14060, "ram": {"total": 17281396736, "dirty-sync-count": 0, "remaining": 824799232, "mbps": 268.57688, "transferred": 459743476, "duplicate": 3914285, "dirty-pages-rate": 0, "skipped": 0, "normal-bytes": 423686144, "normal": 103439}}, "id": "QZTfVHFK"}
2014-11-03 01:12:49: {"timestamp": {"seconds": 1414948369, "microseconds": 87668}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive_image1", "__com.redhat_debug_info": {"message": "No space left on device", "errno": 28}, "nospace": true, "__com.redhat_reason": "enospc", "reason": "No space left on device", "operation": "write", "action": "stop"}}
2014-11-03 01:12:49: {"timestamp": {"seconds": 1414948369, "microseconds": 96940}, "event": "STOP"}
2014-11-03 01:12:49: {"execute": "query-migrate", "id": "DqcW1xdK"}
2014-11-03 01:12:49: {"return": {"expected-downtime": 300, "status": "active", "setup-time": 22, "total-time": 16068, "ram": {"total": 17281396736, "dirty-sync-count": 0, "remaining": 752181248, "mbps": 268.57112, "transferred": 526889207, "duplicate": 3915656, "dirty-pages-rate": 0, "skipped": 0, "normal-bytes": 490688512, "normal": 119797}}, "id": "DqcW1xdK"}
...
2014-11-03 01:12:51: {"execute": "query-status", "id": "GabY4Oku"}
2014-11-03 01:12:51: {"return": {"status": "io-error", "singlestep": false, "running": false}, "id": "GabY4Oku"}

Comment 8 Stefan Hajnoczi 2015-07-30 13:45:59 UTC
Hi have posted patches upstream.  Please test the following RPM to confirm that the bug has been fixed:
https://brewweb.devel.redhat.com/taskinfo?taskID=9614929

Comment 9 Xu Han 2015-07-31 09:14:05 UTC
(In reply to Stefan Hajnoczi from comment #8)
> Hi have posted patches upstream.  Please test the following RPM to confirm
> that the bug has been fixed:
> https://brewweb.devel.redhat.com/taskinfo?taskID=9614929

Have tested the issue with this build, the bug has gone, dst VM won't crash any more.

{"QMP": {"version": {"qemu": {"micro": 0, "minor": 3, "major": 2}, "package": " (qemu-kvm-rhev-2.3.0-13.el7.test)"}, "capabilities": []}}
{"execute": "qmp_capabilities"}
{"return": {}}
{"timestamp": {"seconds": 1438333604, "microseconds": 942816}, "event": "VSERPORT_CHANGE", "data": {"open": false, "id": "vs"}}
{"timestamp": {"seconds": 1438333604, "microseconds": 953135}, "event": "RESUME"}
{"timestamp": {"seconds": 1438333605, "microseconds": 11287}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive_image1", "nospace": true, "__com.redhat_reason": "enospc", "reason": "No space left on device", "operation": "write", "action": "stop"}}
...
{"execute": "query-status"}
{"return": {"status": "io-error", "singlestep": false, "running": false}}

Comment 10 Miroslav Rezanina 2015-08-07 09:30:43 UTC
Fix included in qemu-kvm-rhev-2.3.0-16.el7

Comment 12 Qian Guo 2015-08-13 07:05:32 UTC
Reproduced with qemu-kvm-rhev-2.1.2-21.el7.x86_64

steps:
Boot guest as:
/usr/libexec/qemu-kvm \
    -name rhel7.0 \
    -S \
    -machine pc \
    -cpu Penryn \
    -m 4096 \
    -realtime mlock=off \
    -smp 4,sockets=1,cores=4,threads=1 \
    -uuid fbf54917-5833-48f2-b3fb-5ce2ad294d93 \
    -no-user-config \
    -nodefaults \
    -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/rhel7cp1.0.monitor,server,nowait \
    -mon chardev=charmonitor,id=monitor,mode=control \
    -rtc base=utc,driftfix=slew \
    -global kvm-pit.lost_tick_policy=discard \
    -no-hpet \
    -no-shutdown \
    -boot menu=on \
    -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 \
    -drive file=/home/rhel72qiguo.qcow2,snapshot=off,cache=none,if=none,id=drive-virtio-disk0,format=qcow2 \
    -device virtio-blk-pci,bus=pci.0,addr=0x7,id=test1,drive=drive-virtio-disk0 \
    -netdev tap,vhost=on,script=/etc/qemu-ifup,id=hostnet0 \
    -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:0b:02:81,bus=pci.0,addr=0x3 \
    -chardev pty,id=charserial0 \
    -device isa-serial,chardev=charserial0,id=serial0 \
    -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/rhel7cp1.0.org.qemu.guest_agent.0,server,nowait \
    -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 \
    -chardev spicevmc,id=charchannel1,name=vdagent \
    -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 \
    -spice port=5901,disable-ticketing,seamless-migration=on \
    -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vgamem_mb=16,bus=pci.0,addr=0x2 \
    -device intel-hda,id=sound0,bus=pci.0,addr=0x4 \
    -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 \
    -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 \
    -msg timestamp=on \
    -monitor stdio \
    -qmp unix:/tmp/q1,server,nowait \
    -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 \
    -drive file=/mnt/test.qcow2,snapshot=off,cache=none,if=none,id=drive-virtio-disk1,format=qcow2 \
    -device virtio-scsi-pci,bus=pci.0,addr=0xe,id=scsi1 \
    -device scsi-hd,drive=drive-virtio-disk1,bus=scsi1.0 \

2.migrate guest
(qemu) migrate -d tcp:0:4444

3. trigger ENOSPC for the scsi disk
(qemu) block I/O error in device 'drive-virtio-disk1': No space left on device (28)

Result:
After migration, qemu crashed:
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff0a9ac50 in __memcpy_ssse3 () from /lib64/libc.so.6
(gdb) bt
#0  0x00007ffff0a9ac50 in __memcpy_ssse3 () from /lib64/libc.so.6
#1  0x00005555558418a6 in memcpy (__len=51, __src=<optimized out>, __dest=<optimized out>)
    at /usr/include/bits/string3.h:51
#2  iov_to_buf (iov=iov@entry=0x555556722f70, iov_cnt=<optimized out>, offset=<optimized out>, offset@entry=0, 
    buf=buf@entry=0x555556726f94, bytes=bytes@entry=51) at util/iov.c:49
#3  0x0000555555636287 in virtio_scsi_parse_req (req=req@entry=0x55555671af10, req_size=51, resp_size=108)
    at /usr/src/debug/qemu-2.1.2/hw/scsi/virtio-scsi.c:152
#4  0x0000555555636450 in virtio_scsi_load_request (f=0x55555657ff50, sreq=0x555556569020)
    at /usr/src/debug/qemu-2.1.2/hw/scsi/virtio-scsi.c:243
#5  0x000055555578e0fa in get_scsi_requests (f=0x55555657ff50, pv=0x55555647c210, size=<optimized out>)
    at hw/scsi/scsi-bus.c:1905
#6  0x00005555556cf518 in vmstate_load_state (f=f@entry=0x55555657ff50, 
    vmsd=0x555555c107c0 <vmstate_scsi_device>, opaque=0x55555647c210, version_id=1) at vmstate.c:105
#7  0x00005555556cf4c4 in vmstate_load_state (f=0x55555657ff50, vmsd=0x555555c0f6c0 <vmstate_scsi_disk_state>, 
    opaque=0x55555647c210, version_id=1) at vmstate.c:102
#8  0x00005555556183aa in qemu_loadvm_state (f=f@entry=0x55555657ff50) at /usr/src/debug/qemu-2.1.2/savevm.c:1008
#9  0x00005555556cda86 in process_incoming_migration_co (opaque=0x55555657ff50) at migration.c:97
#10 0x000055555580007a in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>)
    at coroutine-ucontext.c:118
#11 0x00007ffff09a00f0 in ?? () from /lib64/libc.so.6
#12 0x00007fffffffcaf0 in ?? ()
#13 0x0000000000000000 in ?? ()


So this bug is reproduced.


Verified with qemu-kvm-rhev-2.3.0-17.el7.x86_64

steps as above.

Result:
After migration, the status of qemu is io-error and qemu does not crash.
(qemu) info status
VM status: paused (io-error)


So this bug is fixed in x86 platform.

Comment 13 juzhang 2015-08-17 04:16:56 UTC
According to comment12, set this issue as verified.

Comment 15 errata-xmlrpc 2015-12-04 16:20:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2546.html