Bug 1456086
Summary: | Random SIGABRT when migrate from 7.4 to 7.3 !(bs->open_flags & 0x0800) | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Han Han <hhan> | ||||
Component: | qemu-kvm-rhev | Assignee: | Dr. David Alan Gilbert <dgilbert> | ||||
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | huiqingding <huding> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 7.3 | CC: | chayang, dyuan, fjin, hhan, huding, juzhang, knoel, michen, peterx, quintela, virt-maint, xfu, xuzhang | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2017-06-08 11:04:41 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1376765 | ||||||
Attachments: |
|
Hi, Can you confirm - is it the 7.3 destination that's aborting? Dave Sure. The destination is RHEL7.3.z. Hi, 1) Can you tell me what does the 'dominfo' image you're running does - is it some type of stress test or what? 2) What guest OS is it running? 3) Please repeat the test until you hit a failure, when that happens please check on the state of the source VM - is it still running? Dave Hi - can you repeat this one? Hi you are missing cache=none on the disk. Once you change that, if you can still reproduce it, could you check if it also happens when you use virtio-blk for the disk instead of ide? Thanks, Juan. Hi I didn't hit the issue any more. I also tried virtio-blk with cache=none, but not reproduced. (In reply to Han Han from comment #7) > Hi I didn't hit the issue any more. > I also tried virtio-blk with cache=none, but not reproduced. Hmm ok, I guess we'll need to close it then if we can't reproduce it; however, please answer the questions about the guest OS from comment 4. (In reply to Dr. David Alan Gilbert from comment #4) > Hi, > 1) Can you tell me what does the 'dominfo' image you're running does - is > it some type of stress test or what? > 2) What guest OS is it running? > 3) Please repeat the test until you hit a failure, when that happens > please check on the state of the source VM - is it still running? > > Dave 1) As I remember I didn't run any stress test. But the dst host cannot resolve the src host's hostname when migration. 2) Guest OS is RHEL7.4 with kernel-3.10.0-675.el7.x86_64. 3) I didn't hit the bug any more. As I remember VM kept running on src host after meet the failure. OK, then lets close it, but keep an eye out for anything similar. |
Created attachment 1282799 [details] all thread backtrace Description of problem: As subject Version-Release number of selected component (if applicable): src host: qemu-kvm-rhev-2.9.0-6.el7.x86_64 libvirt-3.2.0-6.el7.x86_64 kernel-3.10.0-668.el7.x86_64 dst host: qemu-kvm-rhev-2.6.0-28.el7_3.10.x86_64 libvirt-2.0.0-10.el7_3.9.x86_64 kernel-3.10.0-514.21.1.el7.x86_64 How reproducible: Seldom Steps to Reproduce: 1. Run a host as following on RHEL7.4: /usr/libexec/qemu-kvm -name guest=dominfo,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1876-dominfo/master-key.aes -machine pc-i440fx-rhel7.3.0,accel=kvm,usb=off,dump-guest-core=off -m 1024 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 48df3ffd-0d6d-4977-95bc-9b5c48413cb1 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-1876-dominfo/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -no-acpi -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device pxb,bus_nr=10,id=pci.1,bus=pci.0,addr=0x8 -device pxb,bus_nr=50,id=pci.2,bus=pci.0,addr=0x9 -device pxb,bus_nr=100,id=pci.3,bus=pci.0,addr=0xa -device pci-bridge,chassis_nr=4,id=pci.4,bus=pci.1,addr=0x0 -device pci-bridge,chassis_nr=5,id=pci.5,bus=pci.2,addr=0x0 -device pci-bridge,chassis_nr=6,id=pci.6,bus=pci.3,addr=0x0 -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 -drive file=/exports/dominfo.qcow2,format=qcow2,if=none,id=drive-ide0-0-0 -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -netdev tap,fd=27,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:86:ba:d7,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -spice port=5900,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=1 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -msg timestamp=on 2. Migrate to a RHEL7.3 host # virsh migrate dominfo qemu+ssh://intel-e5530-8-2.lab.eng.pek2.redhat.com/system --verbose --unsafe Actual results: Sometimes migrate failed and get SIGABRT on dst host: # abrt-cli ls id a5f8397fe659fcd68103b7105838c24aca3d80e1 reason: qemu-kvm killed by SIGABRT time: 2017年05月26日 星期五 05时49分11秒 cmdline: /usr/libexec/qemu-kvm -name guest=dominfo,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-21-dominfo/master-key.aes -machine pc-i440fx-rhel7.3.0,accel=kvm,usb=off -m 1024 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 48df3ffd-0d6d-4977-95bc-9b5c48413cb1 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-21-dominfo/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -no-acpi -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device pxb,bus_nr=10,id=pci.1,bus=pci.0,addr=0x8 -device pxb,bus_nr=50,id=pci.2,bus=pci.0,addr=0x9 -device pxb,bus_nr=100,id=pci.3,bus=pci.0,addr=0xa -device pci-bridge,chassis_nr=4,id=pci.4,bus=pci.1,addr=0x0 -device pci-bridge,chassis_nr=5,id=pci.5,bus=pci.2,addr=0x0 -device pci-bridge,chassis_nr=6,id=pci.6,bus=pci.3,addr=0x0 -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 -drive file=/exports/dominfo.qcow2,format=qcow2,if=none,id=drive-ide0-0-0 -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -netdev tap,fd=37,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:86:ba:d7,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -spice port=5902,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=1 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=2 -incoming defer -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -msg timestamp=on package: qemu-kvm-rhev-2.6.0-28.el7_3.10 uid: 107 (qemu) count: 1 Directory: /var/spool/abrt/ccpp-2017-05-26-05:49:11-28336 Run 'abrt-cli report /var/spool/abrt/ccpp-2017-05-26-05:49:11-28336' for creating a case in Red Hat Customer Portal Expected results: NO SIGABRT Additional info: The backtrace: (gdb) bt fu #0 0x00007fa30ee781d7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 resultvar = 0 pid = 28336 selftid = 28336 #1 0x00007fa30ee798c8 in __GI_abort () at abort.c:90 save_stage = 2 act = {__sigaction_handler = {sa_handler = 0x7ffe6f99f5f8, sa_sigaction = 0x7ffe6f99f5f8}, sa_mask = {__val = {140338307790064, 140338727754896, 1342, 140338785189312, 140338306427555, 4, 140338785189152, 47244640257, 128, 0, 0, 0, 0, 21474836480, 140338307790064, 140338307802088}}, sa_flags = 665579520, sa_restorer = 0x7fa30efc23e8} sigs = {__val = {32, 0 <repeats 15 times>}} #2 0x00007fa30ee71146 in __assert_fail_base (fmt=0x7fa30efc23e8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x7fa328041d23 "!(bs->open_flags & 0x0800)", file=file@entry=0x7fa328041c90 "block/io.c", line=line@entry=1342, function=function@entry=0x7fa328042110 <__PRETTY_FUNCTION__.34660> "bdrv_co_do_pwritev") at assert.c:92 str = 0x7fa32abea5b0 "Ф\276*\243\177" total = 4096 #3 0x00007fa30ee711f2 in __GI___assert_fail (assertion=assertion@entry=0x7fa328041d23 "!(bs->open_flags & 0x0800)", file=file@entry=0x7fa328041c90 "block/io.c", line=line@entry=1342, function=function@entry=0x7fa328042110 <__PRETTY_FUNCTION__.34660> "bdrv_co_do_pwritev") at assert.c:101 No locals. #4 0x00007fa327f2eaf7 in bdrv_co_do_pwritev (bs=0x7fa32ac95400, offset=<optimized out>, bytes=65536, qiov=0x7ffe6f99cfc0, flags=(unknown: 0)) at block/io.c:1342 req = {bs = 0x7fa32ac95400, offset = 11119296512, bytes = 65536, type = BDRV_TRACKED_READ, serialising = false, overlap_offset = 11119296512, overlap_bytes = 65536, list = {le_next = 0x0, le_prev = 0x7fa32ac985b8}, co = 0x7fa32ac2c880, wait_queue = {entries = {tqh_first = 0x7fa327a41c38, tqh_last = 0x2}}, waiting_for = 0x7fa32ac95400} align = 512 head_buf = 0x0 tail_buf = 0x0 local_qiov = {iov = 0x7fa32ac01000, niov = 725630976, nalloc = 32675, size = 512} use_local_qiov = false ret = <optimized out> __PRETTY_FUNCTION__ = "bdrv_co_do_pwritev" #5 0x00007fa327f2ebc2 in bdrv_rw_co_entry (opaque=0x7ffe6f99cf70) at block/io.c:588 rwco = 0x7ffe6f99cf70 #6 0x00007fa327f9218a in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:78 self = 0x7fa32ac2c880 co = 0x7fa32ac2c880 #7 0x00007fa30ee89cf0 in ?? () from /usr/lib64/libc-2.17.so No symbol table info available. ---Type <return> to continue, or q <return> to quit---0 #8 0x00007ffe6f99c690 in ?? () No symbol table info available. #9 0x0000000000000000 in ?? () No symbol table info available. Since the bugs is hard to reproduce(only one time when I run the test), could you analysis the backtrace and find a proper way to reproduce it easily. Thanks