Bug 1781637
Summary: | qemu crashed when do mem and disk snapshot | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux Advanced Virtualization | Reporter: | yisun | ||||
Component: | qemu-kvm | Assignee: | Kevin Wolf <kwolf> | ||||
qemu-kvm sub component: | General | QA Contact: | aihua liang <aliang> | ||||
Status: | CLOSED ERRATA | Docs Contact: | |||||
Severity: | high | ||||||
Priority: | high | CC: | areis, chayang, coli, ddepaula, dyuan, eshames, eshenitz, hhan, jinzhao, juzhang, kwolf, lcheng, virt-maint, weizhan, xiaohli, yisun | ||||
Version: | 8.2 | Keywords: | Automation, Regression, TestBlocker | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | qemu-kvm-4.2.0-10.module+el8.2.0+5740+c3dff59e | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2020-05-05 09:52:16 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1798462 | ||||||
Attachments: |
|
Description
yisun
2019-12-10 10:50:21 UTC
qemu process as follow: (.libvirt-ci-venv-ci-runtest-AglsHs) [root@hp-dl385g8-01 tmp]# ps -ef | grep avocado-vt-vm1 qemu 2797 1 10 05:59 ? 00:00:00 /usr/libexec/qemu-kvm -name guest=avocado-vt-vm1,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-11-avocado-vt-vm1/master-key.aes -machine pc-q35-rhel8.1.0,accel=kvm,usb=off,dump-guest-core=off -cpu Opteron_G5,vme=on,x2apic=on,tsc-deadline=on,hypervisor=on,arat=on,tsc-adjust=on,bmi1=on,arch-capabilities=on,ssbd=on,mmxext=on,fxsr-opt=on,cmp-legacy=on,cr8legacy=on,osvw=on,perfctr-core=on,ibpb=on,amd-ssbd=on,virt-ssbd=on,rdctl-no=on,skip-l1dfl-vmentry=on,mds-no=on,svm=off,vmx=off -m 1024 -overcommit mem-lock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 5f4b696f-3fb4-48b9-8b58-6e1020064005 -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=40,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 -device pcie-root-port,port=0x16,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x6 -device qemu-xhci,p2=15,p3=15,id=usb,bus=pci.2,addr=0x0 -device virtio-serial-pci,id=virtio-serial0,bus=pci.3,addr=0x0 -blockdev {"driver":"file","filename":"/var/lib/avocado/data/avocado-vt/images/jeos-27-x86_64.qcow2","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-1-format","read-only":false,"driver":"qcow2","file":"libvirt-1-storage","backing":null} -device virtio-blk-pci,scsi=off,bus=pci.4,addr=0x0,drive=libvirt-1-format,id=virtio-disk0,bootindex=1 -netdev tap,fd=42,id=hostnet0,vhost=on,vhostfd=43 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:02:46:85,bus=pci.1,addr=0x0 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,fd=44,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -vnc 127.0.0.1:1 -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -device virtio-balloon-pci,id=balloon0,bus=pci.5,addr=0x0 -object rng-random,id=objrng0,filename=/dev/urandom -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.6,addr=0x0 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on Hi, test with snapshot on disk, don't hit this issue. Test Env: qemu-kvm version:qemu-kvm-4.2.0-2.module+el8.2.0+5135+ed3b2489.x86_64 kernel version:4.18.0-160.el8.x86_64 Test steps: 1.Start guest with qemu cmds: /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -sandbox on \ -machine q35 \ -nodefaults \ -device VGA,bus=pcie.0,addr=0x1 \ -m 7168 \ -smp 4,maxcpus=4,cores=2,threads=1,dies=1,sockets=2 \ -cpu 'Skylake-Client',+kvm_pv_unhalt \ -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20191210-025743-Q0JzJpKT,server,nowait \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20191210-025743-Q0JzJpKT,server,nowait \ -mon chardev=qmp_id_catch_monitor,mode=control \ -device pvpanic,ioport=0x505,id=idDU2Q2E \ -chardev socket,server,path=/var/tmp/serial-serial0-20191210-025743-Q0JzJpKT,id=chardev_serial0,nowait \ -device isa-serial,id=serial0,chardev=chardev_serial0 \ -chardev socket,id=seabioslog_id_20191210-025743-Q0JzJpKT,path=/var/tmp/seabios-20191210-025743-Q0JzJpKT,server,nowait \ -device isa-debugcon,chardev=seabioslog_id_20191210-025743-Q0JzJpKT,iobase=0x402 \ -device pcie-root-port,id=pcie.0-root-port-2,slot=2,chassis=2,addr=0x2,bus=pcie.0,multifunction=on \ -device qemu-xhci,id=usb1,bus=pcie.0-root-port-2,addr=0x0 \ -object iothread,id=iothread0 \ -object iothread,id=iothread1 \ -device pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x2.0x1,bus=pcie.0 \ -device virtio-scsi-pci,id=virtio_scsi_pci0,iothread=iothread0,bus=pcie.0-root-port-3,addr=0x0 \ -blockdev node-name=file_image1,driver=file,aio=threads,filename=/home/kvm_autotest_root/images/rhel820-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_image1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1 \ -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \ -device pcie-root-port,id=pcie.0-root-port-4,slot=4,chassis=4,addr=0x2.0x2,bus=pcie.0 \ -blockdev node-name=file_data1,driver=file,aio=threads,filename=/home/data.qcow2,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_data1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_data1 \ -device virtio-blk-pci,id=data1,drive=drive_data1,write-cache=on,bus=pcie.0-root-port-4,iothread=iothread1 \ -device pcie-root-port,id=pcie.0-root-port-5,slot=5,chassis=5,addr=0x2.0x3,bus=pcie.0 \ -device virtio-net-pci,mac=9a:9b:1d:13:61:86,id=idKg9AzR,netdev=idxDM2m8,bus=pcie.0-root-port-5,addr=0x0 \ -netdev tap,id=idxDM2m8,vhost=on \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -vnc :0 \ -rtc base=utc,clock=host,driftfix=slew \ -boot menu=off,order=cdn,once=c,strict=off \ -enable-kvm \ -monitor stdio \ -device pcie-root-port,id=pcie_extra_root_port_0,slot=6,chassis=6,addr=0x2.0x4,bus=pcie.0 \ -qmp tcp:0:3000,server,nowait \ 2.Create snapshot target {'execute':'blockdev-create','arguments':{'options': {'driver':'file','filename':'/root/sn1','size':21474836480},'job-id':'job1'}} {'execute':'blockdev-add','arguments':{'driver':'file','node-name':'drive_sn1','filename':'/root/sn1'}} {"return": {}} {'execute':'blockdev-create','arguments':{'options': {'driver': 'qcow2','file':'drive_sn1','size':21474836480,'backing-file':'/home/kvm_autotest_root/images/rhel820-64-virtio-scsi.qcow2','backing-fmt':'qcow2'},'job-id':'job2'}} {'execute':'blockdev-add','arguments':{'driver':'qcow2','node-name':'sn1','file':'drive_sn1','backing':null}} {"return": {}} {'execute':'job-dismiss','arguments':{'id':'job1'}} {"return": {}} {'execute':'job-dismiss','arguments':{'id':'job2'}} {"return": {}} 3.Do snapshot on drive_image1 {"execute":"blockdev-snapshot","arguments":{"node":"drive_image1","overlay":"sn1"}} {"return": {}} After step3, qemu not crash. Created attachment 1644258 [details] Domain XMLs, ARGVs, libvirtd log, detailed backtrace I think it is caused by blockdev features. Evidence: For the same steps, reproduced when blockdev enabled, not reproduced whe blockdev disabled. Version: libvirt-5.10.0-1.module+el8.2.0+5040+bd433686.x86_64 qemu-kvm-4.2.0-1.module+el8.2.0+4793+b09dd2fb.x86_64 Steps: 1. Start an VM 2. Create a memory&disk snapshot # virsh snapshot-create-as pc s1 --no-metadata --diskspec vda,file=/tmp/vda.s1 --memspec /tmp/mem.s1 Then vm gets SIGABRT. # abrt-cli ls id 266e34a039bca1273d0c2a4e1af8db1a28fba6ff reason: _nl_load_domain.cold.0(): qemu-kvm killed by SIGABRT time: Wed 11 Dec 2019 10:38:37 PM EST cmdline: /usr/libexec/qemu-kvm -name guest=pc,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-2-pc/master-key.aes -machine pc-i440fx-rhel7.6.0,accel=kvm,usb=off,vmport=off,dump-guest-core=off -cpu Opteron_G3 -m 1024 -overcommit mem-lock=off -smp 1,sockets=1,cores=1,threads=1 -uuid c0fae7ee-7f4e-4cb0-aca4-25fa945a3f83 -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=37,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x6 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 -blockdev '{\"driver\":\"file\",\"filename\":\"/var/lib/libvirt/images/RHEL-8.2-x86_64-latest.qcow2\",\"node-name\":\"libvirt-2-storage\",\"auto-read-only\":true,\"discard\":\"unmap\"}' -blockdev '{\"node-name\":\"libvirt-2-format\",\"read-only\":false,\"driver\":\"qcow2\",\"file\":\"libvirt-2-storage\",\"backing\":null}' -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x9,drive=libvirt-2-format,id=virtio-disk0,bootindex=1,serial=SB -blockdev '{\"driver\":\"file\",\"filename\":\"/tmp/copy.img1\",\"aio\":\"native\",\"node-name\":\"libvirt-1-storage\",\"cache\":{\"direct\":true,\"no-flush\":false},\"auto-read-only\":true,\"discard\":\"unmap\"}' -blockdev '{\"node-name\":\"libvirt-1-format\",\"read-only\":false,\"discard\":\"ignore\",\"cache\":{\"direct\":true,\"no-flush\":false},\"driver\":\"raw\",\"file\":\"libvirt-1-storage\"}' -device virtio-blk-pci,ioeventfd=on,scsi=off,bus=pci.0,addr=0xa,drive=libvirt-1-format,id=virtio-disk1,write-cache=off,werror=report,rerror=report -netdev tap,fd=39,id=hostnet0 -device e1000,netdev=hostnet0,id=net0,mac=52:54:00:09:f8:0c,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=13,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5900,addr=0.0.0.0,disable-ticketing,image-compression=off,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on package: 15:qemu-kvm-core-4.2.0-1.module+el8.2.0+4793+b09dd2fb uid: 107 (qemu) count: 1 Directory: /var/spool/abrt/ccpp-2019-12-11-22:38:37-6076 Run 'abrt-cli report /var/spool/abrt/ccpp-2019-12-11-22:38:37-6076' for creating a case in Red Hat Customer Portal Additional info: Not reproduced when memory only snapshot or disk-only snapshot Not reproduced when blockdev diskabled: <domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'> ... <qemu:capabilities> <qemu:add capability='drive'/> <qemu:del capability='blockdev'/> </qemu:capabilities> </domain> Not reproduced on RHEL8.1.1-AV(libvirt-5.6.0-10.module+el8.1.1+5131+a6fe889c.x86_64 qemu-kvm-4.1.0-18.module+el8.1.1+5150+45ce6c40.x86_64) It can be reproduced on q35 machine type: qemu log: /usr/libexec/qemu-kvm \ -name guest=q35,debug-threads=on \ -S \ -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1-q35/master-key.aes \ -machine pc-q35-rhel8.1.0,accel=kvm,usb=off,vmport=off,dump-guest-core=off \ -cpu Opteron_G3 \ -m 1024 \ -overcommit mem-lock=off \ -smp 1,sockets=1,cores=1,threads=1 \ -uuid 516b83cb-bd68-47cf-b577-9ed07741601e \ -no-user-config \ -nodefaults \ -chardev socket,id=charmonitor,fd=37,server,nowait \ -mon chardev=charmonitor,id=monitor,mode=control \ -rtc base=utc,driftfix=slew \ -global kvm-pit.lost_tick_policy=delay \ -no-hpet \ -no-shutdown \ -global ICH9-LPC.disable_s3=1 \ -global ICH9-LPC.disable_s4=1 \ -boot strict=on \ -device ich9-usb-ehci1,id=usb,bus=pcie.0,addr=0x5.0x7 \ -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pcie.0,multifunction=on,addr=0x5 \ -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pcie.0,addr=0x5.0x1 \ -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pcie.0,addr=0x5.0x2 \ -device virtio-scsi-pci,id=scsi0,bus=pcie.0,addr=0x6 \ -device virtio-serial-pci,id=virtio-serial0,bus=pcie.0,addr=0x7 \ -blockdev '{"driver":"file","filename":"/var/lib/libvirt/images/RHEL-8.2-x86_64-latest-clone.qcow2","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}' \ -blockdev '{"node-name":"libvirt-1-format","read-only":false,"driver":"qcow2","file":"libvirt-1-storage","backing":null}' \ -device virtio-blk-pci,scsi=off,bus=pcie.0,addr=0x9,drive=libvirt-1-format,id=virtio-disk0,bootindex=1 \ -netdev tap,fd=39,id=hostnet0 \ -device e1000,netdev=hostnet0,id=net0,mac=52:54:00:87:15:bd,bus=pcie.0,addr=0x3 \ -chardev pty,id=charserial0 \ -device isa-serial,chardev=charserial0,id=serial0 \ -chardev spicevmc,id=charchannel0,name=vdagent \ -device virtserialport,bus=virtio-serial0.0,nr=13,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 \ -device usb-tablet,id=input0,bus=usb.0,port=1 \ -spice port=5900,addr=0.0.0.0,disable-ticketing,image-compression=off,seamless-migration=on \ -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x2 \ -device intel-hda,id=sound0,bus=pcie.0,addr=0x4 \ -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 \ -chardev spicevmc,id=charredir0,name=usbredir \ -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 \ -chardev spicevmc,id=charredir1,name=usbredir \ -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 \ -device virtio-balloon-pci,id=balloon0,bus=pcie.0,addr=0x8 \ -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \ -msg timestamp=on char device redirected to /dev/pts/6 (label charserial0) qemu-kvm: block/io.c:1879: bdrv_co_write_req_prepare: Assertion `child->perm & BLK_PERM_WRITE' failed. 2019-12-12 06:06:39.735+0000: shutting down, reason=crashed (In reply to aihua liang from comment #4) > Hi, test with snapshot on disk, don't hit this issue. > > Test Env: > qemu-kvm version:qemu-kvm-4.2.0-2.module+el8.2.0+5135+ed3b2489.x86_64 > kernel version:4.18.0-160.el8.x86_64 > Hi Aihua, I found this crash happens after the snapshot but within a uncertain period. Today when I try to reproduce it, it took around 1 minutes to crash, as follow: [root@libvirt-rhel-8 ~]# virsh snapshot-create-as avocado-vt-vm1 s2 --memspec file=/var/lib/avocado/data/avocado-vt/images/avv-mem.s2 Domain snapshot s2 created [root@libvirt-rhel-8 ~]# for i in {1..1000}; do sleep 1; echo Time:$i; virsh domstate avocado-vt-vm1; done Time:1 running Time:2 running Time:3 running ... Time:54 running Time:55 shut off ... [root@libvirt-rhel-8 ~]# cat /var/log/libvirt/qemu/avocado-vt-vm1.log qemu-kvm: block/io.c:1879: bdrv_co_write_req_prepare: Assertion `child->perm & BLK_PERM_WRITE' failed. 2019-12-13 06:14:06.943+0000: shutting down, reason=crashed Could you pls double check if the qemu process terminated with a longer period? (In reply to yisun from comment #7) > (In reply to aihua liang from comment #4) > > Hi, test with snapshot on disk, don't hit this issue. > > > > Test Env: > > qemu-kvm version:qemu-kvm-4.2.0-2.module+el8.2.0+5135+ed3b2489.x86_64 > > kernel version:4.18.0-160.el8.x86_64 > > > > > Hi Aihua, > I found this crash happens after the snapshot but within a uncertain period. > Today when I try to reproduce it, it took around 1 minutes to crash, as > follow: > > [root@libvirt-rhel-8 ~]# virsh snapshot-create-as avocado-vt-vm1 s2 > --memspec file=/var/lib/avocado/data/avocado-vt/images/avv-mem.s2 > Domain snapshot s2 created > [root@libvirt-rhel-8 ~]# for i in {1..1000}; do sleep 1; echo Time:$i; virsh > domstate avocado-vt-vm1; done > Time:1 > running > > Time:2 > running > > Time:3 > running > > ... > > Time:54 > running > > Time:55 > shut off > ... > [root@libvirt-rhel-8 ~]# cat /var/log/libvirt/qemu/avocado-vt-vm1.log > qemu-kvm: block/io.c:1879: bdrv_co_write_req_prepare: Assertion `child->perm > & BLK_PERM_WRITE' failed. > 2019-12-13 06:14:06.943+0000: shutting down, reason=crashed > > > Could you pls double check if the qemu process terminated with a longer > period? Hi, YiSun Test again, and wait for 2 minutes later, still can't reproduce: After migration: {"execute":"migrate","arguments":{"detach":true,"blk":false,"inc":false,"uri": "exec:gzip -c > STATEFILE.gz"},"id":"libvirt-16"} finished, vm enter "paused (postmigrate)" status successfully. BR, Aliang (In reply to aihua liang from comment #8) > (In reply to yisun from comment #7) > > (In reply to aihua liang from comment #4) > > > Hi, test with snapshot on disk, don't hit this issue. > > > > > > Test Env: > > > qemu-kvm version:qemu-kvm-4.2.0-2.module+el8.2.0+5135+ed3b2489.x86_64 > > > kernel version:4.18.0-160.el8.x86_64 > > > > > > > > > Hi Aihua, > > I found this crash happens after the snapshot but within a uncertain period. > > Today when I try to reproduce it, it took around 1 minutes to crash, as > > follow: > > > > [root@libvirt-rhel-8 ~]# virsh snapshot-create-as avocado-vt-vm1 s2 > > --memspec file=/var/lib/avocado/data/avocado-vt/images/avv-mem.s2 > > Domain snapshot s2 created > > [root@libvirt-rhel-8 ~]# for i in {1..1000}; do sleep 1; echo Time:$i; virsh > > domstate avocado-vt-vm1; done > > Time:1 > > running > > > > Time:2 > > running > > > > Time:3 > > running > > > > ... > > > > Time:54 > > running > > > > Time:55 > > shut off > > ... > > [root@libvirt-rhel-8 ~]# cat /var/log/libvirt/qemu/avocado-vt-vm1.log > > qemu-kvm: block/io.c:1879: bdrv_co_write_req_prepare: Assertion `child->perm > > & BLK_PERM_WRITE' failed. > > 2019-12-13 06:14:06.943+0000: shutting down, reason=crashed > > > > > > Could you pls double check if the qemu process terminated with a longer > > period? > > > Hi, YiSun > > Test again, and wait for 2 minutes later, still can't reproduce: > After migration: > {"execute":"migrate","arguments":{"detach":true,"blk":false,"inc":false, > "uri": > "exec:gzip -c > STATEFILE.gz"},"id":"libvirt-16"} finished, vm enter "paused > (postmigrate)" status successfully. > > > BR, > Aliang thx for the info Hi Admear, Since from pure qemu cmd line, the problem is not reproduced yet. Could you pls have a look about the backtrace or libvirtd log to see if any clue can be found? thanks Hm, an idea, yes. After migration completes, all block devices are inactive, i.e. ownership is released. However, if you do blockdev-add in this state, the new node will be active, which breaks the assumption that all nodes are in the same active/inactive state. Taking a snapshot means that we now have an active overlay over an inactive backing file. When doing 'cont', we recursively activate all images trees from their root. However, since the root is already active, we skip the activation of the whole subtree, so the backing file stays inactive and the next write to it causes an assertion failure. Using blockdev-snapshot-sync rather than blockdev-snapshot, the flags are copied from the old top layer, so bdrv_open() for the new overlay is called with BDRV_O_INACTIVE. I don't think bdrv_open() was ever supposed to be called with that flag, but it could be an explanation why the case accidentally worked with libvirt in -drive mode. For the record: This is BZ prevents the whole virt module of passing gating without requiring manual intervention. Please consider raising the priority of this. I'll see what I can do, but please note that this bug isn't new in any way in QEMU. This is the first report of it and we don't have an upstream fix yet, so fixing it upstream and then backporting will take a few days. If it's really blocking your work, consider backing out whatever made it apparent. (Some libvirt update? Additional test cases?) Thank you Kevin, I disabled the failing test. The side effect is that, due to how osci is designed, we might have to disable all perl-sys-virt tests (actually make them non-gating). Encounter Qemu crash when taken a VM snapshot with memory in oVirt. QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks *** Bug 1798462 has been marked as a duplicate of this bug. *** Can reproduced it on qemu-kvm-4.2.0-9.module+el8.2.0+5699+b5331ee5:(libvirt version:libvirt-6.0.0-6.module+el8.2.0+5821+109ee33c.x86_64) # virsh create avocado-test_x86.xml Domain avocado-vt-vm1 created from avocado-test_x86.xml [root@hp-dl385g10-04 home]# virsh list Id Name State -------------------------------- 2 avocado-vt-vm1 running # virsh snapshot-create-as avocado-vt-vm1 s1 --no-metadata --diskspec vda,file=/tmp/vda.s1 --memspec /tmp/mem.s1 Domain snapshot s1 created # virsh list Id Name State -------------------- #cat /var/log/libvirt/qemu/avocado-vt-vm1.log ... 2020-02-24 10:50:16.143+0000: Domain id=2 is tainted: custom-hypervisor-feature char device redirected to /dev/pts/6 (label charserial0) 2020-02-24T10:50:16.243809Z qemu-kvm: -device cirrus-vga,id=video0,bus=pci.0,addr=0x2: warning: 'cirrus-vga' is deprecated, please use a different VGA card instead qemu-kvm: block/io.c:1879: bdrv_co_write_req_prepare: Assertion `child->perm & BLK_PERM_WRITE' failed. 2020-02-24 10:50:45.114+0000: shutting down, reason=crashed Test on qemu-kvm-4.2.0-11.module+el8.2.0+5837+4c1442ec(libvirt version:libvirt-6.0.0-6.module+el8.2.0+5821+109ee33c.x86_64), don't hit this issue any more. # virsh create avocado-test_x86.xml Domain avocado-vt-vm1 created from avocado-test_x86.xml # virsh list Id Name State -------------------------------- 1 avocado-vt-vm1 running # virsh snapshot-create-as avocado-vt-vm1 s1 --no-metadata --diskspec vda,file=/tmp/vda.s1 --memspec /tmp/mem.s1 Domain snapshot s1 created # virsh list Id Name State -------------------------------- 1 avocado-vt-vm1 running Set bug's status to "Verified". Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2017 |