Bug 1888344
| Summary: | VM with IO threads enabled crashes randomly during snapshot operations | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux Advanced Virtualization | Reporter: | Anitha Udgiri <audgiri> | |
| Component: | qemu-kvm | Assignee: | Sergio Lopez <slopezpa> | |
| qemu-kvm sub component: | Block Jobs | QA Contact: | aihua liang <aliang> | |
| Status: | CLOSED ERRATA | Docs Contact: | ||
| Severity: | high | |||
| Priority: | high | CC: | aliang, coli, jinzhao, juzhang, mjankula, mkalinin, ngu, qzhang, rhodain, slopezpa, virt-maint, ymankad | |
| Version: | 8.2 | Keywords: | TestOnly, Triaged, ZStream | |
| Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
|
| Target Release: | --- | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | qemu-kvm-6.0.0-17.module+el8.5.0+11173+c9fce0bb | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1973828 (view as bug list) | Environment: | ||
| Last Closed: | 2021-11-16 07:51:01 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1897025, 1973828, 1985906 | |||
|
Comment 3
aihua liang
2020-10-15 12:12:23 UTC
Details:
Create snapshot chain with length 11 and iothread enable by auto test, can't hit this issue.
...
-object iothread,id=iothread0 \
-device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0,iothread=iothread0 \
-blockdev node-name=file_image1,driver=file,aio=threads,filename=/home/kvm_autotest_root/images/rhel821-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \
-blockdev node-name=drive_image1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1 \
-device scsi-hd,id=image1,drive=drive_image1,w{'execute': 'blockdev-snapshot', 'arguments': {'node': 'drive_sn8', 'overlay': 'drive_sn9'}, 'id': 'sXpwrBBw'}
rite-cache=on \
-blockdev node-name=file_sn10,driver=file,aio=threads,filename=/var/lib/avocado/data/avocado-vt/sn10.qcow2,cache.direct=on,cache.no-flush=off \
-blockdev node-name=drive_sn10,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_sn10 \
-device scsi-hd,id=sn10,drive=drive_sn10,write-cache=on \
...
{'execute': 'blockdev-snapshot', 'arguments': {'node': 'drive_data1', 'overlay': 'drive_sn0'}, 'id': 'wuOhlHNJ'}
{'execute': 'blockdev-snapshot', 'arguments': {'node': 'drive_sn0', 'overlay': 'drive_sn1'}, 'id': 'PgAFMAuP'}
{'execute': 'blockdev-snapshot', 'arguments': {'node': 'drive_sn1', 'overlay': 'drive_sn2'}, 'id': 'uxVo0BLz'}
{'execute': 'blockdev-snapshot', 'arguments': {'node': 'drive_sn2', 'overlay': 'drive_sn3'}, 'id': 'euSdVYL4'}
{'execute': 'blockdev-snapshot', 'arguments': {'node': 'drive_sn3', 'overlay': 'drive_sn4'}, 'id': 'nCzebCYL'}
{'execute': 'blockdev-snapshot', 'arguments': {'node': 'drive_sn4', 'overlay': 'drive_sn5'}, 'id': 'VIifEExP'}
{'execute': 'blockdev-snapshot', 'arguments': {'node': 'drive_sn5', 'overlay': 'drive_sn6'}, 'id': 'BsvGDV6o'}
{'execute': 'blockdev-snapshot', 'arguments': {'node': 'drive_sn6', 'overlay': 'drive_sn7'}, 'id': 'iwMUuaNx'}
{'execute': 'blockdev-snapshot', 'arguments': {'node': 'drive_sn7', 'overlay': 'drive_sn8'}, 'id': '5rZQu3wq'}
{'execute': 'blockdev-snapshot', 'arguments': {'node': 'drive_sn8', 'overlay': 'drive_sn9'}, 'id': 'sXpwrBBw'}
{'execute': 'blockdev-snapshot', 'arguments': {'node': 'drive_sn9', 'overlay': 'drive_sn10'}, 'id': 'pf0xpTfc'}
(01/20) repeat1.Host_RHEL.m8.u2.product_av.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.8.2.1.x86_64.io-github-autotest-qemu.blockdev_snapshot_chains.q35: PASS (168.90 s)
(02/20) repeat2.Host_RHEL.m8.u2.product_av.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.8.2.1.x86_64.io-github-autotest-qemu.blockdev_snapshot_chains.q35: PASS (171.83 s)
(03/20) repeat3.Host_RHEL.m8.u2.product_av.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.8.2.1.x86_64.io-github-autotest-qemu.blockdev_snapshot_chains.q35: PASS (165.35 s)
(04/20) repeat4.Host_RHEL.m8.u2.product_av.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.8.2.1.x86_64.io-github-autotest-qemu.blockdev_snapshot_chains.q35: PASS (164.45 s)
(05/20) repeat5.Host_RHEL.m8.u2.product_av.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.8.2.1.x86_64.io-github-autotest-qemu.blockdev_snapshot_chains.q35: PASS (164.74 s)
(06/20) repeat6.Host_RHEL.m8.u2.product_av.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.8.2.1.x86_64.io-github-autotest-qemu.blockdev_snapshot_chains.q35: PASS (166.24 s)
(07/20) repeat7.Host_RHEL.m8.u2.product_av.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.8.2.1.x86_64.io-github-autotest-qemu.blockdev_snapshot_chains.q35: PASS (171.87 s)
(08/20) repeat8.Host_RHEL.m8.u2.product_av.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.8.2.1.x86_64.io-github-autotest-qemu.blockdev_snapshot_chains.q35: PASS (173.18 s)
(09/20) repeat9.Host_RHEL.m8.u2.product_av.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.8.2.1.x86_64.io-github-autotest-qemu.blockdev_snapshot_chains.q35: PASS (162.99 s)
(10/20) repeat10.Host_RHEL.m8.u2.product_av.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.8.2.1.x86_64.io-github-autotest-qemu.blockdev_snapshot_chains.q35: PASS (165.92 s)
(11/20) repeat11.Host_RHEL.m8.u2.product_av.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.8.2.1.x86_64.io-github-autotest-qemu.blockdev_snapshot_chains.q35: PASS (168.26 s)
(12/20) repeat12.Host_RHEL.m8.u2.product_av.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.8.2.1.x86_64.io-github-autotest-qemu.blockdev_snapshot_chains.q35: PASS (167.72 s)
(13/20) repeat13.Host_RHEL.m8.u2.product_av.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.8.2.1.x86_64.io-github-autotest-qemu.blockdev_snapshot_chains.q35: PASS (170.33 s)
(14/20) repeat14.Host_RHEL.m8.u2.product_av.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.8.2.1.x86_64.io-github-autotest-qemu.blockdev_snapshot_chains.q35: PASS (169.58 s)
(15/20) repeat15.Host_RHEL.m8.u2.product_av.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.8.2.1.x86_64.io-github-autotest-qemu.blockdev_snapshot_chains.q35: PASS (171.56 s)
(16/20) repeat16.Host_RHEL.m8.u2.product_av.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.8.2.1.x86_64.io-github-autotest-qemu.blockdev_snapshot_chains.q35: PASS (173.00 s)
(17/20) repeat17.Host_RHEL.m8.u2.product_av.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.8.2.1.x86_64.io-github-autotest-qemu.blockdev_snapshot_chains.q35: PASS (170.65 s)
(18/20) repeat18.Host_RHEL.m8.u2.product_av.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.8.2.1.x86_64.io-github-autotest-qemu.blockdev_snapshot_chains.q35: PASS (173.43 s)
(19/20) repeat19.Host_RHEL.m8.u2.product_av.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.8.2.1.x86_64.io-github-autotest-qemu.blockdev_snapshot_chains.q35: PASS (167.46 s)
(20/20) repeat20.Host_RHEL.m8.u2.product_av.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.8.2.1.x86_64.io-github-autotest-qemu.blockdev_snapshot_chains.q35: PASS (164.03 s)
Fixing the product to RHEL Advanced Virtualization To clarify, we are talking about RHV snapshots operations, not RHEL-KVM snapshots. It is when RHV takes the qcow2 image and creates another one and links the two under same vm disk, if to explain in one sentence. Sergio, anything else we need from the customer to help reproducing this or maybe something to deploy on customer's site? Or you have enough information via the coredump? (In reply to Marina Kalinin from comment #6) > To clarify, we are talking about RHV snapshots operations, not RHEL-KVM > snapshots. > It is when RHV takes the qcow2 image and creates another one and links the > two under same vm disk, if to explain in one sentence. Hi Niny, Could you please have a try? Thanks. Try to reproduce it again, still fail to reproduct it.
Test env:
kernel version:4.18.0-193.el8.x86_64
qemu-kvm version:qemu-kvm-4.2.0-29.module+el8.2.1+7712+3c3fe332.2
Guest: RHEL7.6
Machine type: pc
Driveformat: virtio_blk
Data plane: enable
Test Steps:
1. Start guest with qemu cmd:
/usr/libexec/qemu-kvm \
-name 'avocado-vt-vm1' \
-sandbox on \
-machine pc-i440fx-rhel7.6.0 \
-nodefaults \
-device VGA,bus=pci.0,addr=0x2 \
-m 30720 \
-smp 10,maxcpus=10,cores=5,threads=1,dies=1,sockets=2 \
-cpu 'Cascadelake-Server',+kvm_pv_unhalt \
-chardev socket,server=on,id=qmp_id_qmpmonitor1,wait=off,path=/tmp/monitor-qmpmonitor1-20210510-045626-P2QqGJ5E \
-mon chardev=qmp_id_qmpmonitor1,mode=control \
-chardev socket,server=on,id=qmp_id_catch_monitor,wait=off,path=/tmp/monitor-catch_monitor-20210510-045626-P2QqGJ5E \
-mon chardev=qmp_id_catch_monitor,mode=control \
-device pvpanic,ioport=0x505,id=idcKGxjw \
-chardev socket,server=on,id=chardev_serial0,wait=off,path=/tmp/serial-serial0-20210510-045626-P2QqGJ5E \
-device isa-serial,id=serial0,chardev=chardev_serial0 \
-chardev socket,id=seabioslog_id_20210510-045626-P2QqGJ5E,path=/tmp/seabios-20210510-045626-P2QqGJ5E,server=on,wait=off \
-device isa-debugcon,chardev=seabioslog_id_20210510-045626-P2QqGJ5E,iobase=0x402 \
-device qemu-xhci,id=usb1,bus=pci.0,addr=0x3 \
-device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
-object iothread,id=iothread0 \
-blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel76-64-virtio.qcow2,cache.direct=on,cache.no-flush=off \
-blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
-device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,write-cache=on,bus=pci.0,addr=0x4,iothread=iothread0,scsi=off,write-cache=on \
-device virtio-net-pci,mac=9a:0a:a8:f0:c5:08,id=id2V3RTk,netdev=id0s6sUn,bus=pci.0,addr=0x6 \
-netdev tap,id=id0s6sUn,vhost=on \
-vnc :0 \
-rtc base=utc,clock=host,driftfix=slew \
-boot menu=off,order=cdn,once=c,strict=off \
-enable-kvm \
-monitor stdio \
-qmp tcp:0:3000,server,nowait \
2. Create a snapshot node, and do snapshot.
#create snapshot node
{'execute':'qmp_capabilities' }
{'execute':'blockdev-create','arguments':{'options': {'driver':'file','filename':'/root/sn1','size':21474836480},'job-id':'job1'}}
{'execute':'blockdev-add','arguments':{'driver':'file','node-name':'drive_sn1','filename':'/root/sn1'}}
{'execute':'blockdev-create','arguments':{'options': {'driver': 'qcow2','file':'drive_sn1','size':21474836480,'backing-file':'/home/kvm_autotest_root/images/rhel76-64-virtio.qcow2','backing-fmt':'qcow2'},'job-id':'job2'}}
{'execute':'blockdev-add','arguments':{'driver':'qcow2','node-name':'sn1','file':'drive_sn1','backing':null}}
{'execute':'job-dismiss','arguments':{'id':'job1'}}
{'execute':'job-dismiss','arguments':{'id':'job2'}}
{'execute':'blockdev-snapshot','arguments':{'node':'drive_image1','overlay':'sn1'}}
3. Create target snapshot image and backing image.
#qemu-img create -f qcow2 mirror.qcow2 20G
#qemu-img create -f qcow2 mirror_sn.qcow2 -b mirror.qcow2 20G
4. DD file in guest.
#dd if=/dev/urandom of=test bs=1M count=1000
5. Add target snapshot mirror node and do mirror with sync=top.
{'execute':'blockdev-add','arguments':{'driver':'file','node-name':'drive_mirror_sn','filename':'/home/mirror_sn.qcow2'}}
{'execute':'blockdev-add','arguments':{'driver':'qcow2','node-name':'mirror_sn','file':'drive_mirror_sn','backing':null}}
{"execute": "blockdev-mirror", "arguments": {"sync": "top", "device": "sn1","target":"mirror_sn","job-id": "j1"}}
{"timestamp": {"seconds": 1620645242, "microseconds": 859775}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "j1"}}
{"timestamp": {"seconds": 1620645242, "microseconds": 859855}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "j1"}}
{"return": {}}
{"timestamp": {"seconds": 1620645242, "microseconds": 884363}, "event": "JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "j1"}}
{"timestamp": {"seconds": 1620645242, "microseconds": 884430}, "event": "BLOCK_JOB_READY", "data": {"device": "j1", "len": 21757952, "offset": 21757952, "speed": 0, "type": "mirror"}}
6. Convert backing image from src to target.
#qemu-img convert /home/kvm_autotest_root/images/rhel76-64-virtio.qcow2 /home/mirror.qcow2 -O qcow2
7. Do snapshot in dst.
{'execute':'blockdev-add','arguments':{'driver':'file','node-name':'drive_mirror','filename':'/home/mirror.qcow2'}}
{'execute':'blockdev-add','arguments':{'driver':'qcow2','node-name':'mirror','file':'drive_mirror'}}
{"execute":"blockdev-snapshot","arguments":{"node":"mirror","overlay":"mirror_sn"}}
8. Complete mirror job
{"execute":"job-complete","arguments":{"id":"j1"}}
{"return": {}}
{"timestamp": {"seconds": 1620645305, "microseconds": 139017}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "j1"}}
{"timestamp": {"seconds": 1620645305, "microseconds": 139062}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "j1"}}
{"timestamp": {"seconds": 1620645305, "microseconds": 139161}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "j1", "len": 22085632, "offset": 22085632, "speed": 0, "type": "mirror"}}
{"timestamp": {"seconds": 1620645305, "microseconds": 139201}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "j1"}}
{"timestamp": {"seconds": 1620645305, "microseconds": 139230}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "j1"}}
9. Check block info
(qemu)info block
mirror_sn: json:{"backing": {"driver": "qcow2", "file": {"driver": "file", "filename": "/home/mirror.qcow2"}}, "driver": "qcow2", "file": {"driver": "file", "filename": "/home/mirror_sn.qcow2"}} (qcow2)
Attached to: /machine/peripheral/image1/virtio-backend
Cache mode: writeback
Backing file: /home/mirror.qcow2 (chain depth: 1)
10. Quit vm
(qemu)system_powerdown
After step10, vm shutdown successfully.
Repeat 10 times from step1 to step10, can't reproduce this issue.
Hi, Sergio
Can you help to check if the reproduce scenario is correct, thanks.
And Hi, Anitha
Do we have updates from customer now? Is this case still active? Thanks. I noticed the issue in customer portal is closed.
*** Bug 1967895 has been marked as a duplicate of this bug. *** Run regression test on qemu-kvm-6.0.0-21.module+el8.5.0+11555+e0ab0d09, all test pass.
Logs link:
1. virtio_blk+iothread+storage_vm_migration_blockdev:
http://virtqetools.lab.eng.pek2.redhat.com/kvm_autotest_job_log/?jobid=5505620
2. virtio_scsi+iothread+storage_vm_migration_blockdev:
http://virtqetools.lab.eng.pek2.redhat.com/kvm_autotest_job_log/?jobid=5500885
As comment 26, set bug's status to "Verified". Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:4684 |