Hi Stefan, QE have a test on qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64, met 2 different core dumps, could you please help confirm if the usage if correct or if all relevant patches have been merged into qemu 3.1 (fixed in version qemu-3.0)? 1. raw image: Step 1: boot up src and dst guest (in the same host) src: -drive id=drive_image1,if=none,snapshot=off,aio=threads,format=raw,file=/home/kvm_autotest_root/images/win2019-64-virtio.raw,file.x-check-cache-dropped=on,cache=writeback \ -device pcie-root-port,id=pcie.0-root-port-6,slot=6,chassis=6,addr=0x6,bus=pcie.0 \ -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pcie.0-root-port-6,addr=0x0 \ dst: -drive id=drive_image1,if=none,snapshot=off,aio=threads,format=raw,file=/home/kvm_autotest_root/images/win2019-64-virtio.raw,file.x-check-cache-dropped=on,cache=writeback \ -device pcie-root-port,id=pcie.0-root-port-6,slot=6,chassis=6,addr=0x6,bus=pcie.0 \ -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pcie.0-root-port-6,addr=0x0 \ -incoming tcp:0:5888 \ Step 2: migrate src to dst ({"execute": "migrate","arguments":{"uri": "tcp:0:5888"}}) Step 3: resume dst guest when migration finished, qemu core dump. (qemu) qemu-kvm: block/io.c:1619: bdrv_co_write_req_prepare: Assertion `child->perm & BLK_PERM_WRITE' failed. drive1.sh: line 23: 31700 Aborted (core dumped) MALLOC_PERTURB_=1 /usr/libexec/qemu-kvm -S -name 'avocado-vt-vm1' -machine q35 -nodefaults -device VGA,bus=pcie.0,addr=0x1 -drive id=drive_image1,if=none,snapshot=off,aio=threads,format=raw,file=/home/kvm_autotest_root/images/win2019-64-virtio.raw,file.x-check-cache-dropped=on,cache=writeback -device pcie-root-port,id=pcie.0-root-port-6,slot=6,chassis=6,addr=0x6,bus=pcie.0 -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pcie.0-root-port-6,addr=0x0 -device pcie-root-port,id=pcie.0-root-port-8,slot=8,chassis=8,addr=0x8,bus=pcie.0 -device virtio-net-pci,mac=9a:13:14:15:16:17,id=id3oiTUl,vectors=4,netdev=idA1dqov,bus=pcie.0-root-port-8,addr=0x0 -netdev tap,id=idA1dqov,vhost=on -m 15360 -smp 12,maxcpus=12,cores=6,threads=1,sockets=2 -cpu 'Opteron_G5',+kvm_pv_unhalt,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_time -vnc :1 -rtc base=localtime,clock=host,driftfix=slew -boot order=cdn,once=c,menu=off,strict=off -enable-kvm -monitor stdio -qmp tcp:localhost:6666,server,nowait -incoming tcp:0:5888 #0 0x00007f7d8d35793f in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x00007f7d8d341c95 in __GI_abort () at abort.c:79 #2 0x00007f7d8d341b69 in __assert_fail_base (fmt=0x7f7d8d4a8d70 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x56252cb61103 "child->perm & BLK_PERM_WRITE", file=0x56252cb610ed "block/io.c", line=1619, function=<optimized out>) at assert.c:92 #3 0x00007f7d8d34fdf6 in __GI___assert_fail (assertion=assertion@entry=0x56252cb61103 "child->perm & BLK_PERM_WRITE", file=file@entry=0x56252cb610ed "block/io.c", line=line@entry=1619, function=function@entry=0x56252cb61a50 <__PRETTY_FUNCTION__.26609> "bdrv_co_write_req_prepare") at assert.c:101 #4 0x000056252c96dd15 in bdrv_co_write_req_prepare (child=<optimized out>, child=<optimized out>, flags=0, req=0x7f7d505ffeb0, bytes=4096, offset=608976896) at block/io.c:1619 #5 0x000056252c9713de in bdrv_co_write_req_prepare (child=0x56252db11fa0, child=0x56252db11fa0, flags=0, req=0x7f7d505ffeb0, bytes=4096, offset=608976896) at block/io.c:1619 #6 0x000056252c9713de in bdrv_aligned_pwritev (child=child@entry=0x56252db11fa0, req=req@entry=0x7f7d505ffeb0, offset=offset@entry=608976896, bytes=bytes@entry=4096, align=align@entry=1, qiov=qiov@entry=0x56252dd85370, flags=0) at block/io.c:1699 --Type <RET> for more, q to quit, c to continue without paging-- #7 0x000056252c9723c9 in bdrv_co_pwritev (child=0x56252db11fa0, offset=offset@entry=608976896, bytes=bytes@entry=4096, qiov=qiov@entry=0x56252dd85370, flags=flags@entry=0) at block/io.c:1961 #8 0x000056252c9603b1 in blk_co_pwritev (blk=0x56252db8d1a0, offset=608976896, bytes=4096, qiov=0x56252dd85370, flags=0) at block/block-backend.c:1203 #9 0x000056252c96044e in blk_aio_write_entry (opaque=0x56252e83b810) at block/block-backend.c:1409 #10 0x000056252ca00803 in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:116 #11 0x00007f7d8d36d600 in __start_context () at ../sysdeps/unix/sysv/linux/x86_64/__start_context.S:91 #12 0x00007ffd4062cd10 in () #13 0x0000000000000000 in () 2. qcow2 image: Same steps as raw image. (qemu) qemu-kvm: util/error.c:57: error_setv: Assertion `*errp == NULL' failed. drive1.sh: line 23: 31450 Aborted (core dumped) MALLOC_PERTURB_=1 /usr/libexec/qemu-kvm -S -name 'avocado-vt-vm1' -machine q35 -nodefaults -device VGA,bus=pcie.0,addr=0x1 -drive id=drive_image1,if=none,snapshot=off,aio=threads,format=qcow2,file=/home/kvm_autotest_root/images/win2019-64-virtio.qcow2,file.x-check-cache-dropped=on,cache=writeback -device pcie-root-port,id=pcie.0-root-port-6,slot=6,chassis=6,addr=0x6,bus=pcie.0 -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pcie.0-root-port-6,addr=0x0 -device pcie-root-port,id=pcie.0-root-port-8,slot=8,chassis=8,addr=0x8,bus=pcie.0 -device virtio-net-pci,mac=9a:13:14:15:16:17,id=id3oiTUl,vectors=4,netdev=idA1dqov,bus=pcie.0-root-port-8,addr=0x0 -netdev tap,id=idA1dqov,vhost=on -m 15360 -smp 12,maxcpus=12,cores=6,threads=1,sockets=2 -cpu 'Opteron_G5',+kvm_pv_unhalt,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_time -vnc :1 -rtc base=localtime,clock=host,driftfix=slew -boot order=cdn,once=c,menu=off,strict=off -enable-kvm -monitor stdio -qmp tcp:localhost:6666,server,nowait -incoming tcp:0:5888 #0 0x00007f71e06b793f in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x00007f71e06a1c95 in __GI_abort () at abort.c:79 #2 0x00007f71e06a1b69 in __assert_fail_base (fmt=0x7f71e0808d70 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x558b816054bd "*errp == NULL", file=0x558b816054b0 "util/error.c", line=57, function=<optimized out>) at assert.c:92 #3 0x00007f71e06afdf6 in __GI___assert_fail (assertion=assertion@entry=0x558b816054bd "*errp == NULL", file=file@entry=0x558b816054b0 "util/error.c", line=line@entry=57, function=function@entry=0x558b81605578 <__PRETTY_FUNCTION__.15573> "error_setv") at assert.c:101 #4 0x0000558b81480395 in error_setv (errp=0x7f71ad7faf30, src=0x558b815eee0e "block/file-posix.c", line=2554, func=0x558b815ef810 <__func__.27556> "check_cache_dropped", err_class=ERROR_CLASS_GENERIC_ERROR, fmt=0x558b815eef87 "page cache still in use!", ap=0x7f71ad7fade0, suffix=0x0) at util/error.c:57 #5 0x0000558b814804e4 in error_setg_internal (errp=errp@entry=0x7f71ad7faf30, src=src@entry=0x558b815eee0e "block/file-posix.c", line=line@entry=2554, func=func@entry=0x558b815ef810 <__func__.27556> "check_cache_dropped", fmt=fmt@entry=0x558b815eef87 "page cache still in use!") at util/error.c:95 #6 0x0000558b813f5a7e in check_cache_dropped (errp=0x7f71ad7faf30, bs=<optimized out>) --Type <RET> for more, q to quit, c to continue without paging-- at block/file-posix.c:2554 #7 0x0000558b813f5a7e in raw_co_invalidate_cache (bs=<optimized out>, errp=0x7f71ad7faf30) at block/file-posix.c:2603 #8 0x0000558b813ba4d5 in bdrv_co_invalidate_cache (bs=0x558b82851b10, errp=errp@entry=0x7f71ad7faf70) at block.c:4531 #9 0x0000558b813ba44a in bdrv_co_invalidate_cache (bs=0x558b8284b450, errp=0x7fff9837f8a8) at block.c:4500 #10 0x0000558b813ba664 in bdrv_invalidate_cache_co_entry (opaque=0x7fff9837f860) at block.c:4572 #11 0x0000558b8148f803 in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:116 #12 0x00007f71e06cd600 in __start_context () at ../sysdeps/unix/sysv/linux/x86_64/__start_context.S:91 #13 0x00007fff9837f090 in () #14 0x0000000000000000 in () Thanks.
(In reply to CongLi from comment #6) > Step 1: boot up src and dst guest (in the same host) > src: > -drive > id=drive_image1,if=none,snapshot=off,aio=threads,format=raw,file=/home/ > kvm_autotest_root/images/win2019-64-virtio.raw,file.x-check-cache-dropped=on, > cache=writeback \ > -device > pcie-root-port,id=pcie.0-root-port-6,slot=6,chassis=6,addr=0x6,bus=pcie.0 \ > -device > virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pcie.0-root-port- > 6,addr=0x0 \ Correct: no file.x-check-cache-dropped=on cache in src side.
(In reply to CongLi from comment #6) > Hi Stefan, > > QE have a test on qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64, met 2 > different core dumps, could you please help confirm if the usage if correct > or if all relevant patches have been merged into qemu 3.1 (fixed in version > qemu-3.0)? > > 1. raw image: > > Step 1: boot up src and dst guest (in the same host) x-check-cache-dropped=on doesn't work well if you migrate on the same host - and the consistency problems that this patch solves only happens when migrating between two different hosts. Therefore testing on a single host is not effective. Please test migration between two different hosts using shared storage (e.g. an image file on NFS or a SAN LUN).
Hi, Dave, a) Referring to comment 8, I understand it means we only need to test this function on fast train, yes? b) About our current migration test, we just only use "cache=none" on both source side and destination side, so for this RFE issue, we need to add a polarion case that covering following scope: (qemu cli:) source <----------------> destination "cache=writeback" <-----> "cache=writeback,file.x-check-cache-dropped=on" "cache=writethrough" <--> "cache=writethrough,file.x-check-cache-dropped=on" do you think so? c)Due to cache=none and cache=directsync are still working, could you give their priorities? I just guess the priority of cache=none is still P1. All the cache mode is as following: "cache=none"<-----------> "cache=none" "cache=directsync:<-----> "cache=directsync" "cache=writeback" <-----> "cache=writeback,file.x-check-cache-dropped=on" "cache=writethrough" <--> "cache=writethrough,file.x-check-cache-dropped=on" If there is something wrong, please point it out, thanks
(In reply to xianwang from comment #11) > Hi, Dave, > a) Referring to comment 8, I understand it means we only need to test this > function on fast train, yes? > > b) About our current migration test, we just only use "cache=none" on both > source side and destination side, so for this RFE issue, we need to add a > polarion case that covering following scope: > (qemu cli:) > source <----------------> destination > "cache=writeback" <-----> "cache=writeback,file.x-check-cache-dropped=on" > "cache=writethrough" <--> "cache=writethrough,file.x-check-cache-dropped=on" > do you think so? > Further more, does the "forward and backward migration" need to test this function? e.g, rhel7.6<-->rhel8.0, if it is needed, does it right that we could only test migration from rhel7.6 to rhel8.0 but could not from rhel8.0 to rhel7.6 ? because "file.x-check-cache-dropped" is not supported on rhel7.6 and it must be specified on destination side. > c)Due to cache=none and cache=directsync are still working, could you give > their priorities? > I just guess the priority of cache=none is still P1. All the cache mode is > as following: > "cache=none"<-----------> "cache=none" > "cache=directsync:<-----> "cache=directsync" > "cache=writeback" <-----> "cache=writeback,file.x-check-cache-dropped=on" > "cache=writethrough" <--> "cache=writethrough,file.x-check-cache-dropped=on" > > If there is something wrong, please point it out, thanks
(In reply to xianwang from comment #11) > Hi, Dave, It's probably best to ask Stefan since these are his changes. > a) Referring to comment 8, I understand it means we only need to test this > function on fast train, yes? Check with Stefan as to which version implements this feature; I think it is currently after 2.12, so just fast train; I don't see why comment 8 is relevant. > b) About our current migration test, we just only use "cache=none" on both > source side and destination side, so for this RFE issue, we need to add a > polarion case that covering following scope: > (qemu cli:) > source <----------------> destination > "cache=writeback" <-----> "cache=writeback,file.x-check-cache-dropped=on" > "cache=writethrough" <--> "cache=writethrough,file.x-check-cache-dropped=on" > do you think so? Again, check with Stefan. > c)Due to cache=none and cache=directsync are still working, could you give > their priorities? > I just guess the priority of cache=none is still P1. All the cache mode is > as following: > "cache=none"<-----------> "cache=none" > "cache=directsync:<-----> "cache=directsync" > "cache=writeback" <-----> "cache=writeback,file.x-check-cache-dropped=on" > "cache=writethrough" <--> "cache=writethrough,file.x-check-cache-dropped=on" > > If there is something wrong, please point it out, thanks Note that the 'x-check-cache-dropped' is just a test feature to help us check that it's safe. You should also be testing with cache=writeback(?) without the x-check-cache-dropped to just check that migration works reliably in those modes. ALso, you need to do a check with postcopy, and a test with a migration cancel. Best to ask Stefan as to which cache= mode is now recommended. Adding stefan for needinfo
Migration could be completed successfully in multi hosts with shared storage. Hi Stefan, Could you please give some guidance of triggering a verification failure ? """ mincore(2) checks whether pages are resident. Use it to verify that page cache has been dropped. You can trigger a verification failure by mmapping the image file from another process that loads a byte from a page, forcing it to become resident. bdrv_co_invalidate_cache() will fail while that process is alive. """ Thanks.
(In reply to CongLi from comment #14) > Could you please give some guidance of triggering a verification failure ? > > """ > mincore(2) checks whether pages are resident. Use it to verify that > page cache has been dropped. > > You can trigger a verification failure by mmapping the image file from > another process that loads a byte from a page, forcing it to become > resident. bdrv_co_invalidate_cache() will fail while that process is > alive. > """ Run this script on the destination host and leave it running during migration: $ cat mmap-image.py #!/usr/bin/python2 import sys import mmap with open(sys.argv[1], 'rb') as f: m = mmap.mmap(f.fileno(), 0, mmap.MAP_SHARED, mmap.PROT_READ) print 'First byte:', m[0].encode('hex') raw_input('Waiting... (Press Ctrl+C to interrupt)') $ python2 mmap-image.py path/to/vm.img When migration completes the x-check-cache-dropped= check will fail because the first page of the image file is in RAM.
(In reply to xianwang from comment #11) > a) Referring to comment 8, I understand it means we only need to test this > function on fast train, yes? It's hard to tell if cache=writeback migration completed safely without x-check-cached-dropped=on, so I think it should be tested in both fast train and rhel streams (assuming they both ship this feature). > b) About our current migration test, we just only use "cache=none" on both > source side and destination side, so for this RFE issue, we need to add a > polarion case that covering following scope: > (qemu cli:) > source <----------------> destination > "cache=writeback" <-----> "cache=writeback,file.x-check-cache-dropped=on" > "cache=writethrough" <--> "cache=writethrough,file.x-check-cache-dropped=on" > do you think so? Yes, please. I wouldn't worry about cross-version migration for cache=writeback (e.g. 7.6 -> 8.0). > c)Due to cache=none and cache=directsync are still working, could you give > their priorities? > I just guess the priority of cache=none is still P1. All the cache mode is > as following: > "cache=none"<-----------> "cache=none" > "cache=directsync:<-----> "cache=directsync" > "cache=writeback" <-----> "cache=writeback,file.x-check-cache-dropped=on" > "cache=writethrough" <--> "cache=writethrough,file.x-check-cache-dropped=on" From my perspective cache=writeback migration is a low-priority feature. We don't expect many users to rely on it because cache=none is and will remain the recommended setting.
(In reply to Stefan Hajnoczi from comment #15) > (In reply to CongLi from comment #14) > > Could you please give some guidance of triggering a verification failure ? > > > > """ > > mincore(2) checks whether pages are resident. Use it to verify that > > page cache has been dropped. > > > > You can trigger a verification failure by mmapping the image file from > > another process that loads a byte from a page, forcing it to become > > resident. bdrv_co_invalidate_cache() will fail while that process is > > alive. > > """ > > Run this script on the destination host and leave it running during > migration: > > $ cat mmap-image.py > #!/usr/bin/python2 > import sys > import mmap > > with open(sys.argv[1], 'rb') as f: > m = mmap.mmap(f.fileno(), 0, mmap.MAP_SHARED, mmap.PROT_READ) > print 'First byte:', m[0].encode('hex') > raw_input('Waiting... (Press Ctrl+C to interrupt)') > $ python2 mmap-image.py path/to/vm.img > > When migration completes the x-check-cache-dropped= check will fail because > the first page of the image file is in RAM. Hi Stefan, Could you please specify what's the failure would be? I could not trigger the failure with this script, I've tried many times but all failed, migration could be completed successfully, guest works well and no error in qemu, script also executes well, could you please help confirm it? 1. Run the script on dst host 2. Do migration from src to dst with cache=writeback, dst with x-check-cache-dropped. 3. When migration completed, no error occurred. # python3 mmap-image.py /mnt/rhel80-64-virtio.qcow2 First byte: 0x51 Waiting... (Press Ctrl+C to interrupt) # mmap-image.py with python3: # cat mmap-image.py #!/usr/bin/python3 import sys import mmap with open(sys.argv[1], 'rb') as f: m = mmap.mmap(f.fileno(), 0, mmap.MAP_SHARED, mmap.PROT_READ) print('First byte: %s' % hex(m[0])) input('Waiting... (Press Ctrl+C to interrupt)') Thanks.
I'm not 100% sure but it could be because you are using a qcow2 image file and using x-check-cache-dropped= requires a different command-line in that case. Previously in this BZ we discussed command-lines for a raw image file. On my machine I can trigger the warning like this: (src)# qemu-system-x86_64 -M accel=kvm -m 1G -drive if=virtio,file=test.img,format=raw,cache=writeback (dst)# qemu-system-x86_64 -M accel=kvm -m 1G -drive if=virtio,file=test.img,format=raw,cache=writeback,file.x-check-cache-dropped=on -incoming tcp::1234 (dst)# python3 mmap-image.py test.img First byte: 0xeb Waiting... (Press Ctrl+C to interrupt) (src-qemu) migrate tcp:...:1234 Now the destination QEMU prints the following warning: qemu-system-x86_64: page cache still in use! If I skip the python script then the warning is not printed.
(In reply to Stefan Hajnoczi from comment #18) > I'm not 100% sure but it could be because you are using a qcow2 image file > and using x-check-cache-dropped= requires a different command-line in that > case. Previously in this BZ we discussed command-lines for a raw image file. > > On my machine I can trigger the warning like this: Hi Stefan, Could you please provide the qemu version ? QE used latest downstream version qemu-kvm-3.1.0-4.module+el8+2681+819ab34d.x86_64 but still fail to trigger the failure. I used the CML you provided: (src) # /usr/libexec/qemu-kvm -M accel=kvm -m 1G -drive if=virtio, file=/mnt/rhel80-64-virtio.raw,format=raw,cache=writeback -monitor stdio (dst) # /usr/libexec/qemu-kvm -M accel=kvm -m 1G -drive if=virtio, file=/mnt/rhel80-64-virtio.raw,format=raw,cache=writeback,file.x-check-cache-dropped=on -incoming tcp:0:1234 -monitor stdio (dst) # python3 mmap-image.py /mnt/rhel80-64-virtio.raw First byte: 0xeb Waiting... (Press Ctrl+C to interrupt) (src) (qemu) migrate tcp:10.73.130.201:1234 (qemu) info migrate globals: store-global-state: on only-migratable: off send-configuration: on send-section-footer: on decompress-error-check: on capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: off x-colo: off release-ram: off return-path: off pause-before-switchover: off x-multifd: off dirty-bitmaps: off postcopy-blocktime: off late-block-activate: off Migration status: completed total time: 49026 milliseconds downtime: 49 milliseconds setup: 15 milliseconds transferred ram: 916941 kbytes throughput: 153.33 mbps remaining ram: 0 kbytes total ram: 1065800 kbytes duplicate: 61581 pages skipped: 0 pages normal: 228653 pages normal bytes: 914612 kbytes dirty sync count: 5 page size: 4 kbytes multifd bytes: 0 kbytes (qemu) (qemu) info status VM status: paused (postmigrate) The dst QEMU prints nothing: (qemu) (qemu) info migrate globals: store-global-state: on only-migratable: off send-configuration: on send-section-footer: on decompress-error-check: on capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: off x-colo: off release-ram: off return-path: off pause-before-switchover: off x-multifd: off dirty-bitmaps: off postcopy-blocktime: off late-block-activate: off Migration status: completed total time: 0 milliseconds (qemu) info status VM status: running (qemu) > > (src)# qemu-system-x86_64 -M accel=kvm -m 1G -drive > if=virtio,file=test.img,format=raw,cache=writeback > (dst)# qemu-system-x86_64 -M accel=kvm -m 1G -drive > if=virtio,file=test.img,format=raw,cache=writeback,file.x-check-cache- > dropped=on -incoming tcp::1234 > (dst)# python3 mmap-image.py test.img > First byte: 0xeb > Waiting... (Press Ctrl+C to interrupt) > > (src-qemu) migrate tcp:...:1234 > > Now the destination QEMU prints the following warning: > > qemu-system-x86_64: page cache still in use! > > If I skip the python script then the warning is not printed.
(In reply to CongLi from comment #19) > (In reply to Stefan Hajnoczi from comment #18) > > I'm not 100% sure but it could be because you are using a qcow2 image file > > and using x-check-cache-dropped= requires a different command-line in that > > case. Previously in this BZ we discussed command-lines for a raw image file. > > > > On my machine I can trigger the warning like this: > > Hi Stefan, > > Could you please provide the qemu version ? > > QE used latest downstream version > qemu-kvm-3.1.0-4.module+el8+2681+819ab34d.x86_64 > but still fail to trigger the failure. > > I used the CML you provided: > > (src) # /usr/libexec/qemu-kvm -M accel=kvm -m 1G -drive if=virtio, > file=/mnt/rhel80-64-virtio.raw,format=raw,cache=writeback -monitor stdio > (dst) # /usr/libexec/qemu-kvm -M accel=kvm -m 1G -drive if=virtio, > file=/mnt/rhel80-64-virtio.raw,format=raw,cache=writeback,file.x-check-cache- > dropped=on > -incoming tcp:0:1234 -monitor stdio > (dst) # python3 mmap-image.py /mnt/rhel80-64-virtio.raw > First byte: 0xeb > Waiting... (Press Ctrl+C to interrupt) > > (src) (qemu) migrate tcp:10.73.130.201:1234 I reproduced your results. It seems all cached pages in the file are dropped on flush (could be NFS-specific). It's still possible to make the test fail on purpose by modifying the script: #!/usr/bin/python3 import sys import mmap with open(sys.argv[1], 'rb') as f: m = mmap.mmap(f.fileno(), 0, mmap.MAP_SHARED, mmap.PROT_READ) while True: x = m[0] There is now a race condition between this script accessing m[0] (which fetches the page) and QEMU checking if pages are resident in memory. If you run it a few times you'll see the "qemu-system-x86_64: page cache still in use!" warning printed by the destination QEMU. Hope this helps you demonstrate that the check can fail. In any case, what you've found is good news - it means that with NFS shared storage live migration the risk of inconsistencies is low because cached pages are dropped on the destination. It takes work to make it fail :).
(In reply to Stefan Hajnoczi from comment #21) > (In reply to CongLi from comment #19) > > (In reply to Stefan Hajnoczi from comment #18) > In any case, what you've found is good news - it means that with NFS shared > storage live migration the risk of inconsistencies is low because cached > pages are dropped on the destination. It takes work to make it fail :). I should clarify that the QEMU patches from this BZ are still necessary. The pages are probably dropped thanks to the flush that was added on live migration handover.
(In reply to Stefan Hajnoczi from comment #21) > I reproduced your results. It seems all cached pages in the file are > dropped on flush (could be NFS-specific). It's still possible to make the > test fail on purpose by modifying the script: > > #!/usr/bin/python3 > import sys > import mmap > > with open(sys.argv[1], 'rb') as f: > m = mmap.mmap(f.fileno(), 0, mmap.MAP_SHARED, mmap.PROT_READ) > while True: > x = m[0] > > There is now a race condition between this script accessing m[0] (which > fetches the page) and QEMU checking if pages are resident in memory. If you > run it a few times you'll see the "qemu-system-x86_64: page cache still in > use!" warning printed by the destination QEMU. > > Hope this helps you demonstrate that the check can fail. > > In any case, what you've found is good news - it means that with NFS shared > storage live migration the risk of inconsistencies is low because cached > pages are dropped on the destination. It takes work to make it fail :). Thanks Stefan, it works now. Could trigger the failure pages are resident in memory. dst: '(qemu) qemu-kvm: page cache still in use!' 1. Could you please share the command line of qcow2 image ? 2. And could you please help confirm the way of using 'fincore' tool to confirm the pages are resident. Thanks.
(In reply to CongLi from comment #24) > (In reply to Stefan Hajnoczi from comment #21) > 1. Could you please share the command line of qcow2 image ? I've tested that the raw command-line also works for qcow2: -drive if=virtio,file=test.qcow2,format=raw,cache=writeback,file.x-check-cache-dropped=on Sorry, I thought it would be necessary to use a different syntax, but I was wrong. > 2. And could you please help confirm the way of using 'fincore' tool to > confirm the > pages are resident. It is not possible to accurately use fincore(1) since the check must be performed at the moment of live migration handover. I'm not aware of a way to pause the destination QEMU at the right point in time when fincore(1) should be run. Immediately afterwards the destination QEMU may begin accessing the image file again and the page cache will become populated.
(In reply to Stefan Hajnoczi from comment #25) > (In reply to CongLi from comment #24) > > (In reply to Stefan Hajnoczi from comment #21) > > 1. Could you please share the command line of qcow2 image ? > > I've tested that the raw command-line also works for qcow2: > > -drive > if=virtio,file=test.qcow2,format=raw,cache=writeback,file.x-check-cache- > dropped=on > > Sorry, I thought it would be necessary to use a different syntax, but I was > wrong. > > > 2. And could you please help confirm the way of using 'fincore' tool to > > confirm the > > pages are resident. > > It is not possible to accurately use fincore(1) since the check must be > performed at the moment of live migration handover. I'm not aware of a way > to pause the destination QEMU at the right point in time when fincore(1) > should be run. Immediately afterwards the destination QEMU may begin > accessing the image file again and the page cache will become populated. Thanks Stefan for the explanation. Tested on qemu-kvm-3.1.0-15.module+el8+2792+e33e01a0.x86_64: Qcow2 works well too with x-check-cache-dropped=on. QEMU could trigger the failure pages are resident in memory. dst: '(qemu) qemu-kvm: page cache still in use!' Thanks.
Note I've changed Fixed In to QEMU 4.0.0 because we need commit f357fcd890a8d6ced6d261338b859a41414561e9 ("file-posix: add drop-cache=on|off option") so that libvirt can detect this feature.
(In reply to Stefan Hajnoczi from comment #27) > Note I've changed Fixed In to QEMU 4.0.0 because we need commit > f357fcd890a8d6ced6d261338b859a41414561e9 ("file-posix: add drop-cache=on|off > option") so that libvirt can detect this feature. Tested on qemu-kvm-4.0.0-0.module+el8.1.0+3169+3c501422.x86_64, seems drop-cache option has not been merged in downstream. The result is same for raw. qemu-kvm: -drive if=none,file=/home/kvm_autotest_root/images/rhel810-64-virtio-scsi.qcow2,format=qcow2,cache=writeback,file.x-check-cache-dropped=on,drop-cache=on,id=drive_image1: Block format 'qcow2' does not support the option 'drop-cache'
(In reply to CongLi from comment #29) > (In reply to Stefan Hajnoczi from comment #27) > > Note I've changed Fixed In to QEMU 4.0.0 because we need commit > > f357fcd890a8d6ced6d261338b859a41414561e9 ("file-posix: add drop-cache=on|off > > option") so that libvirt can detect this feature. > > Tested on qemu-kvm-4.0.0-0.module+el8.1.0+3169+3c501422.x86_64, > seems drop-cache option has not been merged in downstream. I asked Mirek where this build came from and it's a virt-preview build. Please try the regular RHEL-AV 8.1.0 qemu-kvm builds that are out now. I have verified that RHEL-AV 8.1.0 qemu-kvm has the required patch.
(In reply to Stefan Hajnoczi from comment #30) > (In reply to CongLi from comment #29) > > (In reply to Stefan Hajnoczi from comment #27) > > > Note I've changed Fixed In to QEMU 4.0.0 because we need commit > > > f357fcd890a8d6ced6d261338b859a41414561e9 ("file-posix: add drop-cache=on|off > > > option") so that libvirt can detect this feature. > > > > Tested on qemu-kvm-4.0.0-0.module+el8.1.0+3169+3c501422.x86_64, > > seems drop-cache option has not been merged in downstream. > > I asked Mirek where this build came from and it's a virt-preview build. > Please try the regular RHEL-AV 8.1.0 qemu-kvm builds that are out now. I > have verified that RHEL-AV 8.1.0 qemu-kvm has the required patch. I've tested on latest downstream qemu qemu-kvm-4.0.0-1.module+el8.1.0+3225+a8268fde.x86_64, still met this issue. # /usr/libexec/qemu-kvm -drive if=none,file=/home/kvm_autotest_root/images/win2019-64-virtio-scsi.qcow2,format=qcow2,cache=writeback,file.x-check-cache-dropped=on,drop-cache=on,id=drive_image1 qemu-kvm: -drive if=none,file=/home/kvm_autotest_root/images/win2019-64-virtio-scsi.qcow2,format=qcow2,cache=writeback,file.x-check-cache-dropped=on,drop-cache=on,id=drive_image1: Block format 'qcow2' does not support the option 'drop-cache' Hi Miroslav, Could you please help confirm it ? Thanks.
Moved to ON_QA for retesting.
Commit from comment #27 is present in the tree. In case this issue is still reproducible, we need additional fix.
Hi Stefan, The issue is still existed on qemu-kvm-4.0.0-4.module+el8.1.0+3356+cda7f1ee.x86_64. qemu-kvm: -drive if=none,file=/home/kvm_autotest_root/images/rhel810-64-virtio.qcow2,format=qcow2,cache=writeback,file.x-check-cache-dropped=on,drop-cache=on,id=drive_image1: Block format 'qcow2' does not support the option 'drop-cache' Could you please help confirm the status of this bug ? Thanks.
(In reply to CongLi from comment #34) > Hi Stefan, > > The issue is still existed on > qemu-kvm-4.0.0-4.module+el8.1.0+3356+cda7f1ee.x86_64. > > qemu-kvm: -drive > if=none,file=/home/kvm_autotest_root/images/rhel810-64-virtio.qcow2, > format=qcow2,cache=writeback,file.x-check-cache-dropped=on,drop-cache=on, > id=drive_image1: Block format 'qcow2' does not support the option > 'drop-cache' > > Could you please help confirm the status of this bug ? Please use file.drop-cache=on instead of drop-cache=on.
(In reply to Miroslav Rezanina from comment #33) > Commit from comment #27 is present in the tree. In case this issue is still > reproducible, we need additional fix. We just needed an adjustment to the test-case (see comment #35). Moving it back to ON_QA.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:3723