Description of problem: Fails to query domjobinfo when do snapshot Version-Release number of selected component (if applicable): libvirt-5.0.0-6.module+el8+2860+4e0fe96a.x86_64 qemu-kvm-3.1.0-18.module+el8+2834+fa8bb6e2.x86_64 How reproducible: 100% Steps to Reproduce: 1.Start a vm 2.Do snapshot # virsh snapshot-create-as df899f5c-db94-48b2-867a-e0c266b59b7a sp8 --memspec file=/tmp/foo.image.8 3.Query domain job info during snapshot: # virsh domjobinfo df899f5c-db94-48b2-867a-e0c266b59b7a error: internal error: invalid job statistics type 4.After snapshot completes, query domain job info again: # virsh domjobinfo df899f5c-db94-48b2-867a-e0c266b59b7a --completed error: internal error: invalid job statistics type Actual results: As step 3&4, fails to query domain job info for snapshot Expected results: Can succeed to query domain job info for snapshot Additional info:
Hmm, looks like the snapshot code does not properly set jobInfo->statsType: (gdb) p jobInfo $1 = { status = QEMU_DOMAIN_JOB_STATUS_COMPLETED, operation = VIR_DOMAIN_JOB_OPERATION_SNAPSHOT, started = 1552567278659, stopped = 1552567278667, sent = 0, received = 0, timeElapsed = 30, timeDelta = 0, timeDeltaSet = false, statsType = QEMU_DOMAIN_JOB_STATS_TYPE_NONE, stats = { mig = { status = 6, total_time = 17, downtime_set = true, downtime = 22, setup_time_set = true, setup_time = 1, ram_transferred = 3510941, ram_remaining = 0, ram_total = 168370176, ram_bps = 228062937, ram_duplicate_set = true, ram_duplicate = 40339, ram_normal = 767, ram_normal_bytes = 3141632, ram_dirty_rate = 0, ram_page_size = 4096, ram_iteration = 2, ... }, dump = { status = 6, completed = 17, total = 1 } }, mirrorStats = { transferred = 0, total = 0 } }
The patch was sent upstream for review: https://www.redhat.com/archives/libvir-list/2019-March/msg00971.html
This bug is fixed upstream by commit 1c2a9260e865af8ad7dde9cdd21515800d1864e7 Refs: v5.1.0-237-g1c2a9260e8 Author: Jiri Denemark <jdenemar> AuthorDate: Thu Mar 14 15:33:26 2019 +0100 Commit: Jiri Denemark <jdenemar> CommitDate: Fri Mar 15 09:39:19 2019 +0100 qemu: Set job statsType for external memory snapshot Any job which is able to provide statistics that can be queried via virDomainGetJob{Stats,Info} has to set an appropriate statsType. Without a proper statsType qemuDomainJobInfoToParams and qemuDomainJobInfoToInfo have no idea what statistics should be sent to the API caller. https://bugzilla.redhat.com/show_bug.cgi?id=1688774 Signed-off-by: Jiri Denemark <jdenemar> Reviewed-by: Erik Skultety <eskultet>
Meet a new problem after the fix, pls help to confirm ========================= Do following thing at the same time in different consoles: ========================= a. (IN VM) generate some data to make the snapshot process last long: # dd if=/dev/urandom of=/tmp/1.file bs=1K count=5000000 b. (IN HOST CONSOLE 1) create the snapshot as follow: [root@hp-z220-02 tmp]# virsh snapshot-create-as 61a1423f-5591-4b36-94ad-be0d982c34e5 s1 --memspec file=/tmp/snap.1 c. (IN HOST CONSOLE 2) check the domjobinfo: [root@hp-z220-02 ~]# time virsh domjobinfo 61a1423f-5591-4b36-94ad-be0d982c34e5 ========================= RESULT: ========================= at above step c, the cmd just hangs there, will return something after step b finished, as follow: b1: [root@hp-z220-02 tmp]# virsh snapshot-create-as 61a1423f-5591-4b36-94ad-be0d982c34e5 s1 --memspec file=/tmp/snap.1 Domain snapshot s1 created c1: [root@hp-z220-02 ~]# time virsh domjobinfo 61a1423f-5591-4b36-94ad-be0d982c34e5 Job type: None real 0m27.377s user 0m0.011s sys 0m0.005s ========================= Problem: ========================= During taking snapshot, the domjobinfo should display the info about the running job, something like following: Tue Apr 10 07:03:56 EDT 2018 Job type: Unbounded Operation: Snapshot Time elapsed: 1027 ms Data processed: 110.118 MiB Data remaining: 151.859 MiB Data total: 1.126 GiB Memory processed: 110.118 MiB Memory remaining: 151.859 MiB Memory total: 1.126 GiB Memory bandwidth: 223.485 MiB/s Dirty rate: 0 pages/s Iteration: 1 Constant pages: 228613 Normal pages: 27634 Normal data: 107.945 MiB Expected downtime: 300 ms Setup time: 8 ms
Hi yisun The "hang" problem you met is a different issue and is tracked in this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1565552#c2 Actually you can only get the domjobinfo in memory snapshot phase which is very short, but you could increase the guest memory size to make the phase longer. Note: adding memory load in guest will not help, because guest is paused at the beginning. And note that you can monitor libvirtd log when do the test, after you see libvirtd sends command "migrate" to qemu monitor, you could try to query domjobinfo immediately.
(In reply to Fangge Jin from comment #7) > Hi yisun > > The "hang" problem you met is a different issue and is tracked in this bug: > https://bugzilla.redhat.com/show_bug.cgi?id=1565552#c2 > > Actually you can only get the domjobinfo in memory snapshot phase which is > very short, but you could increase the guest memory size to make the phase > longer. Note: adding memory load in guest will not help, because guest is > paused at the beginning. > > And note that you can monitor libvirtd log when do the test, after you see > libvirtd sends command "migrate" to qemu monitor, you could try to query > domjobinfo immediately. Thx for the info Verified with: libvirt-5.0.0-7.module+el8+2887+effa3c42.x86_64 1. in vm run: [root@localhost ~]# dd if=/dev/urandom of=/tmp/1.file bs=1K count=5000000 2. in host console 1 run: [root@localhost ~]# virsh snapshot-create-as 5d9e1bfb-9342-4530-82b1-f8c2dac70e8d s1 --memspec file=/tmp/snap.1 3. in host console 2 run: [root@localhost ~]# virsh domjobinfo 5d9e1bfb-9342-4530-82b1-f8c2dac70e8d 4. check console 1 until snapshot finished: [root@localhost ~]# virsh snapshot-create-as 5d9e1bfb-9342-4530-82b1-f8c2dac70e8d s1 --memspec file=/tmp/snap.1 Domain snapshot s1 created 5. check console 2 make sure there is no error: [root@localhost ~]# virsh domjobinfo 5d9e1bfb-9342-4530-82b1-f8c2dac70e8d Job type: None <=== This is hang until snapshot finished, tracked by bz565552 [root@localhost ~]# virsh domjobinfo 5d9e1bfb-9342-4530-82b1-f8c2dac70e8d --completed Job type: Completed Operation: Snapshot Time elapsed: 2298 ms Data processed: 700.835 MiB Data remaining: 0.000 B Data total: 1.126 GiB Memory processed: 700.835 MiB Memory remaining: 0.000 B Memory total: 1.126 GiB Memory bandwidth: 1.340 GiB/s Dirty rate: 0 pages/s Page size: 4096 bytes Iteration: 3 Postcopy requests: 0 Constant pages: 116251 Normal pages: 178809 Normal data: 698.473 MiB Total downtime: 526 ms Setup time: 6 ms
(In reply to yisun from comment #8) > (In reply to Fangge Jin from comment #7) > > Hi yisun > > > > The "hang" problem you met is a different issue and is tracked in this bug: > > https://bugzilla.redhat.com/show_bug.cgi?id=1565552#c2 > > > > Actually you can only get the domjobinfo in memory snapshot phase which is > > very short, but you could increase the guest memory size to make the phase > > longer. Note: adding memory load in guest will not help, because guest is > > paused at the beginning. > > > > And note that you can monitor libvirtd log when do the test, after you see > > libvirtd sends command "migrate" to qemu monitor, you could try to query > > domjobinfo immediately. > > Thx for the info > Verified with: > libvirt-5.0.0-7.module+el8+2887+effa3c42.x86_64 > > 1. in vm run: > [root@localhost ~]# dd if=/dev/urandom of=/tmp/1.file bs=1K count=5000000 > > 2. in host console 1 run: > [root@localhost ~]# virsh snapshot-create-as > 5d9e1bfb-9342-4530-82b1-f8c2dac70e8d s1 --memspec file=/tmp/snap.1 > > 3. in host console 2 run: > [root@localhost ~]# virsh domjobinfo 5d9e1bfb-9342-4530-82b1-f8c2dac70e8d > > 4. check console 1 until snapshot finished: > [root@localhost ~]# virsh snapshot-create-as > 5d9e1bfb-9342-4530-82b1-f8c2dac70e8d s1 --memspec file=/tmp/snap.1 > Domain snapshot s1 created > > 5. check console 2 make sure there is no error: > [root@localhost ~]# virsh domjobinfo 5d9e1bfb-9342-4530-82b1-f8c2dac70e8d > Job type: None > <=== This is hang until snapshot finished, tracked by bz565552 <==== typo here, should be bz1565552
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:1293