RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2017928 - [incremental_backup] Expose scratch disk allocation (wr_highest_offset) in the API
Summary: [incremental_backup] Expose scratch disk allocation (wr_highest_offset) in th...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: libvirt
Version: 8.5
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: rc
: 8.5
Assignee: Peter Krempa
QA Contact: yisun
URL:
Whiteboard:
Depends On:
Blocks: 1913387
TreeView+ depends on / blocked
 
Reported: 2021-10-27 18:01 UTC by Nir Soffer
Modified: 2022-05-10 13:33 UTC (History)
8 users (show)

Fixed In Version: libvirt-7.10.0-1.module+el8.6.0+13502+4f24a11d
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-05-10 13:21:40 UTC
Type: Bug
Target Upstream Version: 7.10.0
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-101036 0 None None None 2021-10-29 04:20:25 UTC
Red Hat Product Errata RHSA-2022:1759 0 None None None 2022-05-10 13:22:38 UTC

Description Nir Soffer 2021-10-27 18:01:12 UTC
Description of problem:

During pull mode backup, libvirt expose the scratch disk index:

    <disk name='sdb' backup='yes' type='block' backupmode='full' exportname='sdb' index='7'>
      <driver type='qcow2'/>
      <scratch dev='/path/to/image'>
        <seclabel model='dac' relabel='no'/>
      </scratch>
    </disk>

RHV will use the index name ("sdb[7]") to register a block threshold on the
scratch disk.

But this is not enough to track scratch disk allocation and extending
them.

We have to handle these cases:
- block threshold was set after the allocation excceeded the threshold
  (we don't get an event)
- VM was paused befor block threshold was set
- Event was submitted when vdsm was disconnected from libvirt
  (e.g. crash or restart)

To handle these cases, we use virDomainGetBlockInfo(path) to get the
current allocation, and we extend the volume if the volume is too full.

Howeve this API does not provide access to scratch disks:

    virsh # domblkinfo 1 /path/to/image
    error: invalid argument: invalid path /path/to/image not assigned to domain

    virsh # domblkinfo 1 sdb[7]
    error: invalid argument: invalid path sdb[7] not assigned to domain

We can also get allocation info from:
- getAllDomainStats
- domainListGetStats

But they do not expose the scratch disk, only the actual disks and
the backing chain.

I tested "domstats --block 1" and "domstats --block --backing 1", and
there is no "block.*.backingIndex=7".

qemu exposes the required information:

virsh # qemu-monitor-command 1 --pretty '{"execute":"query-blockstats", "arguments": {"query-nodes": true}}'

...
      "stats": {
        "unmap_operations": 0,
        "unmap_merged": 0,
        "flush_total_time_ns": 0,
        "wr_highest_offset": 1342177280,
        "wr_total_time_ns": 0,
        "failed_wr_operations": 0,
        "failed_rd_operations": 0,
        "wr_merged": 0,
        "wr_bytes": 0,
        "timed_stats": [

        ],
        "failed_unmap_operations": 0,
        "failed_flush_operations": 0,
        "account_invalid": false,
        "rd_total_time_ns": 0,
        "invalid_unmap_operations": 0,
        "flush_operations": 0,
        "wr_operations": 0,
        "unmap_bytes": 0,
        "rd_merged": 0,
        "rd_bytes": 0,
        "unmap_total_time_ns": 0,
        "invalid_flush_operations": 0,
        "account_failed": false,
        "rd_operations": 0,
        "invalid_wr_operations": 0,
        "invalid_rd_operations": 0
      },
      "node-name": "libvirt-7-format"

Note that the scratch disk note "libvirt-7-format" is *not* exposed when
using:

    virsh # qemu-monitor-command 1 --pretty '{"execute":"query-blockstats", "arguments": {"query-nodes": false}}'

which seems to be what libvirt is doing now, as a fix for bug 2015281.

We could use virDomainQemuMonitorCommand
https://libvirt.org/html/libvirt-libvirt-qemu.html#virDomainQemuMonitorCommand

But I think it is not exposed in the python binding, and it is also reuires
depending on internal implentation details (.e.g "libvirt-7-format").

I think what we need to expose the scratch disk in:
- virDomainGetBlockInfo
- virConnectGetAllDomainStats
- virDomainListGetStats

The most important API for us is virDomainGetBlockInfo since it returns
what we need, and it is much easier to use.

Version-Release number of selected component (if applicable):
libvirt-daemon-7.6.0-6.el8_rc.e6185cc768.x86_64

Comment 1 yisun 2021-10-29 04:17:52 UTC
Reproduce steps:
1. Prepare a backup xml
[root@dell-per730-59 ~]# cat backup.xml
<domainbackup mode='pull'>
  <server transport='unix' socket='/tmp/bkup.socket'/>
  <disks>
      <disk name='vda' backup='no'/>
      <disk name='vdb' backup='yes' type='block' backupmode='full' exportname='vdb'>
          <driver type='qcow2'/>
          <scratch dev='/dev/sdd'/>
      </disk>
  </disks>
</domainbackup>


2. Start a vm with two disks vda and vdb, here they have existing external snsapshots
[root@dell-per730-59 ~]# virsh domstate vm1
running

[root@dell-per730-59 ~]# virsh dumpxml vm1 | awk '/<disk/,/<\/disk/'
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/libvirt/images/jeos-27-x86_64.snap1' index='3'/>
      <backingStore type='file' index='2'>
        <format type='qcow2'/>
        <source file='/var/lib/libvirt/images/jeos-27-x86_64.qcow2'/>
        <backingStore/>
      </backingStore>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
    </disk>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/libvirt/images/vdb.snap1' index='4'/>
      <backingStore type='file' index='1'>
        <format type='qcow2'/>
        <source file='/var/lib/libvirt/images/vdb.qcow2'/>
        <backingStore/>
      </backingStore>
      <target dev='vdb' bus='virtio'/>
      <alias name='virtio-disk1'/>
      <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
    </disk>


3. Start the backup
[root@dell-per730-59 ~]# virsh backup-begin vm1 backup.xml
Backup started

4. Check the scratch info
[root@dell-per730-59 ~]# virsh backup-dumpxml vm1 | awk '/<domainbackup/,/<\/domainbackup/'
<domainbackup mode='pull'>
  <server transport='unix' socket='/tmp/bkup.socket'/>
  <disks>
    <disk name='vda' backup='no'/>
    <disk name='vdb' backup='yes' type='block' backupmode='full' exportname='vdb' index='6'>
      <driver type='qcow2'/>
      <scratch dev='/dev/sdd'/>
    </disk>
  </disks>
</domainbackup>

5. The scratch device info not exposed by domblkinfo or domstats
[root@dell-per730-59 ~]# virsh domblkinfo vm1 /dev/sdd
error: invalid argument: invalid path /dev/sdd not assigned to domain

[root@dell-per730-59 ~]# virsh domstats vm1 --backing | grep /dev/sdd | wc -l
0

Comment 2 yisun 2021-10-29 04:26:46 UTC
hi Peter,
If we'll provide such info in domblkinfo, will the external snapshots' backend nodes exposed at the same time? I am just curious if there will be a side effect to expose other kind of nodes which needs to be considered and tested.

For example, now we can only get the most outside layer info from virsh domblkinfo
1. vm's xml
[root@dell-per730-59 ~]# virsh dumpxml vm1 | awk '/<disk/,/<\/disk/'
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/libvirt/images/jeos-27-x86_64.snap1' index='3'/>
      <backingStore type='file' index='2'>
        <format type='qcow2'/>
        <source file='/var/lib/libvirt/images/jeos-27-x86_64.qcow2'/>
        <backingStore/>
      </backingStore>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
    </disk>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/libvirt/images/vdb.snap1' index='4'/>
      <backingStore type='file' index='1'>
        <format type='qcow2'/>
        <source file='/var/lib/libvirt/images/vdb.qcow2'/>
        <backingStore/>
      </backingStore>
      <target dev='vdb' bus='virtio'/>
      <alias name='virtio-disk1'/>
      <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
    </disk>

2. Can get info for snap1 node, but cannot get backing node info
[root@dell-per730-59 ~]# virsh domblkinfo vm1 /var/lib/libvirt/images/jeos-27-x86_64.snap1
Capacity:       10737418240
Allocation:     2101248
Physical:       1769472

[root@dell-per730-59 ~]# virsh domblkinfo vm1 /var/lib/libvirt/images/jeos-27-x86_64.qcow2
error: invalid argument: invalid path /var/lib/libvirt/images/jeos-27-x86_64.qcow2 not assigned to domain

Comment 4 Peter Krempa 2021-10-29 12:42:28 UTC
(In reply to yisun from comment #2)
> hi Peter,
> If we'll provide such info in domblkinfo, will the external snapshots'
> backend nodes exposed at the same time? I am just curious if there will be a
> side effect to expose other kind of nodes which needs to be considered and
> tested.

No, the new data will be exported via the bulk-stats API (virsh domstats). The old APIs will not be extended.

Comment 5 Peter Krempa 2021-11-01 09:32:49 UTC
(In reply to Nir Soffer from comment #0)

[...]

> I think what we need to expose the scratch disk in:
> - virDomainGetBlockInfo
> - virConnectGetAllDomainStats
> - virDomainListGetStats
> 
> The most important API for us is virDomainGetBlockInfo since it returns
> what we need, and it is much easier to use.

Unfortunately that's a legacy API and thus it will _not_ be extended any more. You'll need to fetch the stats from the bulk stats API virConnectGetAllDomainStats/virDomainListGetStats (same backend). It also solves one more of your complaints if you are querying multiple disks, the JSON data from qemu iu fetched only once.

Comment 6 Nir Soffer 2021-11-01 15:12:42 UTC
(In reply to Peter Krempa from comment #5)
> > I think what we need to expose the scratch disk in:
> > - virDomainGetBlockInfo
> > - virConnectGetAllDomainStats
> > - virDomainListGetStats
> > 
> > The most important API for us is virDomainGetBlockInfo since it returns
> > what we need, and it is much easier to use.
> 
> Unfortunately that's a legacy API and thus it will _not_ be extended any
> more. You'll need to fetch the stats from the bulk stats API
> virConnectGetAllDomainStats/virDomainListGetStats (same backend). It also
> solves one more of your complaints if you are querying multiple disks, the
> JSON data from qemu iu fetched only once.

Thanks, this works for us.

Comment 7 Peter Krempa 2021-11-01 16:23:49 UTC
Patches adding the backup disk stats to 'virsh domstats':

https://listman.redhat.com/archives/libvir-list/2021-November/msg00023.html

Comment 9 yisun 2021-11-02 09:44:20 UTC
tested with scracth build, result is expected

1. Prepare a vm, here we'll use its vdb as test disk
[root@dell-per740xd-27 ~]# virsh domblklist vm1
 Target   Source
--------------------------------------------------------
 vda      /var/lib/libvirt/images/jeos-27-x86_64.qcow2
 vdb      /var/lib/libvirt/images/vdb.qcow2


2. In vm, write 200mb data to test disk:
[root@localhost ~]# dd if=/dev/urandom of=/dev/vdb bs=1M count=200; sync
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1.298 s, 162 MB/s

3. Prepare pull-mode backup xml
[root@dell-per740xd-27 ~]# cat backup.xml
<domainbackup mode='pull'>
  <server transport='unix' socket='/tmp/bkup.socket'/>
  <disks>
      <disk name='vda' backup='no'/>
      <disk name='vdb' backup='yes' type='block' backupmode='full' exportname='vdb'>
          <driver type='qcow2'/>
          <scratch dev='/dev/sdb'/>
      </disk>
  </disks>
</domainbackup>

4. Start the backup job
[root@dell-per740xd-27 ~]# virsh backup-begin vm1 backup.xml
Backup started

5. check the usage info for the scratch device is displayed
[root@dell-per740xd-27 ~]# virsh domstats vm1 --block --backing
Domain: 'vm1'
  block.count=3
  ...
  block.2.name=vdb
  block.2.path=/dev/sdb
  block.2.backingIndex=3
  block.2.allocation=196624
  block.2.capacity=1073741824
  block.2.physical=1048576000

6. In terminal 2, add a event watcher for the scratch file
[root@dell-per740xd-27 ~]# virsh domblkthreshold vm1 vdb[3] 100000000
[root@dell-per740xd-27 ~]# virsh event --all --loop

7. Rewrite 200mb data in the same disk in vm
[root@localhost ~]# dd if=/dev/urandom of=/dev/vdb bs=1M count=200; sync
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1.33676 s, 157 MB/s


8. The block threshold event triggered 
[root@dell-per740xd-27 ~]# virsh event --all --loop
event 'block-threshold' for domain 'vm1': dev: vdb[3](/dev/sdb) 100000000 7936

9. Check the domstats, it contains correct allocation info for vdb[3]
[root@dell-per740xd-27 ~]# virsh domstats vm1 --block --backing
Domain: 'vm1'
  block.count=3
  ...
  block.2.name=vdb
  block.2.path=/dev/sdb
  block.2.backingIndex=3
  block.2.allocation=210042880
  block.2.capacity=1073741824
  block.2.physical=1048576000

10. Abort the backup job, the vdb[3] info not in domstats anymore as expected
[root@dell-per740xd-27 ~]# virsh domjobabort vm1
[root@dell-per740xd-27 ~]# virsh domstats vm1 --block --backing
  ...
  no vdb[3] info

Comment 10 Peter Krempa 2021-11-04 10:08:38 UTC
The patches were pushed upstream:

commit 045a87c526778b49662d0d5d4898bd39aa2e6985
Author: Peter Krempa <pkrempa>
Date:   Fri Oct 29 16:04:45 2021 +0200

    qemuDomainGetStatsBlockExportDisk: Report stats also for helper images
    
    Add stat entries also for the mirror destination and the backup job
    scratch/target file. This is possible with '-blockdev' as we use unique
    index for the entries.
    
    The stats are reported when the VIR_CONNECT_GET_ALL_DOMAINS_STATS_BACKING
    is used.
    
    Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2017928
    Signed-off-by: Peter Krempa <pkrempa>
    Reviewed-by: Ján Tomko <jtomko>

commit bc24810c2cabc21d1996fa814737e2f996f2c2bb
Author: Peter Krempa <pkrempa>
Date:   Mon Nov 1 11:35:41 2021 +0100

    qemuMonitorJSONQueryBlockstats: query stats for helper images
    
    Use the 'query-nodes' flag to return all stats. The flag was introduced
    prior to qemu-2.11 so we can always use it, but we invoke it only when
    querying stats. The other invocation is used for detecting the nodenames
    which is fragile code.
     (HEAD -> master, origin/master, origin/HEAD, rhev-backup-stats)
    The images without a frontend don't have the device field so the
    extraction code checks need to be relaxed.
    
    Signed-off-by: Peter Krempa <pkrempa>
    Reviewed-by: Ján Tomko <jtomko>

commit 6448470eca50e529658861c6eb3ae8109366db86
Author: Peter Krempa <pkrempa>
Date:   Mon Nov 1 14:31:42 2021 +0100

    qemustatusxml2xmldata: backup-pull: Add private data for scratch image
    
    Signed-off-by: Peter Krempa <pkrempa>
    Reviewed-by: Ján Tomko <jtomko>

commit 1e4aff444c93b83435f91c814dd5ae4465918d36
Author: Peter Krempa <pkrempa>
Date:   Mon Nov 1 12:42:39 2021 +0100

    virDomainBackupDefFormat: Propagate private data callbacks
    
    The formatter for the backup job data didn't pass the virDomainXMLOption
    struct to the disk formatter which meant that the private data of the
    disk source were not formatted.
    
    This didn't pose a problem for now as the blockjob list remembered the
    nodenames for the jobs, but the backup source lost them.
    
    Signed-off-by: Peter Krempa <pkrempa>
    Reviewed-by: Ján Tomko <jtomko>

v7.9.0-44-g045a87c526

Comment 13 yisun 2021-12-21 12:38:21 UTC
verified with: libvirt-7.10.0-1.module+el8.6.0+13502+4f24a11d.x86_64
same steps as comment 9

Comment 15 errata-xmlrpc 2022-05-10 13:21:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: virt:rhel and virt-devel:rhel security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1759


Note You need to log in before you can comment on or make changes to this bug.