Bug 1470634
Summary: | Wrong allocation value after virDomainBlockCopy() (alloc=capacity) | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Nir Soffer <nsoffer> | |
Component: | qemu-kvm-rhev | Assignee: | Kevin Wolf <kwolf> | |
Status: | CLOSED ERRATA | QA Contact: | Qianqian Zhu <qizhu> | |
Severity: | urgent | Docs Contact: | ||
Priority: | urgent | |||
Version: | 7.4 | CC: | amureini, areis, berrange, bmcclain, chayang, dyuan, ebenahar, hhan, jsuchane, juzhang, knoel, kwolf, lmen, michen, mkalinin, mrezanin, mtessun, nsoffer, qizhu, ratamir, rbalakri, salmy, sgordon, toneata, virt-bugs, virt-maint, xuzhang | |
Target Milestone: | rc | Keywords: | Regression, ZStream | |
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | qemu-kvm-rhev-2.10.0-1.el7 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1473145 (view as bug list) | Environment: | ||
Last Closed: | 2018-04-11 00:26:27 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1473145 | |||
Attachments: |
Description
Nir Soffer
2017-07-13 10:54:48 UTC
Raz, can you reproduce this issue again with libvirt debug logs, both on current libvirt and the scratch build for bug 1470127? This is the blockCopy call used in tested vdsm version - this is the call used in ovirt 4.1: 4142 flags = (libvirt.VIR_DOMAIN_BLOCK_COPY_SHALLOW | 4143 libvirt.VIR_DOMAIN_BLOCK_COPY_REUSE_EXT) 4144 4145 self._dom.blockCopy(drive.name, destxml, flags=flags) On master we switch to persistent domains yesterday, so we are calling now: 4188 flags = (libvirt.VIR_DOMAIN_BLOCK_COPY_SHALLOW | 4189 libvirt.VIR_DOMAIN_BLOCK_COPY_REUSE_EXT | 4190 VIR_DOMAIN_BLOCK_COPY_TRANSIENT_JOB) 4191 4192 self._dom.blockCopy(drive.name, destxml, flags=flags) Need to test again with master. We will test it today First off BlockCopy is a different area of the code... I'm not as "knowledgeable" about that code. In particular, that multi-level <backingStore ...> is something I'd need to have a closer look at how it works... Might also be interesting to know what virsh domstats returns for sda, e.g.: virsh domstats $DOM --backing | grep -E "name|allocation|capacity|physical" FWIW: The blockstats values are a closer representation of what QEMU knows about, while the blockinfo has some gray area. For domstats, those values are: "block.<num>.allocation" - offset of the highest written sector as unsigned long long. "block.<num>.capacity" - logical size in bytes of the block device backing image as unsigned long long. "block.<num>.physical" - physical size in bytes of the container of the backing image as unsigned long long. For blkinfo, those values are: capacity logical size in bytes of the image (how much storage the guest will see) allocation host storage in bytes occupied by the image (such as highest allocated extent if there are no holes, similar to 'du') physical host physical size in bytes of the image container (last offset, similar to 'ls' It seems to me there's a desire to get statistical information using the BlockInfo API The issue is all RHV versions are using BlockInfo to get the allocation. We cannot fix this for customers running existing products. These customers setup will break when upgrading to libvirt 3.2. Breaking backward compatibility is not an option. For next version, we can move to domstats api, we wanted to do this anyway since we can do one call per vm instead of one call per vm per disk, and it we also get the allocation of internal volumes, which can be useful for live merge flow. Couldn't reproduce using: 4.2 master vdsm-4.20.1-176.gitecbab6b.el7.centos.x86_64 libvirt-daemon-3.2.0-15.el7_rc.9a49cd47ef.x86_64 VM with RHEL OS with an additional disk: Tested with 10G thin provision disk - created as 1G LV Tested also with preallocated 25G disk - created as 25G LV For both cases: Started the VM and live migrated the disk (for the preallocated disk, it changed allocation policy to thin due to snapshot creation for live storage migration) VM remains running through all live storage migration operation. (In reply to Elad from comment #10) > Couldn't reproduce using: > 4.2 master > vdsm-4.20.1-176.gitecbab6b.el7.centos.x86_64 We moved to persistent domains yesterday, maybe it does not happen in this configuration. Can you test with latest 4.1 (vdsm-4.19.x)? I understand the historical usage problem as libvirt faces that often too. There's a few different variables in this particular bz as opposed to just the allocation and/or capacity variable changing. Mirgration using BlockCopy and multiple level backingStore presents a different challenge and I'm not 100% sure whether underlying QEMU adjustments in the block layer have an impact as well. Looking through libvirt sources, I don't see anywhere obvious where allocation has been set to capacity. If anything allocation could be set to physical - so I'm not sure what could have caused the allocation to jump like that. Looks like we'll need more libvirtd debug data in an environment where the issue is reproduced. Don't forget to get the domstats output as that should be beneficial. I did another run with libvirt debug logs. 1. Staring vm virsh # dumpxml 2 <domain type='kvm' id='2'> <name>bz1470634</name> <uuid>585d96ca-49d7-4ffc-800b-c24990349383</uuid> <metadata xmlns:ovirt-tune="http://ovirt.org/vm/tune/1.0" xmlns:ovirt-vm="http://ovirt.org/vm/1.0"> <ovirt-tune:qos/> <ovirt-vm:vm xmlns:ovirt-vm="http://ovirt.org/vm/1.0"> <ovirt-vm:agentChannelName>com.redhat.rhevm.vdsm</ovirt-vm:agentChannelName> <ovirt-vm:destroy_on_reboot type="bool">False</ovirt-vm:destroy_on_reboot> <ovirt-vm:startTime type="float">1499971723.4</ovirt-vm:startTime> </ovirt-vm:vm> </metadata> <maxMemory slots='16' unit='KiB'>4194304</maxMemory> <memory unit='KiB'>1048576</memory> <currentMemory unit='KiB'>1048576</currentMemory> <vcpu placement='static' current='2'>16</vcpu> <cputune> <shares>1020</shares> </cputune> <resource> <partition>/machine</partition> </resource> <sysinfo type='smbios'> <system> <entry name='manufacturer'>oVirt</entry> <entry name='product'>oVirt Node</entry> <entry name='version'>7.4-18.el7</entry> <entry name='serial'>77A14806-F954-4D2A-AEA6-C098A4469927</entry> <entry name='uuid'>585d96ca-49d7-4ffc-800b-c24990349383</entry> </system> </sysinfo> <os> <type arch='x86_64' machine='pc-i440fx-rhel7.3.0'>hvm</type> <smbios mode='sysinfo'/> </os> <features> <acpi/> </features> <cpu mode='custom' match='exact' check='full'> <model fallback='forbid'>SandyBridge</model> <topology sockets='16' cores='1' threads='1'/> <feature policy='require' name='vme'/> <feature policy='require' name='hypervisor'/> <feature policy='require' name='arat'/> <feature policy='require' name='xsaveopt'/> <numa> <cell id='0' cpus='0-1' memory='1048576' unit='KiB'/> </numa> </cpu> <clock offset='variable' adjustment='0' basis='utc'> <timer name='rtc' tickpolicy='catchup'/> <timer name='pit' tickpolicy='delay'/> <timer name='hpet' present='no'/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='file' device='cdrom'> <driver name='qemu' type='raw'/> <source startupPolicy='optional'/> <backingStore/> <target dev='hdc' bus='ide'/> <readonly/> <alias name='ide0-1-0'/> <address type='drive' controller='0' bus='1' target='0' unit='0'/> </disk> <disk type='block' device='disk' snapshot='no'> <driver name='qemu' type='raw' cache='none' error_policy='stop' io='native'/> <source dev='/rhev/data-center/mnt/blockSD/d0812abd-2beb-4103-b00c-b9bc4ac77085/images/6d8fa77d-9b67-4150-bfef-06a4e7c66cf4/058224ec-791b-4373-b006-aa4d6e2f7283'/> <backingStore/> <target dev='sda' bus='scsi'/> <serial>6d8fa77d-9b67-4150-bfef-06a4e7c66cf4</serial> <boot order='1'/> <alias name='scsi0-0-0-0'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> <controller type='usb' index='0' model='piix3-uhci'> <alias name='usb'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> </controller> <controller type='scsi' index='0' model='virtio-scsi'> <alias name='scsi0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </controller> <controller type='virtio-serial' index='0' ports='16'> <alias name='virtio-serial0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </controller> <controller type='pci' index='0' model='pci-root'> <alias name='pci.0'/> </controller> <controller type='ide' index='0'> <alias name='ide'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> </controller> <interface type='bridge'> <mac address='00:00:00:ff:41:09'/> <source bridge='ovirtmgmt'/> <target dev='vnet0'/> <model type='virtio'/> <filterref filter='vdsm-no-mac-spoofing'/> <link state='up'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> <channel type='unix'> <source mode='bind' path='/var/lib/libvirt/qemu/channels/585d96ca-49d7-4ffc-800b-c24990349383.com.redhat.rhevm.vdsm'/> <target type='virtio' name='com.redhat.rhevm.vdsm' state='disconnected'/> <alias name='channel0'/> <address type='virtio-serial' controller='0' bus='0' port='1'/> </channel> <channel type='unix'> <source mode='bind' path='/var/lib/libvirt/qemu/channels/585d96ca-49d7-4ffc-800b-c24990349383.org.qemu.guest_agent.0'/> <target type='virtio' name='org.qemu.guest_agent.0' state='disconnected'/> <alias name='channel1'/> <address type='virtio-serial' controller='0' bus='0' port='2'/> </channel> <channel type='spicevmc'> <target type='virtio' name='com.redhat.spice.0' state='disconnected'/> <alias name='channel2'/> <address type='virtio-serial' controller='0' bus='0' port='3'/> </channel> <input type='mouse' bus='ps2'> <alias name='input0'/> </input> <input type='keyboard' bus='ps2'> <alias name='input1'/> </input> <graphics type='spice' tlsPort='5900' autoport='yes' listen='10.35.0.110' defaultMode='secure' passwdValidTo='1970-01-01T00:00:01'> <listen type='network' address='10.35.0.110' network='vdsm-ovirtmgmt'/> <channel name='main' mode='secure'/> <channel name='display' mode='secure'/> <channel name='inputs' mode='secure'/> <channel name='cursor' mode='secure'/> <channel name='playback' mode='secure'/> <channel name='record' mode='secure'/> <channel name='smartcard' mode='secure'/> <channel name='usbredir' mode='secure'/> </graphics> <video> <model type='qxl' ram='65536' vram='8192' vgamem='16384' heads='1' primary='yes'/> <alias name='video0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </video> <memballoon model='virtio'> <alias name='balloon0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </memballoon> <rng model='virtio'> <backend model='random'>/dev/urandom</backend> <alias name='rng0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </rng> </devices> <seclabel type='dynamic' model='selinux' relabel='yes'> <label>system_u:system_r:svirt_t:s0:c354,c582</label> <imagelabel>system_u:object_r:svirt_image_t:s0:c354,c582</imagelabel> </seclabel> <seclabel type='dynamic' model='dac' relabel='yes'> <label>+107:+107</label> <imagelabel>+107:+107</imagelabel> </seclabel> </domain> virsh # domstats --backing 2 Domain: 'bz1470634' state.state=1 state.reason=1 cpu.time=2609290083 cpu.user=110000000 cpu.system=230000000 balloon.current=1048576 balloon.maximum=1048576 balloon.last-update=0 balloon.rss=35112 vcpu.current=2 vcpu.maximum=16 vcpu.0.state=1 vcpu.0.time=1860000000 vcpu.0.wait=0 vcpu.0.halted=yes vcpu.1.state=1 vcpu.1.time=0 vcpu.1.wait=0 vcpu.1.halted=yes net.count=1 net.0.name=vnet0 net.0.rx.bytes=90 net.0.rx.pkts=1 net.0.rx.errs=0 net.0.rx.drop=0 net.0.tx.bytes=0 net.0.tx.pkts=0 net.0.tx.errs=0 net.0.tx.drop=0 block.count=2 block.0.name=hdc block.0.rd.reqs=0 block.0.rd.bytes=0 block.0.rd.times=0 block.0.wr.reqs=0 block.0.wr.bytes=0 block.0.wr.times=0 block.0.fl.reqs=0 block.0.fl.times=0 block.0.allocation=0 block.0.physical=0 block.1.name=sda block.1.path=/rhev/data-center/mnt/blockSD/d0812abd-2beb-4103-b00c-b9bc4ac77085/images/6d8fa77d-9b67-4150-bfef-06a4e7c66cf4/058224ec-791b-4373-b006-aa4d6e2f7283 block.1.rd.reqs=1 block.1.rd.bytes=512 block.1.rd.times=1078839 block.1.wr.reqs=0 block.1.wr.bytes=0 block.1.wr.times=0 block.1.fl.reqs=0 block.1.fl.times=0 block.1.allocation=0 block.1.capacity=21474836480 block.1.physical=21474836480 2. Starting live storage migratin in ovirt This does: a. create snapshot b. create destination disk on another storage c. call libvirt.virDomain.blockCopy() with the destination disk virsh # dumpxml 2 <domain type='kvm' id='2'> <name>bz1470634</name> <uuid>585d96ca-49d7-4ffc-800b-c24990349383</uuid> <metadata xmlns:ovirt-tune="http://ovirt.org/vm/tune/1.0" xmlns:ovirt-vm="http://ovirt.org/vm/1.0"> <ovirt-tune:qos/> <ovirt-vm:vm xmlns:ovirt-vm="http://ovirt.org/vm/1.0"> <ovirt-vm:agentChannelName>com.redhat.rhevm.vdsm</ovirt-vm:agentChannelName> <ovirt-vm:destroy_on_reboot type="bool">False</ovirt-vm:destroy_on_reboot> <ovirt-vm:startTime type="float">1499971723.4</ovirt-vm:startTime> </ovirt-vm:vm> </metadata> <maxMemory slots='16' unit='KiB'>4194304</maxMemory> <memory unit='KiB'>1048576</memory> <currentMemory unit='KiB'>1048576</currentMemory> <vcpu placement='static' current='2'>16</vcpu> <cputune> <shares>1020</shares> </cputune> <resource> <partition>/machine</partition> </resource> <sysinfo type='smbios'> <system> <entry name='manufacturer'>oVirt</entry> <entry name='product'>oVirt Node</entry> <entry name='version'>7.4-18.el7</entry> <entry name='serial'>77A14806-F954-4D2A-AEA6-C098A4469927</entry> <entry name='uuid'>585d96ca-49d7-4ffc-800b-c24990349383</entry> </system> </sysinfo> <os> <type arch='x86_64' machine='pc-i440fx-rhel7.3.0'>hvm</type> <smbios mode='sysinfo'/> </os> <features> <acpi/> </features> <cpu mode='custom' match='exact' check='full'> <model fallback='forbid'>SandyBridge</model> <topology sockets='16' cores='1' threads='1'/> <feature policy='require' name='vme'/> <feature policy='require' name='hypervisor'/> <feature policy='require' name='arat'/> <feature policy='require' name='xsaveopt'/> <numa> <cell id='0' cpus='0-1' memory='1048576' unit='KiB'/> </numa> </cpu> <clock offset='variable' adjustment='0' basis='utc'> <timer name='rtc' tickpolicy='catchup'/> <timer name='pit' tickpolicy='delay'/> <timer name='hpet' present='no'/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='file' device='cdrom'> <driver name='qemu' type='raw'/> <source startupPolicy='optional'/> <backingStore/> <target dev='hdc' bus='ide'/> <readonly/> <alias name='ide0-1-0'/> <address type='drive' controller='0' bus='1' target='0' unit='0'/> </disk> <disk type='block' device='disk' snapshot='no'> <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/> <source dev='/rhev/data-center/mnt/blockSD/d0812abd-2beb-4103-b00c-b9bc4ac77085/images/6d8fa77d-9b67-4150-bfef-06a4e7c66cf4/ba76fbfc-3cb8-4876-9ec6-6173d63772e2'/> <backingStore type='block' index='1'> <format type='raw'/> <source dev='/rhev/data-center/mnt/blockSD/d0812abd-2beb-4103-b00c-b9bc4ac77085/images/6d8fa77d-9b67-4150-bfef-06a4e7c66cf4/058224ec-791b-4373-b006-aa4d6e2f7283'/> <backingStore/> </backingStore> <mirror type='block' job='copy' ready='yes'> <format type='qcow2'/> <source dev='/rhev/data-center/mnt/blockSD/aed577ea-d1ca-4ebe-af80-f852c7ce59bb/images/6d8fa77d-9b67-4150-bfef-06a4e7c66cf4/ba76fbfc-3cb8-4876-9ec6-6173d63772e2'/> </mirror> <target dev='sda' bus='scsi'/> <serial>6d8fa77d-9b67-4150-bfef-06a4e7c66cf4</serial> <boot order='1'/> <alias name='scsi0-0-0-0'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> <controller type='usb' index='0' model='piix3-uhci'> <alias name='usb'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> </controller> <controller type='scsi' index='0' model='virtio-scsi'> <alias name='scsi0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </controller> <controller type='virtio-serial' index='0' ports='16'> <alias name='virtio-serial0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </controller> <controller type='pci' index='0' model='pci-root'> <alias name='pci.0'/> </controller> <controller type='ide' index='0'> <alias name='ide'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> </controller> <interface type='bridge'> <mac address='00:00:00:ff:41:09'/> <source bridge='ovirtmgmt'/> <target dev='vnet0'/> <model type='virtio'/> <filterref filter='vdsm-no-mac-spoofing'/> <link state='up'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> <channel type='unix'> <source mode='bind' path='/var/lib/libvirt/qemu/channels/585d96ca-49d7-4ffc-800b-c24990349383.com.redhat.rhevm.vdsm'/> <target type='virtio' name='com.redhat.rhevm.vdsm' state='disconnected'/> <alias name='channel0'/> <address type='virtio-serial' controller='0' bus='0' port='1'/> </channel> <channel type='unix'> <source mode='bind' path='/var/lib/libvirt/qemu/channels/585d96ca-49d7-4ffc-800b-c24990349383.org.qemu.guest_agent.0'/> <target type='virtio' name='org.qemu.guest_agent.0' state='disconnected'/> <alias name='channel1'/> <address type='virtio-serial' controller='0' bus='0' port='2'/> </channel> <channel type='spicevmc'> <target type='virtio' name='com.redhat.spice.0' state='disconnected'/> <alias name='channel2'/> <address type='virtio-serial' controller='0' bus='0' port='3'/> </channel> <input type='mouse' bus='ps2'> <alias name='input0'/> </input> <input type='keyboard' bus='ps2'> <alias name='input1'/> </input> <graphics type='spice' tlsPort='5900' autoport='yes' listen='10.35.0.110' defaultMode='secure' passwdValidTo='1970-01-01T00:00:01'> <listen type='network' address='10.35.0.110' network='vdsm-ovirtmgmt'/> <channel name='main' mode='secure'/> <channel name='display' mode='secure'/> <channel name='inputs' mode='secure'/> <channel name='cursor' mode='secure'/> <channel name='playback' mode='secure'/> <channel name='record' mode='secure'/> <channel name='smartcard' mode='secure'/> <channel name='usbredir' mode='secure'/> </graphics> <video> <model type='qxl' ram='65536' vram='8192' vgamem='16384' heads='1' primary='yes'/> <alias name='video0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </video> <memballoon model='virtio'> <alias name='balloon0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </memballoon> <rng model='virtio'> <backend model='random'>/dev/urandom</backend> <alias name='rng0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </rng> </devices> <seclabel type='dynamic' model='selinux' relabel='yes'> <label>system_u:system_r:svirt_t:s0:c354,c582</label> <imagelabel>system_u:object_r:svirt_image_t:s0:c354,c582</imagelabel> </seclabel> <seclabel type='dynamic' model='dac' relabel='yes'> <label>+107:+107</label> <imagelabel>+107:+107</imagelabel> </seclabel> </domain> virsh # domstats --backing 2 Domain: 'bz1470634' state.state=3 state.reason=1 cpu.time=4324202072 cpu.user=150000000 cpu.system=260000000 balloon.current=1048576 balloon.maximum=1048576 balloon.last-update=0 balloon.rss=36836 vcpu.current=2 vcpu.maximum=16 vcpu.0.state=1 vcpu.0.time=3000000000 vcpu.0.wait=0 vcpu.0.halted=yes vcpu.1.state=1 vcpu.1.time=0 vcpu.1.wait=0 vcpu.1.halted=yes net.count=1 net.0.name=vnet0 net.0.rx.bytes=230 net.0.rx.pkts=3 net.0.rx.errs=0 net.0.rx.drop=901 net.0.tx.bytes=0 net.0.tx.pkts=0 net.0.tx.errs=0 net.0.tx.drop=0 block.count=3 block.0.name=hdc block.0.rd.reqs=0 block.0.rd.bytes=0 block.0.rd.times=0 block.0.wr.reqs=0 block.0.wr.bytes=0 block.0.wr.times=0 block.0.fl.reqs=0 block.0.fl.times=0 block.0.allocation=0 block.0.physical=0 block.1.name=sda block.1.path=/rhev/data-center/mnt/blockSD/d0812abd-2beb-4103-b00c-b9bc4ac77085/images/6d8fa77d-9b67-4150-bfef-06a4e7c66cf4/ba76fbfc-3cb8-4876-9ec6-6173d63772e2 block.1.rd.reqs=1 block.1.rd.bytes=512 block.1.rd.times=1078839 block.1.wr.reqs=0 block.1.wr.bytes=0 block.1.wr.times=0 block.1.fl.reqs=0 block.1.fl.times=0 block.1.allocation=0 block.1.capacity=21474836480 block.1.physical=21474836480 <<<< this is wrong, this is a 1g lv on the destination block.2.name=sda block.2.path=/rhev/data-center/mnt/blockSD/d0812abd-2beb-4103-b00c-b9bc4ac77085/images/6d8fa77d-9b67-4150-bfef-06a4e7c66cf4/058224ec-791b-4373-b006-aa4d6e2f7283 block.2.backingIndex=1 block.2.rd.reqs=0 block.2.rd.bytes=0 block.2.rd.times=0 block.2.wr.reqs=0 block.2.wr.bytes=0 block.2.wr.times=0 block.2.fl.reqs=0 block.2.fl.times=0 block.2.allocation=0 block.2.capacity=21474836480 block.2.physical=21474836480 virsh # domblkinfo 2 sda Capacity: 21474836480 Allocation: 21474836480 Physical: 3221225472 virsh # list Id Name State ---------------------------------------------------- 2 bz1470634 paused Looks like qemu return wrong physical value (physical==capacity) during block copy. We assumed that this code was correct because physical is always zero for block: 11701 /* Unlike GetStatsBlock, this API has defined the expected return values 11702 * for allocation and physical slightly differently. 11703 * 11704 * Having a zero for either or if they're the same is an indication that 11705 * there's a sparse file backing this device. In this case, we'll force 11706 * the setting of physical based on the on disk file size. 11707 * 11708 * Additionally, if qemu hasn't written to the file yet, then set the 11709 * allocation to whatever qemu returned for physical (e.g. the "actual- 11710 * size" from the json query) as that will match the expected allocation 11711 * value for this API. NB: May still be 0 for block. */ 11712 if (entry->physical == 0 || info->allocation == 0 || 11713 info->allocation == entry->physical) { 11714 if (info->allocation == 0) 11715 info->allocation = entry->physical; But if allocation is zero, and physical is not, we are overriding allocation value. I think the safe way to make the api happy without breaking backward compatibility is: if (entry->physical == 0 || info->allocation == 0 || info->allocation == entry->physical) { if (virStorageSourceGetActualType(disk->src) == VIR_STORAGE_TYPE_FILE && info->allocation == 0) info->allocation = entry->physical; Created attachment 1297854 [details]
Vdsm and libvirt debug logs for run 2
Reproduce also with vdsm 4.2, using persistent domains, see comment 4. While I can understand your assertion and I'm not against it, per se; however, adjusting the code based on a bug or bad value received from a lower layer could mean that eventually when that bug is fixed, the correct value won't be displayed because of this new layer of protection added because we got bad data. The usage of ActualType in the previous check (above a few lines while managing the wr_highest_offset value) was a result of commit id 'cf436a56'. So what does "qemu-img info --backing-chain" say about this source path at this point in time? Does it display the correct or incorrect values? (In reply to John Ferlan from comment #16) I agree that hiding bad value from qemu is bad idea, but I suspect that the issue is in libvirt handling of the values from qemu. In the libvirt log you can see that qemu returns "actual-size": 0. The wrong value 21474836480 appears in qemu response only in "virtual-size". My theory is: 1. qemu returns "virtual-size": 21474836480, "actual-size": 0 2. libvirt code preparing a stats entry changing actual-size to the value of virtual-size - but only after starting blockCopy. 3. the code trying to "fix" allocation, assuming that physical is always 0 for block is fixing allocation to physical. And regarding https://bugzilla.redhat.com/show_bug.cgi?id=1467826#c29 we cannot use domstats because it return wrong physical value. We need both allocation and physical to extend volumes when they become full. We basically do: if physical - alloc < drive.watermarkLimit: extend the drive... This must work, even when we will sue the new libvirt threshold events to minimize polling. Regarding your question about actual lv sizes in this state, here is another run - this time with file based disk (this disk was moved from block storage while the vm was running). virsh # dumpxml 4 ... <disk type='file' device='disk' snapshot='no'> <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/> <source file='/rhev/data-center/mnt/dumbo.tlv.redhat.com:_voodoo_40/d6e4a622-bd31-4d8f-904d-1e26b7286757/images/a5930327-05bc-4b26-b1b5-5a09c720039e/5f9cfbe2-8ced-4e25-bda8-cbec2b3f1d72'/> <backingStore type='file' index='1'> <format type='raw'/> <source file='/rhev/data-center/mnt/dumbo.tlv.redhat.com:_voodoo_40/d6e4a622-bd31-4d8f-904d-1e26b7286757/images/a5930327-05bc-4b26-b1b5-5a09c720039e/25df7f6e-f5de-4dac-8eb8-2f7b991d4bdb'/> <backingStore/> </backingStore> <mirror type='block' job='copy' ready='yes'> <format type='qcow2'/> <source dev='/rhev/data-center/mnt/blockSD/aed577ea-d1ca-4ebe-af80-f852c7ce59bb/images/a5930327-05bc-4b26-b1b5-5a09c720039e/5f9cfbe2-8ced-4e25-bda8-cbec2b3f1d72'/> </mirror> <target dev='sda' bus='scsi'/> <serial>a5930327-05bc-4b26-b1b5-5a09c720039e</serial> <boot order='1'/> <alias name='scsi0-0-0-0'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> virsh # domstats 4 --block --backing Domain: 'bz1467826' block.count=3 block.0.name=hdc block.0.rd.reqs=0 block.0.rd.bytes=0 block.0.rd.times=0 block.0.wr.reqs=0 block.0.wr.bytes=0 block.0.wr.times=0 block.0.fl.reqs=0 block.0.fl.times=0 block.0.allocation=0 block.0.physical=0 block.1.name=sda block.1.path=/rhev/data-center/mnt/dumbo.tlv.redhat.com:_voodoo_40/d6e4a622-bd31-4d8f-904d-1e26b7286757/images/a5930327-05bc-4b26-b1b5-5a09c720039e/5f9cfbe2-8ced-4e25-bda8-cbec2b3f1d72 block.1.rd.reqs=1 block.1.rd.bytes=512 block.1.rd.times=827287 block.1.wr.reqs=0 block.1.wr.bytes=0 block.1.wr.times=0 block.1.fl.reqs=0 block.1.fl.times=0 block.1.allocation=0 block.1.capacity=21474836480 block.1.physical=21474836480 <<< wrong value block.2.name=sda block.2.path=/rhev/data-center/mnt/dumbo.tlv.redhat.com:_voodoo_40/d6e4a622-bd31-4d8f-904d-1e26b7286757/images/a5930327-05bc-4b26-b1b5-5a09c720039e/25df7f6e-f5de-4dac-8eb8-2f7b991d4bdb block.2.backingIndex=1 block.2.rd.reqs=0 block.2.rd.bytes=0 block.2.rd.times=0 block.2.wr.reqs=0 block.2.wr.bytes=0 block.2.wr.times=0 block.2.fl.reqs=0 block.2.fl.times=0 block.2.allocation=0 block.2.capacity=21474836480 block.2.physical=200704 virsh # domblkinfo 4 sda Capacity: 21474836480 Allocation: 21474836480 <<< wrong, see qemu-img check result bellow Physical: 196928 # qemu-img info --backing /rhev/data-center/mnt/dumbo.tlv.redhat.com:_voodoo_40/d6e4a622-bd31-4d8f-904d-1e26b7286757/images/a5930327-05bc-4b26-b1b5-5a09c720039e/5f9cfbe2-8ced-4e25-bda8-cbec2b3f1d72 image: /rhev/data-center/mnt/dumbo.tlv.redhat.com:_voodoo_40/d6e4a622-bd31-4d8f-904d-1e26b7286757/images/a5930327-05bc-4b26-b1b5-5a09c720039e/5f9cfbe2-8ced-4e25-bda8-cbec2b3f1d72 file format: qcow2 virtual size: 20G (21474836480 bytes) disk size: 196K cluster_size: 65536 backing file: 25df7f6e-f5de-4dac-8eb8-2f7b991d4bdb (actual path: /rhev/data-center/mnt/dumbo.tlv.redhat.com:_voodoo_40/d6e4a622-bd31-4d8f-904d-1e26b7286757/images/a5930327-05bc-4b26-b1b5-5a09c720039e/25df7f6e-f5de-4dac-8eb8-2f7b991d4bdb) backing file format: raw Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 corrupt: false image: /rhev/data-center/mnt/dumbo.tlv.redhat.com:_voodoo_40/d6e4a622-bd31-4d8f-904d-1e26b7286757/images/a5930327-05bc-4b26-b1b5-5a09c720039e/25df7f6e-f5de-4dac-8eb8-2f7b991d4bdb file format: raw virtual size: 20G (21474836480 bytes) disk size: 0 # qemu-img check /rhev/data-center/mnt/dumbo.tlv.redhat.com:_voodoo_40/d6e4a622-bd31-4d8f-904d-1e26b7286757/images/a5930327-05bc-4b26-b1b5-5a09c720039e/5f9cfbe2-8ced-4e25-bda8-cbec2b3f1d72 No errors were found on the image. Image end offset: 262144 I will send later the same output for block disk based flow. Here is the same flow with block based disk: 1. Starting the vm: virsh # dumpxml 5 ... <disk type='block' device='disk' snapshot='no'> <driver name='qemu' type='raw' cache='none' error_policy='stop' io='native'/> <source dev='/rhev/data-center/mnt/blockSD/aed577ea-d1ca-4ebe-af80-f852c7ce59bb/images/31cb9434-0511-4519-aa22-d35869cd7fa2/cd04fdaf-0aff-4720-930a-e2d24ed624a7'/> <backingStore/> <target dev='sda' bus='scsi'/> <serial>31cb9434-0511-4519-aa22-d35869cd7fa2</serial> <boot order='1'/> <alias name='scsi0-0-0-0'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> # qemu-img info --backing /rhev/data-center/mnt/blockSD/aed577ea-d1ca-4ebe-af80-f852c7ce59bb/images/31cb9434-0511-4519-aa22-d35869cd7fa2/cd04fdaf-0aff-4720-930a-e2d24ed624a7 image: /rhev/data-center/mnt/blockSD/aed577ea-d1ca-4ebe-af80-f852c7ce59bb/images/31cb9434-0511-4519-aa22-d35869cd7fa2/cd04fdaf-0aff-4720-930a-e2d24ed624a7 file format: raw virtual size: 20G (21474836480 bytes) disk size: 0 virsh # domblkinfo 5 sda Capacity: 21474836480 Allocation: 0 Physical: 21474836480 virsh # domstats 5 --block --backing Domain: 'bz1467826' block.count=2 block.0.name=hdc block.0.rd.reqs=0 block.0.rd.bytes=0 block.0.rd.times=0 block.0.wr.reqs=0 block.0.wr.bytes=0 block.0.wr.times=0 block.0.fl.reqs=0 block.0.fl.times=0 block.0.allocation=0 block.0.physical=0 block.1.name=sda block.1.path=/rhev/data-center/mnt/blockSD/aed577ea-d1ca-4ebe-af80-f852c7ce59bb/images/31cb9434-0511-4519-aa22-d35869cd7fa2/cd04fdaf-0aff-4720-930a-e2d24ed624a7 block.1.rd.reqs=1 block.1.rd.bytes=512 block.1.rd.times=958460 block.1.wr.reqs=0 block.1.wr.bytes=0 block.1.wr.times=0 block.1.fl.reqs=0 block.1.fl.times=0 block.1.allocation=0 block.1.capacity=21474836480 block.1.physical=21474836480 2. Starting live storage migration in ovirt to another block storage The first step in live storage migration is creating a snapshot, blockCopy was not called yet... virsh # dumpxml 5 ... <disk type='block' device='disk' snapshot='no'> <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/> <source dev='/rhev/data-center/mnt/blockSD/aed577ea-d1ca-4ebe-af80-f852c7ce59bb/images/31cb9434-0511-4519-aa22-d35869cd7fa2/0838b0fe-52a5-44b3-9e5f-0104d03e9677'/> <backingStore type='block' index='1'> <format type='raw'/> <source dev='/rhev/data-center/mnt/blockSD/aed577ea-d1ca-4ebe-af80-f852c7ce59bb/images/31cb9434-0511-4519-aa22-d35869cd7fa2/cd04fdaf-0aff-4720-930a-e2d24ed624a7'/> <backingStore/> </backingStore> <target dev='sda' bus='scsi'/> <serial>31cb9434-0511-4519-aa22-d35869cd7fa2</serial> <boot order='1'/> <alias name='scsi0-0-0-0'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> After blockCopy was called, the vm paused: virsh # list Id Name State ---------------------------------------------------- 5 bz1467826 paused virsh # dumpxml 5 ... <disk type='block' device='disk' snapshot='no'> <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/> <source dev='/rhev/data-center/mnt/blockSD/aed577ea-d1ca-4ebe-af80-f852c7ce59bb/images/31cb9434-0511-4519-aa22-d35869cd7fa2/0838b0fe-52a5-44b3-9e5f-0104d03e9677'/> <backingStore type='block' index='1'> <format type='raw'/> <source dev='/rhev/data-center/mnt/blockSD/aed577ea-d1ca-4ebe-af80-f852c7ce59bb/images/31cb9434-0511-4519-aa22-d35869cd7fa2/cd04fdaf-0aff-4720-930a-e2d24ed624a7'/> <backingStore/> </backingStore> <mirror type='block' job='copy' ready='yes'> <format type='qcow2'/> <source dev='/rhev/data-center/mnt/blockSD/6ffbc483-0031-403a-819b-3bb2f0f8de0a/images/31cb9434-0511-4519-aa22-d35869cd7fa2/0838b0fe-52a5-44b3-9e5f-0104d03e9677'/> </mirror> <target dev='sda' bus='scsi'/> <serial>31cb9434-0511-4519-aa22-d35869cd7fa2</serial> <boot order='1'/> <alias name='scsi0-0-0-0'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> virsh # domblkinfo 5 sda Capacity: 21474836480 Allocation: 21474836480 << wrong Physical: 3221225472 virsh # domstats 5 --block --backing Domain: 'bz1467826' block.count=3 block.0.name=hdc block.0.rd.reqs=0 block.0.rd.bytes=0 block.0.rd.times=0 block.0.wr.reqs=0 block.0.wr.bytes=0 block.0.wr.times=0 block.0.fl.reqs=0 block.0.fl.times=0 block.0.allocation=0 block.0.physical=0 block.1.name=sda block.1.path=/rhev/data-center/mnt/blockSD/aed577ea-d1ca-4ebe-af80-f852c7ce59bb/images/31cb9434-0511-4519-aa22-d35869cd7fa2/0838b0fe-52a5-44b3-9e5f-0104d03e9677 block.1.rd.reqs=1 block.1.rd.bytes=512 block.1.rd.times=958460 block.1.wr.reqs=0 block.1.wr.bytes=0 block.1.wr.times=0 block.1.fl.reqs=0 block.1.fl.times=0 block.1.allocation=0 block.1.capacity=21474836480 block.1.physical=21474836480 <<< wrong block.2.name=sda block.2.path=/rhev/data-center/mnt/blockSD/aed577ea-d1ca-4ebe-af80-f852c7ce59bb/images/31cb9434-0511-4519-aa22-d35869cd7fa2/cd04fdaf-0aff-4720-930a-e2d24ed624a7 block.2.backingIndex=1 block.2.rd.reqs=0 block.2.rd.bytes=0 block.2.rd.times=0 block.2.wr.reqs=0 block.2.wr.bytes=0 block.2.wr.times=0 block.2.fl.reqs=0 block.2.fl.times=0 block.2.allocation=0 block.2.capacity=21474836480 block.2.physical=21474836480 # qemu-img info --backing /rhev/data-center/mnt/blockSD/aed577ea-d1ca-4ebe-af80-f852c7ce59bb/images/31cb9434-0511-4519-aa22-d35869cd7fa2/0838b0fe-52a5-44b3-9e5f-0104d03e9677 image: /rhev/data-center/mnt/blockSD/aed577ea-d1ca-4ebe-af80-f852c7ce59bb/images/31cb9434-0511-4519-aa22-d35869cd7fa2/0838b0fe-52a5-44b3-9e5f-0104d03e9677 file format: qcow2 virtual size: 20G (21474836480 bytes) disk size: 0 cluster_size: 65536 backing file: cd04fdaf-0aff-4720-930a-e2d24ed624a7 (actual path: /rhev/data-center/mnt/blockSD/aed577ea-d1ca-4ebe-af80-f852c7ce59bb/images/31cb9434-0511-4519-aa22-d35869cd7fa2/cd04fdaf-0aff-4720-930a-e2d24ed624a7) backing file format: raw Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 corrupt: false image: /rhev/data-center/mnt/blockSD/aed577ea-d1ca-4ebe-af80-f852c7ce59bb/images/31cb9434-0511-4519-aa22-d35869cd7fa2/cd04fdaf-0aff-4720-930a-e2d24ed624a7 file format: raw virtual size: 20G (21474836480 bytes) disk size: 0 # qemu-img check /rhev/data-center/mnt/blockSD/aed577ea-d1ca-4ebe-af80-f852c7ce59bb/images/31cb9434-0511-4519-aa22-d35869cd7fa2/0838b0fe-52a5-44b3-9e5f-0104d03e9677 No errors were found on the image. Image end offset: 262144 (Adding missing info for comment 18) The snapshot lv was already removed - we delete the snapshot after live storage migration completes. Here are logs from vdsm showing the real size of the lv: Creating 1g lv: 2017-07-14 19:44:09,656+0300 INFO (tasks/3) [storage.LVM] Creating LV (vg=aed577ea-d1ca-4ebe-af80-f852c7ce59bb, lv=0838b0fe-52a5-44b3-9e5f-0104d03e9677, size=1024m, activate=True, contiguous=False, initialTags=('OVIRT_VOL_INITIALIZING',)) (lvm:1147) After taking a snapshot, using now this lv: 2017-07-14 19:44:25,859+0300 DEBUG (jsonrpc/5) [virt.vm] (vmId='c6ba6960-bf68-47fd-8ba2-b3543bef1263') Drive sda moved from u'/rhev/data-center/mnt/blockSD/aed577ea-d1ca-4ebe-af80-f852c7ce59bb/images/31cb9434-0511-4519-aa22-d35869cd7fa2/cd04fdaf-0aff-4720-930a-e2d24ed624a7' to u'/rhev/data-center/mnt/blockSD/aed577ea-d1ca-4ebe-af80-f852c7ce59bb/images/31cb9434-0511-4519-aa22-d35869cd7fa2/0838b0fe-52a5-44b3-9e5f-0104d03e9677' (storage:358) Response showing the lv size: 2017-07-14 19:44:26,030+0300 INFO (jsonrpc/3) [vdsm.api] FINISH getVolumeInfo return={'info': {'status': 'OK', 'domain': 'aed577ea-d1ca-4ebe-af80-f852c7ce59bb', 'voltype': 'LEAF', 'description': '', 'parent': 'cd04fdaf-0aff-4720-930a-e2d24ed624a7', 'format': 'COW', 'generation': 0, 'image': '31cb9434-0511-4519-aa22-d35869cd7fa2', 'ctime': '1500050655', 'disktype': '2', 'legality': 'LEGAL', 'mtime': '0', 'apparentsize': '1073741824', 'children': [], 'pool': '', 'capacity': '21474836480', 'uuid': u'0838b0fe-52a5-44b3-9e5f-0104d03e9677', 'truesize': '1073741824', 'type': 'SPARSE', 'lease': {'owners': [], 'version': None}}} from=::ffff:10.35.0.111,39084, flow_id=d2e9931d-b813-4dcc-a9e3-55ad65ea6a89, task_id=994e6092-0b27-4a5e-a548-c335b0dd328c (api:52) NOTE: 'apparentsize': '1073741824' Starting blockCopy: 2017-07-14 19:44:33,978+0300 DEBUG (jsonrpc/2) [virt.vm] (vmId='c6ba6960-bf68-47fd-8ba2-b3543bef1263') Replicating drive sda to <?xml version='1.0' encoding='UTF-8'?> <disk device="disk" snapshot="no" type="block"><source dev="/rhev/data-center/mnt/blockSD/6ffbc483-0031-403a-819b-3bb2f0f8de0a/images/31cb9434-0511-4519-aa22-d35869cd7fa2/0838b0fe-52a5-44b3-9e5f-0104d03e9677" /><driver cache="none" error_policy="stop" io="native" name="qemu" type="qcow2" /></disk> (vm:4186) Receiving block job VIR_DOMAIN_BLOCK_JOB_READY for this block copy: 2017-07-14 19:44:33,993+0300 DEBUG (libvirt/events) [virt.vm] (vmId='c6ba6960-bf68-47fd-8ba2-b3543bef1263') unhandled libvirt event (event_name=Unknown id 8, args=('/rhev/data-center/mnt/blo ckSD/aed577ea-d1ca-4ebe-af80-f852c7ce59bb/images/31cb9434-0511-4519-aa22-d35869cd7fa2/0838b0fe-52a5-44b3-9e5f-0104d03e9677', 2, 3, 8)) (clientIF:559) NOTE: we don't use yest block job events We receive bad allocation value form libvirt and pause the vm: 2017-07-14 19:44:34,887+0300 ERROR (periodic/5) [virt.vm] (vmId='c6ba6960-bf68-47fd-8ba2-b3543bef1263') Improbable extension request for volume 0838b0fe-52a5-44b3-9e5f-0104d03e9677 on domain aed577ea-d1ca-4ebe-af80-f852c7ce59bb, pausing the VM to avoid corruptions (capacity: 21474836480, allocated: 21474836480, physical: 1073741824, next physical size: 3221225472) (vm:1084) Suffice to say there's some moving parts in all this - qemu changing, libvirt changing, and rhv changing. Now you're pointing out different types storage rhv uses and suffice to say there's a lot of data to chomp on and digest. That's going to take some time and may require even more data. First off, GetBlockInfo was designed long before modern adjustments to qcow2 container files, sparse lv/files, and multi-level backingStore options was as complicated as it is today. Using GetBlockInfo is becoming more and more troublesome because it was never designed to provide the depth of details that domain block stats can provide nor the level of details rhv appears to need to know. GetBlockInfo massages @allocation based on some historical factors, but DomainStats does not massage allocation - it returns what QEMU gives it. To add to the woes, if the target hasn't been written qemu will have no allocation or "wr_highest_offset" value, so in order to "handle" that libvirt does more detection. For rhv purposes that detection is bad/wrong; however, for someone else that detection may be right. It's a proverbial rock and hard place. In any case, to lay the ground work, for domain stats: capacity == virtual-size physical == actual-size allocation == wr_highest_offset If 'actual-size' is missing from the returned qemu data, then physical = capacity because the image is not thin provisioned (see qemuMonitorJSONBlockStatsUpdateCapacityOne). Oh and this is the same algorithm used by the BlkInfo code to at least get the qemu values. Furthermore, domain stats can massage the returned physical value in qemuDomainGetStatsOneBlock: if (entry->physical) { QEMU_ADD_BLOCK_PARAM_ULL(record, maxparams, block_idx, "physical", entry->physical); } else { if (qemuDomainStorageUpdatePhysical(driver, cfg, dom, src) == 0) { QEMU_ADD_BLOCK_PARAM_ULL(record, maxparams, block_idx, "physical", src->physical); } else { virResetLastError(); } } So, if "actual-size == 0" (like you show at the top of comment 17), then the stats code will go off and change that, which you're calling wrong, but I need to let Peter answer since he wrote that Stats code. I have a feeling it's because a physical size == 0 makes zero sense. Furthermore, without the qemu data I cannot know for sure if actual-size == 0 or 20G in this case. Not that I want more data, but you're only showing a partial picture of all the data at each step, so it's really hard to put together a logically based response. The data you show is after the copy/mirror starts so what you say is wrong I can only show from the code why those values are the way they are. Based on what you show in comment 17 (cut-n-paste for brevity) here's what I can deduce: <disk type='file' device='disk' snapshot='no'> <driver name='qemu' type='qcow2' .../> <source file='*images*f1d72'/> <backingStore type='file' index='1'> <format type='raw'/> <source file='*images*d4bdb'/> <backingStore/> </backingStore> <mirror type='block' job='copy' ready='yes'> <format type='qcow2'/> <source dev='*images*f1d72'/> </mirror> and block.1.name=sda block.1.path=*images*f1d72 block.1.allocation=0 block.1.capacity=21474836480 block.1.physical=21474836480 <<< wrong value block.2.name=sda block.2.path=*images/*d4bdb block.2.backingIndex=1 block.2.allocation=0 block.2.capacity=21474836480 block.2.physical=200704 and virsh # domblkinfo 4 sda Capacity: 21474836480 Allocation: 21474836480 <<< wrong, see qemu-img check result bellow Physical: 196928 and qemu-img info virtual size: 20G (21474836480 bytes) [== domblkinfo capacity] disk size: 196K [== domblkinfo physical] IIRC, the qemu-img data shows "disk size" of 196K for that 'raw' area as a result of a call to bdrv_getlength(bs) from block/qapi.c - bdrv_query_image_info() which if I'm reading things right will call raw_getlength() which I chased once and ended up returning essentially the stat() st_size value that the libivrt call to virStorageSourceUpdatePhysicalSize will fetch for a TYPE_FILE. For the 'block' area it returns 20G for capacity and 0 (zero) for disk size. I tried following that code once, but got lost quickly. From the libvirt side... For block.1 (mirror type='block'), it would seem to have actual-size == 0 (it's a block device). This results in a call to qemuDomainStorageUpdatePhysical. For a block device that's an lseek(). For block.2 (format type='raw'), it seems actual-size = 200704. Due to how the algorithm works keying off the block name, GetBlockInfo will use the "last" qemu data returned, which seems to be block.1, but only because if it was block.2, then I'd expect to see 200704 in the allocation field (but I'm also in data overload, so I may have read something wrong). It may also be a case of 'actual-size' not being present resulting in physical = capacity (which also changes the answer a bit to block.1 above which won't make that call). I really think I'd need to see the qemu data to be sure though. On to comment 18 or the 'block' example. Here we have: <disk type='block' device='disk' snapshot='no'> <driver name='qemu' type='raw' .../> <source dev='*mnt/blockSD*624a7'/> ... <backingStore/> ... [NB: source type = block and source format = raw are different than previous example. So are the paths obviously. I'm assuming the former was somehow created as a qcow2 image and the latter is a device backed to some raw file.] and block.1.name=sda block.1.allocation=0 block.1.capacity=21474836480 block.1.physical=21474836480 and GetBlockInfo displays: Allocation: 0 This is deemed OK and right, correct? In this case, what's interesting is it would seem 'actual-size == 0'. In that case, the stats would change the physical to be the call to qemuDomainStorageUpdatePhysical. For the BlkInfo, allocation is set to actual-size (which is 0 for the block device) and it too calls the same function for physical. So far, so good. Then you do your migration and blockCopy and we have: <disk type='block' device='disk' snapshot='no'> <driver name='qemu' type='qcow2' ... /> <source dev='*mnt/blockSD*e9677'/> ... <backingStore type='block' index='1'> <format type='raw'/> <source dev='*mnt/blockSD*624a7'/> <== e.g. same as "source" ... <backingStore/> </backingStore> <mirror type='block' job='copy' ready='yes'> <format type='qcow2'/> <source dev='*/mnt/blockSD/*e9677'/> ... </mirror> and block.1.name=sda block.1.path=*/mnt/blockSD/*e9677 block.1.path=*images*e9677 block.1.allocation=0 block.1.capacity=21474836480 block.1.physical=21474836480 <<< wrong ... block.2.name=sda block.2.path=*/mnt/blockSD/*624a7 block.2.backingIndex=1 block.2.allocation=0 block.2.capacity=21474836480 block.2.physical=21474836480 ... and domblkinfo 5 sda Capacity: 21474836480 Allocation: 21474836480 << wrong Physical: 3221225472 and qemu-img info virtual size: 20G (21474836480 bytes) disk size: 0 In this case for type = block, qemu-img takes a different path to get the "disk size:" output, but reading that code would take someone that understands it better from the qemu side as I don't think I could do it justice! I think the display of 0 just means it cannot get to it because of being a block device. As for the libvirt side... For block.1 (mirror type=block) actual-size must be 0 upon return (it's block, right?) so we go ask virStorageSourceUpdatePhysicalSize. That will return an lseek() to the end of the device (standard way to get the size of block device), hence the value. For Block.2 (format type = raw) - I'm assuming the actual-size is returned as 20G. For BlkInfo, if I'm reading the code right 'entry->' values get assigned to *last* 'sda' found in the returned qemu data - IOW block.2 and that's what BlkInfo uses for the basis of it's output. So that means Capacity = 20G, actual-size = 20G, allocation = 0. With these settings, allocation = actual-size and physical = return from virStorageSourceUpdatePhysicalSize on a raw device (e.g. the result of a stat() 'st_size'). So again, based upon the algorithms - it appears libvirt is providing the data; however, without having the actual qemu data for each step it's only a somewhat educated assumption. So prior to perhaps 3 or 4 different changes within the BlkInfo space I made over 6-8 months ago - it was deemed libvirt was displaying the wrong values. I cannot speak to the stats implementation as Peter Krempa has a better handle on that. For me this all started as a result of a question in an upstream list which sent me down the rabbit hole into a twisty maze of passages all alike with the sound of a babbling brook to the west (it's an old computer game reference). After much research and digging, changes were proposed and pushed upstream. But those changes now conflict with the rhv expectations and usage. So again, libvirt is stuck between a rock and hard place. Hopefully I've done the data presented thus far some justice. At this point, I don't know what's right or wrong. Unfortunately more analysis and information will be required. Whether that can be done in 3 days, who knows. I wonder though if those cases that are "working" sometimes for you are a result of qemu data being returned in a different order thus not hitting the allocation == capacity issue. That is the qemu data returned that gets parsed during JSON parsing of the returned hunk of data that uses virHashLookup. My assumption has been that data gets keyed off the "sda" name and since each block.#.name is the same for each of the backingStores, what BlockInfo gets for data depends highly on the order of data returned I would think. Ok - looked too long at this - need a fresh day and/or set of eyes. (In reply to John Ferlan from comment #23) > If 'actual-size' is missing from the returned qemu data, then physical = > capacity because the image is not thin provisioned (see > qemuMonitorJSONBlockStatsUpdateCapacityOne). Oh and this is the same > algorithm used by the BlkInfo code to at least get the qemu values. Looking at the libvirt debug logs, there is no actual-size in the results of "query-blockstats", but there is always in the results of "query-block". > Furthermore, domain stats can massage the returned physical value in > qemuDomainGetStatsOneBlock: > > if (entry->physical) { > QEMU_ADD_BLOCK_PARAM_ULL(record, maxparams, block_idx, > "physical", entry->physical); > } else { > if (qemuDomainStorageUpdatePhysical(driver, cfg, dom, src) == 0) { > QEMU_ADD_BLOCK_PARAM_ULL(record, maxparams, block_idx, > "physical", src->physical); > } else { > virResetLastError(); > } > } > > So, if "actual-size == 0" (like you show at the top of comment 17), then the > stats code will go off and change that, which you're calling wrong, Adding the correct physical value instead of 0 is not wrong, it is useful. > but I > need to let Peter answer since he wrote that Stats code. I have a feeling > it's because a physical size == 0 makes zero sense. Furthermore, without the > qemu data I cannot know for sure if actual-size == 0 or 20G in this case. Massaging the allocation value cannot be right, we need the raw value from qemu without change. Actual size in this case is 1g. > Not that I want more data, but you're only showing a partial picture of all > the data at each step, so it's really hard to put together a logically based > response. The data you show is after the copy/mirror starts so what you say > is wrong I can only show from the code why those values are the way they are. > > Based on what you show in comment 17 (cut-n-paste for brevity) here's what I > can deduce: > > <disk type='file' device='disk' snapshot='no'> > <driver name='qemu' type='qcow2' .../> > <source file='*images*f1d72'/> > <backingStore type='file' index='1'> > <format type='raw'/> > <source file='*images*d4bdb'/> > <backingStore/> > </backingStore> > <mirror type='block' job='copy' ready='yes'> > <format type='qcow2'/> > <source dev='*images*f1d72'/> > </mirror> > > and > > block.1.name=sda > block.1.path=*images*f1d72 > block.1.allocation=0 > block.1.capacity=21474836480 > block.1.physical=21474836480 <<< wrong value > block.2.name=sda > block.2.path=*images/*d4bdb > block.2.backingIndex=1 > block.2.allocation=0 > block.2.capacity=21474836480 > block.2.physical=200704 > > and > > virsh # domblkinfo 4 sda > Capacity: 21474836480 > Allocation: 21474836480 <<< wrong, see qemu-img check result bellow > Physical: 196928 > > and > > qemu-img info > > virtual size: 20G (21474836480 bytes) [== domblkinfo capacity] > disk size: 196K [== domblkinfo physical] > > IIRC, the qemu-img data shows "disk size" of 196K for that 'raw' area as a > result of a call to bdrv_getlength(bs) from block/qapi.c - > bdrv_query_image_info() which if I'm reading things right will call > raw_getlength() which I chased once and ended up returning essentially the > stat() st_size value that the libivrt call to > virStorageSourceUpdatePhysicalSize will fetch for a TYPE_FILE. For the > 'block' area it returns 20G for capacity and 0 (zero) for disk size. I tried > following that code once, but got lost quickly. Wait, block.1 is qcow2 file: > block.1.name=sda > block.1.path=*images*f1d72 > block.1.allocation=0 > block.1.capacity=21474836480 > block.1.physical=21474836480 <<< wrong value block.2 is raw sparse file: > block.2.name=sda > block.2.path=*images/*d4bdb > block.2.backingIndex=1 > block.2.allocation=0 > block.2.capacity=21474836480 > block.2.physical=200704 The issue is showing physical size of 21474836480 for the block.1. This is an empty snapshot, and can have no data in this case. I run this test with an empty raw disk - no operating system installed, so no data can be written to any of the volumes. > From the libvirt side... > > For block.1 (mirror type='block'), it would seem to have actual-size == 0 > (it's a block device). This results in a call to > qemuDomainStorageUpdatePhysical. For a block device that's an lseek(). lseek on that empty snapshot should return 196928. We create qcow2 volume on file storage using basically: qemu-img create -f qcow2 -b backing-file file > For > block.2 (format type='raw'), it seems actual-size = 200704. > > Due to how the algorithm works keying off the block name, GetBlockInfo will > use the "last" qemu data returned, which seems to be block.1, but only > because if it was block.2, then I'd expect to see 200704 in the allocation > field Looking at the data - the physical of block.2 looks close to the right value of block.1, and pyysical of block.1 looks like the right value for block.2. > I really think I'd need to see the qemu data to be sure though. What do you mean by qemu data? how do you suggest to extract it? > On to comment 18 or the 'block' example. > > Here we have: > > <disk type='block' device='disk' snapshot='no'> > <driver name='qemu' type='raw' .../> > <source dev='*mnt/blockSD*624a7'/> > ... > <backingStore/> > ... > > [NB: source type = block and source format = raw are different than previous > example. So are the paths obviously. I'm assuming the former was somehow > created as a qcow2 image and the latter is a device backed to some raw file.] This is the base volume, before we start the live storage migration. > > and > > block.1.name=sda > block.1.allocation=0 > block.1.capacity=21474836480 > block.1.physical=21474836480 > > and GetBlockInfo displays: > > Allocation: 0 > > This is deemed OK and right, correct? Right > In this case, what's interesting is it > would seem 'actual-size == 0'. In that case, the stats would change the > physical to be the call to qemuDomainStorageUpdatePhysical. For the BlkInfo, > allocation is set to actual-size (which is 0 for the block device) and it > too calls the same function for physical. So far, so good. > > > Then you do your migration and blockCopy and we have: > > <disk type='block' device='disk' snapshot='no'> > <driver name='qemu' type='qcow2' ... /> > <source dev='*mnt/blockSD*e9677'/> > ... > > <backingStore type='block' index='1'> > <format type='raw'/> > <source dev='*mnt/blockSD*624a7'/> <== e.g. same as "source" > ... > <backingStore/> > </backingStore> > <mirror type='block' job='copy' ready='yes'> > <format type='qcow2'/> > <source dev='*/mnt/blockSD/*e9677'/> > ... > </mirror> > > and > > block.1.name=sda > block.1.path=*/mnt/blockSD/*e9677 > block.1.path=*images*e9677 > block.1.allocation=0 > block.1.capacity=21474836480 > block.1.physical=21474836480 <<< wrong > ... > block.2.name=sda > block.2.path=*/mnt/blockSD/*624a7 > block.2.backingIndex=1 > block.2.allocation=0 > block.2.capacity=21474836480 > block.2.physical=21474836480 > ... > > and domblkinfo 5 sda > > Capacity: 21474836480 > Allocation: 21474836480 << wrong > Physical: 3221225472 > > and qemu-img info > > virtual size: 20G (21474836480 bytes) > disk size: 0 > > In this case for type = block, qemu-img takes a different path to get the > "disk size:" output, but reading that code would take someone that > understands it better from the qemu side as I don't think I could do it > justice! I think the display of 0 just means it cannot get to it because of > being a block device. Right, we expect this from qemu. > > As for the libvirt side... > > For block.1 (mirror type=block) actual-size must be 0 upon return (it's > block, right?) so we go ask virStorageSourceUpdatePhysicalSize. That will > return an lseek() to the end of the device (standard way to get the size of > block device), hence the value. For Block.2 (format type = raw) - I'm > assuming the actual-size is returned as 20G. Right for block.2. For block.1 the size starts at 1g. After calling blockCopy vdsm extend the the lv to 3g. When the vm was paused, the size was 1g, when I run the virsh command the size was 3g. > For BlkInfo, if I'm reading the code right 'entry->' values get assigned to > *last* 'sda' found in the returned qemu data - IOW block.2 and that's what > BlkInfo uses for the basis of it's output. So that means Capacity = 20G, > actual-size = 20G, allocation = 0. With these settings, allocation = > actual-size and physical = return from virStorageSourceUpdatePhysicalSize on > a raw device (e.g. the result of a stat() 'st_size'). Using the values from block.2 cannot work, and does not explain how we get correct values before we start the blockCopy. > So again, based upon the algorithms - it appears libvirt is providing the > data; however, without having the actual qemu data for each step it's only a > somewhat educated assumption. Look again at the data: > Capacity: 21474836480 > Allocation: 21474836480 << wrong > Physical: 3221225472 How can allocation be more than physical? I can easily collect more data in the vdsm code around the blockCopy call. Please suggest what info you need in each step of the process. For example we can do: # run qemu-img info on drive.path... # log output of virsh domstats, domblkinfo... self._dom.blockCopy(drive.name, destxml, flags=flags) # sleep couple of seconds # log output of qemu-img info on drive.path... # log output of virsh domstats, domblkinfo... Or maybe we better reproduce this using virsh, so it will be easier to debug on your side? If you show me how to create a snapshot from existing file and start blockCopy using virsh I can cook a reproducer. Created attachment 1298910 [details]
setup script for creating block storage domains
This script create 2 ovirt-like storage domains backed by loop device.
Created attachment 1298911 [details]
reproducer script
This script reproduce the issue using the block storage domains created by setup.sh.
Created attachment 1298912 [details]
output of the reproducer script
This is the output of the reproduce.sh script, showing both qemu and libvirt info.
John, to reproduce this issue, use the scripts: # attachment 1298910 [details] sh setup.sh # attachment 1298911 [details] sh reproduce.sh I attached also log from my 7.4 host (attachment 1298912 [details]) Looking at the output of query-block in the log, I think the issue is the difference between query-block output before the blockcopy job and during the blockcopy job: Here is cleaned up output before blockcopy: { "device": "drive-scsi0-0-0-0", "inserted": { "backing_file": "/dev/src/base", "backing_file_depth": 1, "drv": "qcow2", "file": "/dev/src/top", "image": { "actual-size": 0, "backing-filename": "/dev/src/base", "backing-filename-format": "raw", "backing-image": { "actual-size": 0, "filename": "/dev/src/base", "format": "raw", "virtual-size": 4294967296 }, "filename": "/dev/src/top", "format": "qcow2", "full-backing-filename": "/dev/src/base", "virtual-size": 4294967296 }, "node-name": "#block119", }, } And during blockcopy: { "device": "drive-scsi0-0-0-0", "inserted": { "backing_file": "/dev/src/top", "backing_file_depth": 2, "drv": "mirror_top", "file": "/dev/src/top", "image": { "backing-filename": "/dev/src/top", "backing-filename-format": "qcow2", "backing-image": { "actual-size": 0, "backing-filename": "/dev/src/base", "backing-filename-format": "raw", "backing-image": { "actual-size": 0, "filename": "/dev/src/base", "format": "raw", "virtual-size": 4294967296 }, "filename": "/dev/src/top", "format": "qcow2", "full-backing-filename": "/dev/src/base", "virtual-size": 4294967296 }, "filename": "/dev/src/top", "format": "mirror_top", "full-backing-filename": "/dev/src/top", "virtual-size": 4294967296 }, }, } During block copy we have a new node in this chain: filename: /dev/src/top backing-filename: /dev/src/top format: mirror_top This node does not have actual-size. Maybe libvirt is confused by this node, treating it as the top layer? In this case libvirt adds fake physical value based on virtual size, and later, this code would change the allocation: if (entry->physical == 0 || info->allocation == 0 || info->allocation == entry->physical) { if (info->allocation == 0) info->allocation = entry->physical; It this is the case, I think we need to do: 1. fix handling of query-block during block copy. 2. avoid modifying allocation for block device, it is always wrong. I think we have similar issue during block commit, when we get the wrong allocation value after the block job is finished. See https://bugzilla.redhat.com/1442266#c7 (libvirt 2.0) Kevin, is the output of query-block during drive-mirror block job in comment 28 expected? was it changed in lately? FWIW: The code in qemuMonitorJSONBlockStatsUpdateCapacityOne which both the Stats and GetInfo API's would use: /* if actual-size is missing, image is not thin provisioned */ if (virJSONValueObjectGetNumberUlong(image, "actual-size", &bstats->physical) < 0) bstats->physical = bstats->capacity; Looks like original commit is 8caded6, but would seemingly only be for stats. Not sure what BlockInfo would be using at the time. Peter says he'll look more in depth on Monday. (In reply to Nir Soffer from comment #29) > Kevin, is the output of query-block during drive-mirror block job in comment > 28 expected? was it changed in lately? This behaviour is new in qemu 2.9. We actually assumed that since libvirt doesn't manage the block graph on a node level, it would only ever look at the top level (which represents a whole guest device), so we were free to make this change. It seems this assumption was wrong... Now that the change is already made (and we do need this automatically inserted node in qemu), I think we should try to make libvirt deal with it instead of making qemu lie about the graph structure. Created attachment 1299921 [details] reproducer log with build 13667657 In this log we see that allocation and physical values are correct during blockcopy. Before blockcopy: *** vm block stats Domain: 'bz1470634' block.0.allocation=0 block.0.capacity=4294967296 block.0.physical=1073741824 block.1.allocation=0 block.1.capacity=4294967296 block.1.physical=4294967296 *** vm block info for sda Capacity: 4294967296 Allocation: 0 Physical: 1073741824 During blockcopy: *** vm block stats Domain: 'bz1470634' block.0.allocation=0 block.0.capacity=4294967296 block.0.physical=1073741824 block.1.allocation=0 block.1.capacity=4294967296 block.1.physical=4294967296 *** vm block info for sda Capacity: 4294967296 Allocation: 0 Physical: 1073741824 The fix looks correct for this flow. Created attachment 1299924 [details] virsh log during ovirt live storage migration with build 13667657 This log show output from virsh commands during live storage migration in ovirt master. The operation completed successfuly with build 13667657. This flow contains snapshot, blockcopy, and blockcommit. However this build does not include the fix for bug 1470127, so wrong allocation value in domblkinfo may be hidden. We really should test a build including both fixes. Before live storage migration ----------------------------- <disk type='block' device='disk' snapshot='no'> <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/> <source dev='/rhev/data-center/mnt/blockSD/aed577ea-d1ca-4ebe-af80-f852c7ce59bb/images/7aab49ae-d987-4592-9cec-a3cb38df5a8d/b16a9e58-c3e8-4344-a2d9-49dbf7e81177'/> <backingStore type='block' index='1'> <format type='qcow2'/> <source dev='/rhev/data-center/mnt/blockSD/aed577ea-d1ca-4ebe-af80-f852c7ce59bb/images/7aab49ae-d987-4592-9cec-a3cb38df5a8d/93331705-46be-4cb8-9dc2-c1559843fd4a'/> <backingStore/> </backingStore> virsh # domstats 1 --block --backing Domain: 'bz1470634' block.0.name=hdc block.0.path=/rhev/data-center/mnt/dumbo.tlv.redhat.com:_voodoo_iso-02/b9cda581-f9ac-4650-83e0-17d00d7dfc75/images/11111111-1111-1111-1111-111111111111/Fedora-Server-dvd-x86_64-25-1.3.iso block.0.allocation=0 block.0.capacity=2018508800 block.0.physical=2018508800 block.1.name=sda block.1.path=/rhev/data-center/mnt/blockSD/aed577ea-d1ca-4ebe-af80-f852c7ce59bb/images/7aab49ae-d987-4592-9cec-a3cb38df5a8d/b16a9e58-c3e8-4344-a2d9-49dbf7e81177 block.1.allocation=27394048 block.1.capacity=8589934592 block.1.physical=1073741824 block.2.name=sda block.2.path=/rhev/data-center/mnt/blockSD/aed577ea-d1ca-4ebe-af80-f852c7ce59bb/images/7aab49ae-d987-4592-9cec-a3cb38df5a8d/93331705-46be-4cb8-9dc2-c1559843fd4a block.2.backingIndex=1 block.2.allocation=0 block.2.capacity=8589934592 block.2.physical=3221225472 # virsh -r domblkinfo 1 sda Capacity: 8589934592 Allocation: 0 Physical: 1073741824 After live storage migration ---------------------------- <disk type='block' device='disk' snapshot='no'> <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/> <source dev='/rhev/data-center/mnt/blockSD/6ffbc483-0031-403a-819b-3bb2f0f8de0a/images/7aab49ae-d987-4592-9cec-a3cb38df5a8d/b16a9e58-c3e8-4344-a2d9-49dbf7e81177'/> <backingStore type='block' index='1'> <format type='qcow2'/> <source dev='/rhev/data-center/mnt/blockSD/6ffbc483-0031-403a-819b-3bb2f0f8de0a/images/7aab49ae-d987-4592-9cec-a3cb38df5a8d/93331705-46be-4cb8-9dc2-c1559843fd4a'/> <backingStore/> </backingStore> <target dev='sda' bus='scsi'/> <serial>7aab49ae-d987-4592-9cec-a3cb38df5a8d</serial> <boot order='1'/> <alias name='scsi0-0-0-0'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> # virsh -r domstats 1 --block --backing Domain: 'bz1470634' block.count=3 block.0.name=hdc block.0.path=/rhev/data-center/mnt/dumbo.tlv.redhat.com:_voodoo_iso-02/b9cda581-f9ac-4650-83e0-17d00d7dfc75/images/11111111-1111-1111-1111-111111111111/Fedora-Server-dvd-x86_64-25-1.3.iso block.0.allocation=0 block.0.capacity=2018508800 block.0.physical=2018508800 block.1.name=sda block.1.path=/rhev/data-center/mnt/blockSD/6ffbc483-0031-403a-819b-3bb2f0f8de0a/images/7aab49ae-d987-4592-9cec-a3cb38df5a8d/b16a9e58-c3e8-4344-a2d9-49dbf7e81177 block.1.allocation=87359488 block.1.capacity=8589934592 block.1.physical=5368709120 block.2.name=sda block.2.path=/rhev/data-center/mnt/blockSD/6ffbc483-0031-403a-819b-3bb2f0f8de0a/images/7aab49ae-d987-4592-9cec-a3cb38df5a8d/93331705-46be-4cb8-9dc2-c1559843fd4a block.2.backingIndex=1 block.2.allocation=0 block.2.capacity=8589934592 block.2.physical=3221225472 # virsh -r domblkinfo 1 sda Capacity: 8589934592 Allocation: 0 Physical: 5368709120 I reproduced it on libvirt-3.2.0-14.el7.x86_64 qemu-kvm-rhev-2.9.0-14.el7.x86_64 1. Prepare a running VM and disk vda as its OS image 2. Create a raw image then attach it: # qemu-img create /var/lib/libvirt/images/HH-vdb.raw 100M Formatting '/var/lib/libvirt/images/HH-vdb.raw', fmt=raw size=104857600 # virsh attach-disk HH /var/lib/libvirt/images/HH-vdb.raw vdb Disk attached successfully # virsh domblkinfo HH vdb Capacity: 104857600 Allocation: 0 Physical: 104857600 3. Create a external snapshot for vdb # virsh snapshot-create-as HH --no-metadata --disk-only --diskspec vda,snapshot=no Domain snapshot 1500367827 created # virsh domblkinfo HH vdb Capacity: 104857600 Allocation: 200704 Physical: 196616 4. Do shallow blockcopy and check domblkinfo # qemu-img create /var/lib/libvirt/images/HH-vdb.copy 100M -f qcow2 Formatting '/var/lib/libvirt/images/HH-vdb.copy', fmt=qcow2 size=104857600 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16 # virsh blockcopy HH vdb /var/lib/libvirt/images/HH-vdb.copy --transient-job --shallow --reuse-external --wait --verbose Block Copy: [100 %] Now in mirroring phase # virsh domblkinfo HH vdb Capacity: 104857600 Allocation: 104857600 Physical: 196616 @Capacity == @Allocation. Bug reproduced. Created attachment 1300580 [details] reproducer log with qemu build 13677788 This is a run of the reproducer script with libvirt fix from bug 1467826, and qemu scratch build 13677788. With this build everything looks fine. Please see the attachment for full details, here is a summary from the full log. Version tested -------------- # rpm -qa | egrep 'libvirt|qemu' | sort ipxe-roms-qemu-20170123-1.git4e85b27.el7.noarch libvirt-client-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-config-nwfilter-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-driver-interface-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-driver-network-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-driver-nodedev-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-driver-nwfilter-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-driver-qemu-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-driver-secret-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-driver-storage-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-driver-storage-core-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-driver-storage-disk-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-driver-storage-gluster-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-driver-storage-iscsi-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-driver-storage-logical-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-driver-storage-mpath-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-driver-storage-rbd-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-driver-storage-scsi-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-kvm-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-libs-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-lock-sanlock-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-python-3.2.0-3.el7.x86_64 qemu-img-rhev-2.9.0-16.el7_4.2.skip_implicit.x86_64 qemu-kvm-common-rhev-2.9.0-16.el7_4.2.skip_implicit.x86_64 qemu-kvm-rhev-2.9.0-16.el7_4.2.skip_implicit.x86_64 Before blockcopy ---------------- *** vm xml ... <disk type='block' device='disk'> <driver name='qemu' type='qcow2' cache='none' io='native'/> <source dev='/dev/src/top'/> <backingStore type='block' index='1'> <format type='raw'/> <source dev='/dev/src/base'/> <backingStore/> </backingStore> <target dev='sda' bus='scsi'/> <alias name='scsi0-0-0-0'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> ... *** vm query-block (cleaned up) { "id": "libvirt-21", "return": [ { "device": "drive-scsi0-0-0-0", "inserted": { "backing_file": "/dev/src/base", "backing_file_depth": 1, "drv": "qcow2", "file": "/dev/src/top", "image": { "actual-size": 0, "backing-filename": "/dev/src/base", "backing-filename-format": "raw", "backing-image": { "actual-size": 0, "filename": "/dev/src/base", "format": "raw", "virtual-size": 4294967296 }, "filename": "/dev/src/top", "format": "qcow2", "full-backing-filename": "/dev/src/base", "virtual-size": 4294967296 }, }, } ] } *** vm block stats Domain: 'bz1470634' block.count=2 block.0.name=sda block.0.path=/dev/src/top block.0.allocation=0 block.0.capacity=4294967296 block.0.physical=1073741824 block.1.name=sda block.1.path=/dev/src/base block.1.backingIndex=1 block.1.allocation=0 block.1.capacity=4294967296 block.1.physical=4294967296 *** vm block info for sda Capacity: 4294967296 Allocation: 0 Physical: 1073741824 After blockcopy --------------- *** vm xml ... <disk type='block' device='disk'> <driver name='qemu' type='qcow2' cache='none' io='native'/> <source dev='/dev/src/top'/> <backingStore type='block' index='1'> <format type='raw'/> <source dev='/dev/src/base'/> <backingStore/> </backingStore> <mirror type='block' job='copy' ready='yes'> <format type='qcow2'/> <source dev='/dev/dst/top'/> </mirror> <target dev='sda' bus='scsi'/> <alias name='scsi0-0-0-0'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> ... *** vm query-block (cleaned up) { "id": "libvirt-30", "return": [ { "device": "drive-scsi0-0-0-0", "inserted": { "backing_file": "/dev/src/base", "backing_file_depth": 1, "drv": "qcow2", "file": "/dev/src/top", "image": { "actual-size": 0, "backing-filename": "/dev/src/base", "backing-filename-format": "raw", "backing-image": { "actual-size": 0, "filename": "/dev/src/base", "format": "raw", "virtual-size": 4294967296 }, "filename": "/dev/src/top", "format": "qcow2", "full-backing-filename": "/dev/src/base", "virtual-size": 4294967296 }, }, } ] } *** vm block stats Domain: 'bz1470634' block.count=2 block.0.name=sda block.0.path=/dev/src/top block.0.allocation=0 block.0.capacity=4294967296 block.0.physical=1073741824 block.1.name=sda block.1.path=/dev/src/base block.1.backingIndex=1 block.1.allocation=0 block.1.capacity=4294967296 block.1.physical=4294967296 *** vm block info for sda Capacity: 4294967296 Allocation: 0 Physical: 1073741824 Created attachment 1300606 [details] Live storage migration log with qemu build 13677788 This is log from live storage migration in ovirt 4.1 using libvirt fix for bug 1467826 and qemu build 13677788. The live storage migration completed without errors, and the output from the virsh commands looks fine. Here is a summary from the full log, see the attachment for the full details: Tested versions --------------- # rpm -qa | egrep 'vdsm|libvirt|qemu' | sort ipxe-roms-qemu-20170123-1.git4e85b27.el7.noarch libvirt-client-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-config-nwfilter-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-driver-interface-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-driver-network-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-driver-nodedev-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-driver-nwfilter-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-driver-qemu-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-driver-secret-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-driver-storage-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-driver-storage-core-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-driver-storage-disk-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-driver-storage-gluster-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-driver-storage-iscsi-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-driver-storage-logical-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-driver-storage-mpath-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-driver-storage-rbd-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-driver-storage-scsi-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-daemon-kvm-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-libs-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-lock-sanlock-3.2.0-15.el7_rc.9a49cd47ef.x86_64 libvirt-python-3.2.0-3.el7.x86_64 qemu-img-rhev-2.9.0-16.el7_4.2.skip_implicit.x86_64 qemu-kvm-common-rhev-2.9.0-16.el7_4.2.skip_implicit.x86_64 qemu-kvm-rhev-2.9.0-16.el7_4.2.skip_implicit.x86_64 vdsm-4.19.20-1.el7ev.x86_64 vdsm-api-4.19.20-1.el7ev.noarch vdsm-client-4.19.20-1.el7ev.noarch vdsm-hook-vmfex-dev-4.19.20-1.el7ev.noarch vdsm-jsonrpc-4.19.20-1.el7ev.noarch vdsm-python-4.19.20-1.el7ev.noarch vdsm-xmlrpc-4.19.20-1.el7ev.noarch vdsm-yajsonrpc-4.19.20-1.el7ev.noarch Before live storage migration ----------------------------- # virsh -r dumpxml 1 ... <disk type='block' device='disk' snapshot='no'> <driver name='qemu' type='raw' cache='none' error_policy='stop' io='native'/> <source dev='/rhev/.../373e8c55-283f-41d4-8433-95c1ef1bbd1a/.../5a2a304a-0fd6-4c5f-a785-8320fee90bf8'/> <backingStore/> <target dev='sda' bus='scsi'/> <serial>959d5609-0bea-4d94-a8bc-ef93b8591c1e</serial> <boot order='1'/> <alias name='scsi0-0-0-0'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> # virsh -r domstats 1 --block --backing Domain: 'bz1470634' block.count=2 block.0.name=hdc block.0.allocation=0 block.0.physical=0 block.1.name=sda block.1.path=/rhev/.../373e8c55-283f-41d4-8433-95c1ef1bbd1a/.../5a2a304a-0fd6-4c5f-a785-8320fee90bf8 block.1.allocation=0 block.1.capacity=21474836480 block.1.physical=21474836480 # virsh -r domblkinfo 1 sda Capacity: 21474836480 Allocation: 0 Physical: 21474836480 # lvs 373e8c55-283f-41d4-8433-95c1ef1bbd1a/5a2a304a-0fd6-4c5f-a785-8320fee90bf8 LV VG Attr LSize 5a2a304a-0fd6-4c5f-a785-8320fee90bf8 373e8c55-283f-41d4-8433-95c1ef1bbd1a -wi-ao---- 20.00g During live-storage-migration ----------------------------- # virsh -r dumpxml 1 ... <disk type='block' device='disk' snapshot='no'> <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/> <source dev='/rhev/data-center/6c99f4e5-8588-46f5-a818-e11151c1d19c/373e8c55-283f-41d4-8433-95c1ef1bbd1a/images/959d5609-0bea-4d94-a8bc-ef93b8591c1e/ab333078-4cb6-4f0e-9539-86a5ed6d8a62'/> <backingStore type='block' index='1'> <format type='raw'/> <source dev='/rhev/data-center/6c99f4e5-8588-46f5-a818-e11151c1d19c/373e8c55-283f-41d4-8433-95c1ef1bbd1a/images/959d5609-0bea-4d94-a8bc-ef93b8591c1e/5a2a304a-0fd6-4c5f-a785-8320fee90bf8'/> <backingStore/> </backingStore> <mirror type='block' job='copy' ready='yes'> <format type='qcow2'/> <source dev='/rhev/data-center/6c99f4e5-8588-46f5-a818-e11151c1d19c/6ffbc483-0031-403a-819b-3bb2f0f8de0a/images/959d5609-0bea-4d94-a8bc-ef93b8591c1e/ab333078-4cb6-4f0e-9539-86a5ed6d8a62'/> </mirror> <target dev='sda' bus='scsi'/> <serial>959d5609-0bea-4d94-a8bc-ef93b8591c1e</serial> <boot order='1'/> <alias name='scsi0-0-0-0'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> # virsh -r domstats 1 --block --backing Domain: 'bz1470634' block.count=3 block.0.name=hdc block.0.allocation=0 block.0.physical=0 block.1.name=sda block.1.path=/rhev/.../373e8c55-283f-41d4-8433-95c1ef1bbd1a/.../ab333078-4cb6-4f0e-9539-86a5ed6d8a62 block.1.allocation=0 block.1.capacity=21474836480 block.1.physical=3221225472 block.2.name=sda block.2.path=/rhev/.../373e8c55-283f-41d4-8433-95c1ef1bbd1a/.../5a2a304a-0fd6-4c5f-a785-8320fee90bf8 block.2.backingIndex=1 block.2.allocation=0 block.2.capacity=21474836480 block.2.physical=21474836480 # virsh -r domblkinfo 1 sda Capacity: 21474836480 Allocation: 0 Physical: 3221225472 # lvs 373e8c55-283f-41d4-8433-95c1ef1bbd1a/ab333078-4cb6-4f0e-9539-86a5ed6d8a62 LV VG Attr LSize ab333078-4cb6-4f0e-9539-86a5ed6d8a62 373e8c55-283f-41d4-8433-95c1ef1bbd1a -wi-ao---- 3.00g After live storage migraiton ---------------------------- # virsh -r dumpxml 1 ... <disk type='block' device='disk' snapshot='no'> <driver name='qemu' type='raw' cache='none' error_policy='stop' io='native'/> <source dev='/rhev/data-center/6c99f4e5-8588-46f5-a818-e11151c1d19c/6ffbc483-0031-403a-819b-3bb2f0f8de0a/images/959d5609-0bea-4d94-a8bc-ef93b8591c1e/5a2a304a-0fd6-4c5f-a785-8320fee90bf8'/> <backingStore/> <target dev='sda' bus='scsi'/> <serial>959d5609-0bea-4d94-a8bc-ef93b8591c1e</serial> <boot order='1'/> <alias name='scsi0-0-0-0'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> ... # virsh -r domstats 1 --block --backing Domain: 'bz1470634' block.count=2 block.0.name=hdc block.0.allocation=0 block.0.physical=0 block.1.name=sda block.1.path=/rhev/.../6ffbc483-0031-403a-819b-3bb2f0f8de0a/.../5a2a304a-0fd6-4c5f-a785-8320fee90bf8 block.1.allocation=0 block.1.capacity=21474836480 block.1.physical=21474836480 # virsh -r domblkinfo 1 sda Capacity: 21474836480 Allocation: 0 Physical: 21474836480 # lvs 6ffbc483-0031-403a-819b-3bb2f0f8de0a/5a2a304a-0fd6-4c5f-a785-8320fee90bf8 LV VG Attr LSize 5a2a304a-0fd6-4c5f-a785-8320fee90bf8 6ffbc483-0031-403a-819b-3bb2f0f8de0a -wi-ao---- 20.00g Kevin's patch to stop qemu reporting the extra information that breaks libvirt in this case is way simpler and less risky than the refactor to libvirt which would be required at this point to achieve the same result. I'm reassigning this to qemu at this point, so that the backport can be tracked there. Reproduced from qemu level. It inserts a node on the top of snapshot chain when mirroring, with no "actual-size" provided. Version: qemu-kvm-rhev-2.9.0-10.el7.x86_64 kernel-3.10.0-679.el7.x86_64 Steps: 1. Launch guest with a secondary disk: /usr/libexec/qemu-kvm -name rhel7_4 -m 1G -smp 2 -drive file=/home/kvm_autotest_root/images/rhel74-64-virtio.qcow2,format=qcow2,id=drive_image1,if=none,cache=none,media=disk -device virtio-blk-pci,drive=drive_image1,id=virtio-blk-0,bootindex=0 -qmp stdio -vnc :3 -netdev tap,id=hostnet0,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=70:e2:84:14:0e:15 \ -drive file=/home/test.raw,id=drive_image2,format=raw,if=none,cache=none,media=disk -device virtio-blk-pci,drive=drive_image2,id=virtio-blk-1 2. Live snapshot: { "execute": "blockdev-snapshot-sync", "arguments": { "device": "drive_image2","snapshot-file": "/home/sn1", "format": "qcow2", "mode": "absolute-paths" } } 3. Drive mirror with option sync=top { "execute": "drive-mirror", "arguments": { "device": "drive_image2", "target":"/home/mirror.qcow2", "sync":"top", "mode": "existing"}} 4. Query block and check the information of drive_image2, it will add an node and the backing_file_depth increases to 2, and this node has no "actual-size": Detail: { 'execute' : 'query-block'} {"device": "drive_image2", ... "inserted": { ... "image": { "backing-image": { "backing-image": { "virtual-size": 104857600, "filename": "/home/test.raw", "actual-size": 0, ... }, "virtual-size": 104857600, "filename": "/home/sn1", "actual-size": 200704, "backing-filename": "/home/test.raw", ... }, "virtual-size": 104857600, "filename": "/home/sn1", "format": "mirror_top", "backing-filename": "/home/sn1" ... }, "node-name": "#block1322", "backing_file_depth": 2, "drv": "mirror_top", "backing_file": "/home/sn1", "file": "/home/sn1", ... }, "type": "unknown"} Verified on qemu-kvm-rhev-2.10.0-4.el7.x86_64: Steps same as comment 51. Result: No extra node inserted in 'query-block' returned information, so no actual-size missed issue any more: {"io-status": "ok", "device": "drive_image2", "locked": false, "removable": false, "inserted": { "iops_rd": 0, "detect_zeroes": "off", "image": { "backing-image": { "virtual-size": 1073741824, "filename": "/home/test.raw", "format": "raw", "actual-size": 0, "dirty-flag": false }, "backing-filename-format": "raw", "virtual-size": 1073741824, "filename": "/home/sn1", "cluster-size": 65536, "format": "qcow2", "actual-size": 200704, ... "full-backing-filename": "/home/test.raw", "backing-filename": "/home/test.raw", "dirty-flag": false }, "iops_wr": 0, "ro": false, "node-name": "#block1075", "backing_file_depth": 1, "drv": "qcow2", "iops": 0, "bps_wr": 0, "write_threshold": 0, "backing_file": "/home/test.raw", ... "file": "/home/sn1", "encryption_key_missing": false }, ... "type": "unknown"} Therefore the issue should be resolved in qemu layer, I am moving this bz to VERIFIED then. Please reopen it if you still meet the problem. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:1104 |