Bug 1467826
| Summary: | RHEL7.4: virDomainGetBlockInfo always returns alloc=0 on block storage | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Elad <ebenahar> | ||||||
| Component: | libvirt | Assignee: | John Ferlan <jferlan> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | Han Han <hhan> | ||||||
| Severity: | urgent | Docs Contact: | |||||||
| Priority: | urgent | ||||||||
| Version: | 7.4 | CC: | amureini, bmcclain, dyuan, dzheng, fromani, hhan, jferlan, jiyan, jreznik, jsuchane, lmen, mtessun, mzhan, nsoffer, pkrempa, rbalakri, salmy, xuzhang, yisun, ykaul, ylavi | ||||||
| Target Milestone: | rc | Keywords: | Regression, ZStream | ||||||
| Target Release: | --- | ||||||||
| Hardware: | x86_64 | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | libvirt-3.7.0-1.el7 | Doc Type: | If docs needed, set a value | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | |||||||||
| : | 1470127 1473706 (view as bug list) | Environment: | |||||||
| Last Closed: | 2018-04-10 10:52:40 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Bug Depends On: | |||||||||
| Bug Blocks: | 1470127, 1473706 | ||||||||
| Attachments: |
|
||||||||
|
Description
Elad
2017-07-05 09:04:42 UTC
To reproduce using oVirt: 1. Create vm with disk on block storage (in this example we have 2 disks): virsh # dumpxml 1 <domain type='kvm' id='1'> <name>extend</name> <uuid>702b8d03-e246-46d2-808d-e17a54bba708</uuid> <metadata xmlns:ovirt-tune="http://ovirt.org/vm/tune/1.0" xmlns:ovirt-vm="http://ovirt.org/vm/1.0"> <ovirt-tune:qos/> <ovirt-vm:vm xmlns:ovirt-vm="http://ovirt.org/vm/1.0"> <ovirt-vm:agentChannelName>com.redhat.rhevm.vdsm</ovirt-vm:agentChannelName> <ovirt-vm:startTime type="float">1499213407.28</ovirt-vm:startTime> <ovirt-vm:destroy_on_reboot type="bool">False</ovirt-vm:destroy_on_reboot> </ovirt-vm:vm> </metadata> <maxMemory slots='16' unit='KiB'>4194304</maxMemory> <memory unit='KiB'>1048576</memory> <currentMemory unit='KiB'>1048576</currentMemory> <vcpu placement='static' current='2'>16</vcpu> <cputune> <shares>1020</shares> </cputune> <resource> <partition>/machine</partition> </resource> <sysinfo type='smbios'> <system> <entry name='manufacturer'>oVirt</entry> <entry name='product'>oVirt Node</entry> <entry name='version'>7.4-18.el7</entry> <entry name='serial'>94CBF655-4D26-456A-B5D0-40AD803C28E9</entry> <entry name='uuid'>702b8d03-e246-46d2-808d-e17a54bba708</entry> </system> </sysinfo> <os> <type arch='x86_64' machine='pc-i440fx-rhel7.3.0'>hvm</type> <smbios mode='sysinfo'/> </os> <features> <acpi/> </features> <cpu mode='custom' match='exact' check='full'> <model fallback='forbid'>SandyBridge</model> <topology sockets='16' cores='1' threads='1'/> <feature policy='require' name='vme'/> <feature policy='require' name='hypervisor'/> <feature policy='require' name='arat'/> <feature policy='require' name='xsaveopt'/> <numa> <cell id='0' cpus='0-1' memory='1048576' unit='KiB'/> </numa> </cpu> <clock offset='variable' adjustment='0' basis='utc'> <timer name='rtc' tickpolicy='catchup'/> <timer name='pit' tickpolicy='delay'/> <timer name='hpet' present='no'/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='file' device='cdrom'> <driver name='qemu' type='raw'/> <source file='/rhev/data-center/mnt/dumbo.tlv.redhat.com:_voodoo_iso-02/b9cda581-f9ac-4650-83e0-17d00d7dfc75/images/11111111-1111-1111-1111-111111111111/Fedora-Server-dvd-x86_64-25-1.3.iso' startupPolicy='optional'/> <backingStore/> <target dev='hdc' bus='ide'/> <readonly/> <boot order='2'/> <alias name='ide0-1-0'/> <address type='drive' controller='0' bus='1' target='0' unit='0'/> </disk> <disk type='block' device='disk' snapshot='no'> <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/> <source dev='/rhev/data-center/mnt/blockSD/d0812abd-2beb-4103-b00c-b9bc4ac77085/images/4cc7444a-e5a4-48c4-82a3-809088989c08/02415b1a-ec47-40f9-a250-f912ccf1bbde'/> <backingStore type='block' index='1'> <format type='qcow2'/> <source dev='/rhev/data-center/mnt/blockSD/d0812abd-2beb-4103-b00c-b9bc4ac77085/images/4cc7444a-e5a4-48c4-82a3-809088989c08/93331705-46be-4cb8-9dc2-c1559843fd4a'/> <backingStore/> </backingStore> <target dev='sda' bus='scsi'/> <serial>4cc7444a-e5a4-48c4-82a3-809088989c08</serial> <boot order='1'/> <alias name='scsi0-0-0-0'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> <disk type='block' device='disk' snapshot='no'> <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/> <source dev='/rhev/data-center/mnt/blockSD/d0812abd-2beb-4103-b00c-b9bc4ac77085/images/3dadd84a-530d-47db-a434-5dee71119d26/60f9f879-cc10-462f-af34-910c872f58de'/> <backingStore/> <target dev='sdb' bus='scsi'/> <serial>3dadd84a-530d-47db-a434-5dee71119d26</serial> <alias name='scsi0-0-0-1'/> <address type='drive' controller='0' bus='0' target='0' unit='1'/> </disk> <controller type='usb' index='0' model='piix3-uhci'> <alias name='usb'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> </controller> <controller type='scsi' index='0' model='virtio-scsi'> <alias name='scsi0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </controller> <controller type='virtio-serial' index='0' ports='16'> <alias name='virtio-serial0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </controller> <controller type='pci' index='0' model='pci-root'> <alias name='pci.0'/> </controller> <controller type='ide' index='0'> <alias name='ide'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> </controller> <interface type='bridge'> <mac address='00:00:00:ff:41:00'/> <source bridge='ovirtmgmt'/> <target dev='vnet0'/> <model type='virtio'/> <filterref filter='vdsm-no-mac-spoofing'/> <link state='up'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> <channel type='unix'> <source mode='bind' path='/var/lib/libvirt/qemu/channels/702b8d03-e246-46d2-808d-e17a54bba708.com.redhat.rhevm.vdsm'/> <target type='virtio' name='com.redhat.rhevm.vdsm' state='disconnected'/> <alias name='channel0'/> <address type='virtio-serial' controller='0' bus='0' port='1'/> </channel> <channel type='unix'> <source mode='bind' path='/var/lib/libvirt/qemu/channels/702b8d03-e246-46d2-808d-e17a54bba708.org.qemu.guest_agent.0'/> <target type='virtio' name='org.qemu.guest_agent.0' state='disconnected'/> <alias name='channel1'/> <address type='virtio-serial' controller='0' bus='0' port='2'/> </channel> <channel type='spicevmc'> <target type='virtio' name='com.redhat.spice.0' state='disconnected'/> <alias name='channel2'/> <address type='virtio-serial' controller='0' bus='0' port='3'/> </channel> <input type='mouse' bus='ps2'> <alias name='input0'/> </input> <input type='keyboard' bus='ps2'> <alias name='input1'/> </input> <graphics type='spice' tlsPort='5900' autoport='yes' listen='10.35.0.112' defaultMode='secure' passwdValidTo='1970-01-01T00:00:01'> <listen type='network' address='10.35.0.112' network='vdsm-ovirtmgmt'/> <channel name='main' mode='secure'/> <channel name='display' mode='secure'/> <channel name='inputs' mode='secure'/> <channel name='cursor' mode='secure'/> <channel name='playback' mode='secure'/> <channel name='record' mode='secure'/> <channel name='smartcard' mode='secure'/> <channel name='usbredir' mode='secure'/> </graphics> <video> <model type='qxl' ram='65536' vram='8192' vgamem='16384' heads='1' primary='yes'/> <alias name='video0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </video> <memballoon model='virtio'> <alias name='balloon0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </memballoon> <rng model='virtio'> <backend model='random'>/dev/urandom</backend> <alias name='rng0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </rng> </devices> <seclabel type='dynamic' model='selinux' relabel='yes'> <label>system_u:system_r:svirt_t:s0:c57,c95</label> <imagelabel>system_u:object_r:svirt_image_t:s0:c57,c95</imagelabel> </seclabel> <seclabel type='dynamic' model='dac' relabel='yes'> <label>+107:+107</label> <imagelabel>+107:+107</imagelabel> </seclabel> </domain> 2. Login to the vm, and write some data to one of the disks: # time dd if=/dev/zero of=/dev/sdb bs=8M count=66 oflag=direct 66+0 records in 66+0 records out 553648128 bytes (554 MB, 528 MiB) copied, 5.24422 s, 106 MB/s real 0m5.298s user 0m0.002s sys 0m0.138s 3. Check disk allocation virsh # domblkinfo 1 sdb Capacity: 8589934592 Allocation: 0 Physical: 1073741824 In vdsm, we can see that we get alloc=0 on when we check the disk allocation: 2017-07-05 14:19:53,809+0300 INFO (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call Host.getStats succeeded in 0.01 seconds (__init__:583) 2017-07-05 14:19:55,456+0300 DEBUG (periodic/3) [virt.vm] (vmId='702b8d03-e246-46d2-808d-e17a54bba708') Extension info for drive sda volume 02415b1a-ec47-40f9-a250-f912ccf1bbde (capacity=858 9934592, allocated=0, physical=5368709120) (vm:1053) 2017-07-05 14:19:55,458+0300 DEBUG (periodic/3) [virt.vm] (vmId='702b8d03-e246-46d2-808d-e17a54bba708') Extension info for drive sdb volume 60f9f879-cc10-462f-af34-910c872f58de (capacity=858 9934592, allocated=0, physical=1073741824) (vm:1053) 2017-07-05 14:19:57,459+0300 DEBUG (periodic/2) [virt.vm] (vmId='702b8d03-e246-46d2-808d-e17a54bba708') Extension info for drive sda volume 02415b1a-ec47-40f9-a250-f912ccf1bbde (capacity=858 9934592, allocated=0, physical=5368709120) (vm:1053) 2017-07-05 14:19:57,463+0300 DEBUG (periodic/2) [virt.vm] (vmId='702b8d03-e246-46d2-808d-e17a54bba708') Extension info for drive sdb volume 60f9f879-cc10-462f-af34-910c872f58de (capacity=858 9934592, allocated=0, physical=1073741824) (vm:1053) Since vdsm get alloc=0, it never extend the disk. 4. Eventually, disk becomes full, and then the vm pause. For example here I wrote more then 1g, the initial size of the lv: dd if=/dev/sda of=/dev/sdb bs=8M count=129 oflag=direct Vdsm get many messages about this. On each message, vdsm check if some disk should be extended, but since blockInfo always returns alloc=0, no disk is extended: 2017-07-05 14:19:58,716+0300 INFO (libvirt/events) [virt.vm] (vmId='702b8d03-e246-46d2-808d-e17a54bba708') abnormal vm stop device scsi0-0-0-1 error enospc (vm:4430) 2017-07-05 14:19:58,716+0300 INFO (libvirt/events) [virt.vm] (vmId='702b8d03-e246-46d2-808d-e17a54bba708') CPU stopped: onIOError (vm:5359) Vdsm handles ENOSPC errors by checking if drives should be extended: 2017-07-05 14:20:02,195+0300 DEBUG (libvirt/events) [virt.vm] (vmId='702b8d03-e246-46d2-808d-e17a54bba708') Extension info for drive sda volume 02415b1a-ec47-40f9-a250-f912ccf1bbde (capacity=8589934592, allocated=0, physical=5368709120) (vm:1053) 2017-07-05 14:20:02,200+0300 DEBUG (libvirt/events) [virt.vm] (vmId='702b8d03-e246-46d2-808d-e17a54bba708') Extension info for drive sdb volume 60f9f879-cc10-462f-af34-910c872f58de (capacity=8589934592, allocated=0, physical=1073741824) (vm:1053) 2017-07-05 14:20:02,200+0300 INFO (libvirt/events) [virt.vm] (vmId='702b8d03-e246-46d2-808d-e17a54bba708') No VM drives were extended (vm:4437) Since alloc=0, not drives were extended. We get many more ENOSPC messages for some reason. This practically breaks thin provisioning on block storage for RHV. Created attachment 1294548 [details]
Logs for run when vm was paused
In this run I started with 1g LV, writing more then 1g to this device:
dd if=/dev/sda of=/dev/sdb bs=8M count=129
The tarball includes
- output from rpm -qa
- libvirtd.log (using log_filters="1:qemu 1:locking")
- vdsm.log
Looking at libvirtd log - we get this reply from qemu:
2017-07-05 11:19:57.462+0000: 3304: info : qemuMonitorJSONIOProcessLine:213 : QEMU_MONITOR_RECV_REPLY: mon=0x7f341c00f9a0 reply={"return": [{"io-status": "ok", "device": "drive-ide0-1-0", "l
ocked": true, "removable": true, "inserted": {"iops_rd": 0, "detect_zeroes": "off", "image": {"virtual-size": 2018508800, "filename": "/rhev/data-center/mnt/dumbo.tlv.redhat.com:_voodoo_iso-
02/b9cda581-f9ac-4650-83e0-17d00d7dfc75/images/11111111-1111-1111-1111-111111111111/Fedora-Server-dvd-x86_64-25-1.3.iso", "format": "raw", "actual-size": 2018508800, "dirty-flag": false}, "i
ops_wr": 0, "ro": true, "node-name": "#block185", "backing_file_depth": 0, "drv": "raw", "iops": 0, "bps_wr": 0, "write_threshold": 0, "encrypted": false, "bps": 0, "bps_rd": 0, "cache": {"n
o-flush": false, "direct": false, "writeback": true}, "file": "/rhev/data-center/mnt/dumbo.tlv.redhat.com:_voodoo_iso-02/b9cda581-f9ac-4650-83e0-17d00d7dfc75/images/11111111-1111-1111-1111-1
11111111111/Fedora-Server-dvd-x86_64-25-1.3.iso", "encryption_key_missing": false}, "tray_open": false, "type": "unknown"}, {"io-status": "ok", "device": "drive-scsi0-0-0-0", "locked": false
, "removable": false, "inserted": {"iops_rd": 0, "detect_zeroes": "off", "image": {"backing-image": {"virtual-size": 8589934592, "filename": "/rhev/data-center/mnt/blockSD/d0812abd-2beb-4103
-b00c-b9bc4ac77085/images/4cc7444a-e5a4-48c4-82a3-809088989c08/93331705-46be-4cb8-9dc2-c1559843fd4a", "cluster-size": 65536, "format": "qcow2", "actual-size": 0, "format-specific": {"type":
"qcow2", "data": {"compat": "1.1", "lazy-refcounts": false, "refcount-bits": 16, "corrupt": false}}, "dirty-flag": false}, "backing-filename-format": "qcow2", "virtual-size": 8589934592, "fi
lename": "/rhev/data-center/mnt/blockSD/d0812abd-2beb-4103-b00c-b9bc4ac77085/images/4cc7444a-e5a4-48c4-82a3-809088989c08/02415b1a-ec47-40f9-a250-f912ccf1bbde", "cluster-size": 65536, "format
": "qcow2", "actual-size": 0, "format-specific": {"type": "qcow2", "data": {"compat": "1.1", "lazy-refcounts": false, "refcount-bits": 16, "corrupt": false}}, "full-backing-filename": "/rhev
/data-center/mnt/blockSD/d0812abd-2beb-4103-b00c-b9bc4ac77085/images/4cc7444a-e5a4-48c4-82a3-809088989c08/93331705-46be-4cb8-9dc2-c1559843fd4a", "backing-filename": "/rhev/data-center/mnt/bl
ockSD/d0812abd-2beb-4103-b00c-b9bc4ac77085/images/4cc7444a-e5a4-48c4-82a3-809088989c08/93331705-46be-4cb8-9dc2-c1559843fd4a", "dirty-flag": false}, "iops_wr": 0, "ro": false, "node-name": "#
block382", "backing_file_depth": 1, "drv": "qcow2", "iops": 0, "bps_wr": 0, "write_threshold": 0, "backing_file": "/rhev/data-center/mnt/blockSD/d0812abd-2beb-4103-b00c-b9bc4ac77085/images/4
cc7444a-e5a4-48c4-82a3-809088989c08/93331705-46be-4cb8-9dc2-c1559843fd4a", "encrypted": false, "bps": 0, "bps_rd": 0, "cache": {"no-flush": false, "direct": true, "writeback": true}, "file":
"/rhev/data-center/mnt/blockSD/d0812abd-2beb-4103-b00c-b9bc4ac77085/images/4cc7444a-e5a4-48c4-82a3-809088989c08/02415b1a-ec47-40f9-a250-f912ccf1bbde", "encryption_key_missing": false}, "typ
e": "unknown"}, {"io-status": "ok", "device": "drive-scsi0-0-0-1", "locked": false, "removable": false, "inserted": {"iops_rd": 0, "detect_zeroes": "off", "image": {"virtual-size": 858993459
2, "filename": "/rhev/data-center/mnt/blockSD/d0812abd-2beb-4103-b00c-b9bc4ac77085/images/3dadd84a-530d-47db-a434-5dee71119d26/60f9f879-cc10-462f-af34-910c872f58de", "cluster-size": 65536, "
format": "qcow2", "actual-size": 0, "format-specific": {"type": "qcow2", "data": {"compat": "1.1", "lazy-refcounts": false, "refcount-bits": 16, "corrupt": false}}, "dirty-flag": false}, "io
ps_wr": 0, "ro": false, "node-name": "#block727", "backing_file_depth": 0, "drv": "qcow2", "iops": 0, "bps_wr": 0, "write_threshold": 0, "encrypted": false, "bps": 0, "bps_rd": 0, "cache": {
"no-flush": false, "direct": true, "writeback": true}, "file": "/rhev/data-center/mnt/blockSD/d0812abd-2beb-4103-b00c-b9bc4ac77085/images/3dadd84a-530d-47db-a434-5dee71119d26/60f9f879-cc10-4
62f-af34-910c872f58de", "encryption_key_missing": false}, "type": "unknown"}], "id": "libvirt-481"}
Formatting the json in a readable way, and extracting
the interesting device (vol_id=60f9f879-cc10-462f-af34-910c872f58de):
{
"locked": false,
"type": "unknown",
"io-status": "ok",
"removable": false,
"device": "drive-scsi0-0-0-1",
"inserted": {
"bps_rd": 0,
"ro": false,
"backing_file_depth": 0,
"encrypted": false,
"image": {
"cluster-size": 65536,
"format": "qcow2",
"filename": "/rhev/data-center/mnt/blockSD/d0812abd-2beb-4103-b00c-b9bc4ac77085/images/3dadd84a-530d-47db-a434-5dee71119d26/60f9f879-cc10-462f-af34-910c872f58de",
"virtual-size": 8589934592,
"dirty-flag": false,
"format-specific": {
"data": {
"compat": "1.1",
"refcount-bits": 16,
"corrupt": false,
"lazy-refcounts": false
},
"type": "qcow2"
},
"actual-size": 0
},
"cache": {
"no-flush": false,
"writeback": true,
"direct": true
},
"bps_wr": 0,
"drv": "qcow2",
"node-name": "#block727",
"bps": 0,
"iops": 0,
"write_threshold": 0,
"file": "/rhev/data-center/mnt/blockSD/d0812abd-2beb-4103-b00c-b9bc4ac77085/images/3dadd84a-530d-47db-a434-5dee71119d26/60f9f879-cc10-462f-af34-910c872f58de",
"iops_rd": 0,
"encryption_key_missing": false,
"detect_zeroes": "off",
"iops_wr": 0
}
}
Looks like qemu does not return the expected allocation info.
Kevin, do you think this is actually qemu issue?
Created attachment 1294559 [details]
qemu log from /var/log/libvirt/qemu/vmname.log
In order for me to reproduce it would help to have a few details about :
In oVirt:
1. Create a guest with thin provisioned disk resides on a block-based storage (VDSM creates a 1G LV)
2. Perform write operation to the guest storage (could be acheived by simply installing OS on the guest)
IOW: steps to create the <source dev='$path'...> for targets sda and sdb
At first I assumed when you say thin-provisioned, I thought libvirt logical pools with sparse lv's, but I think you mean lv's that you've created using real thin provisioning (via --thinpool, I assume).
I do believe something from commit id 'b9b1aa63' is at fault here... trying to reconfigure my system to the test config I used in order to generate the logic. Something was certainly strange with some backing file format... Although in relooking at the code:
if (entry->physical == 0 || info->allocation == 0 ||
info->allocation == entry->physical) {
info->allocation = entry->physical;
if (info->allocation == 0)
info->allocation = entry->physical;
...
I do wonder who caused me to do that setting and the check.
In your case, physical (e.g. actual-size == 0), so I can see why the value is the way it is...
(In reply to John Ferlan from comment #8) > In order for me to reproduce it would help to have a few details about : > > In oVirt: > 1. Create a guest with thin provisioned disk resides on a block-based > storage (VDSM creates a 1G LV) Right > 2. Perform write operation to the guest storage (could be acheived by simply > installing OS on the guest) Right, installing os typically an extend. An easier way to test this is to hot-plug a disk using a block device, and write directly to the disk within the guest. If libvirt reports correct allocation values we are ok. > IOW: steps to create the <source dev='$path'...> for targets sda and sdb > > At first I assumed when you say thin-provisioned, I thought libvirt logical > pools with sparse lv's, but I think you mean lv's that you've created using > real thin provisioning (via --thinpool, I assume). No, we use ovirt-thin provisioning, which is a regular lv. We extend the lv when the highest offset reach the threshold (by default, physical - 512m). > I do believe something from commit id 'b9b1aa63' is at fault here... trying > to reconfigure my system to the test config I used in order to generate the > logic. Something was certainly strange with some backing file format... > Although in relooking at the code: > > if (entry->physical == 0 || info->allocation == 0 || > info->allocation == entry->physical) { > info->allocation = entry->physical; > if (info->allocation == 0) > info->allocation = entry->physical; > ... > I do wonder who caused me to do that setting and the check. > > In your case, physical (e.g. actual-size == 0), so I can see why the value > is the way it is... Looks like this manipulation is correct only for file based disks. Still not enough details - the first few lines above which you "Right"'d on - are copies of the problem statement. I can either hotplug or coldplug and start. It really doesn't matter that's not the issue being chased. For me to test a fix - it would be far easier to know how to create a similar disk device in my guest to what's been created. I don't have the secret decoder right for that. I don't use ovirt-thin provisioning so I have no idea how the file is created. That's what I need. It'd also be useful to get the values for entry->physical and entry->wr_highest_offset in the first else where info->allocation gets set. I'm still working on resetting up my environment... I got the alloc=0 result by thin LV:
1. Create thin LV
# ls /dev/sdd
sdd 8:48 1 14.6G 0 disk
└─sdd1 8:49 1 14.6G 0 part
# pvcreate /dev/sdd1
# vgcreate vg /dev/sdd1
# lvcreate -n pool0 -L 10G vg
# lvcreate -n pool0meta -L 1G vg
# lvconvert --type thin-pool --poolmetadata vg/pool0meta vg/pool0
# lvcreate -n thin1 -V 1T --thinpool pool0 vg
# lvcreate -n thin2 -V 1T --thinpool pool0 vg
# lvdisplay vg/thin2
LV Path /dev/vg/thin2
LV Name thin2
VG Name vg
LV UUID zfznA1-0EgD-z8m1-Fizk-833I-CqRR-R11UYQ
LV Write Access read/write
LV Creation host, time no-server.com, 2017-07-06 14:19:24 +0800
LV Pool name pool0
LV Status available
# open 0
LV Size 1.00 TiB
Mapped size 0.00%
Current LE 262144
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 8192
Block device 253:7
2. Start VM with the thin lv
# virsh domblklist n2
Target Source
------------------------------------------------
vda /var/lib/libvirt/images/n2.qcow2
vdb /dev/mapper/vg-thin2
# virsh domblkinfo n2 vdb
Capacity: 1099511627776
Allocation: 0
Physical: 1099511627776
# virsh qemu-monitor-command n2 '{"execute":"query-block"}'
{"return":[{"io-status":"ok","device":"drive-virtio-disk0","locked":false,"removable":false,"inserted":{"iops_rd":0,"detect_zeroes":"off","image":{"virtual-size":10737418240,"filename":"/var/lib/libvirt/images/n2.qcow2","cluster-size":65536,"format":"qcow2","actual-size":1286078464,"format-specific":{"type":"qcow2","data":{"compat":"1.1","lazy-refcounts":false,"refcount-bits":16,"corrupt":false}},"dirty-flag":false},"iops_wr":0,"ro":false,"node-name":"#block185","backing_file_depth":0,"drv":"qcow2","iops":0,"bps_wr":0,"write_threshold":0,"encrypted":false,"bps":0,"bps_rd":0,"cache":{"no-flush":false,"direct":false,"writeback":true},"file":"/var/lib/libvirt/images/n2.qcow2","encryption_key_missing":false},"type":"unknown"},{"io-status":"ok","device":"drive-virtio-disk1","locked":false,"removable":false,"inserted":{"iops_rd":0,"detect_zeroes":"off","image":{"virtual-size":1099511627776,"filename":"/dev/mapper/vg-thin2","format":"raw","actual-size":0,"dirty-flag":false},"iops_wr":0,"ro":false,"node-name":"#block310","backing_file_depth":0,"drv":"raw","iops":0,"bps_wr":0,"write_threshold":0,"encrypted":false,"bps":0,"bps_rd":0,"cache":{"no-flush":false,"direct":false,"writeback":true},"file":"/dev/mapper/vg-thin2","encryption_key_missing":false},"type":"unknown"}],"id":"libvirt-22"}
# lvdisplay vg/thin2
/dev/mapper/36001405d688ea2ccd464b41901158e03: read failed after 0 of 4096 at 0: Input/output error
/dev/mapper/36001405d688ea2ccd464b41901158e03: read failed after 0 of 4096 at 5368643584: Input/output error
/dev/mapper/36001405d688ea2ccd464b41901158e03: read failed after 0 of 4096 at 5368700928: Input/output error
/dev/mapper/36001405d688ea2ccd464b41901158e03: read failed after 0 of 4096 at 4096: Input/output error
--- Logical volume ---
LV Path /dev/vg/thin2
LV Name thin2
VG Name vg
LV UUID zfznA1-0EgD-z8m1-Fizk-833I-CqRR-R11UYQ
LV Write Access read/write
LV Creation host, time no-server.com, 2017-07-06 14:19:24 +0800
LV Pool name pool0
LV Status available
# open 1
LV Size 1.00 TiB
Mapped size 0.00%
Current LE 262144
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 8192
Block device 253:7
3. Write 1G data to the thin LV in VM and check again
(in vm) # dd if=/dev/urandom of=/dev/vdb bs=300M count=20
dd: warning: partial read (33554431 bytes); suggest iflag=fullblock
0+20 records in
0+20 records out
671088620 bytes (671 MB) copied, 2.75019 s, 244 MB/s
# virsh domblkinfo n2 vdb
Capacity: 1099511627776
Allocation: 0
Physical: 1099511627776
# virsh qemu-monitor-command n2 '{"execute":"query-block"}'
{"return":[{"io-status":"ok","device":"drive-virtio-disk0","locked":false,"removable":false,"inserted":{"iops_rd":0,"detect_zeroes":"off","image":{"virtual-size":10737418240,"filename":"/var/lib/libvirt/images/n2.qcow2","cluster-size":65536,"format":"qcow2","actual-size":1286078464,"format-specific":{"type":"qcow2","data":{"compat":"1.1","lazy-refcounts":false,"refcount-bits":16,"corrupt":false}},"dirty-flag":false},"iops_wr":0,"ro":false,"node-name":"#block185","backing_file_depth":0,"drv":"qcow2","iops":0,"bps_wr":0,"write_threshold":0,"encrypted":false,"bps":0,"bps_rd":0,"cache":{"no-flush":false,"direct":false,"writeback":true},"file":"/var/lib/libvirt/images/n2.qcow2","encryption_key_missing":false},"type":"unknown"},{"io-status":"ok","device":"drive-virtio-disk1","locked":false,"removable":false,"inserted":{"iops_rd":0,"detect_zeroes":"off","image":{"virtual-size":1099511627776,"filename":"/dev/mapper/vg-thin2","format":"raw","actual-size":0,"dirty-flag":false},"iops_wr":0,"ro":false,"node-name":"#block310","backing_file_depth":0,"drv":"raw","iops":0,"bps_wr":0,"write_threshold":0,"encrypted":false,"bps":0,"bps_rd":0,"cache":{"no-flush":false,"direct":false,"writeback":true},"file":"/dev/mapper/vg-thin2","encryption_key_missing":false},"type":"unknown"}],"id":"libvirt-25"}
Command 'lvdisplay vg/thin2' hangs when VM is running.
After vm is destroyed, the result:
LV Path /dev/vg/thin2
LV Name thin2
VG Name vg
LV UUID zfznA1-0EgD-z8m1-Fizk-833I-CqRR-R11UYQ
LV Write Access read/write
LV Creation host, time no-server.com, 2017-07-06 14:19:24 +0800
LV Pool name pool0
LV Status available
# open 1
LV Size 1.00 TiB
Mapped size 0.06%
Current LE 262144
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 8192
Block device 253:7
Please check the result
The disk xml in comment11 is: # virsh dumpxml n2| awk '/<disk/,/<\/disk/' <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/var/lib/libvirt/images/n2.qcow2'/> <target dev='vda' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/> </disk> <disk type='block' device='disk'> <driver name='qemu' type='raw'/> <source dev='/dev/mapper/vg-thin2'/> <target dev='vdb' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/> </disk> Soffer, I am not familiar with vdsm APIs. Could you help check if the thin LV creation steps are the same with what vdsm does? Thanks Perfect - thanks Han - that's exactly what I needed. I can now (easily) reproduce what's being seen. BTW: query-blockstats and search on wr_highest_offset in order to "see" what is supposed to be used by libvirt to get the allocation value. The commit referenced in comment 9 is what the problem is. In particular: > info->allocation = entry->physical; > if (info->allocation == 0) > info->allocation = entry->physical; Should seemingly just drop the: > info->allocation = entry->physical; I'm wondering now if that was an editing remnant or some stray (and perhaps erroneous) edge condition that I was trying to handle. I've reconfigured a guest to have the environment from my notes and I'm checking on that now. In any case, I should have a patch posted upstream sometime today. Vdsm does not use thinpool lv, but is good enough to reproduce the issue. To test ovirt-like "thin" lv, you can do: pvcreate /dev/sdxxx vgcreate ovirt-vg /dev/sdxxx lvcreatae --name ovirt-lv --size 1g ovit-vg qemu-img create -f qcow2 /dev/ovirt-vg/ovirt-lv 10g And use /dev/ovirt-vg/ovirt-lv as a disk. Here is a simple way to reproduce this, given the lv created in commit 15:
$ sudo virt-install \
--name bz1467826 \
--memory 1024 \
--disk /dev/ovirt-vg/ovirt-lv,format=qcow2,driver_type=qcow2 \
--cdrom Fedora-Server-dvd-x86_64-25-1.3.iso
During the installation, we can see:
virsh # domblkinfo 6 vda
Capacity: 5368709120
Allocation: 0
Physical: 1073741824
(continued from comment 16) virsh # qemu-monitor-command 6 '{"execute":"query-blockstats"}' {"return":[{"device":"drive-virtio-disk0","parent":{"stats":{"flush_total_time_ns":0,"wr_highest_offset":1076109312,... At this point the vm is paused, we can extend the lv and resume: lvextend --size +1g ovirt-vg/ovirt-lv virsh # resume 6 Domain 6 resumed virsh # qemu-monitor-command 6 '{"execute":"query-blockstats"}' {"return":[{"device":"drive-virtio-disk0","parent":{"stats":{"flush_total_time_ns":0,"wr_highest_offset":1245970432,... Thanks for the example... I've posted a patch upstream that will remove an unconditional setting of @allocation. Still not quite clear what caused me to put it there other than perhaps disjoint patches... If you start at commit id 'cf436a56' and work backwards you'll see a number of adjustments in the code where IIRC the goal was to try and make common code paths between storage pools and domains. Upstream patch for this issue: https://www.redhat.com/archives/libvir-list/2017-July/msg00143.html This is now pushed upstream:
commit fde654be5307a570b7b0f31537e18e70a274cd50
Author: John Ferlan <jferlan>
Date: Thu Jul 6 14:08:38 2017 -0400
qemu: Fix qemuDomainGetBlockInfo allocation value setting
...
Commit id 'b9b1aa639' was supposed to add logic to set the allocation
for sparse files when wr_highest_offset was zero; however, an unconditional
setting was done just prior. For block devices, this means allocation is
always returning 0 since 'actual-size' will be zero.
Remove the unconditional setting and add the note about it being possible
to still be zero for block devices. As soon as the guest starts writing to
the volume, the allocation value will then be obtainable from qemu via
the wr_highest_offset.
$ git describe fde654be5307a570b7b0f31537e18e70a274cd50
v3.5.0-72-gfde654b
$
Why should we expect the downstream build to test? This is urgent. Yaniv - the best I can say is as long as the process takes. Unfortunately for me I'm not in the office today as I have family business out of town. I'm sure someone will pick up the rest of this "today" at some point. (In reply to John Ferlan from comment #21) > This is now pushed upstream: > > commit fde654be5307a570b7b0f31537e18e70a274cd50 > Author: John Ferlan <jferlan> > Date: Thu Jul 6 14:08:38 2017 -0400 > > qemu: Fix qemuDomainGetBlockInfo allocation value setting > > ... > > Commit id 'b9b1aa639' was supposed to add logic to set the allocation > for sparse files when wr_highest_offset was zero; however, an > unconditional > setting was done just prior. For block devices, this means allocation is > always returning 0 since 'actual-size' will be zero. > > Remove the unconditional setting and add the note about it being possible > to still be zero for block devices. As soon as the guest starts writing > to > the volume, the allocation value will then be obtainable from qemu via > the wr_highest_offset. > > $ git describe fde654be5307a570b7b0f31537e18e70a274cd50 > v3.5.0-72-gfde654b > $ This patch works in virsh. Test with the scratch build contains this patch. Steps: Prepare a LV. # lvs VG/lv2 LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert lv2 VG -wi-a----- 10.00g Attach the disk to VM: # virsh attach-disk HH /dev/mapper/VG-lv2 vdb Disk attached successfully # virsh domblkinfo HH vdb Capacity: 10737418240 Allocation: 0 Physical: 10737418240 Write some data in VM and check domblkinfo: (in VM)# dd if=/dev/urandom of=/dev/vdb bs=10M count=1 1+0 records in 1+0 records out 10485760 bytes (10 MB) copied, 0.879529 s, 11.9 MB/s # virsh domblkinfo HH vdb Capacity: 10737418240 Allocation: 10485760 Physical: 10737418240 Extend the lv and check domblkinfo: # lvextend --size +1g VG/lv2 Size of logical volume VG/lv2 changed from 10.00 GiB (2560 extents) to 11.00 GiB (2816 extents). Logical volume VG/lv2 successfully resized. # virsh domblkinfo HH vdb Capacity: 10737418240 Allocation: 10485760 Physical: 11811160064 All works well. Bug fixed. Hi John, Is the 'Allocation' of domblkinfo equal to the highest offset of write or read byte? Short answer is it's complicated... and can vary depending on active or inactive domain. The BlockInfo is (and has been) a bit vague about what gets returned for Allocation when the domain is running. There's a bit of history on this that I ended up following when I made the prior change, but essentially for an active domain the @allocation ends up being the highest written offset "once written" or the value that QEMU returns for actual-size (which is similar to what 'du' would return) if not written yet (IOW: the highest read). For a device backed to a file, the answer is generally speaking spot on. For a device backed to a block device, the answer gets a lot more complicated which is what all the code in the area of change is handling. If you read the commit message from http://libvirt.org/git/?p=libvirt.git;a=commit;h=b9b1aa639 and from http://libvirt.org/git/?p=libvirt.git;a=commit;h=0282ca45a I think you'll get a bit more understanding of the "history". That first commit also references a couple of other commits which could be followed. That second commit notes that physical cannot be obtained via virsh which is not true any more since one could pass the --physical flag and we'd get the physical value as well. There's another bz on that - I don't have the reference handy. Personally, I think for the purposes of what's being "required" for oVirt to get the high water mark of the device for the active domain, perhaps it would be better to use the DomainStats API because it always returns the offset of the highest written sector and only works for active domains. For reference, see: http://libvirt.org/html/libvirt-libvirt-domain.html#virDomainBlockInfo and http://libvirt.org/html/libvirt-libvirt-domain.html#virConnectGetAllDomainStats Test it on libvirt-3.7.0-2.el7.x86_64 qemu-kvm-rhev-2.9.0-16.el7_4.6.x86_64. Passed. 1. Start a VM 2. Attach a block device to the VM: # virsh attach-disk test /dev/mapper/rhel_dell--op740--04-test vdb Disk attached successfully 3. Check with domblkinfo # virsh domblkinfo test vdb Capacity: 10737418240 Allocation: 0 Physical: 10737418240 4. In VM, write 4096 bytes to the block device (In VM)# dd if=/dev/urandom of=/dev/vdb count=1 bs=4096 1+0 records in 1+0 records out 4096 bytes (4.1 kB) copied, 0.00253738 s, 1.6 MB/s 5. Check if domblkinfo's Allocation value mathes the size of writing: # virsh domblkinfo test vdb Capacity: 10737418240 Allocation: 4096 Physical: 10737418240 The result is expected. Verified on libvirt-3.9.0-1.el7.x86_64 qemu-kvm-rhev-2.10.0-4.el7.x86_64 as steps in comment31. Passed. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:0704 |