Bug 1160568
| Summary: | [PPC][vdsm] VM enters 'Paused' on 'no storage space error' while writing to its thin-provision disk | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [oVirt] vdsm | Reporter: | Elad <ebenahar> | ||||||||
| Component: | General | Assignee: | Nobody <nobody> | ||||||||
| Status: | CLOSED NOTABUG | QA Contact: | Raz Tamir <ratamir> | ||||||||
| Severity: | low | Docs Contact: | |||||||||
| Priority: | unspecified | ||||||||||
| Version: | --- | CC: | alitke, amureini, bazulay, bugproxy, bugs, ebenahar, fnovak, lbopf, lpeer, lsurette, nsoffer, scohen, srevivo, tnisan, ykaul, ylavi | ||||||||
| Target Milestone: | ovirt-4.2.0 | Flags: | rule-engine:
ovirt-4.2+
|
||||||||
| Target Release: | --- | ||||||||||
| Hardware: | All | ||||||||||
| OS: | Unspecified | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | Environment: | ||||||||||
| Last Closed: | 2017-03-19 16:55:04 UTC | Type: | Bug | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Embargoed: | |||||||||||
| Attachments: |
|
||||||||||
------- Comment From fnovak.com 2014-11-13 14:49 EDT------- reverse mirror Bug 1160568 - [PPC][vdsm] VM enters 'Paused' on 'no storage space error' while writing to its thin-provision disk located on FC domain Thin provisioned block volumes can be extended two ways (proactive and reactive). Proactive extension is the preferred approach and the one you describe above. Under normal circumstances we are able to detect that the VM will run out of space soon and preemptively extend the volume. If we don't handle it in time then libvirt pauses the VM and raises an event. We then reactively perform the extension and unpause the VM. This is the fallback scenario and VM installation is a known case that exercises this flow due to the high rate of block allocation that happens when installing to a fresh disk. I am inclined to say that everything is working as designed here but let's confirm a few things... After looking at vdsm.log it appears that the VM was stopped 2 or 3 times and only briefly (until reactive extension completed). It also looks like the VM then ran normally thereafter. Can you confirm these two observations? If so, then I don't think there is anything to change here. As a workaround, vdsm provides tunables which could make the scanning more responsive to these types of scenarios: [irs] volume_utilization_percent = 50 volume_utilization_chunk_mb = 1024 vol_size_sample_interval = 60 As you can see, by default we only check once per minute if extension is required. You could specify a smaller interval in /etc/vdsm/vdsm.conf to check more frequently. Also, you could increase the chunk_mb value to 2048 so that extensions are bigger each time. It might be interesting to detect when this scenario is happening and automatically boost those values for a limited time, but that would be something for 3.6 at the earliest. Adam, thanks for explenation, very informative. > After looking at vdsm.log it appears that the VM was stopped 2 or 3 times > and only briefly (until reactive extension completed). It also looks like > the VM then ran normally thereafter. Can you confirm these two > observations? If so, then I don't think there is anything to change here. Confirmed > As a workaround, vdsm provides tunables which could make the scanning more > responsive to these types of scenarios: > > [irs] > volume_utilization_percent = 50 > volume_utilization_chunk_mb = 1024 > vol_size_sample_interval = 60 > > As you can see, by default we only check once per minute if extension is > required. You could specify a smaller interval in /etc/vdsm/vdsm.conf to > check more frequently. Also, you could increase the chunk_mb value to 2048 > so that extensions are bigger each time. > > It might be interesting to detect when this scenario is happening and > automatically boost those values for a limited time, but that would be > something for 3.6 at the earliest. That can be great. But still, the fact that a VM enters to paused just because user installs OS on it gives an impression that something is broken in the system. I'd suggest to increase the volume_utilization_chunk_mb value if a high speed HBA/NIC is used for accessing the storage. I encountered the issue while using a host that has 4GB FC HBA. It can be much more noticeable while using higher bandwidth. In light of Comment 3, I am converting this to an RFE. The code is currently working as designed but could be improved. In a future release, it might be nice to boost the vol_size_sample_interval and volume_utilization_chunk_mb parameters for VMs which are experiencing frequent pauses due to -EIO. Created attachment 979345 [details]
logs
default comment by bridge
(In reply to Adam Litke from comment #2) > Thin provisioned block volumes can be extended two ways (proactive and > reactive). Proactive extension is the preferred approach and the one you > describe above. Under normal circumstances we are able to detect that the > VM will run out of space soon and preemptively extend the volume. If we > don't handle it in time then libvirt pauses the VM and raises an event. We > then reactively perform the extension and unpause the VM. Actually the reactive extension is pretty useless, since we check if a drive needs extension every 2 seconds. On average, it can save 1 second, which is not important since spm check for extension request once every 3 seconds, and hsm check for extension reply once every 3 seconds. > As a workaround, vdsm provides tunables which could make the scanning more > responsive to these types of scenarios: This is not a workaround, just tunables so we can tune the values in the field if needed. > [irs] > volume_utilization_percent = 50 > volume_utilization_chunk_mb = 1024 > vol_size_sample_interval = 60 vol_size_sample_interval is not related to extension and has no effect. > As you can see, by default we only check once per minute if extension is > required. We check every 2 seconds, the relevant configuration is: [vars] # How often should we sample each vm for statistics (seconds). vm_watermark_interval = 2 (yes the documentation is bogus) It think we should understand better why the vm got paused in this case, before we plan fancy dynamic solutions. In the average case, extension flow can look like this: Time Operation 00.00 We detect low free space and send an extend request 01.50 SPM get the request and perform lvextend this can block if many operation run in parallel due to lvm internal locks. Lets assume it took 0.5 seconds 02.00 SPM send extension reply 03.50 HSM get extension reply and perform lvrefresh operation this can block if many operation run in parallel due to lvm internal locks. Lets assume it took 0.5 second 04.00 disk was extended In the worst case, it can take extra 3 seconds, or just block forever if one of the lvm operation get suck :-) Maybe the issue is blocking lvm operations, like in bug 1195421? Nir, I used your awesome new repoplot tool to examine the vdsm log and I am not seeing any signs of LVM contention for this case. I am pretty sure they are just installing from a very fast source and are hitting a few pauses along the way. (In reply to Adam Litke from comment #7) > Nir, > > I used your awesome new repoplot tool to examine the vdsm log and I am not > seeing any signs of LVM contention for this case. I am pretty sure they are > just installing from a very fast source and are hitting a few pauses along > the way. Can you attach your plot to the bug for inspection? How about monitoring the write rate in vdsm, for example, keep a running average for the last 15 reads, and adapt the extension size based on the rate? With the current defaults, we check every 2 seconds and extend takes 3-6 seconds, so our extend rate is about 1 GiB / 8s. If write rate is higher, we need to increase the extend rate. Using a running average, our extend rate will change nicely based on the actual write rate. This is not a RFE, it's a bug. The current design maybe working as designed, but it is causing this bug. Any change to design should be to resolve this. There reverting the change. Please answer the needinfo from Nir. Created attachment 1239209 [details]
repoplot analysis
This repoplot from the vdsm logs shows normal storage conditions with acceptable lvm command latency. As I stated earlier, this bug is a classic case of a VM writing faster than our extension monitoring can keep up with.
For such write rates, we just need to change the default conf. Pushing out. |
Created attachment 953936 [details] logs Description of problem: I have a VM with a thin provision disk attached to it. The disk is located on a FC storage domain. When writing to the disk from the guest, the VM enters to Paused due to no space left error. After a few seconds, the VM is resumed. It seems that the volume extension is not being done fast enough, which causes the VM disk to consume all its free space allocated to it on the storage before vdsm is able to extend it to the new actual size. (Encountered the bug on an IBM PPC environment) Version-Release number of selected component (if applicable): vdsm-4.14.17-1.pkvm2_1.1.ppc64 libvirt-1.1.3-1.pkvm2_1.17.11.ppc64 qemu-1.6.0-2.pkvm2_1.17.10.ppc64 rhev 3.4.3 rhevm-3.4.3-1.2.el6ev.noarch How reproducible: Always Steps to Reproduce: 1. Create a VM with a thin provision disk attached located on a FC domain 2. Install OS on the guest 3. Actual results: During OS installation, the VM enters to Paused state. It seems that the writing to the disk is being done to fast for vdsm and it is not manage to extend the disk before it runs out of space. VM is stopped due to no space error: libvirtEventLoop::INFO::2014-11-05 07:20:46,696::vm::4602::vm.Vm::(_onIOError) vmId=`bf87e50e-b931-4504-b5c8-4d704369da34`::abnormal vm stop device virtio-disk1 error enospc Extension request is being sent after the VM was stopped: libvirtEventLoop::DEBUG::2014-11-05 07:20:46,701::vm::2608::vm.Vm::(__extendDriveVolume) vmId=`bf87e50e-b931-4504-b5c8-4d704369da34`::Requesting an extension for the volume: {'newSize': 3072, 'domainID': 'e4556645 -98aa-44e5-834b-bfa49279cbc2', 'name': u'vdb', 'volumeID': 'b21d482b-9a0e-42a6-937c-50c45d67b81b', 'imageID': 'd3aec65f-44dd-48f0-a9d9-c9dfd7586f97', 'poolID': '3e0aea33-e35d-4968-a30c-f69af942a328'} Expected results: The extension request should be sent before the disk runs out of space. The VM should not enter to Paused Additional info: I wasn't able to reproduce the issue on a rhev3.5 environment (with FC). Could be becouse of FC HBA speed differences between the 3.4.3 hosts and the 3.5 hosts. Attaching: engine.log vdsm.log supervdsm.log libvirt.log qemu log for the specific VM messages sanlock.log audit logs /usr/share/hwdata/pci.ids from the host for the FC HW info