Created attachment 927425 [details] vdsm+engine logs Description of problem: When running vm with thin provision disks on a free space deficit domain (disk virtual size > domain's free space). vdsm writes to vm's disk until fail,with multiple trace backs. 9f9e08df-1e64-4c7b-a073-05eaeb11af51::ERROR::2014-08-17 11:39:37,021::storage_mailbox::172::Storage.SPM.Messages.Extend::(processRequest) processRequest: Exception caught while trying to extend volume: b574fb8b-f1f3-4bf8-b082-078888fd2627 in domain: dd4677a6-8d22-4d6b-837f-c5556dd15ab2 Traceback (most recent call last): File "/usr/share/vdsm/storage/storage_mailbox.py", line 166, in processRequest pool.extendVolume(volume['domainID'], volume['volumeID'], size) File "/usr/share/vdsm/storage/securable.py", line 77, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1300, in extendVolume sdCache.produce(sdUUID).extendVolume(volumeUUID, size, isShuttingDown) File "/usr/share/vdsm/storage/blockSD.py", line 1315, in extendVolume lvm.extendLV(self.sdUUID, volumeUUID, size) # , isShuttingDown) File "/usr/share/vdsm/storage/lvm.py", line 1143, in extendLV _resizeLV("lvextend", vgName, lvName, size) File "/usr/share/vdsm/storage/lvm.py", line 1137, in _resizeLV free_size / constants.MEGAB)) VolumeGroupSizeError: Volume Group not big enough: ('dd4677a6-8d22-4d6b-837f-c5556dd15ab2/b574fb8b-f1f3-4bf8-b082-078888fd2627 4096 > 512 (MiB)',) 861c4431-b605-4ae3-9421-092e7cc3fc0c::ERROR::2014-08-17 11:39:38,217::task::866::Storage.TaskManager.Task::(_setError) Task=`b3b2eaea-00e2-4b5b-992d-fdec1d24b9bf`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 873, in _run return fn(*args, **kargs) File "/usr/share/vdsm/virt/vm.py", line 2630, in __afterVolumeExtension apparentSize, trueSize = self.__verifyVolumeExtension(volInfo) File "/usr/share/vdsm/virt/vm.py", line 2603, in __verifyVolumeExtension (volInfo['name'], volInfo['domainID'], volInfo['volumeID'])) 861c4431-b605-4ae3-9421-092e7cc3fc0c::ERROR::2014-08-17 11:39:38,221::threadPool::209::Storage.ThreadPool.WorkerThread::(_processNextTask) Task <function runTask at 0x20b9848> failed Traceback (most recent call last): File "/usr/share/vdsm/storage/threadPool.py", line 201, in _processNextTask cmd(args) File "/usr/share/vdsm/storage/storage_mailbox.py", line 79, in runTask ctask.prepare(cmd, *args) File "/usr/share/vdsm/storage/task.py", line 103, in wrapper return m(self, *a, **kw) File "/usr/share/vdsm/storage/task.py", line 1179, in prepare raise self.error RuntimeError: Volume extension failed for vdb (domainID: dd4677a6-8d22-4d6b-837f-c5556dd15ab2, volumeID: b574fb8b-f1f3-4bf8-b082-078888fd2627) Version-Release number of selected component (if applicable): rc1 How reproducible: 100% Steps to Reproduce: Setup: -have a domain with 3G free space -vm with OS+disk thin(virtual size > 3,the disk is on the free space deficit domain) 1.run engine-config -s FreeSpaceCriticalLowInGB=1 2.run service ovirt-engine restart 3.run vm and dd to its disk until multiple failures Actual results: vdsm fail to extend the volume,vm crashes shortly after Expected results: 1.engine should not allow FreeSpaceCriticalLowInGB to be set to negative values (there's already a bug on it BZ # 1130030 ) 2.engine should check vm's disk's virtual size and compare it to domain's threshold correctly before running the vm Additional info:
(In reply to Ori from comment #0) > Actual results: > vdsm fail to extend the volume,vm crashes shortly after If there's no space on the domain there's no way to extend this disk. What do you expect to happen here? > > Expected results: > 1.engine should not allow FreeSpaceCriticalLowInGB to be set to negative > values (there's already a bug on it BZ # 1130030 ) Agreed - handled as part of bug 1130030 > 2.engine should check vm's disk's virtual size and compare it to domain's > threshold correctly before running the vm I disagree. IMO, this is the point of over-committing. If the admin is willing to overcommit in this fashion, I see no reason in preventing the VM from running. Since this is essentially an SLA decision - Scott, your call.
(In reply to Allon Mureinik from comment #1) > > 2.engine should check vm's disk's virtual size and compare it to domain's > > threshold correctly before running the vm > I disagree. IMO, this is the point of over-committing. If the admin is > willing to overcommit in this fashion, I see no reason in preventing the VM > from running. Since this is essentially an SLA decision - Scott, your call. Closing based on this comment. Scott, if you feel this is wrong, please reopen and specify the required behavior.
This is considered a feature, not a bug, since it allows users to over commit storage for future planning of storage addition on a need basis. We also display allocated space and over commit ratio in the UI, so we inform user of the status of this. So no action item here.