Description of problem: while up to 100 vms running on vdsm, 'Set new balloon target failed'. looks like the problem affected on libvirt connectivity. beside of, the host compute those vms as running, while them not. I filtered the running vms in the UI and just 36 of them running. I have also connect to the host and search for the actual kvm process and i found another number 84. also it takes long time to get them running. see the logs: PolicyEngine::ERROR::2014-05-29 12:00:42,562::vm::4588::vm.Vm::(reportError) vmId=`1293b710-93fb-475f-b14e-33f8fefbec4a`::Set new balloon target failed Traceback (most recent call last): File "/usr/share/vdsm/vm.py", line 4600, in setBalloonTarget self._dom.setMemory(target) File "/usr/share/vdsm/vm.py", line 868, in f ret = attr(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 76, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1417, in setMemory if ret == -1: raise libvirtError ('virDomainSetMemory() failed', dom=self) libvirtError: invalid argument: cannot set memory higher than max memory PolicyEngine::DEBUG::2014-05-29 12:00:42,626::libvirtconnection::108::libvirtconnection::(wrapper) Unknown libvirterror: ecode: 8 edom: 10 level: 2 message: invalid argument: cannot set memory higher than max memory PolicyEngine::ERROR::2014-05-29 12:00:42,641::vm::4588::vm.Vm::(reportError) vmId=`866f56a5-603c-48c0-b3b7-38a379ac4569`::Set new balloon target failed Traceback (most recent call last): File "/usr/share/vdsm/vm.py", line 4600, in setBalloonTarget self._dom.setMemory(target) File "/usr/share/vdsm/vm.py", line 868, in f ret = attr(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 76, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1417, in setMemory if ret == -1: raise libvirtError ('virDomainSetMemory() failed', dom=self) libvirtError: invalid argument: cannot set memory higher than max memory Version-Release number of selected component (if applicable): 3.3.3-0.52.el6ev (latest) How reproducible: 100% Steps to Reproduce: 1.build a host with 100 vms (vms configuration = 256 ram, 1 cpu, balloon-checked) 3.search in the logs the following lines. 4.host displayed the vms as running, in fact that just 36 of them running. Actual results: there is enough space and all vms should run as well. 100 vms with 256 Expected results: -the balloon computing should get the right free memory or the estimated over commit size. -host should not report the vm as running unless the actual kvm process is running. Additional info: attached logs
i found the vms which is not running yet on 'Waiting to launch' status over 33 min. I treied to follow the vm but there is nothing in the logs beside the xmlrpc of the creation call.
Bug 1102701 - [Scale] - there no correlation about the vm status between the engine to vdsm probably side effect.
Created attachment 900342 [details] vdsm log
Hardware configuration: Host: -24 cores -64 GB RAM -1 TB disk -1Gb Netweork \ ~128 MB\s -NFS storage VM: -256 GB RAM -1 CVCPU -20GB Disk | thin provision after 55 min looks like all the vms running
-host should not report the vm as running unless the actual kvm process is running. It doesn't, that is what WaitForLunch is for. The exception should not happen though, MOM should be aware that the KVM process has not been started yet.
Can you also attach /var/log/mom/mom.log please? MOM is not acting on VMs with non-Up state. But in this case it seems that VDSM reports Up before the process actually starts.
Also the two VMs from that exception are not present in the VDSM log at all. We need both logs to correlate to be able to debug this.
Created attachment 900395 [details] mom logs
Created attachment 900398 [details] vdsm _1
Created attachment 900402 [details] Wrong memory configuration This screenshot explains the data I got from the server: vdsClient -s 0 getAllVmStats reported the following for the affected VMs: balloonInfo = {'balloon_max': '262144', 'balloon_min': '1048576', 'balloon_target': '262144', 'balloon_cur': '262144'} You might notice that the maximum is bigger than the minimum. That is caused by wrong configuration of the VM. The memory size was configured to be 256MB and the minimum guaranteed memory to be 1024MB. That is not a valid configuration.
Created attachment 900414 [details] Situation after attempt to reproduce on 3.5 I tried to reproduce on the 3.5 snapshot and the dialog does not allow to enter the invalid values.
This issue was caused by misconfiguration that is not longer possible and was fixed in 3.4. *** This bug has been marked as a duplicate of bug 1045131 ***