Bug 1102664 - [Scale] - Set new balloon target failed on large scale deployment, and host count vms as running when they not
Summary: [Scale] - Set new balloon target failed on large scale deployment, and host c...
Keywords:
Status: CLOSED DUPLICATE of bug 1045131
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.3.0
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
: ---
Assignee: Martin Sivák
QA Contact:
URL:
Whiteboard: sla
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-05-29 12:16 UTC by Eldad Marciano
Modified: 2019-10-10 09:22 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-05-29 15:43:58 UTC
oVirt Team: SLA
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
vdsm log (753.86 KB, application/x-gzip)
2014-05-29 12:46 UTC, Eldad Marciano
no flags Details
mom logs (287.19 KB, application/zip)
2014-05-29 14:55 UTC, Eldad Marciano
no flags Details
vdsm _1 (10.72 MB, application/zip)
2014-05-29 14:57 UTC, Eldad Marciano
no flags Details
Wrong memory configuration (15.76 KB, image/png)
2014-05-29 15:21 UTC, Martin Sivák
no flags Details
Situation after attempt to reproduce on 3.5 (53.18 KB, image/png)
2014-05-29 15:38 UTC, Martin Sivák
no flags Details

Description Eldad Marciano 2014-05-29 12:16:00 UTC
Description of problem:
while up to 100 vms running on vdsm, 'Set new balloon target failed'.
looks like the problem affected on libvirt connectivity.


beside of, the host compute those vms as running, while them not.
I filtered the running vms in the UI and just 36 of them running.
I have also connect to the host and search for the actual kvm process and i found another number 84.
also it takes long time to get them running.


see the logs:
PolicyEngine::ERROR::2014-05-29 12:00:42,562::vm::4588::vm.Vm::(reportError) vmId=`1293b710-93fb-475f-b14e-33f8fefbec4a`::Set new balloon target failed                                                              
Traceback (most recent call last):                                                                                                                                                                                   
  File "/usr/share/vdsm/vm.py", line 4600, in setBalloonTarget                                                                                                                                                       
    self._dom.setMemory(target)                                                                                                                                                                                      
  File "/usr/share/vdsm/vm.py", line 868, in f                                                                                                                                                                       
    ret = attr(*args, **kwargs)                                                                                                                                                                                      
  File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 76, in wrapper                                                                                                                           
    ret = f(*args, **kwargs)                                                                                                                                                                                         
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1417, in setMemory                                                                                                                                      
    if ret == -1: raise libvirtError ('virDomainSetMemory() failed', dom=self)                                                                                                                                       
libvirtError: invalid argument: cannot set memory higher than max memory                                                                                                                                             
PolicyEngine::DEBUG::2014-05-29 12:00:42,626::libvirtconnection::108::libvirtconnection::(wrapper) Unknown libvirterror: ecode: 8 edom: 10 level: 2 message: invalid argument: cannot set memory higher than max memory                                                                                                                                                                                                                   
PolicyEngine::ERROR::2014-05-29 12:00:42,641::vm::4588::vm.Vm::(reportError) vmId=`866f56a5-603c-48c0-b3b7-38a379ac4569`::Set new balloon target failed                                                              
Traceback (most recent call last):                                                                                                                                                                                   
  File "/usr/share/vdsm/vm.py", line 4600, in setBalloonTarget                                                                                                                                                       
    self._dom.setMemory(target)                                                                                                                                                                                      
  File "/usr/share/vdsm/vm.py", line 868, in f                                                                                                                                                                       
    ret = attr(*args, **kwargs)                                                                                                                                                                                      
  File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 76, in wrapper                                                                                                                           
    ret = f(*args, **kwargs)                                                                                                                                                                                         
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1417, in setMemory                                                                                                                                      
    if ret == -1: raise libvirtError ('virDomainSetMemory() failed', dom=self)                                                                                                                                       
libvirtError: invalid argument: cannot set memory higher than max memory 

Version-Release number of selected component (if applicable):
3.3.3-0.52.el6ev (latest)

How reproducible:
100%

Steps to Reproduce:
1.build a host with 100 vms (vms configuration = 256 ram, 1 cpu, balloon-checked)
3.search in the logs the following lines.
4.host displayed the vms as running, in fact that just 36 of them running.

Actual results:
there is enough space and all vms should run as well.
100 vms with 256

Expected results:
-the balloon computing should get the right free memory or the estimated over commit size.
-host should not report the vm as running unless the actual kvm process is running.


Additional info:
attached logs

Comment 1 Eldad Marciano 2014-05-29 12:27:32 UTC
i found the vms which is not running yet on 'Waiting to launch' status over 33 min.


I treied to follow the vm but there is nothing in the logs beside the xmlrpc of the creation call.

Comment 2 Eldad Marciano 2014-05-29 12:42:58 UTC
Bug 1102701 - [Scale] - there no correlation about the vm status between the engine to vdsm

probably side effect.

Comment 3 Eldad Marciano 2014-05-29 12:46:47 UTC
Created attachment 900342 [details]
vdsm log

Comment 4 Eldad Marciano 2014-05-29 12:51:27 UTC
Hardware configuration:

Host:
-24 cores
-64 GB RAM
-1 TB disk
-1Gb Netweork \ ~128 MB\s
-NFS storage

VM:
-256 GB RAM
-1 CVCPU
-20GB Disk | thin provision



after 55 min looks like all the vms running

Comment 5 Martin Sivák 2014-05-29 13:15:05 UTC
-host should not report the vm as running unless the actual kvm process is running.

It doesn't, that is what WaitForLunch is for.

The exception should not happen though, MOM should be aware that the KVM process has not been started yet.

Comment 6 Martin Sivák 2014-05-29 13:21:31 UTC
Can you also attach /var/log/mom/mom.log please? MOM is not acting on VMs with non-Up state.

But in this case it seems that VDSM reports Up before the process actually starts.

Comment 7 Martin Sivák 2014-05-29 13:25:59 UTC
Also the two VMs from that exception are not present in the VDSM log at all. We need both logs to correlate to be able to debug this.

Comment 9 Eldad Marciano 2014-05-29 14:55:48 UTC
Created attachment 900395 [details]
mom logs

Comment 10 Eldad Marciano 2014-05-29 14:57:18 UTC
Created attachment 900398 [details]
vdsm _1

Comment 11 Martin Sivák 2014-05-29 15:21:31 UTC
Created attachment 900402 [details]
Wrong memory configuration

This screenshot explains the data I got from the server:

vdsClient -s 0 getAllVmStats reported the following for the affected VMs:

balloonInfo = {'balloon_max': '262144', 'balloon_min': '1048576', 'balloon_target': '262144', 'balloon_cur': '262144'}

You might notice that the maximum is bigger than the minimum. That is caused by wrong configuration of the VM. The memory size was configured to be 256MB and the minimum guaranteed memory to be 1024MB. That is not a valid configuration.

Comment 12 Martin Sivák 2014-05-29 15:38:19 UTC
Created attachment 900414 [details]
Situation after attempt to reproduce on 3.5

I tried to reproduce on the 3.5 snapshot and the dialog does not allow to enter the invalid values.

Comment 13 Martin Sivák 2014-05-29 15:43:58 UTC
This issue was caused by misconfiguration that is not longer possible and was fixed in 3.4.

*** This bug has been marked as a duplicate of bug 1045131 ***


Note You need to log in before you can comment on or make changes to this bug.