Created attachment 1540976 [details] Increasing HE vm memory from UI attempt Description of problem: When trying to increase default amount of memory allocated to HE vm TO 64GB, the memory stays at default value (16GB). This issue is blocking us from proceeding with our regression tests. We tried to increase the HE vm memory with different approaches: 1. From UI --> edit HE vm --> system --> specify 65536 in appropriate memory boxes. 2. editing /run/ovirt-hosted-engine-ha/vm.conf file on the host which runs the HE vm 3. Making a custom vm.conf file None of the specified above methods worked for us. The fact of absence of documentation for the correct procedure of memory increase, makes the task even harder. Version-Release number of selected component (if applicable): ovirt-engine-4.3.1.1-0.1.el7.noarch rpm -qa | grep rhv-release rhv-release-4.3.1-3-001.noarch How reproducible: 100% Steps to Reproduce: 1. UI method: Edit the HE vm --> system --> set "maximum memory", "guaranteed", "memory" to 65536 --> click ok. see video attached 2. editing vm.conf on the Host that runs the HE vm: set maintenance to global, turn off the HE vm, service ovirt-ha-broker stop (in case it use using the file) vim /run//ovirt-hosted-engine-ha/vm.conf setting value of "memSize" to 65536 and saving the file Start the HE vm, and maintenance to 'none' Actual results: 1. UI method: when setting the memory values to desired ones (65536), clicking 'ok' button wont give us any result at all. no errors in engine log, no notifications in the UI. 3. editing the vm.conf file method: shortly after you modify the value, it gets overwritten to the default one (memSize=16384). Expected results: UI method: should work, or at least give a meaningful error or notification. vm.conf method : The file shouldn't be overwritten, or at least we think so.
Ilan please attach relevant logs failure in 'Balloon operation is not available' 019-03-05 14:20:11,637+0000 ERROR (jsonrpc/7) [api] FINISH setBalloonTarget error=Balloon operation is not available (api:129) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 122, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 702, in setBalloonTarget return self.vm.setBalloonTarget(target) File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 5471, in setBalloonTarget raise exception.BalloonError(str(e)) BalloonError: Balloon operation is not available 2019-03-05 14:20:11,638+0000 INFO (jsonrpc/7) [api.virt] FINISH setBalloonTarget return={'status': {'message': 'Balloon operation is not available', 'code': 53}} from=::1,37694, vmId=a9a6eaa8-cd51-4d86-89b0-3f8b64dd9f21 (api:52) 2019-03-05 14:20:11,638+0000 INFO (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call VM.setBalloonTarget failed (error 53) in 0.00 seconds (__init__:573) 2019-03-05 14:20:12,576+0000 INFO (jsonrpc/4) [jsonrpc.JsonRpcServer] RPC call Host.ping2 succeeded in 0.00 seconds (__init__:573) 2019-03-05 14:20:12,578+0000 INFO (jsonrpc/1) [api.host] START getCapabilities() from=::1,42324 (api:46) 2019-03-05 14:20:12,934+0000 INFO (jsonrpc/2) [vdsm.api] START getSpmStatus(spUUID=u'bae47151-c1ef-41f7-8361-ae4e1810cb90', options=None) from=::ffff:10.12.69.30,56166, task_id=baabb8a8-19fa-4bcb-bf48-a36be6089106 (api:46) 2019-03-05 14:20:12,940+0000 INFO (jsonrpc/2) [vdsm.api] FINISH getSpmStatus return={'spm_st': {'spmId': 1, 'spmStatus': 'SPM', 'spmLver': 4L}} from=::ffff:10.12.69.30,56166, task_id=baabb8a8-19fa-4bcb-bf48-a36be6089106 (api:52) 2019-03-05 14:20:12,940+0000 INFO (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC call StoragePool.getSpmStatus succeeded in 0.00 seconds (__init__:573) 2019-03-05 14:20:13,051+0000 INFO (jsonrpc/3) [vdsm.api] START getStoragePoolInfo(spUUID=u'bae47151-c1ef-41f7-8361-ae4e1810cb90', options=None) from=::ffff:10.12.69.30,56196, task_id=b32bc65a-c26a-4caa-917d-c244807714b2 (api:46) Possibly related to https://bugzilla.redhat.com/show_bug.cgi?id=1523835
Simone, do we have a balloon driver in hosted engine?
Updating the BZ that we upgraded to 4.3.2-0.1.el7 downstream we arent currently seeing the issue reproduce though more testing is required. I want to have at least a few more resource changes via UI before closing this as not a bug - I currently can't do that because the enviroment is in use and changing the resources will affect the vm resources and our monitoring results. @dvaanunu - can you upgrade to latest downstream on your 04 env and upgrade your (HE vm), (Host holding VM with reboot) and then see if you can change cpu/memory/memory guaranteed via UI of hosted-engine ?
(In reply to Ryan Barry from comment #3) > Simone, do we have a balloon driver in hosted engine? Upstream appliance is based on Centos 7.6 and downstream one on RHEL 7.6 so, AFAIK, we don't need special drivers.
Restoring other need info
scale-04 env: Host - rhv-release-4.3.2-1-001.noarch Hosted Engine - rhv-release-4.3.2-1-001.noarch Hosted Engine - 16GB Memory Change Hosted-Engine memory vi UI: Before Changes: Memory Size - 16384 MB Maximum memory - 65536 MB Physical Memory Guaranteed - 16384 MB ssh to Hosted-Engine: [root@scale-hosted-engine-04 ~]# free -h total used free shared buff/cache available Mem: 15G 2.1G 12G 44M 552M 13G Swap: 8.0G 0B 8.0G Edit Hosted-Engine Memory using UI to 32GB: Memory Size - 32768 MB Maximum memory - 65536 MB Physical Memory Guaranteed - 32768 MB (Change automatically) Press "OK" [root@scale-hosted-engine-04 ~]# free -h total used free shared buff/cache available Mem: 31G 2.8G 28G 49M 589M 27G Swap: 8.0G 0B 8.0G Verify change in UI --> Edit Hosted-Engine Memory After Changes: Memory Size - 16384 MB Maximum memory - 65536 MB Physical Memory Guaranteed - 32768 MB
This bug report has Keywords: Regression or TestBlocker. Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.
I'm trying to increase mine from 2048 to 4096 and I can verify this bug is taking place in 4.3.7.2
Can you retest it using latest 4.4? https://bugzilla.redhat.com/show_bug.cgi?id=1523835 should fix it.
Certainly will try when 4.4 reaches GA.
*** This bug has been marked as a duplicate of bug 1523835 ***