Bug 1685569 - [scale] HE - unable to increase HE vm memory to 64GB
Summary: [scale] HE - unable to increase HE vm memory to 64GB
Keywords:
Status: CLOSED DUPLICATE of bug 1523835
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Virt
Version: 4.3.1
Hardware: x86_64
OS: Linux
unspecified
urgent with 1 vote
Target Milestone: ovirt-4.4.0
: ---
Assignee: Liran Rotenberg
QA Contact: Ilan Zuckerman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-03-05 14:15 UTC by Ilan Zuckerman
Modified: 2020-06-25 11:06 UTC (History)
10 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2020-03-19 12:32:11 UTC
oVirt Team: Virt
Embargoed:
pm-rhel: ovirt-4.4+
pm-rhel: blocker?


Attachments (Terms of Use)
Increasing HE vm memory from UI attempt (2.60 MB, video/webm)
2019-03-05 14:15 UTC, Ilan Zuckerman
no flags Details

Description Ilan Zuckerman 2019-03-05 14:15:12 UTC
Created attachment 1540976 [details]
Increasing HE vm memory from UI attempt

Description of problem:
When trying to increase default amount of memory allocated to HE vm TO 64GB, the memory stays at default value (16GB).
This issue is blocking us from proceeding with our regression tests.

We tried to increase the HE vm memory with different approaches:
1. From UI --> edit HE vm --> system --> specify 65536 in appropriate memory boxes.
2. editing  /run/ovirt-hosted-engine-ha/vm.conf  file on the host which runs the HE vm
3. Making a custom vm.conf file

None of the specified above methods worked for us.
The fact of absence of documentation for the correct procedure of memory increase, makes the task even harder.

Version-Release number of selected component (if applicable):
ovirt-engine-4.3.1.1-0.1.el7.noarch

rpm -qa | grep rhv-release
rhv-release-4.3.1-3-001.noarch

How reproducible:
100%

Steps to Reproduce:
1. UI method: Edit the HE vm --> system --> set "maximum memory", "guaranteed", "memory" to 65536 --> click ok. see video attached

2. editing vm.conf on the Host that runs the HE vm: 
set maintenance to global,
turn off the HE vm,
service ovirt-ha-broker stop (in case it use using the file)
vim /run//ovirt-hosted-engine-ha/vm.conf
setting value of "memSize" to 65536 and saving the file
Start the HE vm, and maintenance to 'none'

Actual results:
1. UI method: when setting the memory values to desired ones (65536), clicking 'ok' button wont give us any result at all. no errors in engine log, no notifications in the UI. 

3. editing the vm.conf file method: shortly after you modify the value, it gets overwritten to the default one (memSize=16384).

Expected results:
UI method: should work, or at least give a meaningful error or notification.
vm.conf method : The file shouldn't be overwritten, or at least we think so.

Comment 1 mlehrer 2019-03-05 14:28:15 UTC
Ilan please attach relevant logs failure in 'Balloon operation is not available'

019-03-05 14:20:11,637+0000 ERROR (jsonrpc/7) [api] FINISH setBalloonTarget error=Balloon operation is not available (api:129)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 122, in method
    ret = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 702, in setBalloonTarget
    return self.vm.setBalloonTarget(target)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 5471, in setBalloonTarget
    raise exception.BalloonError(str(e))
BalloonError: Balloon operation is not available
2019-03-05 14:20:11,638+0000 INFO  (jsonrpc/7) [api.virt] FINISH setBalloonTarget return={'status': {'message': 'Balloon operation is not available', 'code': 53}} from=::1,37694, vmId=a9a6eaa8-cd51-4d86-89b0-3f8b64dd9f21 (api:52)
2019-03-05 14:20:11,638+0000 INFO  (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call VM.setBalloonTarget failed (error 53) in 0.00 seconds (__init__:573)
2019-03-05 14:20:12,576+0000 INFO  (jsonrpc/4) [jsonrpc.JsonRpcServer] RPC call Host.ping2 succeeded in 0.00 seconds (__init__:573)
2019-03-05 14:20:12,578+0000 INFO  (jsonrpc/1) [api.host] START getCapabilities() from=::1,42324 (api:46)
2019-03-05 14:20:12,934+0000 INFO  (jsonrpc/2) [vdsm.api] START getSpmStatus(spUUID=u'bae47151-c1ef-41f7-8361-ae4e1810cb90', options=None) from=::ffff:10.12.69.30,56166, task_id=baabb8a8-19fa-4bcb-bf48-a36be6089106 (api:46)
2019-03-05 14:20:12,940+0000 INFO  (jsonrpc/2) [vdsm.api] FINISH getSpmStatus return={'spm_st': {'spmId': 1, 'spmStatus': 'SPM', 'spmLver': 4L}} from=::ffff:10.12.69.30,56166, task_id=baabb8a8-19fa-4bcb-bf48-a36be6089106 (api:52)
2019-03-05 14:20:12,940+0000 INFO  (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC call StoragePool.getSpmStatus succeeded in 0.00 seconds (__init__:573)
2019-03-05 14:20:13,051+0000 INFO  (jsonrpc/3) [vdsm.api] START getStoragePoolInfo(spUUID=u'bae47151-c1ef-41f7-8361-ae4e1810cb90', options=None) from=::ffff:10.12.69.30,56196, task_id=b32bc65a-c26a-4caa-917d-c244807714b2 (api:46)

Possibly related to https://bugzilla.redhat.com/show_bug.cgi?id=1523835

Comment 3 Ryan Barry 2019-03-12 11:50:55 UTC
Simone, do we have a balloon driver in hosted engine?

Comment 4 mlehrer 2019-03-12 12:17:39 UTC
Updating the BZ that we upgraded to 4.3.2-0.1.el7 downstream we arent currently seeing the issue reproduce though more testing is required.
I want to have at least a few more resource changes via UI before closing this as not a bug - I currently can't do that because the enviroment is in use and changing the resources will affect the vm resources and our monitoring results. 

@dvaanunu - can you upgrade to latest downstream  on your 04 env and upgrade your (HE vm), (Host holding VM with reboot) and then see if you can change cpu/memory/memory guaranteed via UI of hosted-engine ?

Comment 5 Simone Tiraboschi 2019-03-12 12:38:25 UTC
(In reply to Ryan Barry from comment #3)
> Simone, do we have a balloon driver in hosted engine?

Upstream appliance is based on Centos 7.6 and downstream one on RHEL 7.6 so, AFAIK, we don't need special drivers.

Comment 6 Simone Tiraboschi 2019-03-12 12:39:36 UTC
Restoring other need info

Comment 7 David Vaanunu 2019-03-17 07:55:55 UTC
scale-04 env:
   Host - rhv-release-4.3.2-1-001.noarch
   Hosted Engine - rhv-release-4.3.2-1-001.noarch

Hosted Engine - 16GB Memory
Change Hosted-Engine memory vi UI:

Before Changes:
    Memory Size - 16384 MB
    Maximum memory - 65536 MB
    Physical Memory Guaranteed - 16384 MB

    ssh to Hosted-Engine:
             [root@scale-hosted-engine-04 ~]# free -h
                      total        used        free      shared  buff/cache   available
       Mem:            15G        2.1G         12G         44M        552M         13G
       Swap:          8.0G          0B        8.0G

Edit Hosted-Engine Memory using UI to 32GB:
    Memory Size - 32768 MB
    Maximum memory - 65536 MB
    Physical Memory Guaranteed - 32768 MB (Change automatically)
    Press "OK"

             [root@scale-hosted-engine-04 ~]# free -h
                      total        used        free      shared  buff/cache   available
      Mem:            31G        2.8G         28G         49M        589M         27G
      Swap:          8.0G          0B        8.0G




Verify change in UI --> Edit Hosted-Engine Memory

After Changes:
    Memory Size - 16384 MB
    Maximum memory - 65536 MB
    Physical Memory Guaranteed - 32768 MB

Comment 9 RHEL Program Management 2019-04-09 03:06:21 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 10 Clint Goudie 2019-12-08 17:14:57 UTC
I'm trying to increase mine from 2048 to 4096 and I can verify this bug is taking place in 4.3.7.2

Comment 11 Liran Rotenberg 2020-01-15 11:43:42 UTC
Can you retest it using latest 4.4?
https://bugzilla.redhat.com/show_bug.cgi?id=1523835 should fix it.

Comment 14 Clint Goudie 2020-01-15 16:16:35 UTC
Certainly will try when 4.4 reaches GA.

Comment 15 Ryan Barry 2020-03-19 12:32:11 UTC

*** This bug has been marked as a duplicate of bug 1523835 ***


Note You need to log in before you can comment on or make changes to this bug.