Created attachment 1239107 [details] logs(you can start looking from the date "Tue Jan 10 15:15:00") Description of problem: Update of the HE VM does not work Version-Release number of selected component (if applicable): ovirt-hosted-engine-ha-2.1.0-0.0.master.20170105095417.20170105095414.git017505b.el7.centos.noarch ovirt-hosted-engine-setup-2.1.0-0.0.master.20170104124556.git776e0f1.el7.centos.noarch ovirt-engine-setup-plugin-ovirt-engine-4.1.0-0.3.beta2.20161221085908.el7.centos.noarch How reproducible: Always Steps to Reproduce: 1. Deploy HE and add master storage domain to the engine to initiate auto-import process(HE VM has 4GB) 2. Enable global maintenance 3. Update OvfUpdateIntervalInMinutes to 1 minute (# engine-config -s OvfUpdateIntervalInMinutes=1 && systemctl restart ovirt-engine) 4. Update HE VM memory to 6GB 5. Wait 5 minutes(to be sure that OVF updated) 6. Restart HE VM(on the host # hosted-engine --vm-poweroff && hosted-engine --vm-start) 7. Check amount of the memory on the HE VM guest OS Actual results: Guest OS has 4GB of the memory Expected results: Guest OS has 8GB of the memory Additional info: I also tried to reduce number of CPU's and it also does not work
Looking at the agent log below shows an overrun. Martin, care to review? 1. 4GiB before the change: MainThread::DEBUG::2017-01-10 15:16:30,408::config::448::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_file_content_from_shared_storage) Reading 'vm.conf' from '/rhev/data-center/mnt/10.35.110.11:_Compute__NFS_alukiano_he__2/c33f3f2a-22ec-4355-aa1c-978579065b02/images/59951756-4398-46aa-92aa-dcb441dae05e/53bfd738-0e61-4779-b04a-2675b490c181' MainThread::DEBUG::2017-01-10 15:16:30,409::heconflib::73::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_dd_pipe_tar) executing: 'dd if=/rhev/data-center/mnt/10.35.110.11:_Compute__NFS_alukiano_he__2/c33f3f2a-22ec-4355-aa1c-978579065b02/images/59951756-4398-46aa-92aa-dcb441dae05e/53bfd738-0e61-4779-b04a-2675b490c181 bs=4k' ... MainThread::DEBUG::2017-01-10 15:16:30,425::heconflib::74::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_dd_pipe_tar) executing: 'tar -xOf - vm.conf' MainThread::DEBUG::2017-01-10 15:16:30,438::heconflib::92::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_dd_pipe_tar) stdout: vmId=d53e5737-22ac-44f9-bb10-3006dee22b05 memSize=4096 2. Then we see the update: MainThread::INFO::2017-01-10 15:16:44,004::config::409::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store) Found an OVF for HE VM, trying to convert MainThread::DEBUG::2017-01-10 15:16:44,009::ovf2VmParams::243::root::(confFromOvf) conf is cpuType=Conroe emulatedMachine=pc-i440fx-rhel7.3.0 ... memSize=6144 3. Then it's back to 4GiB: MainThread::DEBUG::2017-01-10 15:16:44,025::heconflib::142::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(extractConfFile) extracting 'vm.conf' from '/rhev/data-center/mnt/10.35.110.11:_Compute__NFS_alukiano_he__2/c33f3f2a-22ec-4355-aa1c-978579065b02/images/59951756-4398-46aa-92aa-dcb441dae05e/53bfd738-0e61-4779-b04a-2675b490c181' ... MainThread::DEBUG::2017-01-10 15:16:44,025::heconflib::74::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_dd_pipe_tar) executing: 'tar -xOf - vm.conf' MainThread::DEBUG::2017-01-10 15:16:44,038::heconflib::92::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_dd_pipe_tar) stdout: vmId=d53e5737-22ac-44f9-bb10-3006dee22b05 memSize=4096
Artyom, looking at the time diff (0.021 sec) suggests it may be a race. What is the frequency of reproduction?
Artyom, you do not need to do the step no. 3 anymore: 3. Update OvfUpdateIntervalInMinutes to 1 minute (# engine-config -s OvfUpdateIntervalInMinutes=1 && systemctl restart ovirt-engine) But even if you do, what do you see in the webadmin UI? And does this happen when you leave the update interval set to 60s?
To Doron: I tried it 3 times on different setups and all time have the same issue so for it 100% To Martin: 1. Under the UI I can see that HE VM has updated configuration(like memory equal to 6GB) 2. I tried it first with default OVF update interval, when it did not work I changed OVF update interval to 1 minute.
Doron, your comment talks about two different files. The vm.conf and the ovf stores. We have both in the shared storage and only the OVF store is updated (vm.conf is the original configuration as used by setup). The OVF store contains the proper value since the update and never reverts back. So the bug is probably only in the part that decides which file will be used to start the VM.
And that is here: MainThread::INFO::2017-01-10 15:16:44,009::config::414::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store) Got vm.conf from OVF_STORE MainThread::DEBUG::2017-01-10 Here we know that the OVF value was properly extracted. And yet the if not content: returns true and continues to read the fallback config file. ovirt_hosted_engine_ha/env/config.py:239 15:16:44,010::config::448::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_file_content_from_shared_storage) Reading 'vm.conf' from '/rhev/data-center/mnt/10.35.110.11:_Compute__NFS_alukiano_he__2/c33f3f2a-22ec-4355-aa1c-978579065b02/images/59951756-4398-46aa-92aa-dcb441dae05e/53bfd738-0e61-4779-b04a-2675b490c181'
The code I cited was a bit old, but the new one had the issue slightly better hidden. We properly used the OVF content, but then rewrote it when we tried to publish it to the /var/run cache file.
The problem still exists under ovirt-hosted-engine-ha-2.1.0-1.el7ev.noarch
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.
Verified on ovirt-hosted-engine-ha-2.1.0.1-1.el7ev.noarch 1) Memory update - PASS 2) CPU update - PASS 3) Add additional nic - PASS