Bug 1304387 - HE VM hot plug is not working, it's missing the real max number of cpu definition
HE VM hot plug is not working, it's missing the real max number of cpu defini...
Status: CLOSED CURRENTRELEASE
Product: ovirt-engine
Classification: oVirt
Component: BLL.HostedEngine (Show other bugs)
3.6.2.6
x86_64 Linux
high Severity medium (vote)
: ovirt-4.0.4
: 4.0.4
Assigned To: Martin Sivák
Artyom
: Triaged
Depends On: 1326817
Blocks:
  Show dependency treegraph
 
Reported: 2016-02-03 08:21 EST by Artyom
Modified: 2017-05-11 05:27 EDT (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: We did not set the maximum allowed number of CPUs for the hosted engine VM. Consequence: Hosted engine CPU hotplug was not working. Fix: The maximum number of CPUs is properly computed and passed to the hosted engine configuration. Result: Hosted engine CPU hotplug should work now.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-09-26 08:32:34 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: SLA
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
ykaul: ovirt‑4.0.z+
mgoldboi: planning_ack+
msivak: devel_ack+
mavital: testing_ack+


Attachments (Terms of Use)
engine and vdsm logs (1.57 MB, application/zip)
2016-02-03 08:21 EST, Artyom
no flags Details
vdsm and engine logs (2.51 MB, application/zip)
2016-09-04 09:48 EDT, Artyom
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 55541 master MERGED HE: Store maximum number of vcpus to the OVF file 2016-07-13 09:23 EDT
oVirt gerrit 55818 master MERGED Ignore memory and CPU hotplug errors for hosted engine VM 2016-04-11 03:18 EDT
oVirt gerrit 55932 ovirt-engine-3.6 MERGED Ignore memory and CPU hotplug errors for hosted engine VM 2016-04-11 07:00 EDT
oVirt gerrit 55938 ovirt-engine-3.6.5 MERGED Ignore memory and CPU hotplug errors for hosted engine VM 2016-04-11 08:05 EDT
oVirt gerrit 61307 ovirt-engine-4.0 MERGED HE: Store maximum number of vcpus to the OVF file 2016-07-31 09:12 EDT

  None (edit)
Description Artyom 2016-02-03 08:21:26 EST
Created attachment 1120783 [details]
engine and vdsm logs

Description of problem:
Cpu hotplug not works for HE vm with error message:
Message: Failed to hot set number of CPUS to VM HostedEngine. Underlying error message: invalid argument: requested vcpus is greater than max allowable vcpus for the domain: 4 > 2

Version-Release number of selected component (if applicable):
rhevm-3.6.2.6-0.1.el6.noarch

How reproducible:
Always

Steps to Reproduce:
1. Deploy hosted-engine
2. Enter to the engine and attach storage domain(to start auto-import process)
3. Update HE vm sockets in hotplug mode

Actual results:
Hotplug cpu failed on the HE vm with error

Expected results:
Hotplug cpu works for HE vm


Additional info:
dumpxml show me <vcpu placement='static'>2</vcpu>
Comment 1 Red Hat Bugzilla Rules Engine 2016-02-17 06:40:22 EST
Bug tickets must have version flags set prior to targeting them to a release. Please ask maintainer to set the correct version flags and only then set the target milestone.
Comment 2 Artyom 2016-04-13 12:10:00 EDT
Checked on rhevm-3.6.5.3-0.1.el6.noarch
CPU hotplug still does not work for HE VM, with the same error.
After HE VM restarted I can see new values for CPU, but bug connects to CPU hotplug, so I move it to assigned.
Comment 3 Red Hat Bugzilla Rules Engine 2016-04-13 12:10:05 EDT
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.
Comment 4 Yaniv Kaul 2016-04-14 03:40:38 EDT
Moving to 3.6.6, this is certainly not a blocker.
Comment 5 Red Hat Bugzilla Rules Engine 2016-04-17 03:59:14 EDT
Bug tickets must have version flags set prior to targeting them to a release. Please ask maintainer to set the correct version flags and only then set the target milestone.
Comment 6 Sandro Bonazzola 2016-05-02 06:00:00 EDT
Moving from 4.0 alpha to 4.0 beta since 4.0 alpha has been already released and bug is not ON_QA.
Comment 7 Yaniv Lavi 2016-05-23 09:16:22 EDT
oVirt 4.0 beta has been released, moving to RC milestone.
Comment 8 Yaniv Lavi 2016-05-23 09:20:12 EDT
oVirt 4.0 beta has been released, moving to RC milestone.
Comment 10 Red Hat Bugzilla Rules Engine 2016-07-25 06:33:55 EDT
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.
Comment 11 Artyom 2016-09-04 09:48 EDT
Created attachment 1197639 [details]
vdsm and engine logs

Checked on rhevm-4.0.4-0.1.el7ev.noarch

1) Deploy HE
2) Add storage domain to the engine
3) Edit number of the HE VM CPU's from 2 to 4 - still receive error in the engine and HE VM still has only two CPU's

I can see number of problems when the agent create HE VM:
1) HE VM: <vcpu placement='static'>2</vcpu>
   Regular VM: <vcpu placement='static' current='1'>16</vcpu>
2) HE VM: <type arch='x86_64' machine='rhel6.5.0'>hvm</type>
   Regular VM: <type arch='x86_64' machine='pc-i440fx-rhel7.2.0'>hvm</type>
Comment 12 Red Hat Bugzilla Rules Engine 2016-09-04 09:48:29 EDT
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.
Comment 13 Martin Sivák 2016-09-05 09:39:21 EDT
(In reply to Artyom from comment #11)
> Created attachment 1197639 [details]
> vdsm and engine logs
> 
> Checked on rhevm-4.0.4-0.1.el7ev.noarch
> 
> 1) Deploy HE
> 2) Add storage domain to the engine
> 3) Edit number of the HE VM CPU's from 2 to 4 - still receive error in the
> engine and HE VM still has only two CPU's


Artyom, how long did you wait for the CPU setting to propagate? It might take up to an hour unless you modify the OVF sync interval in engine-config.
Comment 14 Artyom 2016-09-05 10:44:27 EDT
I waited 5 minutes, but I though we already have the fix that updates OVF straight forward without need to wait for OVF interval
Comment 15 Martin Sivák 2016-09-05 10:49:59 EDT
> I waited 5 minutes, but I though we already have the fix that updates OVF
> straight forward without need to wait for OVF interval

We do, but it does not work properly :/

Can you please retest this with the OvfUpdateInterval set to one minute?(In reply to Artyom from comment #14)
Comment 16 Artyom 2016-09-05 11:26:18 EDT
I have HE VM with 2 CPU's

1) Put HE environment to global maintenance
2) Update OVF interval to one minute - # engine-config -s OvfUpdateIntervalInMinutes=1
3) Restart engine - # systemctl restart ovirt-engine
4) Disable global maintenance
5) Update HE VM number of CPU's to 4 - Still can see error message from libvirt
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/vm.py", line 2559, in setNumberOfCpus
    libvirt.VIR_DOMAIN_AFFECT_CURRENT)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 69, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 123, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 916, in wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2433, in setVcpusFlags
    if ret == -1: raise libvirtError ('virDomainSetVcpusFlags() failed', dom=self)
libvirtError: invalid argument: requested vcpus is greater than max allowable vcpus for the domain: 4 > 2
6) Wait 5 minutes
7) Check HE VM number of CPU's
# lscpu 
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                2
...


Some additional info, I can see that OVF updated correctly
umber of virtual CPU</rasd:Description><rasd:InstanceId>1</rasd:InstanceId><rasd:ResourceType>3</rasd:ResourceType><rasd:num_of_sockets>4</rasd:num_of_sockets><rasd:cpu_per_socket>1</rasd:cpu_per_socket><rasd:threads_per_cpu>1</rasd:threads_per_cpu><rasd:max_num_of_vcpus>16</rasd:max_num_of_vcpus></Item><Item><rasd:Caption>4096 MB of memory</rasd:Caption><rasd:Description>Memory Size</rasd:Description><rasd:InstanceId>2</rasd:InstanceId><rasd:ResourceType>4</rasd:ResourceType><rasd:AllocationUnits>MegaBytes</rasd:AllocationUnits><rasd:VirtualQuantity>4096</rasd:VirtualQuantity></Item><Item><rasd:Caption>virtio-disk0</rasd:Caption><rasd:InstanceId>af6e9e52-3f0f-443d-83f5-465539c7506d</rasd:InstanceId><ras

Ok looks I found what the problem:
When you just finish deploy you HE VM has incorrect values for <vcpu placement='static'>2</vcpu>(I believe it happens until you restart HE VM with the updated OVF)
Comment 17 Artyom 2016-09-05 11:31:21 EDT
So I will verify this bug and open another one connect to start HE VM without OVF(just after HE deployment)

Verified on rhevm-4.0.4-0.1.el7ev.noarch

1) Deploy HE
2) Wait until engine will generate HE OVF
3) Enable global maintenance
4) Restart HE VM
5) Update HE VM number of CPU's to 4
6) Check that HE VM has correct number of CPU's

Note You need to log in before you can comment on or make changes to this bug.