Bug 1794304 - correctly configured VM fails on start with error: CPU IDs in <numa> exceed the <vcpu> count.
Summary: correctly configured VM fails on start with error: CPU IDs in <numa> exceed t...
Keywords:
Status: CLOSED DUPLICATE of bug 1437559
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Virt
Version: 4.4.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: ---
Assignee: Michal Skrivanek
QA Contact: meital avital
URL:
Whiteboard:
Depends On:
Blocks: 1437559
TreeView+ depends on / blocked
 
Reported: 2020-01-23 08:55 UTC by Polina
Modified: 2020-01-23 16:01 UTC (History)
2 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2020-01-23 16:01:11 UTC
oVirt Team: SLA
Embargoed:


Attachments (Terms of Use)
logs (1.35 MB, application/gzip)
2020-01-23 08:55 UTC, Polina
no flags Details

Description Polina 2020-01-23 08:55:48 UTC
Created attachment 1654785 [details]
logs

Description of problem: Vm fails on start with ERROR: EVENT_ID: VM_DOWN_ERROR(119), VM vm is down with error. Exit message: internal error: CPU IDs in <numa> exceed the <vcpu> count.

Version-Release number of selected component (if applicable):
http://bob-dr.lab.eng.brq.redhat.com/builds/4.4/rhv-4.4.0-14

How reproducible:100%


Steps to Reproduce:
1.Configure the VM with 5CPUs (5 Virtual Sockets, 1 Cores per Virtual Socket, 1 Threads per Core) and 2 numa nodes. Start. Check engine log correct numa configuration
     <numa>
      <cell id='0' cpus='0-7,16-71' memory='524288' unit='KiB'/>
      <cell id='1' cpus='8-15,72-127' memory='524288' unit='KiB'/>
    </numa>


2. Reconfigure the VM to have 16 CPUs(2 Virtual Sockets, 2Cores per Virtual Socket, 4 Threads per Core). Leave Numa node 2. Restart the VM .

Actual results:the VM fails on startup with ERROR: EVENT_ID: VM_DOWN_ERROR(119), VM vm is down with error. Exit message: internal error: CPU IDs in <numa> exceed the <vcpu> count.
in engine.log
    ...
    <cpu match="exact">
    <model>Westmere</model>
    <topology cores="2" threads="4" sockets="16"/>
    <numa>
      <cell id="1" cpus="3-4,16-77" memory="524288"/>
      <cell id="0" cpus="0-2,78-138" memory="524288"/>
    </numa>
    </cpu>
    ...

Expected results: VM starts
xml contains :
    <numa>
      <cell id='0' cpus='0-7,16-71' memory='524288' unit='KiB'/>
      <cell id='1' cpus='8-15,72-127' memory='524288' unit='KiB'/>
    </numa>

Additional info: 
In the attached logs
2020-01-23 10:07:07,864+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-9) [2b14e989] EVENT_ID: VM_DOWN_ERROR(119), VM vm is down with error. Exit message: internal error: CPU IDs in <numa> exceed the <vcpu> count.

Comment 1 Michal Skrivanek 2020-01-23 09:01:54 UTC
when you change topology the numa assignemnt/reservation is not necessarily valid anymore. it should be probably dropped and recreated on any such change

Comment 2 Ryan Barry 2020-01-23 16:01:11 UTC
This is part of testing rhbz#1437559, not a blocker. Closing. Let's resolve it there.

*** This bug has been marked as a duplicate of bug 1437559 ***


Note You need to log in before you can comment on or make changes to this bug.