Bug 1794304

Summary:

correctly configured VM fails on start with error: CPU IDs in <numa> exceed the <vcpu> count.

Product:

[oVirt] ovirt-engine

Reporter:

Polina <pagranat>

Component:

BLL.Virt

Assignee:

Michal Skrivanek <michal.skrivanek>

Status:

CLOSED DUPLICATE

QA Contact:

meital avital <mavital>

Severity:

high

Docs Contact:

Priority:

unspecified

Version:

4.4.0

CC:

bugs, rbarry

Target Milestone:

---

Target Release:

---

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2020-01-23 16:01:11 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

SLA

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

1437559

Attachments:

Description	Flags
logs	none

Description Polina 2020-01-23 08:55:48 UTC

Created attachment 1654785 [details]
logs

Description of problem: Vm fails on start with ERROR: EVENT_ID: VM_DOWN_ERROR(119), VM vm is down with error. Exit message: internal error: CPU IDs in <numa> exceed the <vcpu> count.

Version-Release number of selected component (if applicable):
http://bob-dr.lab.eng.brq.redhat.com/builds/4.4/rhv-4.4.0-14

How reproducible:100%


Steps to Reproduce:
1.Configure the VM with 5CPUs (5 Virtual Sockets, 1 Cores per Virtual Socket, 1 Threads per Core) and 2 numa nodes. Start. Check engine log correct numa configuration
     <numa>
      <cell id='0' cpus='0-7,16-71' memory='524288' unit='KiB'/>
      <cell id='1' cpus='8-15,72-127' memory='524288' unit='KiB'/>
    </numa>


2. Reconfigure the VM to have 16 CPUs(2 Virtual Sockets, 2Cores per Virtual Socket, 4 Threads per Core). Leave Numa node 2. Restart the VM .

Actual results:the VM fails on startup with ERROR: EVENT_ID: VM_DOWN_ERROR(119), VM vm is down with error. Exit message: internal error: CPU IDs in <numa> exceed the <vcpu> count.
in engine.log
    ...
    <cpu match="exact">
    <model>Westmere</model>
    <topology cores="2" threads="4" sockets="16"/>
    <numa>
      <cell id="1" cpus="3-4,16-77" memory="524288"/>
      <cell id="0" cpus="0-2,78-138" memory="524288"/>
    </numa>
    </cpu>
    ...

Expected results: VM starts
xml contains :
    <numa>
      <cell id='0' cpus='0-7,16-71' memory='524288' unit='KiB'/>
      <cell id='1' cpus='8-15,72-127' memory='524288' unit='KiB'/>
    </numa>

Additional info: 
In the attached logs
2020-01-23 10:07:07,864+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-9) [2b14e989] EVENT_ID: VM_DOWN_ERROR(119), VM vm is down with error. Exit message: internal error: CPU IDs in <numa> exceed the <vcpu> count.

Comment 1 Michal Skrivanek 2020-01-23 09:01:54 UTC

when you change topology the numa assignemnt/reservation is not necessarily valid anymore. it should be probably dropped and recreated on any such change

Comment 2 Ryan Barry 2020-01-23 16:01:11 UTC

This is part of testing rhbz#1437559, not a blocker. Closing. Let's resolve it there.

*** This bug has been marked as a duplicate of bug 1437559 ***