Bug 1568148

Summary: libvirt: vpx:// driver does not get CPU vendor, model or topology from VMware server
Product: Red Hat Enterprise Linux 7 Reporter: Kedar Kulkarni <kkulkarn>
Component: libvirtAssignee: Pino Toscano <ptoscano>
Status: CLOSED ERRATA QA Contact: tingting zheng <tzheng>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 7.5CC: bthurber, dyuan, fdupont, hkataria, jdenemar, kkulkarn, lavenel, lmen, mpovolny, mxie, obarenbo, ptoscano, rjones, tzheng, xuzhang
Target Milestone: betaKeywords: Upstream
Target Release: 7.6   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: V2V
Fixed In Version: libvirt-4.3.0-1.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-10-30 09:53:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1541908    
Attachments:
Description Flags
PreMigration
none
PostMigration
none
comment12-v2v.log
none
window2016_8-sockets_1-core.vmx none

Description Kedar Kulkarni 2018-04-16 20:34:36 UTC
Description of problem:
VM CPU sockets and cores differ post migration. Please see attached screenshots. 

Version-Release number of selected component (if applicable):
master.20180411234937_802d601

How reproducible:
100%

Steps to Reproduce:
1.Migrate VM which has more than one CPU cores and more than one socket
2.
3.

Actual results:
CPU Socket x CPU Core is incorrect see screenshots

Expected results:
CPU Core/Sockets should remain same post migration

Additional info:

Comment 2 Kedar Kulkarni 2018-04-16 20:36:12 UTC
Created attachment 1422731 [details]
PreMigration

Comment 3 Kedar Kulkarni 2018-04-16 20:36:34 UTC
Created attachment 1422732 [details]
PostMigration

Comment 5 Pino Toscano 2018-04-17 06:53:28 UTC
Please provide the logs for virt-v2v.

Comment 8 Richard W.M. Jones 2018-04-17 14:47:06 UTC
The source metadata (from libvirt) doesn't have any information about
CPU vendor, model or topology, only the number of cores.

What libvirt gave us:

<domain type='vmware' xmlns:vmware='http://libvirt.org/schemas/domain/vmware/1.0'>
  <name>v2v-windows-kkulkarn</name>
  <uuid>42009372-17da-be73-779d-007ccf1bd228</uuid>
  <memory unit='KiB'>16777216</memory>
  <currentMemory unit='KiB'>16777216</currentMemory>
  <vcpu placement='static'>16</vcpu>
  <cputune>
    <shares>16000</shares>
  </cputune>
  <os>
    <type arch='x86_64'>hvm</type>
  </os>
  <clock offset='utc'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <devices>
    <disk type='file' device='disk'>
      <source file='[name redacted].vmdk'/>
      <target dev='sda' bus='scsi'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <controller type='scsi' index='0' model='lsisas1068'/>
    <interface type='bridge'>
      <mac address='00:50:56:80:b3:81'/>
      <source bridge='VM Network'/>
      <model type='vmxnet3'/>
    </interface>
    <video>
      <model type='vmvga' vram='8192' primary='yes'/>
    </video>
  </devices>
  <vmware:datacenterpath>Datacenter</vmware:datacenterpath>
  <vmware:moref>vm-7737</vmware:moref>
</domain>


What virt-v2v parses from the libvirt XML:

    source name: v2v-windows-kkulkarn
hypervisor type: vmware
         memory: 17179869184 (bytes)
       nr vCPUs: 16
   CPU features: 
       firmware: unknown
        display: 
          video: vmvga
          sound: 
disks:
        nbd:unix:/var/tmp/vddk.h3Mb23/nbdkit1.sock:exportname=/ (raw) [scsi]
removable media:

NICs:
        Bridge "VM Network" mac: 00:50:56:80:b3:81 [vmxnet3]

Comment 9 Richard W.M. Jones 2018-04-17 14:52:06 UTC
BTW if you just want something that works, virt-v2v -i vmx can parse
at least #sockets and #cores.  However it appears that #threads, together
with CPU vendor and model, is simply not available in VMware or else
it's not stored in the obvious fashion in the .vmx file.

Comment 10 Pino Toscano 2018-04-17 15:09:48 UTC
Would it be possible to fetch the .vmx file of that guest? Check in the datastore that contains the VM itself.

Comment 12 kuwei@redhat.com 2018-04-19 06:30:00 UTC
With below builds I can reproduce comment 8:
virt-v2v-1.36.10-6.el7_5.1.x86_64
libvirt-3.9.0-14.el7_5.2.x86_64

Steps:
1.Prepare a guest on vmware (coresPerSocket = "4") and use virsh command to dump the source metadata.

virsh # dumpxml esx5.5-rhel7.4-x86_64
<domain type='vmware' xmlns:vmware='http://libvirt.org/schemas/domain/vmware/1.0'>
  <name>esx5.5-rhel7.4-x86_64</name>
  <uuid>4237df0e-8de9-2d1d-d486-908dabf60810</uuid>
  <memory unit='KiB'>2097152</memory>
  <currentMemory unit='KiB'>2097152</currentMemory>
  <vcpu placement='static'>8</vcpu>
  <cputune>
    <shares>8000</shares>
  </cputune>
  <os>
    <type arch='x86_64'>hvm</type>
  </os>
  ................
  ................
  <vmware:datacenterpath>data</vmware:datacenterpath>
  <vmware:moref>vm-418</vmware:moref>
</domain>

2.So, we doesn't have any information about
CPU vendor, model or topology, only the number of cores.


Then I try to convert the guest via "vmx" from comment 9.

Steps:
1.Mount the vmware nfs storage to v2v conversion server.

2.# cat esx5.5-rhel7.4-x86_64.vmx | grep cpu
sched.cpu.min = "0"
sched.cpu.units = "mhz"
sched.cpu.shares = "normal"
numvcpus = "8"
cpuid.coresPerSocket = "4"

3.Convert the guest via "vmx"
# virt-v2v -i vmx esx5.5-rhel7.4-x86_64.vmx -on vmx-rhel7.4 -of qcow2

4.After conversion, dump the guest's xml, but also can't find sockets and cores message.
# virsh dumpxml vmx-rhel7.4
<domain type='kvm'>
  <name>vmx-rhel7.4</name>
  <uuid>3a27cb1f-de04-4a07-95f2-b9ce14f98f49</uuid>
  <memory unit='KiB'>2097152</memory>
  <currentMemory unit='KiB'>2097152</currentMemory>
  <vcpu placement='static'>8</vcpu>
  <os>
    <type arch='x86_64' machine='pc-i440fx-rhel7.5.0'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
.........
.........

5.We could find v2v could parse sockets and cores message from v2v -x -v log.
#parsed VMX tree:
namespace '':
    encoding = "UTF-8"
cleanshutdown = "TRUE"
namespace 'config':
    version = "8"
namespace 'cpuid':
    corespersocket = "4"

6.From comment 9, we know that #threads not appear in vmx file, so we also could not get exactly sockets and cores in guest xml

Hi, rjones, whether my steps is right?
And is it necessary to add a default property #threads ,then we can get the exactly CPU message?

Comment 13 Richard W.M. Jones 2018-04-19 10:28:50 UTC
As always in these cases I need the full virt-v2v debug log to make
any suggestions.

Comment 14 kuwei@redhat.com 2018-04-19 10:52:24 UTC
Created attachment 1424010 [details]
comment12-v2v.log

Comment 15 Richard W.M. Jones 2018-04-19 11:10:43 UTC
Right yes I understand.  Although we implemented CPU topology upstream,
it's not available in RHEL 7.5 (bug 1541908).  I've made this bug depend
on that RFE.

Comment 16 Pino Toscano 2018-04-19 11:30:37 UTC
(In reply to Richard W.M. Jones from comment #15)
> Right yes I understand.  Although we implemented CPU topology upstream,
> it's not available in RHEL 7.5 (bug 1541908).  I've made this bug depend
> on that RFE.

It's the other way round: to implement the proper conversion of the CPU topology in v2v, the esx driver in libvirt must report the correct CPU topology in the XML of guest.

If this bug is not fixed, v2v with bug 1541908 fixed will be able to convert the CPU topology when used with other input modes than the ones based on libvirt (so it would work with vmx, ova, etc).

Comment 17 Pino Toscano 2018-04-19 15:38:25 UTC
Patch posted for this:
https://www.redhat.com/archives/libvir-list/2018-April/msg01878.html

Comment 18 Pino Toscano 2018-04-20 08:39:45 UTC
Reviewed, and pushed upstream:

commit f10a1a95a213b2266b6a7b5ac6e9d65cbb168cd5
Author:     Pino Toscano <ptoscano>
AuthorDate: Thu Apr 19 15:03:38 2018 +0200
Commit:     Ján Tomko <jtomko>
CommitDate: Fri Apr 20 09:11:01 2018 +0200

    vmx: write cpuid.coresPerSocket back from CPU topology
    
    When writing the VMX file from the domain XML, write
    cpuid.coresPerSocket if there is a specified CPU topology in the guest.
    
    Use the domain XML of esx-in-the-wild-9 in vmx2xml as testcase for
    xml2vmxtest.
    
    Signed-off-by: Pino Toscano <ptoscano>
    Acked-by: Richard W.M. Jones <rjones>
    Reviewed-by: Ján Tomko <jtomko>
    Signed-off-by: Ján Tomko <jtomko>

commit 5cceadcbac033349ee9621bc87f98e0ced19b2ad
Author:     Pino Toscano <ptoscano>
AuthorDate: Thu Apr 19 15:03:37 2018 +0200
Commit:     Ján Tomko <jtomko>
CommitDate: Fri Apr 20 09:09:29 2018 +0200

    vmx: convert cpuid.coresPerSocket for CPU topology
    
    Convert the cpuid.coresPerSocket key as both number of CPU sockets, and
    cores per socket.
    
    Add the VMX file attached to RHBZ#1568148 as testcase esx-in-the-wild-9;
    adapt the resulting XML of testcase esx-in-the-wild-8 to the CPU
    topology present in that VMX.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1568148
    
    Signed-off-by: Pino Toscano <ptoscano>
    Acked-by: Richard W.M. Jones <rjones>
    Reviewed-by: Ján Tomko <jtomko>
    Signed-off-by: Ján Tomko <jtomko>

v4.2.0-437-gf10a1a95a2

Comment 20 mxie@redhat.com 2018-05-30 07:45:08 UTC
Verify the bug with below builds:
libvirt-4.3.0-1.el7.x86_64
qemu-kvm-rhev-2.12.0-2.el7.x86_64

Steps:
Scenario1:
1.1 Set multiple sockets and cores for guest's cpu on ESXi6.7 host,such as:cpu=core(2) * socket(4)

1.2 Use virsh to check guest's cpu topology 
# virsh -c vpx://root.73.141/data/10.73.75.219/?no_verify=1
Enter root's password for 10.73.73.141: 
Welcome to virsh, the virtualization interactive terminal.

Type:  'help' for help with commands
       'quit' to quit
virsh # dumpxml esx6.7-rhel7.5-x86_64
<domain type='vmware' xmlns:vmware='http://libvirt.org/schemas/domain/vmware/1.0'>
  <name>esx6.7-rhel7.5-x86_64</name>
  <uuid>422c080c-c6b1-512d-6ff5-393308aa44d4</uuid>
  <memory unit='KiB'>2097152</memory>
  <currentMemory unit='KiB'>2097152</currentMemory>
  <vcpu placement='static'>8</vcpu>
  <cputune>
    <shares>8000</shares>
  </cputune>
  <os>
    <type arch='x86_64'>hvm</type>
  </os>
  <cpu>
    <topology sockets='4' cores='2' threads='1'/>
  </cpu>
.....

Result1:
     Libvirt: vpx:// driver can get CPU topology from ESXi6.7 when set multiple sockets and cores for guest's cpu


Scenario2:
2.1 Set multiple sockets and cores for guest's cpu on ESXi5.5 host,such as:cpu=core(3) * socket(2)

2.2 Use virsh to check guest's cpu topology
# virsh -c vpx://root.75.182/data/10.73.3.19/?no_verify=1
Enter root's password for 10.73.75.182: 
Welcome to virsh, the virtualization interactive terminal.

Type:  'help' for help with commands
       'quit' to quit

virsh # dumpxml esx5.5-win2016-x86_64
<domain type='vmware' xmlns:vmware='http://libvirt.org/schemas/domain/vmware/1.0'>
  <name>esx5.5-win2016-x86_64</name>
  <uuid>42378dec-6f14-9ed3-ee28-7dcbfb703382</uuid>
  <memory unit='KiB'>4194304</memory>
  <currentMemory unit='KiB'>4194304</currentMemory>
  <vcpu placement='static'>6</vcpu>
  <cputune>
    <shares>6000</shares>
  </cputune>
  <os>
    <type arch='x86_64'>hvm</type>
  </os>
  <cpu>
    <topology sockets='2' cores='3' threads='1'/>
  </cpu>
....

Result2:
     Libvirt: vpx:// driver can get CPU topology from ESXi5.5 when set multiple sockets and cores for guest's cpu


Scenario3:
3.1 Set 8 sockets and 1 core for guest's cpu on ESXi5.5 host

3.2 Use virsh to check guest's cpu topology
# virsh -c vpx://root.75.182/data/10.73.3.19/?no_verify=1
Enter root's password for 10.73.75.182: 
Welcome to virsh, the virtualization interactive terminal.

Type:  'help' for help with commands
       'quit' to quit
virsh # dumpxml esx5.5-win2016-x86_64
<domain type='vmware' xmlns:vmware='http://libvirt.org/schemas/domain/vmware/1.0'>
  <name>esx5.5-win2016-x86_64</name>
  <uuid>42378dec-6f14-9ed3-ee28-7dcbfb703382</uuid>
  <memory unit='KiB'>4194304</memory>
  <currentMemory unit='KiB'>4194304</currentMemory>
  <vcpu placement='static'>8</vcpu>
  <cputune>
    <shares>8000</shares>
  </cputune>
....


Result3:
     Libvirt: vpx:// driver can not get CPU topology from ESXi6.7 when set multiple sockets and 1 core for guest's cpu


Scenario4:
4.1 Set 1 sockets and 7 core for guest's cpu on ESXi5.5 host

4.2 Use virsh to check guest's cpu topology
# virsh -c vpx://root.75.182/data/10.73.3.19/?no_verify=1
Enter root's password for 10.73.75.182: 
Welcome to virsh, the virtualization interactive terminal.

Type:  'help' for help with commands
       'quit' to quit
virsh # dumpxml esx5.5-win2016-x86_64
error: internal error: Expecting VMX entry 'numvcpus' to be an unsigned integer (1 or a multiple of 2) but found 7


Result4:
     Libvirt: vpx:// driver can not get CPU topology from ESXi6.7 when guest's cpu num is singular

Scenario5
5.1 Set 1 sockets and 6 core for guest's cpu on ESXi5.5 host

5.2 Use virsh to check guest's cpu topology
# virsh -c vpx://root.75.182/data/10.73.3.19/?no_verify=1
Enter root's password for 10.73.75.182: 
Welcome to virsh, the virtualization interactive terminal.

Type:  'help' for help with commands
       'quit' to quit
virsh # dumpxml esx5.5-win2016-x86_64
<domain type='vmware' xmlns:vmware='http://libvirt.org/schemas/domain/vmware/1.0'>
  <name>esx5.5-win2016-x86_64</name>
  <uuid>42378dec-6f14-9ed3-ee28-7dcbfb703382</uuid>
  <memory unit='KiB'>4194304</memory>
  <currentMemory unit='KiB'>4194304</currentMemory>
  <vcpu placement='static'>6</vcpu>
  <cputune>
    <shares>6000</shares>
  </cputune>
  <os>
    <type arch='x86_64'>hvm</type>
  </os>
  <cpu>
    <topology sockets='1' cores='6' threads='1'/>
  </cpu>

Result5:
     Libvirt: vpx:// driver can get CPU topology from ESXi6.7 when set 1 sockets and multiple cores for guest's cpu

Hi Pino,

   Could you please help to check the problems of result 3 and 4 ? Thanks

Comment 21 Pino Toscano 2018-05-30 08:48:59 UTC
(In reply to mxie from comment #20)
> Verify the bug with below builds:
> libvirt-4.3.0-1.el7.x86_64
> qemu-kvm-rhev-2.12.0-2.el7.x86_64
> 
> Steps:
> Scenario2:
> Scenario3:
> 3.1 Set 8 sockets and 1 core for guest's cpu on ESXi5.5 host
> 
> 3.2 Use virsh to check guest's cpu topology
> # virsh -c vpx://root.75.182/data/10.73.3.19/?no_verify=1
> Enter root's password for 10.73.75.182: 
> Welcome to virsh, the virtualization interactive terminal.
> 
> Type:  'help' for help with commands
>        'quit' to quit
> virsh # dumpxml esx5.5-win2016-x86_64
> <domain type='vmware'
> xmlns:vmware='http://libvirt.org/schemas/domain/vmware/1.0'>
>   <name>esx5.5-win2016-x86_64</name>
>   <uuid>42378dec-6f14-9ed3-ee28-7dcbfb703382</uuid>
>   <memory unit='KiB'>4194304</memory>
>   <currentMemory unit='KiB'>4194304</currentMemory>
>   <vcpu placement='static'>8</vcpu>
>   <cputune>
>     <shares>8000</shares>
>   </cputune>
> ....
> 
> 
> Result3:
>      Libvirt: vpx:// driver can not get CPU topology from ESXi6.7 when set
> multiple sockets and 1 core for guest's cpu

Can you please get the .vmx file in ESXi for this guest? It should be in the datastore of the guest.

> Scenario4:
> 4.1 Set 1 sockets and 7 core for guest's cpu on ESXi5.5 host
> 
> 4.2 Use virsh to check guest's cpu topology
> # virsh -c vpx://root.75.182/data/10.73.3.19/?no_verify=1
> Enter root's password for 10.73.75.182: 
> Welcome to virsh, the virtualization interactive terminal.
> 
> Type:  'help' for help with commands
>        'quit' to quit
> virsh # dumpxml esx5.5-win2016-x86_64
> error: internal error: Expecting VMX entry 'numvcpus' to be an unsigned
> integer (1 or a multiple of 2) but found 7
> 
> 
> Result4:
>      Libvirt: vpx:// driver can not get CPU topology from ESXi6.7 when
> guest's cpu num is singular

It looks like the limitation of numvcpus to be 1 or a multiple of 2 was part of the initial commit of the VMX parser, in 2009. Not sure why it was added in the first place, and since the VMX file format is proprietary of VMware, I have no idea whether removing the limitation might break anything.
I'd say to open a new bug for this (with the .vmx file of the guest), and limit the testing of cores for this bug to 1 or a multiple of 2.

Comment 22 mxie@redhat.com 2018-05-30 09:05:19 UTC
Created attachment 1445732 [details]
window2016_8-sockets_1-core.vmx

Comment 23 mxie@redhat.com 2018-05-30 09:14:50 UTC
> > Result4:
> >      Libvirt: vpx:// driver can not get CPU topology from ESXi6.7 when
> > guest's cpu num is singular
> 
> It looks like the limitation of numvcpus to be 1 or a multiple of 2 was part
> of the initial commit of the VMX parser, in 2009. Not sure why it was added
> in the first place, and since the VMX file format is proprietary of VMware,
> I have no idea whether removing the limitation might break anything.
> I'd say to open a new bug for this (with the .vmx file of the guest), and
> limit the testing of cores for this bug to 1 or a multiple of 2.

I have filed a bug1584091 to track this problem, thanks Pino's confirmation

Comment 24 Pino Toscano 2018-06-11 08:44:35 UTC
To recap the testing done by Ming Xie:

- scenario 1
- scenario 2
- scenario 5
All OK

- scenario 3
I took a look at the attached VMX file (window2016_8-sockets_1-core.vmx), and unfortunately it does not provide any additional information to make it handled as the guest is configured in ESXi; hence, I cannot improve this case, sorry.

- scenario 4
Reported as bug 1584091.

mxie: any other case/scenario I'm missing?

Comment 25 mxie@redhat.com 2018-06-11 08:58:13 UTC
Hi Pino,

    No omission,for scenario3, do you think it needs a bug to track or just consider it as a known problem?

Comment 26 Pino Toscano 2018-06-11 16:45:34 UTC
(In reply to mxie from comment #25)
>     No omission,for scenario3, do you think it needs a bug to track or just
> consider it as a known problem?

I'd say to report it as upstream bug, instead of RHEL one: while I do not see much to do for this, still a bug to track this is better than nothing.

Comment 27 mxie@redhat.com 2018-06-12 03:13:21 UTC
Have filed upstream bug1590079 to track the problem of scenario3, so move this bug from ON_QA to VERIFIED according to comment20 ~ comment26

Comment 29 errata-xmlrpc 2018-10-30 09:53:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:3113