Bug 1623932 - Not able to spawn DPDK VM's when Ceph as a backend
Summary: Not able to spawn DPDK VM's when Ceph as a backend
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 13.0 (Queens)
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: ---
Assignee: Emilien Macchi
QA Contact: Gurenko Alex
URL:
Whiteboard:
Depends On:
Blocks: epmosp13bugs
TreeView+ depends on / blocked
 
Reported: 2018-08-30 13:33 UTC by David Hill
Modified: 2019-06-11 08:20 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-12-05 09:33:44 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1780932 0 None None None 2018-11-20 09:20:04 UTC
OpenStack gerrit 587773 0 'None' MERGED Add OVS-DPDK parameter as part of roles file 2021-02-17 13:58:04 UTC

Description David Hill 2018-08-30 13:33:06 UTC
Description of problem:
When trying to start DPDK VM's with Ceph backend the VM's are in Spawing state, the VM's should start successfully.(Note - SRIOV VM's are started successfully using Ceph Backend)


Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Deploy OSP13 with Ceph and DPDK
2. Try spawning a DPDK VM
3.

Actual results:
VM spawning

Expected results:
VM created

Additional info:

Comment 1 David Hill 2018-08-30 13:34:21 UTC
It looks like we're creating the vhu devices:

[dhill@collab-shell qemu]$ cat instance-00000001.xml  | grep -i vhu
      <source type='unix' path='/var/lib/vhost_sockets/vhuc9d2ba5e-26' mode='server'/>


[dhill@collab-shell openvswitch]$ cat ovs-vsctl_-t_5_show  | grep vhu
        Port "vhuc9d2ba5e-26"
            Interface "vhuc9d2ba5e-26"
                options: {vhost-server-path="/var/lib/vhost_sockets/vhuc9d2ba5e-26"}


Nova gets a success when plugging the vif:
[dhill@collab-shell nova]$ grep vif nova-compute.log 
2018-08-28 18:45:50.441 1 INFO os_vif [-] Loaded VIF plugins: ovs, linux_bridge
2018-08-29 01:18:28.115 1 INFO oslo.privsep.daemon [req-5ceec186-ae7b-4936-be5e-686058fc9866 376f491e883e4f36afe95efe79a2ef44 9ba550af9c424380a498dc83edfb6eb0 - default default] Running privsep helper: ['sudo', 'nova-rootwrap', '/etc/nova/rootwrap.conf', 'privsep-helper', '--config-file', '/usr/share/nova/nova-dist.conf', '--config-file', '/etc/nova/nova.conf', '--privsep_context', 'vif_plug_ovs.privsep.vif_plug', '--privsep_sock_path', '/tmp/tmpvdSvQv/privsep.sock']
2018-08-29 01:18:28.727 1 INFO os_vif [req-5ceec186-ae7b-4936-be5e-686058fc9866 376f491e883e4f36afe95efe79a2ef44 9ba550af9c424380a498dc83edfb6eb0 - default default] Successfully plugged vif VIFVHostUser(active=False,address=fa:16:3e:ff:f3:27,has_traffic_filtering=False,id=c9d2ba5e-26c0-4ee8-bc15-d8f963bbfe6b,mode='server',network=Network(7f610612-3459-4ef8-9193-8875f7e69517),path='/var/lib/vhost_sockets/vhuc9d2ba5e-26',plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=False,vif_name='vhuc9d2ba5e-26')


Ovs successfully create the device and added it to br-int:
[dhill@collab-shell openvswitch]$ grep vhuc9d2ba5e-26  ovs-vswitchd.log 
2018-08-29T01:18:28.702Z|00213|netdev_dpdk|INFO|vHost User device 'vhuc9d2ba5e-26' created in 'client' mode, using client socket '/var/lib/vhost_sockets/vhuc9d2ba5e-26'
2018-08-29T01:18:28.706Z|00214|dpdk|WARN|VHOST_CONFIG: failed to connect to /var/lib/vhost_sockets/vhuc9d2ba5e-26: No such file or directory
2018-08-29T01:18:28.706Z|00215|dpdk|INFO|VHOST_CONFIG: /var/lib/vhost_sockets/vhuc9d2ba5e-26: reconnecting...
2018-08-29T01:18:28.706Z|00217|dpif_netdev|INFO|Core 7 on numa node 0 assigned port 'vhuc9d2ba5e-26' rx queue 0 (measured processing cycles 0).
2018-08-29T01:18:28.706Z|00218|bridge|INFO|bridge br-int: added interface vhuc9d2ba5e-26 on port 3

The VM sees the vhu devices but QEMU sticks on waiting on connection for that device:
[dhill@collab-shell qemu]$ cat instance-00000001.log 
2018-08-29 01:18:29.598+0000: starting up libvirt version: 3.9.0, package: 14.el7_5.6 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2018-06-05-05:26:44, x86-041.build.eng.bos.redhat.com), qemu version: 2.10.0(qemu-kvm-rhev-2.10.0-21.el7_5.4), hostname: computeovsdpdk-0.localdomain
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin HOME=/root QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name guest=instance-00000001,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1-instance-00000001/master-key.aes -machine pc-i440fx-rhel7.5.0,accel=kvm,usb=off,dump-guest-core=off -cpu Skylake-Server-IBRS,ss=on,hypervisor=on,tsc_adjust=on,clflushopt=on,pku=on,stibp=on -m 4096 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -object memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu/1-instance-00000001,share=yes,size=4294967296,host-nodes=0,policy=bind -numa node,nodeid=0,cpus=0-3,memdev=ram-node0 -uuid 8a8d4ac4-150d-4ddb-a91b-d40534d24819 -smbios 'type=1,manufacturer=Red Hat,product=OpenStack Compute,version=17.0.3-0.20180420001141.el7ost,serial=aeef476e-7cb6-1000-9e72-a81e84a11c69,uuid=8a8d4ac4-150d-4ddb-a91b-d40534d24819,family=Virtual Machine' -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-1-instance-00000001/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -object secret,id=virtio-disk0-secret0,data=3j5+coiwLGiIoiY/ArAZuV+BIzsoLvAOtRJ4uMRpGH0=,keyid=masterKey0,iv=5pmwLlOC+ttyAFyvkpne5w==,format=base64 -drive 'file=rbd:vms/8a8d4ac4-150d-4ddb-a91b-d40534d24819_disk:id=openstack:auth_supported=cephx\;none:mon_host=172.18.0.10\:6789,file.password-secret=virtio-disk0-secret0,format=raw,if=none,id=drive-virtio-disk0,cache=writeback,discard=unmap' -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -chardev socket,id=charnet0,path=/var/lib/vhost_sockets/vhuc9d2ba5e-26,server -netdev vhost-user,chardev=charnet0,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:ff:f3:27,bus=pci.0,addr=0x3 -add-fd set=0,fd=80 -chardev pty,id=charserial0,logfile=/dev/fdset/0,logappend=on -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0,bus=usb.0,port=1 -vnc 172.17.0.11:0 -k en-us -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on
2018-08-29T01:18:30.533348Z qemu-kvm: -chardev socket,id=charnet0,path=/var/lib/vhost_sockets/vhuc9d2ba5e-26,server: info: QEMU waiting for connection on: disconnected:unix:/var/lib/vhost_sockets/vhuc9d2ba5e-26,server

And the VM stays in "paused" state:
[dhill@collab-shell virsh]$ cat virsh_-r_list_--all 
 Id    Name                           State
----------------------------------------------------
 1     instance-00000001              paused


CPU affinity is configured for systemd processes:
[dhill@collab-shell computeovsdpdk-0]$ grep -ri cpuaff etc/*
etc/systemd/system.conf:#CPUAffinity=1 2
etc/systemd/system.conf:CPUAffinity=0 1 2 3 4 5 6 7 22 23 24 25 26 27 28 29 44 45 46 47 48 49 50 51 66 67 68 69 70 71 72 73

And isolated cores:
[dhill@collab-shell tuned]$ grep ^isolated_cores cpu-partitioning-variables.conf 
isolated_cores=8-21,52-65,30-43,74-87

The VM will be free to float (no CPU pinning which is not very good):
    <vcpupin vcpu='0' cpuset='8-21,52-65'/>
    <vcpupin vcpu='1' cpuset='8-21,52-65'/>
    <vcpupin vcpu='2' cpuset='8-21,52-65'/>
    <vcpupin vcpu='3' cpuset='8-21,52-65'/>
    <emulatorpin cpuset='8-21,52-65'/>

[dhill@collab-shell proc]$ cat meminfo  | grep -i huge
AnonHugePages:      2048 kB
HugePages_Total:     200
HugePages_Free:      198
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:    1048576 kB


This behavior looks like these [1][2] .   Could it be a permission error on that /var/lib/vhost_sockets/chuc9d2ba5e-26 socket ?
Would it also be possible to get "numactl –hardware" output ?

Quickly like that, their environment is not properly configured as you can see systemd has core 7 and the pmd selected cpu 7 on numa 0 for the vhu device.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1156267
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1489631

Comment 2 David Hill 2018-08-30 13:41:59 UTC
The other case has the same ovs-vswithcd.log :

2018-08-30T10:15:52.506Z|00124|dpdk|WARN|VHOST_CONFIG: failed to connect to /var/lib/vhost_sockets/vhu5e5dabad-eb: No such file or directory


Is it possible that the nova/libvirt containers are not exporting /var/lib/vhost_sockets or something like that ?

Comment 10 Saravanan KR 2018-09-14 06:33:30 UTC
> Could you please help us in understanding the role data for ComputeHCI is affecting the configuration for our deployment when we are not using ComputeHCI role ?

All the roles defined in roles_data.yaml will be used for the processing, which means all the heat stack required for ComputeHCI deployment will be created. If you refer to the created stacks, it will be clear that service resources for ComputeHCI will be created.

The node count (ComputeHCICount: 0) will ensure ComputeHCI will NOT be deployed to a node, but all the heat processing will be done with ComputeHCI role definition. So having correct parameters/services is important, even though count is 0.

I see 3 possible options:
* Remove the ComputeHCI from the roles_data OR 
* Swap ComputeNeutronOvsDpdk to ComputeNeutronOvsAgent in ComputeHCI OR
* Define the required parameters for ComputeHCI (if OVS-DPDK is required)

Comment 12 Govardhan Chintha 2018-09-18 09:22:06 UTC
(In reply to Saravanan KR from comment #10)
> > Could you please help us in understanding the role data for ComputeHCI is affecting the configuration for our deployment when we are not using ComputeHCI role ?
> 
> All the roles defined in roles_data.yaml will be used for the processing,
> which means all the heat stack required for ComputeHCI deployment will be
> created. If you refer to the created stacks, it will be clear that service
> resources for ComputeHCI will be created.
> 
> The node count (ComputeHCICount: 0) will ensure ComputeHCI will NOT be
> deployed to a node, but all the heat processing will be done with ComputeHCI
> role definition. So having correct parameters/services is important, even
> though count is 0.
> 
> I see 3 possible options:
> * Remove the ComputeHCI from the roles_data OR 
> * Swap ComputeNeutronOvsDpdk to ComputeNeutronOvsAgent in ComputeHCI OR
> * Define the required parameters for ComputeHCI (if OVS-DPDK is required)

The heat processing for all the roles specified in the roles_data.yaml is understandable. But role parameter should be specific to the role. In this case, the role parameter VhostuserSocketGroup value for the roles is as mentioned below.

For ComputeOvsDpdkSriov role:  
             VhostuserSocketGroup: "hugetlbfs" (configured from role parameters in templates)

For ComputeHCI role:
             VhostuserSocketGroup: "qemu" (default parameter value).

But after the installation the value VhostuserSocketGroup: hugetlbfs is not reflecting in ComputeOvsDpdkSriov role and it is getting overridden by qemu.
This implies that these is some issue with the role parameter implementation(at least for this parameter VhostuserSocketGroup).

Comment 18 Jaison Raju 2018-11-19 10:45:52 UTC
I can summarize the resolution.

1. Define 'VhostuserSocketGroup' under RoleParametersDefault in roles_data.yaml for all roles that is expected to be ovs-dpdk based (i.e. every role that has 'OS::TripleO::Services::ComputeNeutronOvsDpdk' service)

  RoleParametersDefault:
    VhostuserSocketGroup: "hugetlbfs"

2. If you have a non-dpdk role, then make sure that '    - OS::TripleO::Services::ComputeNeutronOvsDpdk' service is not defined in that role.

@Saravanan , do you think the above statement can summarize our solution?
If so, we can close this bz.

Regards,
Jaison R

Comment 19 Saravanan KR 2018-11-19 11:05:13 UTC
(In reply to Jaison Raju from comment #18)
> I can summarize the resolution.
> 
> 1. Define 'VhostuserSocketGroup' under RoleParametersDefault in
> roles_data.yaml for all roles that is expected to be ovs-dpdk based (i.e.
> every role that has 'OS::TripleO::Services::ComputeNeutronOvsDpdk' service)
> 
>   RoleParametersDefault:
>     VhostuserSocketGroup: "hugetlbfs"
> 
> 2. If you have a non-dpdk role, then make sure that '    -
> OS::TripleO::Services::ComputeNeutronOvsDpdk' service is not defined in that
> role.
> 
> @Saravanan , do you think the above statement can summarize our solution?
> If so, we can close this bz.
> 
> Regards,
> Jaison R


Yes, the statement looks fine with me (with the ml2-ovs context). 

For ml2-ODL, we need to add a statement that OvS-DPDK role should be the at the last of the list of roles in roles_data.yaml file (till we fix this as discussed in BZ #1649700).

Comment 20 Chris Fields 2018-11-26 19:13:02 UTC
Saravanan, does your suggestion in comment #19 require a docs update?  If so do the docs get updated w/Jaison's suggestion in comment #18 with the caveat that you mention in #19?

Comment 21 Saravanan KR 2018-12-05 09:33:44 UTC
(In reply to Chris Fields from comment #20)
> Saravanan, does your suggestion in comment #19 require a docs update?  If so
> do the docs get updated w/Jaison's suggestion in comment #18 with the caveat
> that you mention in #19?

I don't see a doc need for it. It was an issue with the templates, which has been fixed. And for ODL part, Jaison have raised a BZ. Closing it.


Note You need to log in before you can comment on or make changes to this bug.