Bug 1767013 - openstack server show --diagnostics - not reporting NIC statistics for vhost-user ports
Summary: openstack server show --diagnostics - not reporting NIC statistics for vhost-...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: libvirt
Version: 8.0
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: rc
: ---
Assignee: Michal Privoznik
QA Contact: Luyao Huang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-10-30 13:39 UTC by Mircea Vutcovici
Modified: 2024-10-01 16:22 UTC (History)
19 users (show)

Fixed In Version: libvirt-7.0.0-1.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-05-25 06:41:20 UTC
Type: Bug
Target Upstream Version: 7.0.0
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1397940 0 unspecified CLOSED Gathers network statistics of openvswitch vhostuser interfaces 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 1459091 0 high CLOSED virsh domiflist return NULL for the vhostuser interface name and source 2021-12-10 15:05:06 UTC

Description Mircea Vutcovici 2019-10-30 13:39:31 UTC
Description of problem:
Running "openstack server show --diagnostics" or "nova diagnostics" is not reporting nic_details like rx_packets, rx_drop... when the VM is using the Nuage/6wind version of DPDK



Version-Release number of selected component (if applicable):


How reproducible:
all the time


Actual results:
(tpapod4-vim4b) [stack@tpavcpaz096vim4bavrs ~]$ nova diagnostics 785bba76-4582-4e55-9a04-c17bf8178f41
+----------------+----------------------------------------------------------------------------------+
| Property       | Value                                                                            |
+----------------+----------------------------------------------------------------------------------+
| config_drive   | False                                                                            |
| cpu_details    | [{"utilisation": null, "id": 0, "time": 251480000000}, {"utilisation": null,     |
|                | "id": 1, "time": 57170000000}, {"utilisation": null, "id": 2, "time":            |
|                | 57880000000}, {"utilisation": null, "id": 3, "time": 62240000000},               |
|                | {"utilisation": null, "id": 4, "time": 41160000000}, {"utilisation": null, "id": |
|                | 5, "time": 48660000000}, {"utilisation": null, "id": 6, "time": 137420000000},   |
|                | {"utilisation": null, "id": 7, "time": 51560000000}, {"utilisation": null, "id": |
|                | 8, "time": 33570000000}, {"utilisation": null, "id": 9, "time": 48300000000}]    |
| disk_details   | [{"read_requests": 8880, "errors_count": -1, "read_bytes": 197445120,            |
|                | "write_requests": 24047, "write_bytes": 191307264}]                              |
| driver         | libvirt                                                                          |
| hypervisor     | kvm                                                                              |
| hypervisor_os  | linux                                                                            |
| memory_details | {"used": 16, "maximum": 16}                                                      |
| nic_details    | []                                                                               |
| num_cpus       | 10                                                                               |
| num_disks      | 1                                                                                |
| num_nics       | 0                                                                                |
| state          | running                                                                          |
| uptime         | 483809                                                                           |
+----------------+----------------------------------------------------------------------------------+

Expected results:
(tpapod4-vim4b) [stack@tpavcpaz096vim4bavrs ~]$ nova diagnostics 2784ca3c-147e-4234-99cb-2e300fb143a6
+----------------+------------------------------------------------------------------------------+
| Property       | Value                                                                        |
+----------------+------------------------------------------------------------------------------+
| config_drive   | False                                                                        |
| cpu_details    | [{"utilisation": null, "id": 0, "time": 368680000000}, {"utilisation": null, |
|                | "id": 1, "time": 281210000000}, {"utilisation": null, "id": 2, "time":       |
|                | 347200000000}, {"utilisation": null, "id": 3, "time": 206580000000},         |
|                | {"utilisation": null, "id": 4, "time": 417370000000}, {"utilisation": null,  |
|                | "id": 5, "time": 158560000000}, {"utilisation": null, "id": 6, "time":       |
|                | 144070000000}, {"utilisation": null, "id": 7, "time": 196110000000},         |
|                | {"utilisation": null, "id": 8, "time": 1039940000000}, {"utilisation": null, |
|                | "id": 9, "time": 303680000000}]                                              |
| disk_details   | [{"read_requests": 11668, "errors_count": -1, "read_bytes": 225243648,       |
|                | "write_requests": 11111, "write_bytes": 107361280}]                          |
| driver         | libvirt                                                                      |
| hypervisor     | kvm                                                                          |
| hypervisor_os  | linux                                                                        |
| memory_details | {"used": 16, "maximum": 16}                                                  |
| nic_details    | [{"rx_packets": 611444, "rx_drop": 0, "tx_octets": 57271546, "tx_errors": 0, |
|                | "mac_address": "fa:16:3e:a5:b7:59", "rx_octets": 57298260, "rx_rate": null,  |
|                | "rx_errors": 0, "tx_drop": 0, "tx_packets": 610829, "tx_rate": null}]        |
| num_cpus       | 10                                                                           |
| num_disks      | 1                                                                            |
| num_nics       | 1                                                                            |
| state          | running                                                                      |
| uptime         | 661544                                                                       |
+----------------+------------------------------------------------------------------------------+


Additional info:

Comment 1 smooney 2019-10-30 15:16:40 UTC
Note updating to track this as an RFE as support for vhost user nic statics is not a support feature upstream
as such extension of the diagnostics command support to work with vhost-user is not a bug but an RFE.

upstream policy is pretty clear on this point. bugs of the nature "feature X is not supported for Configuration Y" where
that support was never implemented or planned to be implemented when the initial feature was introduced are feature request
not bugs which is the case for this BZ.

Comment 2 Mircea Vutcovici 2019-10-31 13:22:16 UTC
Yes, I was thinking too that it's a RFE. Many thanks!

Comment 3 Jamie Fargen 2019-10-31 13:34:04 UTC
This doesn't seem to be an RFE for nova diagnostics as much as it seems to be an issue with libvirt.



If you run virsh domstats on an instance that is interface type=bridge the network stats are present:
# virsh domstats instance-00000008
Domain: 'instance-00000008'
...
  net.count=1
  net.0.name=tap05660aa7-90
  net.0.rx.bytes=532
  net.0.rx.pkts=6
  net.0.rx.errs=0
  net.0.rx.drop=0
  net.0.tx.bytes=10268
  net.0.tx.pkts=44
  net.0.tx.errs=0
  net.0.tx.drop=0
...

If you run virsh domstats on an instance with the interface type=vhostuser the network stats are not present:
# virsh domstats instance-0000000b
Domain: 'instance-0000000b'
...
net.count=1
...


This bugzilla https://bugzilla.redhat.com/show_bug.cgi?id=1397940 includes patches to resolve the issue instances with the interface type=vhostuser not returning network statistics.

Comment 4 smooney 2019-10-31 18:18:06 UTC
right so https://bugzilla.redhat.com/show_bug.cgi?id=1459091#c8 

seams to suggest that adding <target dev='vhost-user1'/> to  the interface definition

e.g. 

<interface type='vhostuser'>
      <mac address='52:54:00:93:51:db'/>
      <source type='unix' path='/var/run/openvswitch/vhost-user1' mode='client'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>

to 
<interface type='vhostuser'>
      <mac address='52:54:00:93:51:db'/>
      <source type='unix' path='/var/run/openvswitch/vhost-user1' mode='client'/>
      <target dev='vhost-user1'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>

should allow the dom stats to work

however that is not documented anywhere out side of that bug.

so on the one hand i think its a libvirt error that he target element is need since there actually is no netdev device
with that name present on the platform  and on the other hand the fact this is not mentioned in the libvirt docs also seams like a bug.

Form a nova point of view if gathering nic statistics form vhost-user ports is now a supported libvirt feature as https://bugzilla.redhat.com/show_bug.cgi?id=1397940
would indicate, we can treat the enablemnet of that in nova as an RFE but honestly i would expect it to work without the target element.

i suspect the target dev need to be the ovs interface name in this case but the fact there is no documentaion fo this in libvirt or ovs makes be hesitent to enabled
this untill we get confirmation form teh virt team that this shoudl work and how <target dev='vhost-user1'/> shoudl be generated.

Comment 5 smooney 2019-10-31 18:22:31 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=1397940#c16 would seam to confim that the target dev shoudl be the ovs interface name.

Comment 6 smooney 2019-11-01 13:07:00 UTC
just checking on one of my dev setup i see that libvirt is auto populating the target element

 <interface type='vhostuser'>
      <mac address='fa:16:3e:b7:f5:66'/>
      <source type='unix' path='/var/run/openvswitch/vhud3edafa4-b1' mode='server'/>
      <target dev='vhud3edafa4-b1'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>


so this seams to be yet another case where libvirt on ubuntu and centos auto populaties
the <target dev='vhud3edafa4-b1'/> element but libvirt on rhel does not.
we saw the same behaviour delta between centos7/ubunut18.04 and rhel 7 with macvtap sriov previously.

so i suspect this is a regression in libvirt or a result in a slightly different complition/config.
on the ubuntu test vm im using libvirtd (libvirt) 4.0.0 and QEMU emulator version 2.11.1(Debian 1:2.11+dfsg-1ubuntu7.19)

when the target dev is populated the stats are infact retrived correctly

irsh # domiflist instance-00000001
Interface  Type       Source     Model       MAC
-------------------------------------------------------
vhud3edafa4-b1 vhostuser  -          virtio      fa:16:3e:b7:f5:66

virsh # domifstat instance-00000001 vhud3edafa4-b1
vhud3edafa4-b1 rx_bytes 7395
vhud3edafa4-b1 rx_packets 68
vhud3edafa4-b1 rx_drop 0
vhud3edafa4-b1 tx_bytes 9837
vhud3edafa4-b1 tx_packets 110
vhud3edafa4-b1 tx_errs 0
vhud3edafa4-b1 tx_drop 0

and the diagnostics command works fine

ubuntu@dev:/opt/repos/devstack$ nova diagnostics test
+----------------+-------------------------------------------------------------------------+
| Property       | Value                                                                   |
+----------------+-------------------------------------------------------------------------+
| config_drive   | False                                                                   |
| cpu_details    | [{"utilisation": null, "id": 0, "time": 24560000000}]                   |
| disk_details   | [{"read_requests": 1013, "errors_count": -1, "read_bytes": 21753344,    |
|                | "write_requests": 675, "write_bytes": 46341120}]                        |
| driver         | libvirt                                                                 |
| hypervisor     | kvm                                                                     |
| hypervisor_os  | linux                                                                   |
| memory_details | {"used": 0, "maximum": 0}                                               |
| nic_details    | [{"rx_packets": 68, "rx_drop": 0, "tx_octets": 9837, "tx_errors": 0,    |
|                | "mac_address": "fa:16:3e:b7:f5:66", "rx_octets": 7395, "rx_rate": null, |
|                | "rx_errors": -1, "tx_drop": 0, "tx_packets": 110, "tx_rate": null}]     |
| num_cpus       | 1                                                                       |
| num_disks      | 1                                                                       |
| num_nics       | 1                                                                       |
| state          | running                                                                 |
| uptime         | 1574                                                                    |
+----------------+-------------------------------------------------------------------------+


i think we should re target this to the virt dfg to clarify the behavioural difference we are seeing
with the downstream libvirt given this is the second bug related to the <target dev='vhud3edafa4-b1'/>
element not being populated when it previously was.

we could adress this by populating the element but this appears to be an api breakage as the semantics
of the xml parsing have change in libvirt and it break other layered product beyond openstack that depend
on the previous behaviour.

if this as an intentional change in libvirt that was not communicated to the openstack team feel free to kick this
back to use and we can adress it in openstack and site the releasenote or libvirt bug that covers the change.

Comment 7 Daniel Berrangé 2019-11-04 16:06:47 UTC
Libvirt gets the target interface name by calling  'ovsctl get Interface' passing the last component of the UNIX domain socket. The code can be seen here:

  https://libvirt.org/git/?p=libvirt.git;a=blob;f=src/util/virnetdevopenvswitch.c;h=8607cf48e615f5c5eaf73ebecee33edb00f95229;hb=HEAD#l484

So if the <target> is not present, then this method / ovsctl command is failing for some reason.

Comment 8 Michal Privoznik 2019-11-04 16:38:37 UTC
Libvirt is filling in <target/> automatically since bug 1459091 (libvirt-3.7.0). Please attach full debug logs which should cover domain creation and the API call for querying stats.
Also, I don't think this is a FutureFeature rather than regular bug - attempts to report stats for openvswitch type interfaces are documented in bug 1397940. Therefore I'm removing FeatureFeature keyword and replacing it with Regression instead.

Comment 9 smooney 2019-11-05 13:02:36 UTC
The futrue feature keyword was added for nova if we need to change the xml generation but it is not required if this is a libvirt regression so yes its correct to remove.

@Daniel Berrangé
As an FYI its a convention that the last segment of the unix domain socket is the interface name.
it was required to be so for vhost-user in client mode where the vswitch created the socket.
In vhost user server mode qemu creates the socket based on the path we set in the libvirt xml
and we pass the full socket path to ovs when we create the port so they could actully be different
but i made the be the same in openstack so that we dont have to care if qemu is configured to be the vhost-user
client of server.

it looks like this is the issue

-chardev socket,id=charnet0,path=/var/lib/vhost_sockets/vhost-sock-d5adbdf4-fdb6-454b-8387-953a55840191,server \
-netdev vhost-user,chardev=charnet0,queues=8,id=hostnet0 \
-device virtio-net-pci,mq=on,vectors=18,rx_queue_size=512,tx_queue_size=512,netdev=hostnet0,id=net0,mac=fa:16:3e:fc:37:ff,bus=pci.0,addr=0x3 \
-add-fd set=1,fd=31 \

they are using the neutron port uuid as the socket path "/var/lib/vhost_sockets/vhost-sock-d5adbdf4-fdb6-454b-8387-953a55840191"
this is beacuse the case is related to the use of 6winds forked ovs-dpdk and not the standard one.
as a result they are also using 6winds ml2 dirver which does not follow the nameing convention that we agreeded to use
e.g. last segment is interface name.

there are a couple of paths forward.
1.) 6wind can change there ml2 driver to follow the nameing convention.
2.) libvirt when dealing with a server mode vhost-user port can look up the port in ovs via its socket path.
3.) nova can try to set the target dev.

in anycace i think  the libvirt docs need to be updated to document the use of the target element.
https://libvirt.org/formatdomain.html#elementVhostuser

the problem with 3 is both the neutron ml2 driver https://opendev.org/x/networking-6wind and
the os-vif plugin https://github.com/6WIND/os-vif-plugin-vhostuser-fp are both out of tree and delivered/developed
by 6wind not redhat.

so option 2 would within our contol to address but 1 and 3 would really require 6wind to be involved at a minimum to review the code changes.


for option one looking briefly at the the 6wind code which i kind of regret i belive if we just modified
https://opendev.org/x/networking-6wind/src/branch/master/networking_6wind/ml2_drivers/openvswitch/mech_driver/mech_ovs_fp.py#L96-L99
to generate teh socket path with the standard convention then the current libvirt apparch would work

The name of the ovs port is being set here
https://github.com/openstack/nova/blob/793213086c26381ec3927295eb5f5d3171e2be86/nova/network/os_vif_util.py#L54
('nic' + vif['id'])[:model.NIC_NAME_LEN]  so that is 'nic' cocatonated with the neutorn port uuid then truncated to 14 charaters to ensure
we do not exceed the max netdev name length on linux.

for option 3 we could also just change the name we generate in nova however i think that would casue issue such as exceeding the max netdev lenght.
the other problem with 1 and 3 in my mind is that they pose a potential problem on upgrade. specificaly if we change how we generate the names we might
leak ports or introduce other errors as we can really change the ovs port name or socket path without impacting runnign vms.
so with that in mind i think the libvirt change of using the socket path to find the port instead of relying on the name convention is the best path forward
although i recognize its not ideal.

Comment 10 Daniel Berrangé 2019-11-05 13:08:16 UTC
(In reply to smooney from comment #9)
> there are a couple of paths forward.
> 1.) 6wind can change there ml2 driver to follow the nameing convention.
> 2.) libvirt when dealing with a server mode vhost-user port can look up the
> port in ovs via its socket path.

This would be my preference as it avoids us having to rely on naming conventions which inevitably risk breaking and is transparent to apps.

> 3.) nova can try to set the target dev.

Comment 11 smooney 2019-11-05 13:09:47 UTC
actully if we make a change in nova since i now know how the ovs interface name is generated i could possibel populate the target element
but the code that is generating the xml at that point really should not have to specal case for differenbt net backends to that extent but
i could do it if the libvirt approach would not be vialble. ill leave this assgined to the virt team for now but if you think it does not make
sense to do in libvirt ill explore that option in nova.

Comment 12 Michal Privoznik 2019-11-06 15:57:17 UTC
I think I have a patch ready, but before I post it, can you please test it?

https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=24511849

On one hand, I am able to query openvswitch and translate socket into interface name so that it can be put into XML when domain is starting up. However, I'm not sure how nova configures openvswitch - is it configured when libvirt is starting qemu? Well, we'll see if you test my patch :-)

Comment 13 smooney 2019-11-06 17:45:37 UTC
nova creates the ovs interface before it generates the libvirt xml so it will exists in the ovs db before you start the domain.
it should be noted however that the vhost user path is only in the ovsdb if the vhost mode is server i.e. qemu is the server and the vswich is the client.
if ovs is the server and qemu is the client then the qemu socket name is hardcoded to have the last segment be the interface name.

i don't realy have a openstack environment that i can test this in right now although i might be able to set one up if you can provide a link to the libvirt patch.
maybe  Mircea Vutcovici  can work with the customer to see if they can test it? i belive it should work based on what you descirbed

Comment 14 Mircea Vutcovici 2019-11-06 20:10:36 UTC
I am asking the customer if they have a test environment where we can verify the hotfix.

Comment 23 Michal Privoznik 2020-11-11 09:21:24 UTC
Patches posted upstream:

https://www.redhat.com/archives/libvir-list/2020-November/msg00505.html

Comment 26 Michal Privoznik 2020-11-12 07:26:34 UTC
Merged upstream as:

e4c29e2904 virnetdevopenvswitch: Get names for dpdkvhostuserclient too

v6.9.0-193-ge4c29e2904

Comment 28 Michal Privoznik 2020-11-12 07:30:39 UTC
Moving to POST per comment 26.

Comment 35 Luyao Huang 2020-12-16 07:41:03 UTC
Fail to verify this bug with libvirt-daemon-6.10.0-1.module+el8.4.0+8898+a84e86e1.x86_64 and openvswitch2.11-2.11.3-75.el8fdp.x86_64:

1. Prepare dpdkvhostuserclient and dpdkvhostuser port

# ovs-vsctl show
0dc890bf-222e-4ea6-b368-dcee7fecdf48
    Bridge "ovsbr0"
        Port "vhost-user1"
            Interface "vhost-user1"
                type: dpdkvhostuser
        Port "ovsbr0"
            Interface "ovsbr0"
                type: internal
        Port "vhost-user2"
            Interface "vhost-user2"
                type: dpdkvhostuser
        Port "vhost-client-1"
            Interface "vhost-client-1"
                type: dpdkvhostuserclient
                options: {vhost-server-path="/var/lib/libvirt/qemu/vhost-client-1"}

2. add vhost-user interfaces in guest xml:

    <interface type='vhostuser'>
      <mac address='52:54:00:93:51:dd'/>
      <source type='unix' path='/var/run/openvswitch/vhost-user2' mode='client'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
    </interface>
    <interface type='vhostuser'>
      <mac address='52:54:00:a8:55:a1'/>
      <source type='unix' path='/var/run/openvswitch/vhost-user1' mode='client'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
    </interface>
    <interface type='vhostuser'>
      <mac address='52:54:00:70:c2:4b'/>
      <source type='unix' path='/var/lib/libvirt/qemu/vhost-client-1' mode='server'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x08' slot='0x00' function='0x0'/>
    </interface>

3. start guest
# virsh start vm1
Domain vm1 started

4. check guest live xml

    <interface type='vhostuser'>
      <mac address='52:54:00:93:51:dd'/>
      <source type='unix' path='/var/run/openvswitch/vhost-user2' mode='client'/>
      <target dev='&quot;vhost-user2&quot;
'/>
      <model type='virtio'/>
      <alias name='net1'/>
      <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
    </interface>
    <interface type='vhostuser'>
      <mac address='52:54:00:a8:55:a1'/>
      <source type='unix' path='/var/run/openvswitch/vhost-user1' mode='client'/>
      <target dev='&quot;vhost-user1&quot;
'/>
      <model type='virtio'/>
      <alias name='net2'/>
      <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
    </interface>
    <interface type='vhostuser'>
      <mac address='52:54:00:70:c2:4b'/>
      <source type='unix' path='/var/lib/libvirt/qemu/vhost-client-1' mode='server'/>
      <target dev=''/>
      <model type='virtio'/>
      <alias name='net3'/>
      <address type='pci' domain='0x0000' bus='0x08' slot='0x00' function='0x0'/>
    </interface>

5. check domstats output and find that there is no network stats for vhostuser interface:

# virsh domstats vm1
...
  net.count=4
  net.0.name=vnet5
  net.0.rx.bytes=10108
  net.0.rx.pkts=178
  net.0.rx.errs=0
  net.0.rx.drop=0
  net.0.tx.bytes=1860
  net.0.tx.pkts=11
  net.0.tx.errs=0
  net.0.tx.drop=0
  net.1.name="vhost-user2"

  net.2.name="vhost-user1"

  net.3.name=
  block.count=0
...

6. check libvirtd debug log:

2020-12-16 07:25:50.999+0000: 28657: debug : virCommandRunAsync:2609 : About to run ovs-vsctl --timeout=5 get Interface vhost-user2 name
2020-12-16 07:25:50.999+0000: 28657: debug : virCommandRunAsync:2612 : Command result 0, with PID 28780
2020-12-16 07:25:51.006+0000: 28657: debug : virCommandRun:2454 : Result exit status 0, stdout: '"vhost-user2"
' stderr: ''
2020-12-16 07:25:51.006+0000: 28657: debug : virCommandRunAsync:2609 : About to run ovs-vsctl --timeout=5 get Interface vhost-user1 name
2020-12-16 07:25:51.007+0000: 28657: debug : virCommandRunAsync:2612 : Command result 0, with PID 28783
2020-12-16 07:25:51.012+0000: 28657: debug : virCommandRun:2454 : Result exit status 0, stdout: '"vhost-user1"
' stderr: ''
2020-12-16 07:25:51.012+0000: 28657: debug : virCommandRunAsync:2609 : About to run ovs-vsctl --timeout=5 --no-headings --columns=name find Interface options:vhost-server-path=path
2020-12-16 07:25:51.013+0000: 28657: debug : virCommandRunAsync:2612 : Command result 0, with PID 28785
2020-12-16 07:25:51.018+0000: 28657: debug : virCommandRun:2454 : Result exit status 0, stdout: '' stderr: ''
...

2020-12-16 07:31:10.255+0000: 28656: debug : virCommandRunAsync:2609 : About to run ovs-vsctl --timeout=5 --if-exists --format=list --data=json --no-headings --columns=statistics list Interface '"vhost-user2"
'
2020-12-16 07:31:10.256+0000: 28656: debug : virCommandRunAsync:2612 : Command result 0, with PID 28983
2020-12-16 07:31:10.262+0000: 28656: debug : virCommandRun:2454 : Result status 0, stdout: '' stderr: ''
2020-12-16 07:31:10.262+0000: 28656: error : virNetDevOpenvswitchInterfaceStats:389 : internal error: Interface not found
2020-12-16 07:31:10.262+0000: 28656: debug : virCommandRunAsync:2609 : About to run ovs-vsctl --timeout=5 --if-exists --format=list --data=json --no-headings --columns=statistics list Interface '"vhost-user1"
'
2020-12-16 07:31:10.263+0000: 28656: debug : virCommandRunAsync:2612 : Command result 0, with PID 28984
2020-12-16 07:31:10.268+0000: 28656: debug : virCommandRun:2454 : Result status 0, stdout: '' stderr: ''
2020-12-16 07:31:10.268+0000: 28656: error : virNetDevOpenvswitchInterfaceStats:389 : internal error: Interface not found
2020-12-16 07:31:10.268+0000: 28656: debug : virCommandRunAsync:2609 : About to run ovs-vsctl --timeout=5 --if-exists --format=list --data=json --no-headings --columns=statistics list Interface ''
2020-12-16 07:31:10.269+0000: 28656: debug : virCommandRunAsync:2612 : Command result 0, with PID 28985
2020-12-16 07:31:10.274+0000: 28656: debug : virCommandRun:2454 : Result status 0, stdout: '' stderr: ''
2020-12-16 07:31:10.274+0000: 28656: error : virNetDevOpenvswitchInterfaceStats:389 : internal error: Interface not found


You can see that the root cause is libvirt doesn't get the right name of dpdkvhostuserclient and dpdkvhostuser port

Comment 36 Michal Privoznik 2020-12-16 15:28:10 UTC
Okay, so the problem here is that the string escaping has changed in ovs-vsctl. What I developed my patches on was version 2.14.0 which doesn't escape string if it's only alphanum characters, underscore, dash or dot. Older versions (e.g. 2.11.4) allows only alpha, underscore, dash and dot. Hence, vhost-user1 might get escaped with one version but not with other. I need to think how to unescape this properly.

One more thing, ovs-vsctl successfully ignores output --format requests for "get Interface $if name". Doesn't matter what output format I chose I always get the single string, sometimes escaped and sometimes not. Therefore, making it output JSON and parsing it back is not an option.

Comment 37 Michal Privoznik 2020-12-16 18:46:41 UTC
Patches proposed upstream:

https://www.redhat.com/archives/libvir-list/2020-December/msg00761.html

Comment 39 Michal Privoznik 2020-12-17 08:35:43 UTC
Merged upstream:

51d9af4c0c virnetdevopenvswitch: Try to unescape ovs-vsctl reply in one specific case
0dd029b7f2 virNetDevOpenvswitchGetVhostuserIfname: Actually use @path to lookup interface

v6.10.0-226-g51d9af4c0c

Comment 40 Luyao Huang 2021-01-19 02:34:51 UTC
Verify this bug with libvirt-daemon-7.0.0-1.module+el8.4.0+9464+3e71831a.x86_64:

S1: Use virsh domstats get vhost-user VNIC stats

1. Set up vhost-user server and client use openvsiwtch:

# ovs-vsctl show
465a1f46-5e3a-4518-9f3e-9e02bdaa242a
    Bridge "ovsbr0"
        Port "vhost-user1"
            Interface "vhost-user1"
                type: dpdkvhostuser
        Port "ovsbr0"
            Interface "ovsbr0"
                type: internal
        Port "vhost-client-1"
            Interface "vhost-client-1"
                type: dpdkvhostuserclient
                options: {vhost-server-path="/var/lib/libvirt/qemu/vhost-client-1"}
        Port "vhost-user2"
            Interface "vhost-user2"
                type: dpdkvhostuser

2. prepare a guest which use different vhost-user interface and there is no target elements:

# virsh dumpxml vm1
...
    <interface type='vhostuser'>
      <mac address='52:54:00:93:51:dd'/>
      <source type='unix' path='/var/run/openvswitch/vhost-user2' mode='client'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
    </interface>
    <interface type='vhostuser'>
      <mac address='52:54:00:a8:55:a1'/>
      <source type='unix' path='/var/run/openvswitch/vhost-user1' mode='client'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
    </interface>
    <interface type='vhostuser'>
      <mac address='52:54:00:70:c2:4b'/>
      <source type='unix' path='/var/lib/libvirt/qemu/vhost-client-1' mode='server'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x08' slot='0x00' function='0x0'/>
    </interface>
...

3. start guest
# virsh start vm1
Domain 'vm1' started

4. use virsh domstats
# virsh domstats
Domain: 'vm1'
...
  net.0.name=vhost-user2
  net.0.rx.bytes=0
  net.0.rx.pkts=0
  net.0.rx.drop=0
  net.0.tx.bytes=0
  net.0.tx.pkts=0
  net.0.tx.errs=0
  net.0.tx.drop=0
  net.1.name=vhost-user1
  net.1.rx.bytes=0
  net.1.rx.pkts=0
  net.1.rx.drop=0
  net.1.tx.bytes=0
  net.1.tx.pkts=0
  net.1.tx.errs=0
  net.1.tx.drop=0
  net.2.name=vhost-client-1
  net.2.rx.bytes=0
  net.2.rx.pkts=0
  net.2.rx.drop=0
  net.2.tx.bytes=0
  net.2.tx.pkts=0
  net.2.tx.errs=0
  net.2.tx.drop=0
  block.count=0

5. check active xml and each vhost-user interface have right target name:

# virsh dumpxml vm1
...
    <interface type='vhostuser'>
      <mac address='52:54:00:93:51:dd'/>
      <source type='unix' path='/var/run/openvswitch/vhost-user2' mode='client'/>
      <target dev='vhost-user2'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
    </interface>
    <interface type='vhostuser'>
      <mac address='52:54:00:a8:55:a1'/>
      <source type='unix' path='/var/run/openvswitch/vhost-user1' mode='client'/>
      <target dev='vhost-user1'/>
      <model type='virtio'/>
      <alias name='net1'/>
      <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
    </interface>
    <interface type='vhostuser'>
      <mac address='52:54:00:70:c2:4b'/>
      <source type='unix' path='/var/lib/libvirt/qemu/vhost-client-1' mode='server'/>
      <target dev='vhost-client-1'/>
      <model type='virtio'/>
      <alias name='net2'/>
      <address type='pci' domain='0x0000' bus='0x08' slot='0x00' function='0x0'/>
    </interface>
...

Comment 42 errata-xmlrpc 2021-05-25 06:41:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:2098


Note You need to log in before you can comment on or make changes to this bug.