The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 1732991 - DPDK in guest causes vhostuser to go down
Summary: DPDK in guest causes vhostuser to go down
Keywords:
Status: CLOSED DUPLICATE of bug 1548112
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: openvswitch
Version: FDP 19.D
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: ---
: ---
Assignee: Maxime Coquelin
QA Contact: qding
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-07-24 22:07 UTC by Marc Methot
Modified: 2020-10-19 09:33 UTC (History)
18 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-19 09:33:36 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Marc Methot 2019-07-24 22:07:13 UTC
When the DPDK (version 2.2) application (6wind) within the guest (rhel 7.2) starts it causes the host's process to segfault.

sos_commands/logs/journalctl_--no-pager_--catalog_--boot
~~~
Jul 20 09:20:40 compute-6 kernel: ovs-vswitchd[629694]: segfault at 0 ip 00007fae81105016 sp 00007ffd87853608 error 4 in libc-2.17.so[7fae80fc7000+1c2000]
Jul 20 09:20:41 compute-6 systemd[1]: ovs-vswitchd.service: main process exited, code=killed, status=11/SEGV
~~~

With 4 PMD in the guest it breaks, however with 8 it works.

Comment 26 Andreas Karis 2019-08-06 21:49:08 UTC
Reminder: 
* issue is with 2.9.0-103
* The same test is working in openvswitch 2.9.0-56 and openvswitch 2.6

Things don't crash, it's just the port that's down once the DPDK application in the VM starts. When I unbing the igb_uio driver, then the port goes back up.

Issue is when number of guest pmd cpu less than 8 with host queues 8: e.g. if guest 4 and host 8 then down
If host queues is 4, then 4 guest PMDs is fine.

Comment 27 Andreas Karis 2019-08-06 21:52:57 UTC
2.9.0-97 is the earliest version where the customer noticed it.

Comment 31 Andreas Karis 2019-08-06 22:37:31 UTC
We downgrade on this system to -56 and could repeat the issue, hence this does not seem to be tied to the specific OVS version. The customer has another cluster with minor -56 where he cannot reproduce the issue.

Comment 57 Andreas Karis 2019-08-23 15:22:47 UTC
This is from my lab, qemu-kvm-rhev runs from the nova_libvirt container:
root@computeovsdpdk-0 qemu-test-rpm]# cat /proc/$(pgrep -f 4d498516-2e6a-473e-8595-310319bc5d54)/mountinfo
463 424 0:44 / / rw,relatime - overlay overlay rw,seclabel,lowerdir=/var/lib/docker/overlay2/l/6IWB576M62AOC35CXPKRMC4LAC:/var/lib/docker/overlay2/l/4VYJWYYFGPYEBOALNVSHLJ6VHA:/var/lib/docker/overlay2/l/BPAHILLCHSX2LLAZ6CGWHDKGAH:/var/lib/docker/overlay2/l/UFF6YDIKC7OBCFDYH2QKXVTAWI:/var/lib/docker/overlay2/l/BSNQ34QHDDH4Q4JRCHGNNXSRT7,upperdir=/var/lib/docker/overlay2/61883b7bd6fc3bea9afbc640de3e8b985a9cfd0c47aadafebf2eeb818bec7c7d/diff,workdir=/var/lib/docker/overlay2/61883b7bd6fc3bea9afbc640de3e8b985a9cfd0c47aadafebf2eeb818bec7c7d/work
464 463 0:3 / /proc rw,nosuid,nodev,noexec,relatime - proc proc rw
465 463 0:17 / /sys rw,nosuid,nodev,noexec,relatime - sysfs sysfs rw,seclabel
466 465 0:15 / /sys/fs/selinux rw,relatime - selinuxfs selinuxfs rw
467 465 0:20 / /sys/fs/cgroup ro,nosuid,nodev,noexec - tmpfs tmpfs ro,seclabel,mode=755
468 467 0:21 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd
469 467 0:23 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,cpuset
470 467 0:24 / /sys/fs/cgroup/perf_event rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,perf_event
471 467 0:25 / /sys/fs/cgroup/net_cls,net_prio rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,net_prio,net_cls
472 467 0:26 / /sys/fs/cgroup/devices rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,devices
473 467 0:27 / /sys/fs/cgroup/cpu,cpuacct rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,cpuacct,cpu
474 467 0:28 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,memory
475 467 0:29 / /sys/fs/cgroup/pids rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,pids
476 467 0:30 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,freezer
477 467 0:31 / /sys/fs/cgroup/hugetlb rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,hugetlb
478 467 0:32 / /sys/fs/cgroup/blkio rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,blkio
479 463 0:5 / /dev rw,nosuid - devtmpfs devtmpfs rw,seclabel,size=49061156k,nr_inodes=12265289,mode=755
480 479 0:18 / /dev/shm rw,nosuid,nodev - tmpfs tmpfs rw,seclabel
481 642 0:45 / /dev/shm rw,nosuid,nodev,noexec,relatime - tmpfs shm rw,seclabel,size=65536k
482 642 0:12 / /dev/pts rw,nosuid,noexec,relatime - devpts devpts rw,seclabel,gid=5,mode=620,ptmxmode=000
483 642 0:37 / /dev/hugepages rw,relatime - hugetlbfs hugetlbfs rw,seclabel
484 642 0:14 / /dev/mqueue rw,relatime - mqueue mqueue rw,seclabel
485 642 0:5 /log /dev/log rw,nosuid - devtmpfs devtmpfs rw,seclabel,size=49061156k,nr_inodes=12265289,mode=755
486 463 0:19 / /run rw,nosuid,nodev - tmpfs tmpfs rw,seclabel,mode=755
526 486 0:39 / /run/user/975 rw,nosuid,nodev,relatime - tmpfs tmpfs rw,seclabel,size=13180676k,mode=700,uid=975,gid=971
527 486 0:38 / /run/user/0 rw,nosuid,nodev,relatime - tmpfs tmpfs rw,seclabel,size=13180676k,mode=700
528 486 0:3 / /run/docker/netns/default rw,nosuid,nodev,noexec,relatime - proc proc rw
529 486 8:2 /var/lib/docker/containers/0caf80934ad90f075ce9ccac7a99d755bf8244f07a28145aa5e5843973931932/secrets//deleted /run/secrets rw,relatime - xfs /dev/sda2 rw,seclabel,attr2,inode64,noquota
530 486 0:19 /libvirt /run/libvirt rw,nosuid,nodev - tmpfs tmpfs rw,seclabel,mode=755
531 463 8:2 /usr/lib/modules /usr/lib/modules ro,relatime - xfs /dev/sda2 rw,seclabel,attr2,inode64,noquota
532 463 8:2 /etc/puppet /etc/puppet ro,relatime - xfs /dev/sda2 rw,seclabel,attr2,inode64,noquota
533 463 8:2 /etc/libvirt /etc/libvirt rw,relatime - xfs /dev/sda2 rw,seclabel,attr2,inode64,noquota
534 463 8:2 /usr/share/zoneinfo/UTC /usr/share/zoneinfo/UTC ro,relatime - xfs /dev/sda2 rw,seclabel,attr2,inode64,noquota
535 463 8:2 /etc/hosts /etc/hosts ro,relatime - xfs /dev/sda2 rw,seclabel,attr2,inode64,noquota
536 463 8:2 /var/lib/docker/containers/0caf80934ad90f075ce9ccac7a99d755bf8244f07a28145aa5e5843973931932/hostname /etc/hostname rw,relatime - xfs /dev/sda2 rw,seclabel,attr2,inode64,noquota
537 463 8:2 /var/lib/docker/containers/0caf80934ad90f075ce9ccac7a99d755bf8244f07a28145aa5e5843973931932/resolv.conf /etc/resolv.conf rw,relatime - xfs /dev/sda2 rw,seclabel,attr2,inode64,noquota
589 463 8:2 /var/log/containers/libvirt /var/log/libvirt rw,relatime - xfs /dev/sda2 rw,seclabel,attr2,inode64,noquota
590 589 8:2 /var/log/libvirt/qemu /var/log/libvirt/qemu ro,relatime - xfs /dev/sda2 rw,seclabel,attr2,inode64,noquota
591 463 8:2 /var/lib/nova /var/lib/nova rw,relatime master:1 - xfs /dev/sda2 rw,seclabel,attr2,inode64,noquota
592 463 8:2 /etc/ssh/ssh_known_hosts /etc/ssh/ssh_known_hosts ro,relatime - xfs /dev/sda2 rw,seclabel,attr2,inode64,noquota
593 463 8:2 /var/lib/libvirt /var/lib/libvirt rw,relatime - xfs /dev/sda2 rw,seclabel,attr2,inode64,noquota
594 463 8:2 /var/lib/vhost_sockets /var/lib/vhost_sockets rw,relatime - xfs /dev/sda2 rw,seclabel,attr2,inode64,noquota
595 463 8:2 /etc/pki/ca-trust/extracted /etc/pki/ca-trust/extracted ro,relatime - xfs /dev/sda2 rw,seclabel,attr2,inode64,noquota
596 595 8:2 /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem ro,relatime - xfs /dev/sda2 rw,seclabel,attr2,inode64,noquota
597 596 8:2 /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem ro,relatime - xfs /dev/sda2 rw,seclabel,attr2,inode64,noquota
598 595 8:2 /etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt ro,relatime - xfs /dev/sda2 rw,seclabel,attr2,inode64,noquota
599 463 8:2 /var/lib/config-data/puppet-generated/nova_libvirt /var/lib/kolla/config_files/src ro,relatime - xfs /dev/sda2 rw,seclabel,attr2,inode64,noquota
600 463 8:2 /etc/pki/ca-trust/source/anchors /etc/pki/ca-trust/source/anchors ro,relatime - xfs /dev/sda2 rw,seclabel,attr2,inode64,noquota
601 463 8:2 /var/lib/kolla/config_files/nova_libvirt.json /var/lib/kolla/config_files/config.json ro,relatime - xfs /dev/sda2 rw,seclabel,attr2,inode64,noquota
641 463 8:2 /etc/ceph /var/lib/kolla/config_files/src-ceph ro,relatime - xfs /dev/sda2 rw,seclabel,attr2,inode64,noquota
642 479 0:85 / /dev rw,nosuid,relatime - tmpfs devfs rw,seclabel,size=64k,mode=755
[root@computeovsdpdk-0 qemu-test-rpm]#

[root@computeovsdpdk-0 qemu-test-rpm]# /var/lib/docker/overlay2/l/4VYJWYYFGPYEBOALNVSHLJ6VHA/usr/libexec/qemu-kvm --version
QEMU emulator version 2.12.0 (qemu-kvm-rhev-2.12.0-18.el7_6.1)
Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers
[root@computeovsdpdk-0 qemu-test-rpm]#

Comment 59 Maxime Coquelin 2020-10-19 09:33:36 UTC
Hi Marc and Andreas,

My understanding is that 6Wind has backported DPDK patches mentioned in Comment 51 in their VNF,
and so issue is no more reproduced.

Regarding QEMU/OVS-DPDK solution mentioned in Comment 40:
"
Regarding the solutions, the ideal one would be that we have a new vhost-user protocol
feature to notify the backend that the driver set DRIVER_OK. But it would take time
to get the spec accepted upstream and also get the implementation done, merged upstream
and implemented.
"

This is addressed in Bz1548112, and will be available in OVS-DPDK v2.14.

I propose to close as duplicate of Bz1548112 for the QEMU/OVS-DPDK part.

*** This bug has been marked as a duplicate of bug 1548112 ***


Note You need to log in before you can comment on or make changes to this bug.