RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1525446 - Host dpdk's testpmd "Segmentation fault" when migrating VM with vhost-user and packets flow
Summary: Host dpdk's testpmd "Segmentation fault" when migrating VM with vhost-user an...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: dpdk
Version: 7.5
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: rc
: ---
Assignee: Aaron Conole
QA Contact: Pei Zhang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-12-13 10:54 UTC by Pei Zhang
Modified: 2018-04-10 23:59 UTC (History)
9 users (show)

Fixed In Version: dpdk-17.11-6.el7.x86_64
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-04-10 23:59:23 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
XML of VM (2.97 KB, text/plain)
2017-12-13 10:54 UTC, Pei Zhang
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2018:1065 0 None None None 2018-04-10 23:59:47 UTC

Description Pei Zhang 2017-12-13 10:54:28 UTC
Created attachment 1367272 [details]
XML of VM

Description of problem:
Boot dpdk's testpmd with vhost-user in host. Next boot VM using same vhost-user socket. Then generates packets from another host to this VM. 

Then testpmd will "Segmentation fault" after migration finished.


Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.10.0-12.el7.x86_64
3.10.0-820.el7.x86_64
dpdk-17.11-3.el7.x86_64/


How reproducible:
100%

Steps to Reproduce:
1. Boot testpmd in src and des host
# /usr/bin/testpmd -l 2,4,6,8,10,12,14 \
--socket-mem 1024,1024 \
-n 4  \
--vdev 'net_vhost0,iface=/tmp/vhost-user1,client=1'  \
--vdev 'net_vhost1,iface=/tmp/vhost-user2,client=1'  \
--vdev 'net_vhost2,iface=/tmp/vhost-user3,client=1'  \
-- \
--portmask=3F \
--disable-hw-vlan \
-i \
--rxq=1 --txq=1 \
--nb-cores=6 \
--forward-mode=io

2. Boot VM 
See attachment of this Comment.


3. Start testpmd in guest

modprobe vfio enable_unsafe_noiommu_mode=Y
modprobe vfio-pci

# /usr/bin/testpmd \
-l 1,2,3 \
-n 4 \
-d /usr/lib64/librte_pmd_virtio.so.1 \
-w 0000:00:03.0 -w 0000:00:06.0 \
-- \
--nb-cores=2 \
--disable-hw-vlan \
-i \
--disable-rss \
--rxq=1 --txq=1

4. Generator packets from another host
 ./build/MoonGen examples/l2-load-latency.lua 0 1 64


5. Do migration from src to des host
# virsh migrate --verbose --persistent --live rhel7.5_nonrt qemu+ssh://192.168.1.2/system

6. After migration finished, dpdk quit with "Segmentation fault". Also there are error info in # dmesg.

# dmesg
[16105.282031] testpmd[24507]: segfault at 24 ip 00007fef3bcf42d7 sp 00007fef333f6c40 error 4 in librte_vhost.so.4[7fef3bcef000+10000]


Actual results:
dpdk's testpmd quit unexpected.


Expected results:
dpdk's testpmd should always work well.


Additional info:

Comment 2 Pei Zhang 2017-12-14 09:35:24 UTC
Additional info:

1. This is a regression issue.
dpdk-16.11.2-6.el7.x86_64   works well.


2. More details about the error.
(1) After migrating from src to des, the dpdk's testpmd in src host "Segmentation fault".

(2) Another error in #dmesg
[15979.767740] lcore-slave-4[27362]: segfault at 2 ip 00007fdb728fbab1 sp 00007fdb6d7fe950 error 4
[15979.767813] lcore-slave-6[27363]: segfault at 2 ip 00007fdb728fd93a sp 00007fdb6cffe900 error 4
[15979.767816]  in librte_vhost.so.4[7fdb728f2000+10000]
[15979.790956]  in librte_vhost.so.4[7fdb728f2000+10000]

Comment 6 Maxime Coquelin 2018-01-29 14:09:12 UTC
Looking at the logs, it crashed withing the same second SET_VRING_ADDR is being 
handled while the device is running.

I already reproduced such crash with DPDK v17.11 and posted a fix for this 
specific one [0]. However, this patch has been discarded as Victor has
fixed async virtio_net struct changes more generally by introducing a new lock.

Victor patch has been accepted upstream and queued for v17.11 LTS release:
https://dpdk.org/dev/patchwork/patch/33921/
This patch is needed also for Bz1450680.

Adding Victor in cc:.

Regards,
Maxime
[0]: http://dpdk.org/dev/patchwork/patch/31659/

Comment 8 Pei Zhang 2018-02-07 05:16:14 UTC
==Verification==

Versions:
3.10.0-843.el7.x86_64
qemu-kvm-rhev-2.10.0-19.el7.x86_64
libvirt-3.9.0-11.el7.x86_64
dpdk-17.11-7.el7.x86_64

Steps:
Following steps in Description. 

All 10 migration runs work well, both dpdk and guest work well, no any error. 

So this bug has been fixed well. Move status to 'VERIFIED'.

Comment 11 errata-xmlrpc 2018-04-10 23:59:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:1065


Note You need to log in before you can comment on or make changes to this bug.