Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1618791

Summary: [OSP-10] Fix for vhost-user backend crash on SET_MEM_TABLE request handling while port enabled
Product: Red Hat OpenStack Reporter: atelang <atelang>
Component: openvswitchAssignee: Maxime Coquelin <maxime.coquelin>
Status: CLOSED ERRATA QA Contact: Yariv <yrachman>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 10.0 (Newton)CC: agurenko, apevec, aschultz, atelang, cfontain, chrisw, dbecker, fbaudin, fleitner, jschluet, lhh, mburns, morazi, ovs-team, qding, rhos-maint, slinaber, srevivo, supadhya, yrachman
Target Milestone: z9Keywords: Triaged, ZStream
Target Release: 10.0 (Newton)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openvswitch-2.9.0-56.el7fdp Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1618788 Environment:
Last Closed: 2018-09-17 17:01:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1618488, 1618788    
Bug Blocks:    

Comment 7 Alex McLeod 2018-09-03 07:58:39 UTC
Hi there,

If this bug requires doc text for errata release, please set the 'Doc Type' and provide draft text according to the template in the 'Doc Text' field.

The documentation team will review, edit, and approve the text.

If this bug does not require doc text, please set the 'requires_doc_text' flag to -.

Thanks,
Alex

Comment 9 Yariv 2018-09-11 21:12:28 UTC
This is verified with the following puddle

cat core_puddle_version 
2018-08-21.2

Guest with vfio and direct ports attached
[stack@undercloud-0 ~]$ openstack server list --all
+----------------------+----------------------+--------+----------------------+------------------------+
| ID                   | Name                 | Status | Networks             | Image Name             |
+----------------------+----------------------+--------+----------------------+------------------------+
| fd42c063-c06d-499e-  | tempest-             | ACTIVE | sriov-1=10.50.135.10 | rhel-guest-image-7.5-1 |
| 95ef-dc2159de17bd    | TestNfvBasic-        |        | 6;                   | 80.x86_64.qcow2        |
|                      | server-1217997283    |        | data1=10.10.135.101, |                        |
|                      |                      |        | 10.35.141.22         |                        |

yum update kernel uname -a
Linux tempest-testnfvbasic-server-1217997283 3.10.0-862.11.6.el7.x86_64

yum -y install gcc make cmake kernel-headers gcc-c++ kernel-headers.x86_64 cmake gcc gcc-c++ glibc-devel glibc-headers kernel-devel

git clone git://dpdk.org/dpdk
#cd dpdk
#git checkout v17.11
have these entries assigned to "n" in the config/common_linuxapp file:
CONFIG_RTE_KNI_KMOD=​n​
CONFIG_RTE_LIBRTE_KNI=n

export RTE_SDK=$PWD
export RTE_TARGET=x86_64-native-linuxapp-gcc
cd dpdk
make config T=x86_64-native-linuxapp-gcc O=x86_64-native-linuxapp-gcc
make O=x86_64-native-linuxapp-gcc -j 1

dpdk-17.11-11.el7.x86_64 
 
lspci
00:05.0 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 01)

upload vfio module and try to bind SRIOV port to dpdk
modprobe -r vfio
modprobe -v vfio enable_unsafe_noiommu_mode=1
modprobe -v vfio-pci

cd dpdk/usertools/
./dpdk-devbind.py --bind vfio-pci 00:05.0
./dpdk-devbind.py --status 

Network devices using DPDK-compatible driver
============================================
0000:00:05.0 'Ethernet Virtual Function 700 Series 154c' drv=vfio-pci unused=


chris, franck? is it enough to mark this BZ as verified

Comment 10 Franck Baudin 2018-09-12 07:47:10 UTC
# downgrading ovs-vswitchd to openvswitch-2.9.0-54

[root@overcloud-compute-0 openvswitch]# ps awwx | grep ovs-vswitchd

  29537 ?        S<Lsl  33:37 ovs-vswitchd unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfile:info --mlockall --user openvswitch:hugetlbfs --no-chdir --log-file=/var/log/openvswitch/ovs-vswitchd.log --pidfile=/var/run/openvswitch/ovs-vswitchd.pid --detach

  32444 pts/3    S+     0:00 grep --color=auto ovs-vswitchd

# Here starting testpmd in the VM

[root@overcloud-compute-0 openvswitch]# ps awwx | grep ovs-vswitchd

  32619 ?        S<     0:00 ovs-vswitchd unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfile:info --mlockall --user openvswitch:hugetlbfs --no-chdir --log-file=/var/log/openvswitch/ovs-vswitchd.log --pidfile=/var/run/openvswitch/ovs-vswitchd.pid --detach

  32620 ?        R<Lsl   0:02 ovs-vswitchd unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfile:info --mlockall --user openvswitch:hugetlbfs --no-chdir --log-file=/var/log/openvswitch/ovs-vswitchd.log --pidfile=/var/run/openvswitch/ovs-vswitchd.pid --detach

  32627 pts/3    S+     0:00 grep --color=auto ovs-vswitchd

# in the dmesg, we get the crash

 7109.687965] pmd76[29665]: segfault at 2aad33745002 ip 000055a58741391c sp 00007fe5f9ffa500 error 4 in ovs-vswitchd[55a58721a000+4b8000]


With latest OVS included in z9, the same sequence doesn't lead to a crash. ovs-vswitchd PID stay the same.

Comment 12 errata-xmlrpc 2018-09-17 17:01:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2671