Bug 1304507

Summary: A multi-queue vhostuser failed to start testpmd due to port start failure
Product: Red Hat Enterprise Linux 7 Reporter: Jean-Tsung Hsiao <jhsiao>
Component: openvswitch-dpdkAssignee: Flavio Leitner <fleitner>
Status: CLOSED WORKSFORME QA Contact: Jean-Tsung Hsiao <jhsiao>
Severity: high Docs Contact:
Priority: high    
Version: 7.3CC: aloughla, fleitner, jhsiao, kzhang, rcain
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-04-12 14:36:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jean-Tsung Hsiao 2016-02-03 20:55:29 UTC
Description of problem: A multi-queue vhostuser failed to start testpmd due to port start failure.

. testpmd-mq.sh
EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 0 on socket 0
EAL: Detected lcore 2 as core 0 on socket 0
EAL: Detected lcore 3 as core 0 on socket 0
EAL: Support maximum 128 logical core(s) by configuration.
EAL: Detected 4 lcore(s)
EAL: VFIO modules not all loaded, skip VFIO support...
EAL: Setting up physically contiguous memory...
EAL: Ask a virtual area of 0x1000000 bytes
EAL: Virtual area found at 0x7f33bea00000 (size = 0x1000000)
EAL: Ask a virtual area of 0x1f000000 bytes
EAL: Virtual area found at 0x7f339f800000 (size = 0x1f000000)
EAL: Ask a virtual area of 0x200000 bytes
EAL: Virtual area found at 0x7f339f400000 (size = 0x200000)
EAL: Ask a virtual area of 0x1fc00000 bytes
EAL: Virtual area found at 0x7f337f600000 (size = 0x1fc00000)
EAL: Ask a virtual area of 0x200000 bytes
EAL: Virtual area found at 0x7f337f200000 (size = 0x200000)
EAL: Requesting 512 pages of size 2MB from socket 0
EAL: TSC frequency is ~3399993 KHz
EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable clock cycles !
EAL: open shared lib /usr/lib64/librte_pmd_virtio.so
EAL: Master lcore 0 is ready (tid=c439e8c0;cpuset=[0])
EAL: lcore 1 is ready (tid=7e7f8700;cpuset=[1])
EAL: lcore 2 is ready (tid=7dff7700;cpuset=[2])
EAL: lcore 3 is ready (tid=7d7f6700;cpuset=[3])
EAL: PCI device 0000:00:03.0 on NUMA socket -1
EAL:   probe driver: 1af4:1000 rte_virtio_pmd
EAL: PCI device 0000:00:09.0 on NUMA socket -1
EAL:   probe driver: 1af4:1000 rte_virtio_pmd
Interactive-mode selected
Configuring Port 0 (socket 0)
Fail to start port 0
Configuring Port 1 (socket 0)

NOTE: Hit return key, but no prompt back.Need to hit CRTL-C to get it back.




Version-Release number of selected component (if applicable):
* On vhostuser guest
[root@localhost jhsiao]# rpm -qa | grep dpdk
dpdk-tools-2.1.0-5.el7.x86_64
dpdk-2.1.0-5.el7.x86_64

* On host
[root@netqe5 jhsiao]# rpm -qa | grep dpdk
dpdk-2.2.0-1.el7.x86_64
dpdk-tools-2.2.0-1.el7.x86_64
[root@netqe5 jhsiao]# rpm -qa | grep dpdk
dpdk-2.2.0-1.el7.x86_64
dpdk-tools-2.2.0-1.el7.x86_64
[root@netqe5 jhsiao]# rpm -qa | grep openvswitch
openvswitch-2.5.90-1.el7.x86_64
[root@netqe5 jhsiao]# uname -a
Linux netqe5.knqe.lab.eng.bos.redhat.com 3.10.0-327.el7.x86_64 #1 SMP Thu Oct 29 17:29:29 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux


How reproducible:Reproducible


Steps to Reproduce:
1.On host configure a OVS with two dpdk ports and two vhu ports.
2.Config a vhostuser with four CPU's and two 4-queue vhostuser interfaces.
3.Run "ethtool -L eth0 combined 4;ethtool -L eth1 combined 4"
4.Run testpmd with two cores and two queues, like the following:
[root@localhost jhsiao]# cat testpmd-mq.sh
queues=2
cores=$queues
#cores=2
testpmd -l 0,1,2,3 -n 1 -d /usr/lib64/librte_pmd_virtio.so  \
        -w 0000:00:03.0 \
        -w 0000:00:09.0 \
	-- \
        --nb-cores=${cores} \
        --disable-hw-vlan -i \
        --rxq=${queues} --txq=${queues}


Actual results:
Fail to start port 0

Expected results:
Both ports should be started successfully.

Additional info:

Comment 2 Flavio Leitner 2016-02-15 16:36:13 UTC
Same here.
I could reproduce this issue with vhost MQ (not vhost-user)
Could you please confirm if you also can reproduce?

If so, that means MQ is broken for testpmd and virtio and is not related to vhost-user MQ implementation in the host.

Thanks,
fbl

Comment 3 Jean-Tsung Hsiao 2016-02-18 18:06:51 UTC
(In reply to Flavio Leitner from comment #2)
> Same here.
> I could reproduce this issue with vhost MQ (not vhost-user)
> Could you please confirm if you also can reproduce?

There is no such issue with vhostuser. NOTE: I am using exactly the same host OVS vhostuser configuration. The only change is removing the <driver queues='4'> section from the xml file.

> 
> If so, that means MQ is broken for testpmd and virtio and is not related to
> vhost-user MQ implementation in the host.

Agree.

> 
> Thanks,
> fbl

Comment 4 Flavio Leitner 2016-04-08 22:34:35 UTC
> 3.Run "ethtool -L eth0 combined 4;ethtool -L eth1 combined 4"

Please skip that command as for DPDK we don't need ethtool, but it might change the driver in such way that DPDK might not expect.

> testpmd -l 0,1,2,3 -n 1 -d /usr/lib64/librte_pmd_virtio.so  \

Please also add the number of descriptions which is needed for virtio:
--rxd=256 --txd=256 

> Version-Release number of selected component (if applicable):
> * On vhostuser guest
> [root@localhost jhsiao]# rpm -qa | grep dpdk
> dpdk-tools-2.1.0-5.el7.x86_64
> dpdk-2.1.0-5.el7.x86_64

Please update to DPDK 2.2.0 which supports multiple queue.
We don't support DPDK 2.1.0 anymore.

testpmd has a very simple way to allocate CPUs for queues which is literally one core for each queue.  So, for 4 queues doing forwarding between 2 devices you actually need 8 cores. (There is a plan to improve this algorithm)

Thanks,

Comment 5 Jean-Tsung Hsiao 2016-04-12 14:36:10 UTC
(In reply to Flavio Leitner from comment #4)
> > 3.Run "ethtool -L eth0 combined 4;ethtool -L eth1 combined 4"

Yes, this is not needed for running testpmd.
> 
> Please skip that command as for DPDK we don't need ethtool, but it might
> change the driver in such way that DPDK might not expect.
> 
> > testpmd -l 0,1,2,3 -n 1 -d /usr/lib64/librte_pmd_virtio.so  \
> 
> Please also add the number of descriptions which is needed for virtio:
> --rxd=256 --txd=256 

already added.

> 
> > Version-Release number of selected component (if applicable):
> > * On vhostuser guest
> > [root@localhost jhsiao]# rpm -qa | grep dpdk
> > dpdk-tools-2.1.0-5.el7.x86_64
> > dpdk-2.1.0-5.el7.x86_64
> 
> Please update to DPDK 2.2.0 which supports multiple queue.

By replacing 2.1.0 by 2.2.0 this bug went away.

I,ll close this BZ.

> We don't support DPDK 2.1.0 anymore.
> 
> testpmd has a very simple way to allocate CPUs for queues which is literally
> one core for each queue.  So, for 4 queues doing forwarding between 2
> devices you actually need 8 cores. (There is a plan to improve this
> algorithm)
> 
> Thanks,