Bug 1491314 - DPDK networks dont work using RHOS11 docs
Summary: DPDK networks dont work using RHOS11 docs
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openvswitch
Version: 11.0 (Ocata)
Hardware: All
OS: Linux
low
low
Target Milestone: ---
: ---
Assignee: Open vSwitch development team
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-09-13 13:47 UTC by Jaison Raju
Modified: 2017-09-24 11:58 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-09-24 11:58:08 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Jaison Raju 2017-09-13 13:47:20 UTC
Description of problem:
Instance is not pingable.
Instance doesn't get IP.
Static IP on instance, but still cant ping gateway in network namespace.
Same configuration works on RHOS10 .


Version-Release number of selected component (if applicable):
RHOS11

How reproducible:
Always

Steps to Reproduce:
1. Deploy dpdk based openstack infra using vlan for dpdk/provider n/w
2.
3.

Actual results:
Ping or access never works

Expected results:


Additional info:
I am not raising this as documentation bug as i am not aware of what is actually going wrong & hence we need engineering team to help investigate.

Comment 7 Jaison Raju 2017-09-13 20:50:25 UTC
 i reviewed the default templates.
There is not much difference, Although I did include vxlan for tunnel networks. 
  NeutronTunnelTypes: 'vxlan'
  NeutronNetworkType: 'vlan,vxlan'

But this works well in RHOS10 .
I will test this again & update the bz.

Comment 8 Jaison Raju 2017-09-15 19:09:27 UTC
I redeployed without vxlan , but still the same result.
One thing i noticed is that these ports are in down state & without mac address.

ovs-ofctl show br-int
OFPT_FEATURES_REPLY (xid=0x2): dpid:0000fe0b95a56041
n_tables:254, n_buffers:256
capabilities: FLOW_STATS TABLE_STATS PORT_STATS QUEUE_STATS ARP_MATCH_IP
actions: output enqueue set_vlan_vid set_vlan_pcp strip_vlan mod_dl_src mod_dl_dst mod_nw_src mod_nw_dst mod_nw_tos mod_tp_src mod_tp_dst
 1(int-br-link0): addr:0a:b6:f8:bf:77:d0
     config:     0
     state:      0
     speed: 0 Mbps now, 0 Mbps max
 2(vhu9a3d7764-39): addr:00:00:00:00:00:00
     config:     0
     state:      LINK_DOWN
     speed: 0 Mbps now, 0 Mbps max
 3(vhu1444a55c-4f): addr:00:00:00:00:00:00
     config:     0
     state:      LINK_DOWN
     speed: 0 Mbps now, 0 Mbps max
 LOCAL(br-int): addr:fe:0b:95:a5:60:41
     config:     PORT_DOWN
     state:      LINK_DOWN
     current:    10MB-FD COPPER
     speed: 10 Mbps now, 0 Mbps max
OFPT_GET_CONFIG_REPLY (xid=0x4): frags=normal miss_send_len=0

Comment 9 Yariv 2017-09-17 06:12:53 UTC
Hi Jaison

Looking at compute SOS report

I see the following I see the following in compute log file.
/var/log/tuned/tuned.log

2017-09-10 15:01:35,259 INFO     tuned.daemon.controller: terminating controller
2017-09-10 15:01:35,260 INFO     tuned.daemon.daemon: stopping tunning
2017-09-10 15:01:36,126 INFO     tuned.daemon.daemon: terminating Tuned, rolling back all changes
2017-09-10 15:01:36,127 INFO     tuned.plugins.plugin_bootloader: removing grub2 tuning previously added by Tuned

It is related to known issue we had 11z2
https://bugzilla.redhat.com/show_bug.cgi?id=1488369

Can you check what is your tuned version? can you reboot compute?

Comment 10 Jaison Raju 2017-09-17 12:18:09 UTC
(In reply to Yariv from comment #9)
> Hi Jaison
> 
> It is related to known issue we had 11z2
> https://bugzilla.redhat.com/show_bug.cgi?id=1488369
> 
> Can you check what is your tuned version? can you reboot compute?

Hi Yariv,

I noticed the above bug earlier, so i upgraded the compute-1 to latest kernel/packages & then rebooted the compute.
But still no change.

[jraju@localhost dpdk]$ cat sosreport-compute-1-20170913131320/etc/tuned/cpu-partitioning-variables.conf 
# Examples:
# isolated_cores=2,4-7
# isolated_cores=2-23
#
isolated_cores=2,4,6,7,8,9,10,11
[jraju@localhost dpdk]$ cat sosreport-compute-1-20170913131320/proc/cmdline 
BOOT_IMAGE=/boot/vmlinuz-3.10.0-693.2.1.el7.x86_64 root=LABEL=img-rootfs ro console=tty0 console=ttyS0,115200n8 crashkernel=auto rhgb quiet default_hugepagesz=1GB hugepagesz=1G hugepages=16 iommu=pt intel_iommu=on selinux=0 isolcpus=2,4,6,7,8,9,10,11 nohz=on nohz_full=2,4,6,7,8,9,10,11 rcu_nocbs=2,4,6,7,8,9,10,11 tuned.non_isolcpus=0000002b intel_pstate=disable nosoftlockup LANG=en_US.UTF-8
[jraju@localhost dpdk]$ grep tuned sosreport-compute-1-20170913131320/installed-rpms 
tuned-2.8.0-5.el7.noarch                                    Thu Jul 27 17:43:33 2017
tuned-profiles-cpu-partitioning-2.8.0-5.el7.noarch          Thu Jul 27 18:51:13 2017
[jraju@localhost dpdk]$ grep kernel sosreport-compute-1-20170913131320/installed-rpms
erlang-kernel-18.3.4.4-1.el7ost.x86_64                      Thu Jul 27 18:35:41 2017
kernel-3.10.0-693.2.1.el7.x86_64                            Mon Sep 11 14:12:22 2017
kernel-3.10.0-693.el7.x86_64                                Thu Jul 27 17:43:26 2017
kernel-tools-3.10.0-693.2.1.el7.x86_64                      Mon Sep 11 14:13:41 2017
kernel-tools-libs-3.10.0-693.2.1.el7.x86_64                 Mon Sep 11 14:13:00 2017
[jraju@localhost dpdk]$ cat sosreport-compute-1-20170913131320/uname
Linux compute-1 3.10.0-693.2.1.el7.x86_64 #1 SMP Fri Aug 11 04:58:43 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux

Regards,
Jaison R

Comment 11 Eyal Dannon 2017-09-18 12:58:21 UTC
Hi,

I took a look at your templates and I did not see any reference to Internal/ tenant API  network.

Are you trying to set VLAN/VXLAN as tunnel? if you do, you should configure the relevant fields at the network-environment and nic-configs

For compute.yaml
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/11/html/network_functions_virtualization_configuration_guide/appe-sample-ovsdpdk-files#ap-ovsdpdk-compute

Network-environment:
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/11/html/network_functions_virtualization_configuration_guide/appe-sample-ovsdpdk-files#ap-ovsdpdk-2-network-environment

Let me know if you need any help with those templates,
Eyal

Comment 13 Jaison Raju 2017-09-21 08:26:07 UTC
(In reply to Eyal Dannon from comment #11)
> Hi,
> 
> I took a look at your templates and I did not see any reference to Internal/
> tenant API  network.
I use controlplane/provisioning network for rest of the communication, including tunnel/tenant network.

> 
> Are you trying to set VLAN/VXLAN as tunnel? if you do, you should configure
> the relevant fields at the network-environment and nic-configs
> 
> For compute.yaml
> https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/11/
> html/network_functions_virtualization_configuration_guide/appe-sample-
> ovsdpdk-files#ap-ovsdpdk-compute
> 
> Network-environment:
> https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/11/
> html/network_functions_virtualization_configuration_guide/appe-sample-
> ovsdpdk-files#ap-ovsdpdk-2-network-environment
> 
> Let me know if you need any help with those templates,
> Eyal
I redeployed without tunnel , but same results:

  NeutronTunnelTypes: ''
  NeutronNetworkType: 'vlan'

Comment 17 Eyal Dannon 2017-09-24 11:25:53 UTC
Hi,

Using your environment I successfully boot an instance and ping using the following:
[stack@dell-fc430-1 ~]$ nova boot --image rhel7 --flavor dpdk.test --nic net-id=983bab6c-c758-4541-86b9-0bd39db18245 instance015

| 46bcb2fa-074d-4d98-909e-24fc55b55bfe | instance015 | ACTIVE | -          | Running     | dpdk-provider-172=10.65.199.203 |

[root@controller-0 ~]# ip netns exec qdhcp-983bab6c-c758-4541-86b9-0bd39db18245 ping 10.65.199.203
PING 10.65.199.203 (10.65.199.203) 56(84) bytes of data.
64 bytes from 10.65.199.203: icmp_seq=1 ttl=64 time=0.163 ms
64 bytes from 10.65.199.203: icmp_seq=2 ttl=64 time=0.168 ms

DPDK instance requires huge pages to get DHCP, to set it you should use:

[stack@dell-fc430-1 ~]$ openstack flavor set dpdk.test --property hw:mem_page_size=large

Comment 18 Jaison Raju 2017-09-24 11:58:08 UTC
True, that seems to be the issue.
i think someone may have removed huge pages from filter to bypass a numatopology filter failure earlier & forgot to add it back.
Sorry team, for wasting your cycles.

Regards,
Jaison R


Note You need to log in before you can comment on or make changes to this bug.