The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 1974062 - [OVN] Metadata ports can no longer talk to SR-IOV ports
Summary: [OVN] Metadata ports can no longer talk to SR-IOV ports
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: ovn2.13
Version: FDP 20.F
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Ihar Hrachyshka
QA Contact: Jianlin Shi
URL:
Whiteboard:
Depends On:
Blocks: 1974061 1985717 1986484
TreeView+ depends on / blocked
 
Reported: 2021-06-20 09:15 UTC by Roman Safronov
Modified: 2021-07-29 20:18 UTC (History)
17 users (show)

Fixed In Version: ovn-2021-21.06.0-12.el8fdp
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1974061
: 1985717 (view as bug list)
Environment:
Last Closed: 2021-07-29 20:18:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2021:2971 0 None None None 2021-07-29 20:18:24 UTC

Description Roman Safronov 2021-06-20 09:15:13 UTC
+++ This bug was initially created as a clone of Bug #1974061 +++

Description of problem:
Instance is not able to get metadata on creation. SSH to the instance is not working.

Version-Release number of selected component (if applicable):
RHOS-16.2-RHEL-8-20210610.n.1

How reproducible:
Happens very often, mainly on SR-IOV environment

Steps to Reproduce:
1. Deploy SR-IOV environment, make sure that external network exist.
2. Create a security group with allowed icmp and ssh and a keypair.
3. Create a new network, create a router and connect the internal network to the external one through the router.
4. Launch a VM connected to the internal network.
5. Create a FIP and attach it to the VM's port.
6. Try to ping the VM FIP
Result: Ping works - OK
7. Try to ssh the VM
Result: Access using SSH fails - NOK (BUG)

Actual results:
Metadata service is not accessible from a VM so SSH key can not be obtained. It is not possible to connect to VM using SSH.

Expected results:
Metadata service is accessible from VM. It is possible to connect to VM using SSH.

Additional info:

Try run openstack console log show <VM UUID>
It can be seen that VM is not able to access metadata:
   35.102236] cloud-init[797]: 2021-06-18 18:11:07,494 - util.py[WARNING]: No active metadata service found

Connect to the compute node where the VM is running and try to ping the VM from metadata namespace:
sudo ip net exec ovnmeta-<DATAPATH UUID> ping 192.168.2.225
Result: no replies

Try to trace a packet from the VM to metadata port:

[heat-admin@computesriov-1 ~]$ sudo ovs-appctl ofproto/trace br-int in_port=493,dl_src=fa:16:3e:35:7c:7d,dl_dst=fa:16:3e:52:37:e5
Flow: in_port=493,vlan_tci=0x0000,dl_src=fa:16:3e:35:7c:7d,dl_dst=fa:16:3e:52:37:e5,dl_type=0x0000

bridge("br-int")
----------------
 0. in_port=493, priority 100, cookie 0xcee76ab7
    set_field:0xd->reg13
    set_field:0xf->reg11
    set_field:0xe->reg12
    set_field:0x4->metadata
    set_field:0x3->reg14
    resubmit(,8)
 8. reg14=0x3,metadata=0x4,dl_src=fa:16:3e:35:7c:7d, priority 50, cookie 0x278e23bd
    resubmit(,9)
 9. metadata=0x4, priority 0, cookie 0x773414cc
    resubmit(,10)
10. metadata=0x4, priority 0, cookie 0x6c0ff7a9
    resubmit(,11)
11. metadata=0x4, priority 0, cookie 0x781aaddf
    resubmit(,12)
12. metadata=0x4, priority 0, cookie 0xf29f0a11
    resubmit(,13)
13. metadata=0x4, priority 0, cookie 0x29be1854
    resubmit(,14)
14. metadata=0x4, priority 0, cookie 0xe29979c2
    resubmit(,15)
15. metadata=0x4, priority 0, cookie 0x1ba260d6
    resubmit(,16)
16. ct_state=-trk,metadata=0x4, priority 5, cookie 0x3cd9fb46
    set_field:0x100000000000000000000000000/0x100000000000000000000000000->xxreg0
    set_field:0x200000000000000000000000000/0x200000000000000000000000000->xxreg0
    resubmit(,17)
17. metadata=0x4, priority 0, cookie 0xe50c1ff9
    resubmit(,18)
18. metadata=0x4, priority 0, cookie 0x32efad62
    resubmit(,19)
19. metadata=0x4, priority 0, cookie 0x12f09e82
    resubmit(,20)
20. metadata=0x4, priority 0, cookie 0x5c1d64dd
    resubmit(,21)
21. metadata=0x4, priority 0, cookie 0x7c74f24b
    resubmit(,22)
22. metadata=0x4, priority 0, cookie 0xbdcdc10b
    resubmit(,23)
23. metadata=0x4, priority 0, cookie 0x71ce481b
    resubmit(,24)
24. metadata=0x4, priority 0, cookie 0x9c0631be
    resubmit(,25)
25. metadata=0x4, priority 0, cookie 0x3337a67f
    resubmit(,26)
26. metadata=0x4, priority 0, cookie 0xe80589b0
    resubmit(,27)
27. metadata=0x4, priority 0, cookie 0xfceb4a8d
    resubmit(,28)
28. metadata=0x4, priority 0, cookie 0x3d2bd176
    resubmit(,29)
29. metadata=0x4, priority 0, cookie 0xa31797e7
    resubmit(,30)
30. metadata=0x4, priority 0, cookie 0xd33564d0
    resubmit(,31)
31. metadata=0x4,dl_dst=fa:16:3e:52:37:e5, priority 50, cookie 0xdbb8b7c4
    set_field:0x2->reg15
    resubmit(,37)
37. priority 0
    resubmit(,38)
38. reg15=0x2,metadata=0x4, priority 100, cookie 0xe2f53bb7
    set_field:0x1->reg15
    resubmit(,38)
38. reg15=0x1,metadata=0x4, priority 100
    set_field:0x10->reg13
    set_field:0xf->reg11
    set_field:0xe->reg12
    resubmit(,39)
39. priority 0
    set_field:0->reg0
    set_field:0->reg1
    set_field:0->reg2
    set_field:0->reg3
    set_field:0->reg4
    set_field:0->reg5
    set_field:0->reg6
    set_field:0->reg7
    set_field:0->reg8
    set_field:0->reg9
    resubmit(,40)
40. metadata=0x4, priority 0, cookie 0x2d2084a7
    resubmit(,41)
41. metadata=0x4, priority 0, cookie 0x9a0d473
    resubmit(,42)
42. metadata=0x4, priority 0, cookie 0xa37266fe
    resubmit(,43)
43. metadata=0x4, priority 0, cookie 0xbf5498f8
    resubmit(,44)
44. ct_state=-trk,metadata=0x4, priority 5, cookie 0x3af4b3cc
    set_field:0x100000000000000000000000000/0x100000000000000000000000000->xxreg0
    set_field:0x200000000000000000000000000/0x200000000000000000000000000->xxreg0
    resubmit(,45)
45. metadata=0x4, priority 0, cookie 0xa775f992
    resubmit(,46)
46. metadata=0x4, priority 0, cookie 0x76593a3
    resubmit(,47)
47. metadata=0x4, priority 0, cookie 0xb0395be2
    resubmit(,48)
48. metadata=0x4, priority 0, cookie 0x1ac6c088
    resubmit(,49)
49. metadata=0x4, priority 0, cookie 0x50392d97
    resubmit(,50)
50. reg15=0x1,metadata=0x4, priority 50, cookie 0x9de154a2
    resubmit(,64)
64. priority 0
    resubmit(,65)
65. reg15=0x1,metadata=0x4, priority 100, cookie 0x149767bd
    push_vlan:0x8100
    set_field:4418->vlan_vid
    output:487

    bridge("br-ext-int")
    --------------------
         0. priority 0
            NORMAL
             -> no learned MAC for destination, flooding

            bridge("br-int")
            ----------------
                 0. No match.
                    drop

        bridge("br-int")
        ----------------
         0. No match.
            drop
    pop_vlan

Final flow: reg0=0x300,reg11=0xf,reg12=0xe,reg13=0x10,reg14=0x3,reg15=0x1,metadata=0x4,in_port=493,vlan_tci=0x0000,dl_src=fa:16:3e:35:7c:7d,dl_dst=fa:16:3e:52:37:e5,dl_type=0x0000
Megaflow: recirc_id=0,ct_state=-new-est-rel-rpl-inv-trk,ct_label=0/0x1,eth,in_port=493,dl_src=fa:16:3e:35:7c:7d,dl_dst=fa:16:3e:52:37:e5,dl_type=0x0000
Datapath actions: push_vlan(vid=322,pcp=0),1,3


Note: in case I try to launch on the network a VM with '--config-drive True', i.e. like this:
openstack server create --flavor rhel-flavor --security-group overcloud_sg --image rhel-8 --nic net-id=internal_A vm2 --key-name test-key --config-drive True
it succeeds to get metadata. After this all new VMs on the same network will be able to access the metadata service on this network.

Comment 2 spower 2021-06-29 17:26:55 UTC
Can you get this triaged so we can give the Blocker+ when approved?

Comment 3 Yaniv Kaul 2021-06-30 08:07:47 UTC
(In reply to spower from comment #2)
> Can you get this triaged so we can give the Blocker+ when approved?

+NEEDINFO on owner.

Comment 14 Jianlin Shi 2021-07-16 07:34:51 UTC
tested with following script:

systemctl start openvswitch          
systemctl start ovn-northd                                          
ovn-nbctl set-connection ptcp:6641
ovn-sbctl set-connection ptcp:6642
ovs-vsctl set open . external_ids:system-id=hv1 external_ids:ovn-remote=tcp:1.1.172.25:6642 external_ids:ovn-encap-type=geneve external_ids:ovn-encap-ip=1.1.172.25
systemctl restart ovn-controller                                             
                                                    
ovs-vsctl add-br br-phys                                                   
ip link set br-phys up  
                              
ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
                                 
ovn-nbctl ls-add ls                                                                    
                                           
ovn-nbctl --wait=sb ha-chassis-group-add hagrp
ovn-nbctl --wait=sb ha-chassis-group-add-chassis hagrp hv1 10         
ovn-nbctl lsp-add ls lext                                               
ovn-nbctl lsp-set-addresses lext "00:00:00:00:00:04 10.0.0.4 2001::4"  
ovn-nbctl lsp-set-type lext external                                     
hagrp_uuid=`ovn-nbctl --bare --columns _uuid find ha_chassis_group name=hagrp`
ovn-nbctl set logical_switch_port lext ha_chassis_group=$hagrp_uuid
                                             
ovn-nbctl lsp-add ls lp \                                                                                                                       
    -- lsp-set-type lp localport \
    -- lsp-set-addresses lp "00:00:00:00:00:01 10.0.0.1 2001::1" \
    -- lsp-add ls lsp \                                    
    -- lsp-set-addresses lsp "00:00:00:00:00:02 10.0.0.2 2001::2"
                                                 
ovn-nbctl lsp-add ls lext2                      
ovn-nbctl lsp-set-addresses lext2 "00:00:00:00:00:10 10.0.0.10 2001::10"
ovn-nbctl lsp-set-type lext2 external          
ovn-nbctl set logical_switch_port lext2 ha_chassis_group=$hagrp_uuid
ovn-nbctl --wait=hv sync                       
                                               
ovn-nbctl lsp-add ls lext-deleted     
ovn-nbctl lsp-set-addresses lext-deleted "00:00:00:00:00:03 10.0.0.3 2001::3"
ovn-nbctl lsp-set-type lext-deleted external        
ovn-nbctl set logical_switch_port lext-deleted ha_chassis_group=$hagrp_uuid
ovn-nbctl --wait=hv sync
ovn-nbctl lsp-del lext-deleted
ovn-nbctl --wait=hv sync

ovs-vsctl add-port br-int lp -- set interface lp type=internal external_ids:iface-id=lp
ip netns add lp
ip link set lp netns lp
ip netns exec lp ip link set lp address 00:00:00:00:00:01
ip netns exec lp ip link set lp up
ip netns exec lp ip addr add 10.0.0.1/24 dev lp
ip netns exec lp ip addr add 2001::1/64 dev lp

ovn-nbctl --wait=hv sync
ovs-vsctl add-port br-int lsp -- set interface lsp type=internal external_ids:iface-id=lsp options:tx_pcap=lsp.pcap options:rxq_pcap=lsp-rx.pcap
ip netns add lsp
ip link set lsp netns lsp
ip netns exec lsp ip link set lsp address 00:00:00:00:00:02
ip netns exec lsp ip link set lsp up
ip netns exec lsp ip addr add 10.0.0.2/24 dev lsp
ip netns exec lsp ip addr add 2001::2/64 dev lsp
ip netns exec lsp tcpdump -i lsp -w lsp.pcap &

ovs-vsctl add-port br-phys ext1 -- set interface ext1 type=internal
ip netns add ext1
ip link set ext1 netns ext1
ip netns exec ext1 ip link set ext1 up
ip netns exec ext1 ip addr add 10.0.0.101/24 dev ext1
ip netns exec ext1 ip addr add 2001::101/64 dev ext1
ip netns exec ext1 tcpdump -i ext1 -w ext1.pcap &
sleep 2

ovn-nbctl lsp-add ls ln \
    -- lsp-set-type ln localnet \
    -- lsp-set-addresses ln unknown \
    -- lsp-set-options ln network_name=phys
    
ip netns exec lp ip neigh add 10.0.0.4 lladdr 00:00:00:00:00:04 dev lp
ip netns exec lp ip -6 neigh add 2001::4 lladdr 00:00:00:00:00:04 dev lp
ip netns exec lp ip neigh add 10.0.0.10 lladdr 00:00:00:00:00:10 dev lp
ip netns exec lp ip -6 neigh add 2001::10 lladdr 00:00:00:00:00:10 dev lp
ip netns exec lp ping 10.0.0.4 -c 1 -w 1 -W 1
ip netns exec lp ping 10.0.0.10 -c 1 -w 1 -W 1
ip netns exec lp ping6 2001::4 -c 1 -w 1 -W 1
ip netns exec lp ping6 2001::10 -c 1 -w 1 -W 1
sleep 1
pkill tcpdump
sleep 1
tcpdump -r ext1.pcap -nnle

reproduced on ovn-2021-21.06.0-4:

[root@wsfd-advnetlab16 4]# tcpdump  -r ext1.pcap  -nnle host 10.0.0.4 or host 10.0.0.10 or host 2001::4 or host 2001::10
reading from file ext1.pcap, link-type EN10MB (Ethernet)                                              
dropped privs to tcpdump 

<=== no packets on localnet

verified on ovn-2021-21.06.0-12:

[root@wsfd-advnetlab16 bz1974062]# tcpdump  -r ext1.pcap  -nnle host 10.0.0.4 or host 10.0.0.10 or host 2001::4 or host 2001::10
reading from file ext1.pcap, link-type EN10MB (Ethernet)
dropped privs to tcpdump
03:30:47.849584 00:00:00:00:00:01 > 00:00:00:00:00:04, ethertype IPv4 (0x0800), length 98: 10.0.0.1 > 10.0.0.4: ICMP echo request, id 25014, seq 1, length 64
03:30:48.870469 00:00:00:00:00:01 > 00:00:00:00:00:10, ethertype IPv4 (0x0800), length 98: 10.0.0.1 > 10.0.0.10: ICMP echo request, id 25015, seq 1, length 64
03:30:49.902329 00:00:00:00:00:01 > 00:00:00:00:00:04, ethertype IPv6 (0x86dd), length 118: 2001::1 > 2001::4: ICMP6, echo request, seq 1, length 64
03:30:50.930259 00:00:00:00:00:01 > 00:00:00:00:00:10, ethertype IPv6 (0x86dd), length 118: 2001::1 > 2001::10: ICMP6, echo request, seq 1, length 64

<=== packets sent on localnet

Comment 18 Jianlin Shi 2021-07-19 04:14:38 UTC
Verified on ovn2.13-20.12.0-149.el8:

[root@dell-per740-12 bz1974062]# tcpdump  -r ext1.pcap  -nnle host 10.0.0.4 or host 10.0.0.10 or host 2001::4 or host 2001::10
reading from file ext1.pcap, link-type EN10MB (Ethernet)
dropped privs to tcpdump
00:12:52.027244 00:00:00:00:00:01 > 00:00:00:00:00:04, ethertype IPv4 (0x0800), length 98: 10.0.0.1 > 10.0.0.4: ICMP echo request, id 25268, seq 1, length 64
00:12:53.055108 00:00:00:00:00:01 > 00:00:00:00:00:10, ethertype IPv4 (0x0800), length 98: 10.0.0.1 > 10.0.0.10: ICMP echo request, id 25269, seq 1, length 64
00:12:54.078155 00:00:00:00:00:01 > 00:00:00:00:00:04, ethertype IPv6 (0x86dd), length 118: 2001::1 > 2001::4: ICMP6, echo request, seq 1, length 64
00:12:55.105044 00:00:00:00:00:01 > 00:00:00:00:00:10, ethertype IPv6 (0x86dd), length 118: 2001::1 > 2001::10: ICMP6, echo request, seq 1, length 64
[root@dell-per740-12 bz1974062]# rpm -qa | grep -E "openvswitch2.13|ovn2.13"
ovn2.13-central-20.12.0-149.el8fdp.x86_64
ovn2.13-20.12.0-149.el8fdp.x86_64
openvswitch2.13-2.13.0-117.el8fdp.x86_64
ovn2.13-host-20.12.0-149.el8fdp.x86_64

Comment 20 errata-xmlrpc 2021-07-29 20:18:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (ovn2.13 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:2971


Note You need to log in before you can comment on or make changes to this bug.